programming zona

Inline Functions and their Uses

It’s a good practice to divide the program into several functions such
that parts of the program don’t get repeated a lot and to make the code
easily understandable.

We all know that calling and returning from a function generates some overhead.
The overhead is sometimes to such an extent that it makes significant effect
on the overall speed of certain complex and function-oriented programs. In most
cases, we have only a few functions that have extensive use and make significant
impact on the performance of the whole program.

Not using functions is not an option, using function-like macros is an option,
but there is a better solution, to use Inline Functions.

Yes, like it sounds, inline functions are expanded at the place of calling
rather than being “really called” thus reducing the overhead. It
means wherever we call an inline function, compiler will expand the code there
and no actual calling will be done.

Member functions of classes are generally made inline as in many cases these
functions are short but any other function can be inline too. Functions are
made inline by preceding its definition with the keyword “inline”
as shown in the example below:


  // precede the definition
  // of function with "inline"
  // it's optional to do so for
  // the prototype
  inline ret-type func-name(arg-list...)
  {
    ...
    ...
    ...
  }

Member functions (class) can be made inline as below:


  class myclass
  {
  private:
    ...
    ...

  public:
    ret-type func-name(arg-list...);
    ...
    ...
  };

  inline ret-type myclass::func-name(arg-list...)
  {
    ...
    ...
  }

Or simply as


  class myclass
  {
  private:
    ...
    ...

  public:
    inline ret-type myclass::func-name(arg-list...)
    {
      ...
      ...
    }
    ...
    ...
  };

Short member functions are usually made inline as above.

Please mote one thing though, inline is a request to the compiler and not a
command which means it depends on the compiler and the conditions whether a
particular function will be really made “inline” or not.

As inlining increases duplication of parts of code, performance will be gained
at the expense of program size, which is pretty obvious. As I’ve said
not all functions make major impact on the overall performance of the program.
So we should carefully select which functions to inline and which not to, because
if we inline many functions we can’t be sure of whether it’d do
any better to the performance, but it’s for sure that program size will
increase unnecessarily.

Related Articles:

Increase
your Programming Skills III

Increase
your Programming Skills II

Increase
your Programming Skills I

Search Engine and Ranking

Crawling

Start from an HTML, save the file, crawl its links

def collect_doc(doc)
-- save doc
-- for each link in doc
---- collect_doc(link.href)

Build Word Index

Build a set of tuples

{
word1 => {doc1 => [pos1, pos2, pos3], doc2 => [pos4]}
word2 => {...}
}

def build_index(doc)
-- for each word in doc.extract_words
---- index[word][doc] << style="font-weight: bold;" size="5">Search

Given a set of words, locate the relevant docs

A simple example ...

def search(query)
-- Find the most selective word within query
-- for each doc in index[most_selective_word].keys
---- if doc.has_all_words(query)
------ result << doc
-- return result

Ranking

There are many scoring functions and the result need to be able to combine these scoring functions in a flexible way

def scoringFunction(query, doc)
-- do something from query and doc
-- return score

Scoring function can be based on word counts, page rank, or where the location of words within the doc ... etc.
Scoring need to be normalized, say within the same range.
weight is between 0 to 1 and the sum of all weights equals to 1

def score(query, weighted_functions, docs)
-- scored_docs = []
-- for each weighted_function in weighted_functions
---- for each doc in docs
------ scored_docs[doc] += (weight_function[:func](query, doc) * weight_function[:weight])
-- return scored_docs.sort(:rank)

Collaborative Filtering

Given a set of user's rating on movies, how to recommend movies to a new user ?

Ideas from the book: Collective Intelligence chapter one

Given a set of ratings per user
[ person1 => {movieA => rating-1A, movieB => rating-1B},
person2 => {movieX => rating-2X, movieY => rating-2Y, movieA => rating-2A}
...
]

Determine similarity

person1 and person2 are similar if they have rate the same movies with similar ratings

person_distance =
square_root of sum of
-- for each movie_name in person1.ratings.keys
---- if (person2.ratings.keys contains movie_name)
------ square(person1.ratings[movie_name] - person1.ratings[movie_name])

Person similarity =
0 if no common movies in corresponding ratings
1 / (1 + person_distance) otherwise

How to find similar persons to personK ?
Calculate every other person's similarity to personK, and sorted by similarity.

How about movie similarity ?

Invert the set of rankings to ...
[ movieA => {person1 => rating-1A, person2 => rating-2A},
movieB => {person1 => rating-1B}
movieX => {person2 => rating-2X}
movieY => {person2 => rating-2Y}
...
]

movie_distance =
square_root of sum of
-- for each person_name in movieX.ratings.keys
---- if (movieX.ratings.keys contains person_name)
------ square(movieX.ratings[person_name] - movieY.ratings[person_name])

Movie similarity =
0 if no common persons in corresponding ratings
1 / (1 + movie_distance) otherwise

How to find similar movies to movieX ?
Calculate every other movie's similarity to movieX, and sorted by similarity.

Making recommendations

Lets say there is a new personK provide his ratings. How do we recommend movies that may interests him ?

User-based filtering

For each person in persons
-- similarity = person_similarity(personK, person)
-- For each movie_name in person.ratings.keys
---- weighted_ratings[movie_name] += (similarity * person.ratings[movie_name])
---- sum_similarity[movie_name] += similarity

For each movie_name in weighted_ratings.keys
-- weighted_ratings[movie_name] /= sum_similarity[movie_name]

return weighted_ratings.sort_by(:rating)

Item-based filtering

Pre-calculate the movie similarity

For each movie in movies
-- neighbors[movie] = rank_similarity(movie, no_of_close_neighbors)

{
movie1 => {movieA => similarity1A, movieB => similarity1B}
movie2 => {movieC => similarity2C, movieD => similarity2D}
}

At run time ...
personK => {movieX => rating-kX, movieY => rating-kY}

For each movie in personK.ratings.keys
-- for each close_movie in neighbors[movie]
---- weight_ratings[close_movie] += neighbors[movie][close_movie] * personK.ratings[movie]
---- similarity_sum[close_movie] += neighbors[movie][close_movie]

For each target_movie in weight_ratings.keys
-- weight_ratings[target_movie] /= similarity_sum[target_movie]

return weight_ratings.sort_by(:rating)

Using a Stack to Reverse Numbers

Yeah, I heard many of you saying this and I know it’s no big deal to
reverse a number and neither is it using stack to do so. I am writing this just
to give you an example of how certain things in a program can be done using
stacks. So, let’s move on…

As many of you already know, a stack is a Data Structure in which data can
be added and retrieved both from only one end (same end). Data is stored linearly
and the last data added is the first one to be retrieved, due to this fact it
is also known as Last-In-First-Out data structure. For more info please read
Data
Structures: Introduction to Stacks.

Now, let’s talk about reversing a number, well reversing means to rearrange
a number in the opposite order from one end to the other.

Suppose we have a number

12345

then its reverse will be

54321

Ok, now let’s have a look at the example program which does this:


  // Program in C++ to reverse
  // a number using a Stack

  // PUSH -> Adding data to the sat ck
  // POP -> Retrieving data from the stack

  #include<iostream.h>

  // stack class
  class stack
  {
    int arr[100];
    // 'top' will hold the
    // index number in the
    // array from which all
    // the pushing and popping
    // will be done
    int top;
  public:
    stack();
    void push(int);
    int pop();
  };


  // member functions
  // of the stack class
  stack::stack()
  {
    // initialize the top
    // position
    top=-1;
  }

  void stack::push(int num)
  {
    if(top==100)
    {
      cout<<"\nStack Full!\n";
      return;
    }

    top++;
    arr[top]=num;
  }

  int stack::pop()
  {
    if(top==-1)
    {
      return NULL;
    }

    return arr[top--];
  }
  // member function definition ends

  void main()
  {
    stack st;
    int num, rev_num=0;
    int i=1, tmp;

    cout<<"Enter Number: ";
    cin>>num;

    // this code will store
    // the individual digits
    // of the number in the
    // Stack
    while(num>0)
    {
      st.push(num%10);
      num/=10;
    }

    // code below will retrieve
    // digits from the stack
    // to create the reversed
    // number
    tmp=st.pop();
    while(tmp!=NULL)
    {
      rev_num+=(tmp*i);
      tmp=st.pop();
      i*=10;
    }

    cout<<"Reverse: "<<rev_num<<endl;
  }

The above code is pretty much straightforward and I leave it up to you to understand
it!

P.S. If you experience any problems understanding it please first read the
article Data
Structures: Introduction to Stacks

Related Articles:

Data
Structures: Introduction to Stacks

Data
Structures: Introduction to Queues

Insertion
and Deletion of elements in an Array

Classes and Structures in C++

In C, a structure (struct) gives us the ability to organize similar data together.
You may wonder what I said. It is so in C, this is because structure is one
of the few things which is more or less entirely different in the two languages
(C and C++).

In C++, the role of structures is elevated so much as to be same as that of
a class. In C, structure could only include data as variables and arrays but
in C++ they can also include functions, constructors, destructors etc. and in
fact everything else that a class can. Knowing this, it wouldn’t be wrong
to say that in C++, structures are an alternate way of defining a class. However
there are some differences.

Look at the following code:


  // First difference between a class
  // and a structure in C++
   
  // define a structure
  struct mystruct
  {
    char name[25];
    int id_no;
  };

  void main()
  {
    mystruct a;
    // in C, it is necessary to
    // include the struct keyword

    // Example:
    // struct mystruct a;

    ...
    ...
    ...
  }

In C, we must declare an object of a structure by using the keyword struct but
in C++ it is optional to do so (just like a class).

Another difference is the fact that all the data or functions inside a struct
are public by default contrary to a class, inside which scope is private by
default.

It is obvious from the following code:


  // Second difference between a class
  // and a structure in C++

  // define a structure
  struct mystruct
  {
    // public by default
    // So convenient to start with
    // public declarations

    void func1();
    void func2();
    ...
    ...

  private:
  // now private
    int data1;
    int data2;
    ...
    ...
  };

  // define a class
  class myclass
  {
    // private by default
    int data1;
    int data2;
    ...
    ...

  public:
  // now public
    void func1();
    void func2();
    ...
    ...
  };

While you’ve got the power, you should not define a class as a struct
since it is not a good practice. Structures and classes should be used as per
their original purposes to avoid unnecessary complications and misunderstandings.

Related Articles:

Something about Local Classes

Class
with all the Overloaded Arithmetic Operators

Friend
Functions of Class

Deriving
a Class from another Derived Class

Initialization lists and base class members

Just some thoughts upon a question that I ran across, raised about, why you could not initialize base class members in derived classes. Put it another way, why could you not use the base class data members in derived class' initialization lists. Simplest reason - the standards, the language rules don't allow it. But let's have some fun with the "what-if"s.

One reason that struck me almost as a first thought, that the data member could be private in which case, you won't have access to it in the derived class. But what if it is public or protected? Let us take an example:

    class Base
    {
    public:
        Base(){}
        Base(int member_) : member (member_){}
        int member;
        //other members
    };

    class Derived : public Base
    {
    public:
        Derived(int member_) : member(member_){}
        //other members
    };

    int main()
    {
        Derived derivedObject(10);
    }

You would have expected it to work since the base member is accessible in the derived class, it being public. After all, it is perfectly okay if you used it in the derived class constructor body! What is so special about initialization lists that only direct base classes, virtual bases and the containing class' data members can only appear?

You might think that the base class object might not have been allocated or does not exist at all while in the initialization list of the derived class and hence it is not allowed to do that. But that reasoning would be flawed. Why? For the following reason (quoting section 12.6.2 (5) from the standards):

Initialization shall proceed in the following order:

    — First, and only for the constructor of the most derived class as
described below, virtual base classes shall be initialized in the
order they appear on a depth-first left-to-right traversal of the
directed acyclic graph of base classes, where “left-to-right” is
the order of appearance of the base class names in the derived
class base-specifierlist.

    — Then, direct base classes shall be initialized in declaration order
as they appear in the base-specifier-list (regardless of the order
of the mem-initializers).

    — Then, non-static data members shall be initialized in the order
they were declared in the class definition (again regardless of
the order of the mem-initializers).

    — Finally, the body of the constructor is executed.

    [ Note: the declaration order is mandated to ensure that base and
member subobjects are destroyed in the reverse order of initialization.
    —end note ]

So, the base is already initialized before anything else happens in the initialization list. What could be the reason then? The reason probably is that it does not make sense! Once the base constructor has initialized the base member, the derived class initializing it does not make sense. How can one thing be initialized twice? The second time, it has to be an assignment.

But again, that holds true just for non-POD members. For the object being constructed in the above code, via default constructor of the base class, the POD member "member" remains uninitialized! It will have an indeterminate value. So, it won't be double initialization, would it? It would be initialized just once and that from the derived class constructor (initialization list).

Now, consider there were a further derived class that publicly derived from the above "Derived" class having the same member initialization syntax. Now, that makes it two. This could probably have been dealt with some complication set of rules but why add that logical overhead?

That is not all though. The *rules* can get more complex. The consideration of different access specifiers (private/protected), different inheritance types (private/protected), an explicit constuctor call that too initializes the member, virtual bases, different treatment for non-POD and POD types and what not. Things just start to get too complex and dirty if you allow that.

Simply speaking, base class members should be the base class' responsibility and derived class should only be concerned with the construction abstraction provided by the base classes in form of the base constructors and their initialization lists. Making things more coupled is always a sign of bad design choice where things just start to fall apart as soon as something changes. That is not good code.

To make things clear and simpler, it's best said and accepted that the standard does not allow it for initialization lists to take up the responsibility of initializing base class members, just the immediate bases, virtual bases and class' members.

Have fun... Cheers!

Distributed UUID Generation

How to generate a unique ID within a distributed environment in consideration of scalability ?

Also, the following feature requirements are given ...

The ID must be guaranteed globally unique (this rule out probabilistic approaches)
The assigned ID will be used by client for an unbounded time period (this means that once assigned, the ID is gone forever and cannot be reused for subsequent assignment)
The length of the ID is small (say 64-bit) compared to machine unique identifier (this rule out the possibility of using a combination of MAC address + Process id + Thread id + Timestamp to create a unique ID)

Architectural goals ...

The solution needs to be scalable with respects to growth of request rates
The solution needs to be resilient to component failure

General Solution

Using a general distributed computing scheme, there will be a pool of ID generators (called "workers") residing in many inter-connected, cheap, commodity machines.

Depends on the application, the client (which request for a unique id) may collocate in the same machine of the worker, or the client may reside in a separate machine. In the latter case, a load balancer will be sitting between the client and the worker to make sure the client workload is evenly distributed.

Central Bookkeeper

One straightforward (maybe naive) approach is to have the worker (ie: the ID generator) make a request to a centralized book-keeping server who maintains a counter. Obviously this central book-keeper can be a performance bottleneck as well as a single point of failure.

The performance bottleneck can be mitigated by having the worker making a request to the centralized book-keeper for a "number range" rather than the "id" itself. In this case, id assignment will be done locally by the worker within this number range, only when the whole range is exhausted will the book-keeper be contacted again.

When the book-keeper receive a request of a number range, it will persist to disk the allocated number range before returning to the client. So if the book-keeper crashes, it knows where to start allocating the number range after reboot. To prevent the disk itself being a SPOF, mirrored disk should be used.

The SPOF problem of the book-keeper can be addressed by having a pair of book-keepers (primary/standby). The primary book-keeper need to synchronize its counter state to the standby via shared disks or counter change replication. The standby will continuously monitor the health of the primary book-keeper and will take over its role when it crashes.

Peer Multicast

Instead of using a central book-keeper to track the number range assignment, we can replicate this information among the workers themselves. Under this scheme, every worker keep track of its current allocated range as well as the highest allocated range. So when a worker exhaust its current allocated range, it broadcast a "range allocation request" to all other workers, wait for all of their acknowledgments, and then update its current allocated range.

It is possible that multiple clients making request at the same time. This kind of conflicts can be resolved by distributed co-ordination algorithms (there are many of well-known ones and one of them is bully algorithm)

For performance reason, the worker doesn't persist its the most updated allocated id. In case the client crashes, it will request a brand-new range after bootup, even the previous range was not fully utilized.

Distributed Hashtable

By using the "key-based-routing" model of DHT, each worker will create some "DHT Node" (with a randomly generated 64-bit node id) and join a "DHT ring". Under this scheme, the number range is allocated implicitly (between the node id and its immediate neighbor's node id).

Now, we can utilize a number of nice characteristics of the DHT model such as we can use large number of user's resource for workers with O(logN) routing hobs. And also the DHT nodes contains replicated data of its neighbors so that if one DHT node crashes, its neighbor will take over its number range immediately.

Now, what if a DHT node has exhausted its implicitly allocated number range ? When this happens, the worker will start a new DHT node (which will join at some point in the ring and then has a newly assigned number range).

Threading Building Blocks

Just for a quick note. Intel recently made their TBB library open source and since it has a different task-based approach to incorporating parallism in C++, it felt interesting. It uses threads internally but keeps the user code away from threads themselves by parallelizing actions/tasks user performs in his/her code. Looks nice, sounds good, since threads are basically a low level detail to achieving benefits of parallel processing and if there can be laid a layer of abstraction that insulates programmers from those details, it could ease out and quicken the developement of parallel processing within applications without getting into the nitty-gritty of threads. Analogously, do we deal with the parallel processing across FPUs?

Few days back I downloaded the Open Sourced Intel Threading Building Blocks library and started to test a few samples with it.

One of the sample solution in there was using parallel_for algorithm (parallel_for.h). It was giving me a fatal error that the file affinity.h could not be found. It did not exist! This file was being included in parallel_for.h.

After much of searching and going quickly through the docs, I could not find the reason but I luckily bumped into a thread in the libraries discussion forum. Where one user complained of the file missing from the development release.

The suggestion to fix this was:

1. Either comment out the include.
2. Add an empty file in the include path.

This resolves the compilation issue but why keep such an include anyway? It wastes a bit of time of everyone who is new to the project and wants to get quickly building out samples and testing/debugging and seeing the library in action. Probably they will fix this soon.

More information can be found here - Latest Developer Release Missing affinity.h.

Hope this helps anyone who is starting out on TBB and falls upon the same problem until that header comes alive in the project while I will go back to experimenting more with it. Good luck and have fun! :)

Something about Local Classes

We all know that identifiers (variables, objects, functions etc.) may have
two scopes in C++. They may be declared as global or local to a block.

We have seen identifiers like variables and functions to be defined locally
and globally quite often but there is one identifier which is not that commonly
declared as local, yeah you guessed right, its classes.

You might have noticed the fact that classes are almost always declared as
global even when they are to be used only in one block. It is so because of
some reasons that we’ll discuss later.

First let’s have a look at a class declared locally:


  // this code contains a local class
  #include <iostream.h>

  void func();

  void main()
  {
    // myclass unknown here
  }

  void func()
  {
    class myclass
    {
      ...
      ...
    };
  }

While classes may also be defined as local, there are some restrictions of
what can be done and what cannot be, they are listed below:

Member functions must be inline

Members of local class cannot access other variables within the block

It cannot have static member variables

Here is an example program:


  // this code contains a local class
  #include <iostream.h>

  void func();

  void main()
  {
    // myclass unknown here
    func();
  }

  void func()
  {
    int num;

    class myclass
    {
      int a;

    public:
      // member functions MUST be defined
      // inside the class
      void set(int x)
      {
        // can't access num
        a=x;
      }
      void show()
      {
        cout<<a<<endl;
      }
    };


    myclass ob;
    ob.set(10);
    ob.show();
  }

Related Articles:

Class
with all the Overloaded Arithmetic Operators

Operator
Overloading using Friend Functions

Problems
on Operator Overloading II

How bad code is formed

Ignore in some case consultants do this purposefully to keep their contract, in general no one intentionally create unmaintainable code. However, bad code silently creep in over the period of the product's life cycle.

Manager's mind

Lets write the code quick. We don't have enough time for the next rollout. If we cannot rollout, the project may die quickly. Lets put long-term maintainability as a secondary priority, which we'll improve after the rollout when we have time. By the way, who knows if I am still working in this project after a year.

Unfortunately, if this rollout is successful, the next release cycle has an even more tighter schedule
Code is "written once" and "read several hundred times" throughout its lifecycle. Subsequent maintenance cost is several hundreds times more than the gain we made in its initial rollout.

I cannot justify any resources spent on refactoring because it doesn't add any features to my next release. The code is working already. You mention about improved code maintainability and I don't know how to quantify that.

This is how bad code retains and accumulates. After its momentum has grown to a certain point, then everyone just accept the reality that the code is already bad enough that no motivation is there to improve it.

Lets have the developers focusing in writing the real code. We'll have the QA folks worry about the testing. This way, they can proceed at full speed in parallel.

Developer's mind

Similar logic is working elsewhere but it doesn't fit exactly what I want. It isn't provide enough feature for my need and it also provide some features that I don't need. It is not worthwhile to spend my time to read and understand how it works, lets re-implement my version from scratch.

This is how duplicated logic getting into the system. There are many ways to do one thing.

Similar logic is already working elsewhere. It is not worthwhile to spend my time reinventing the wheel. Lets copy it here and modify for my need.

This is another way of how duplicated code is getting into the system. When you change the code in one place, you need to make corresponding changes in many other places where you have made copies, otherwise some copies may not be working correctly.
Due to "laziness", variable names are kept the same in their copies, making the copied code even harder to understand.

This code is not broken, please don't make any change

This is how bad code retains and accumulates. After its momentum has grown to a certain point, then everyone just accept the reality that the code is already bad enough that no motivation is there to improve it.

Of course the code is hard to read because it is solving a complex problem. If the code is easy to read, it must not be doing anything significant. You just need to be smarter when reading the code.

I am not convinced that this is the right way of doing this kind of things. Therefore, I am not going to follow the convention even it has been done in many other places. I am going to do it my way.

I don't know why I have to do it this way. But since this is the way we do this kind of things in many other places, I just follow that convention. Is "consistency" a problem ?

Coding for maintainability

One important criteria for judging the value of any serious software product is how maintainable it is over its life cycle. It is important to notice that code is "write once" but "read many times" when people subsequently maintains it (enhancing features, fixing bugs, optimizing performance). Any cost spent in understanding any non-trivial code will be multiplied many times by itself and added to the overall maintenance budget of the software product.

It is so important to spend all the effort to make the code as readable as possible. Once you do it, you will start to get your dividends over the product's life cycle.

Lets set some criteria/characteristics of code quality from a maintainability standpoint.

Good code

Just by looking at its signature, you can guess what a class/method/function is doing without looking at its implementation. A very descriptive name of the class/method/function and parameter/variable are carefully chosen such that its associated semantics is very clear. When you open up its implementation, it is doing exactly what you guess.

Since the naming is so descriptive, it doesn't need a lot of comments to make the code readable. Comments are only appears for code segments that are implementing a complex algorithm or when optimizing performance of hotspots (where code readability is intentionally sacrificed).

The responsibility of a class/method/function is very clear. There is only one way to do a particular thing and is well-encapsulated within the responsible class/method/function. There is no duplicated code.

Responsibility tends to be distributed evenly so you don't have a package with too many classes or a class with too many instance variables/methods. And you won't have a method that span more than 2 pages of code lines.

Bad code

Similar code shows up in many places, it is hard to know which of these are "similar" and which are "exactly same"

Extremely long methods (couple hundreds lines of code within a method) and usually accompanied by deeply nested case statements

Only have trivial comments and lack of useful comments

There are many different ways to do one particular thing although they are supposed to achieve the same result

To do a particular thing, the caller need to make multiple calls to different objects in a certain sequence.

The class/method/function seems to be implying certain responsibilities but when you open up its implementation, you figure out it is doing something completely different. You quickly lose confidence in your understanding and has to trace down into all implementations details to see what it is actually doing in every place.

There are some places in the code where you cannot figure out what it is trying to do

In my next blog, I'll discuss how bad code is formed.

Easy Freeware Downloads - My New Blog

Q. What is this?

A. This is the screenshot of my new blog Easy
Freeware Downloads.

Q. What was it about?

A. It is pretty much a freeware (software) archive. It is a
blog hence new freeware are added with brief description and feature list.

Q. Why is it named so?

A. For each freeware we list, a direct download link makes
downloading easy with just ONE CLICK, hence the name.

Q. Is it worth visiting now?

A. I guess so, because I am announcing it after having working
on it for more than a month.

It has 60+ freeware listed (as of 28-Oct-07).

Click Easy
Freeware Downloads to visit.

Understand Legacy Code

The first step to support Legacy System is to understand the code. One approach is to read through all the lines of code. Quite often, this is quite ineffective, you may be wasting a lot of time reading dead code.

What I typically do is quickly scan through the package structure of the code, the class name and public methods. If the code structure is good, then I can get some rough idea about where the logic lies. However, I rely more on analyzing the run time behavior of the code to determine its logic flow.

A technique that I use frequently is to just run the system and determine the execution path. At the first step, I need to determine the entry points of the system.

Locate the entry points

Usually, the entry points are ...

main() method where the system boots
background maintenance tasks
event-driven callbacks

Finding out these entry points is non-trivial, especially if the documentation of the overall system is poor. I have built a tool that allows me to instrument those classes that I am interested so that when I start the system, it will print out all the entry points when they execute. Entry point is the lowest method of the stack trace that you are interested. The tool is based on AspectJ which allows me to inject some print statements when certain methods are executed.

After I get an idea about where the entry points are, the next step is to drill down into each execution point to study what it is doing.

Drill down into entry points

The first thing I usually do is to read through the code trying to understand what it is doing. Eclipse IDE is very effective to navigate through the code to see which method is calling which method. The most frequently used Eclipse feature is ...

"open declaration" which jumps to the method being called
"references" which list all the methods that is calling this methods
"Calling hierachy" which is a tree of callers which calls this method
"Type hierachy" which is a tree of Class hierachy

After that, I will add trace statements into the code to verify my understanding. Consider Trace statement is an enhanced logging mechanism which include the thread id, the calling stack trace and a message. Trace statements is temporary and its purposes is help me to understanding the code. After that, I will remove my trace statement. Since I don't want anyone to use the trace statements in the production environment, I wrote a "Trace" class so that I can easily remove it when I am done.

Document the understanding

Draw a UML diagram to describe my understanding. And also take notes about any observed issues and what can be improved.

Legacy System Phenomenen

As mentioned in my earlier blog, the software engineering skills we've learned assumes a green-field project where we starts on a clean sheet of paper. In this case, we can apply a number of practices to build a solid foundation for future evolution and maintenance.

Test Driven Development will force you to document the external behavior of your system in terms of Unit Test
Enforce a coding standard and use a tool to enforce that
Make sure the code is regularly reviewed to ensure readability and modularity

In reality, less than 2% project are under this situation. Mostly likely, we are dealing with a piece of software that is not built on this foundation. In many cases, even the projects starts from a good foundation, it will deteriorate over time because of schedule pressure, lack of engineering skills, people turnover ... etc.

How can one support such a legacy system ? By support, I mean the following ...

Fix bugs when it occurs
Tune performance when loading increases
Add features when new requirement arrives

In order to do the these things effectively, we need to first "understand" the legacy code and then "refactor" it. "Understanding" is about how to quickly establish an insight of the inner workings of the legacy code without making any modifications that may accidentally change its external behavior and causes bugs. "Refactoring" is about how to improve the understandability of legacy code by making modification to the code without changing its external behavior.

In later blogs, I will share my techniques of how to understand a piece of code written by someone else and also how to refactor it safely.

Perspective is Worth 80 IQ Points

Bill Williams sums up nicely why small is beautiful when it comes to programming tools.

http://rbblog.billdubya.net/2007/10/mindset-problem.html

"Test" as the "Spec"

One of a frequently encountered question in enterprise software development is where does the hand off happen from the architect (who design the software) to the developer (who implements the software). Usually, the architect designs the software at a higher level of abstraction and then communicate her design to the developers, who break down into more concrete, detail-level design before turning into implementation-level code.

The hand off happens typically via an architecture spec written by the architect, composed of UML class diagrams, sequence diagrams or state transition diagrams ... etc. Based on the understanding from these diagrams, the developers go ahead to write the code.

However, the progression is not as smooth as we expect. There is a gap between the architecture diagrams and the code and a transformation is required. During such transformation, there is a possibility of mis-interpretation, wrong assumption. Quite often, the system ends up to be wrongly implemented due to miscommunication between the developer and the architect. Such miscommunication can either due to the architect hasn't described his design clear enough in the spec, or because the developer is not experienced enough to fill in some left-out detail that the architect consider obvious.

One way to mitigate this problem is to have more design review meetings, or code review session to make sure what is implemented is correctly reflecting the design. Unfortunately, I found such review sessions are usually not happening either because the architect is too busy in other tasks, or she is reluctant to read the developer's code. It ends up the implementation doesn't match the design. Quite often, this discrepancy is discovered at a very late stage and left no time to fix. While developers start patching the current implementation for bug fixing or adding new features, the architect lose the control on the architecture evolution.

Is there a way for the architect to enforce her design at the early stage given the following common constraints ?

The architect cannot afford frequent progress/checkpoint review meetings
While making sure the implementation compliant with the design at a higher level, the architect doesn't want to dictate the low level implementation details

Test As A Specification

The solution is to have the architect writing the Unit Tests (e.g. JUnit Test Classes in Java), which acts as the "Spec" of her design.

In this model, the architect will focus in the "interface" aspect and how this system interact with external parties, such as the client (how this system will be used), as well as the collaborators (how this system uses other systems).

The system will expose a set of "Facade" classes which encapsulate the system's external behavior and act as the entry point to its client. Therefore, by writing Unit Tests against these "Facades", the architect can fully specify the external behavior of the system.

A set of "Collaborator" classes is also defined to explicitly capture the interaction of this system to other supporting systems. This "Collaborator" classes are specified in terms of Mock Objects so that the required behavior of supporting system is fully specified. On the other hand, the interaction sequence with the supporting systems are specified via "expectation" of Mock objects.

The architect will specify the behavior of "Facade" and "Collaborator" in a set of XUnit Test Cases, which acts as the design spec of the system. This way, the architect is defining the external behavior of the system while giving enough freedom for the developers to decide on the internal implementation structure. Typically, there are many "Impl Detail" classes which the Facade delegates to. These "Impl Detail" classes will invoke the "Collaborator interface" to get things done in some cases.

Note that the architect is not writing ALL the test cases. Architecture-level Unit Test are just a small subset of the overall test cases specifically focus in the architecture level abstraction. These tests are specifically written to ignore the implementation detail so that its stability will not be affected by change of implementation logic.

On the other hand, developers will also provide a different set of TestCases that covers the "Impl Detail" classes. Usually, this set of "Impl Level TestCase" which change when the developers change the internal implementations. By separating these 2 sets of test cases under different categories, they can evolve independently when different aspects of the system changes along its life cycle, and resulting in a more maintainable system as it evolves.

Example: User Authentication

To illustrate, lets go through an example using a User Authentication system. There maybe 40 classes that implements this whole UserAuthSubsystem. But the architecture-level test cases only focused in the Facade classes and specify only the "external behavior" of what this subsystem should provide. It doesn't touch any of the underlying implementation classes because those are the implementor's choices which the architect doesn't want to constrain.

** User Authentication Subsystem Spec starts here **

Responsibility:

Register User -- register a new user
Remove User -- delete a registered user
Process User Login -- authenticate a user login and activate a user session
Process User Logout -- inactivate an existing user session

Collaborators:

Credit Card Verifier -- Tell if the user name match the the card holder
User Database - Store user's login name, password and personal information


public class UserAuthSystemTest {
 UserDB mockedUserDB;
 CreditCardVerifier mockedCreditCardVerifier;
 UserAuthSystem uas;

 @Before
 public void setUp() {
   // Setup the Mock collaborators
   mockedUserDB = createMock(UserDB.class);
   mockedCardVerifier =
     createMock(CreditCardVerifier.class);
   uas =
     new UserAuthSubsystem(mockedUserDB,
                       mockedCardVerifier);
 }

 @Test
 public void testUserLogin_withIncorrectPassword() {
   String uName = "ricky";
   String pwd = "test1234";

   // Define the interactions with Collaborators
   expect(mockUserDB.checkPassword(uName, pwd)))
        .andReturn("false");
   replay();

   // Check the external behavior is correct
   assertFalse(uas.login(userName, password));
   assertNull(uas.getLoginSession(userName));

   // Check the collaboration with collaborators
   verify();
 }

 @Test
 public void testRegistration_withGoodCreditCard() {
   String userName = "Ricky TAM";
   String password = "testp";
   String creditCard = "123456781234";
   expect(mockCardVerifier.checkCard(userName,creditCard)))
           .andReturn("true");
   expect(mockUserDB.addUser(userName, password)));
   replay();
   uas.registerUser(userName, creditCard, password));
   verify();
 }

 @Test
 public void testUserLogin_withCorrectPassword() { .... }

 @Test
 public void testRegistration_withBadCreditCard() { .... }

 @Test
 public void testUserLogout() { .... }

 @Test
 public void testUnregisterUser() { .... }
}

** User Authentication Subsystem Spec ends here **

Summary

This approach ("Test" as a "Spec") has a number of advantages ...

There is no ambiguation about the system's external behavior and hence no room for mis-communication since the intended behavior of the system is communicated clearly in code.
The architect can write the TestCase at the level of abstractions she choose. She has full control in what she wants to constraint and what she wants to give freedom.
By elevating architect-level test cases as the spec of the system's external behavior. They become more stable and independent of changes in implementation details.
This approach force the architect to think repeatedly what is the "interface" of the subsystem and also what are the collaborators. So the system design is forced to have a clean boundary.

Open Source Phenomenon

The Open Source Phenomenon has changed drastically the skills set requirement for our software development professionals today.

Traditionally, software development involves not just writing the business logic of your application, but also a lot of plumbing logic (such as data persistence, fault resilience). These plumbing logic are typically independent of your business logic, and also quite complicated (although they are well-studied engineering problem). In most projects, large portion of the engineer effort had gone to re-inventing solution for these recurring plumbing problem.

The situation is quite different today in our software development world. Thanks to the Open Source Phenomenon, most of the plumbing logic are available in some forms of software libraries or frameworks with source code available. There are different kinds of open source licensing model with difference degrees of usage or modification restrictions. Nevertheless, the software development game plan was changed drastically since then. Instead of building all plumbing from scratch, we can just leverage what is out there as Open source libraries/frameworks. Open source phenomenon bring huge savings in the project cost and also reduce time to market.

However, surviving in the Open Source world requires a different set of software engineering skill sets that is not taught in our college. Most techniques we learn from colleges assumes a green-field development environment and we learn how to design a good system from scratch rather than how to understand or adapt existing code. We learn how to "write code" but not "read code".

Skills to survive in the Open Source world are quite different. Such as ...

"Knowing how to find" is more important than "knowing how to build"

"Google", "Wikipedia" are your friend. Of course, "Apache", "SourceForge" is good place to check. Sometimes I find it is not easy for a hard-core software engineer to look for other people's code to use. Especially before she has certain familiarity of the code base. The mindset need to be changed now. The confidence should be based on its popularity and in some case, you just use it and see how it goes.

"Quick learning" and "start fast"

Looking at the usage examples is usually a good start to get a quick understanding on how things works. Then start to write some Unit Test code to get a better feel on the area that you are interested.

Be prepared to "switch"

This is very important when you are using less proven Open Source products. Even with proven open source product, the level of supports may fluctuate in future, not something under your control. Therefore, your application architecture should try hard to minimize the coupling (and dependencies) to the open source products.

In most cases, you need just a few features out of the whole suite of the open source product. A common technique that I use often to confine my usage is to define an interface that captures all the API that I need. And then write an implementation which depends on the Open Source product. Now the dependency is localized in the implementation class. Whenever I switch, I just need to rewrite the implementation of the interface. I can even use IOC (inversion of control) mechanism to achieve even zero dependencies between my application to the Open source product.

Ability to drill down

Quite often, the open source provides most of what your need, but not 100%. In other scenarios, you want to do some enhancement, or fixing a bug. You need to be able to drill down into the code and feel comfortable to make changes.

This is a tricky area. I know many good engineers who can write good code, but only very few can pick up code written by other people, and be able to familiar with that. As I said earlier, we are trained to write code, but not read code. In dealing with open source product, code reading skills is very important.

In fact, the skills of reading code is generally very important as most projects are in maintenance mode rather than green-field development mode. I will share code reading technique more detail in later blog when I talk about doing surgery on legacy code. But here is a quick summary of an important subset that are relevant to dissecting open source products ...

Look at its package structure, class name, method name to get some rough ideas about the code structure and data structure.
Put some trace code that prints out its execution path, then run some examples to verify your understanding.
For further drill down, put in break points and step through the code
Document your findings
Subscribe to the community mail alias and send an email to ask

Reverse Engineering Tools

There are a couple of tools that analyze existing code base to extract its design or external behavior. For example, Agitar can analyze your Java bytecode and generate JUnit tests ... Altova can analyze your code to generate the UML diagram.

These reverse engineering tools provide value to legacy application where you have a piece of running production code but very little information about its design, and you have very little test where you can understand its expected behavior. These tools can quickly analyze the code and tell you its observed behavior. It "discover" what the code is current doing and give you a better starting point to do further analysis.

Giving you a better starting point is all it provides. They are typically unable to distinguish between the intended behavior (sitting at a higher level of abstraction) from its actual implementation (sitting at a much lower level). In fact, I doubt if automatically extract the intended behavior is at all possible. The end result is that it swarms you with a lot of low-level, implementation-specific details (because the actual implementation is all it knows) and then you need to manually analyze what is important and what is unimportant to you.

In other words, it gives you a different starting point with a different format (e.g. a set of test methods and class invariants it detects instead of implementation Java code). But it doesn't raise your level of abstraction which I argue has to be done manually anyway. So does it give you a better start ? I think it is better but not too much better. By following a set of code surgery, refactoring technique, we can achieve a much better result. I will share such techniques in later blogs.

Now, is these tools useful for new development ? I don't think so. I am a stronger believer of TDD (Test-Driven-Development) where the test itself should be regarded as the "spec" and so by definition has to be written first and has to be written manually by the application designer. In TDD, you write your test first and write it manually (not generated). You put in all the expected behavior in your test methods, and you also put in Mock object to test its interaction. You diagram your UML model to capture your major classes and methods at a higher abstraction level.

Unit Testing Async Calls

Most Unit testing frameworks assumes a synchronous call model that the calling thread will block until the called method returns. So a typical test method will look like the following ...

class SyncServiceTest {
@Test
void testSyncCall() {
  Object input = ...
  Object expectedOutput = ...
  SyncService syncService = new SyncService();
  assertEqual(syncService.call(input), expectedOutput);
}
}

However, this model doesn't fit if the call is asynchronous; ie: the calling thread is not blocked and the result will come later from a separated callback thread. For example, if you have the following AsyncService, where the "call()" method returns immediately and the result comes back from a callback interface.

class AsyncService {
public static interface ResultHandler {
  void handleResult(Object result);
}

ResultHandler resultHandler;

public AsyncService(ResultHandler resultHandler) {
  this.resultHandler = resultHandler;
}

public void call(final Object request) throws InterruptedException {
  Executor executor =
    new ThreadedExecutor().execute(new Runnable() {
      public void run() {
        Object result = process(request);
        resultHandler.handleResult(result);
      }
    });
}
}

The test method need to be changed. You need to block the calling thread until the result comes back from the callback thread. The modified test code will look like the following ...

class AsyncServiceTest {

Object response;

@Test
public void testAsyncCall() {
  Object input = ...
  Object expectedOutput = ...
  final Latch latch = new Latch();

  AsyncCall asyncCall =
    new AsyncCall(new AsyncCall.ResultHandler() {
      public void handleResult(Object result) {
        response = result;
        latch.release();
      }
    });

  asyncCall.call(input);
  latch.attempt(3000);
  assertEquals(response, expectedOutput);
}
}

In this test method, the calling thread is waiting for the callback thread. A latch is used for the synchronization. You cannot use Thread.wait() and Thread.notify() for this purpose because there is a possibility that notify can be called before the wait and then one thread will wait indefinitely. Latch is a once-off switch and if the "release" is called first, then all subsequent "attempt" call will return immediately.

Half Life 2 Episode 2

I've been a big fan of Half Life and its sequels for years, and Valve just released HL2 Episode 2 so I've looked at some reviews online. It's receiving very high marks by those who've played it.

So why am I posting about a video game here? Because I'm dying to play it, but I won't play because I can't allow myself to be distracted from working on Run BASIC right now. Probably I'll give myself permission to enjoy some game playing abandon when the holidays roll around.

In the meantime Run BASIC development is a lot of fun too in many important ways. :-)

Overloading the Parenthesis () Operator

First, I want to apologize to my readers for not being able to post lately,
partly due to me being very busy these days;-)

As we all know, there are certain ways by which we can pass data to objects
(of class). We can pass values during the time of construction as below:

  class-name ob-name(values);

Or we may define a member function to accept data which can be called as below:

  class-name ob-name;
  ob-name.set(values);

Where set is the member function of the class.

But actually there is one more way to do so, yeah you guessed it right!, by
overloading the parenthesis () operator.

Parenthesis () operator like other operators is overloaded with the following
prototype:

  ret-type operator()(arg1, arg2,...);

It is a unary operator hence the only argument passed to this function (implicitly)
is this pointer. The argument list may contain any number of arguments as you
wish to pass.

The following example program illustrates the overloading of parenthesis ()
operator in C++.


  // Overloading Parenthesis ()
  // Operator
  #include <iostream.h>

  class myclass
  {
    int a,b;

  public:
    myclass(){}
    myclass(int,int);
    myclass operator()(int,int);
    void show();
  };

  // ------------------------
  // --- MEMBER FUNCTIONS ---
  myclass::myclass(int x, int y)
  {
    a=x;
    b=y;
  }

  myclass myclass::operator()(int x, int y)
  {
    a=x;
    b=y;

    return *this;
  }

  void myclass::show()
  {
    cout<<a<<endl<<b<<endl;
  }
  // --- MEMBER FUNCTIONS ---
  // ------------------------

  void main()
  {
    myclass ob1(10,20);
    myclass ob2;

    ob1.show();

    // it's a nice way to pass
    // values, otherwise we
    // would have to define and
    // call a member/friend function
    // to do so
    ob2(100,200);
    ob2.show();
  }

Related Articles:

Problems
Related to Operator Overloading

Overloading
[] Operator II

Using
Friends to Overload all the Overloaded Arithmetic Operators

Problems
on Operator Overloading II

Overloading
the Short-Hand Operators (+= and -=) using Friend Functions

Parsing XML in Run BASIC

One of the important feature that a Web 2.0 language needs is an XML parser. Run BASIC now has one built in. The XMLPARSER statement parses an XML string and returns an XML accessor object with a bunch of handy built-in methods for making your way through an XML document.
Here is a simple example of what that sort of code looks like:

a$ = "<program name=""myprog"" author=""Carl Gundel""/>"
xmlparser #parser, a$
print #parser key$()
for x = 1 to #parser attribCount()
key$ = #parser attribKey$(x)
print key$; ", ";
print #parser attribValue$(x)
next x

This short program produces:

program
name, myprog
author, Carl Gundel

And here is a short program which will display the tag names and contents of an artibrarily nested XML document:

xmlparser #doc, s$
print #doc key$()
call displayElements #doc
end

sub displayElements #xmlDoc
count = #xmlDoc elementCount()
for x = 1 to count
#elem = #xmlDoc #element(x)
print "Key: "; #elem key$();
value$ = #elem value$()
if value$ <> "" then
print " Value: "; value$
end if
print
call displayElements #elem
next x
end sub

Do-it-yourself programming

A friend of mine pointed me at this article "Do-It-Yourself Software" in the Wall Street Journal.

http://online.wsj.com/public/article/SB119023041951932741.html

Run BASIC and Liberty BASIC are both aimed at this market, and in fact this is the traditional niche of BASIC language products.

More on modularity

Run BASIC gives you the ability to create web pages that are component based. You define your own components and the ease at which you can plug these things together comes essentially for free. Here is a really simple example that I posted in our beta testing forum:

[masterPage]
cls
html "Program manager"
link #wiki, "Wiki", [runTheWiki]
print " ";
link #multi, "Multicounter", [runTheMulticounter]
print
if launchFlag then render #subProg
wait

[runTheWiki]
run "runWiki", #subProg
launchFlag = 1
goto [masterPage]

[runTheMulticounter]
run "multicounter", #subProg
launchFlag = 1
goto [masterPage]

Here is a version that doesn't use any GOTOs:

global launchFlag
call displayMasterPage
wait

sub displayMasterPage
cls
html "Program manager"
link #wiki, "Wiki", runSubprogram
#wiki setkey("runWiki")
print " ";
link #multi, "Multicounter", runSubprogram
#multi setkey("multicounter")
print
if launchFlag then render #subProg
end sub

sub runSubprogram linkKey$
run linkKey$, #subProg
launchFlag = 1
call displayMasterPage
end sub

So what does this do? It creates a simple web page with a title and two links. Click on the wiki link and the wiki program becomes part of the web page. Click on the multicounter link and the multicounter replaces the wiki part of the page. You can switch back and forth between the wiki and the multicounter at will with just the click of a mouse. What's even more interesting is that the multicounter is already a modular program, so you get three levels of modularity but you aren't limited to that.

So for programmers who like to put all their code in one file, there's nothing to prevent that. But people who like modules can have a field day.

Six sixes in one over in Twenty20 Cricket - World record

Six sixes in one over in Twenty20 (Twenty Twenty, T20) Cricket? Oh! World record!!! It was an amazing over; all the balls went over the rope like rockets. The Start was Yuvraj Singh from India, and the poor bowler was Stuart Broad (England). This happened on 2007-09-19, in Twenty20 world up match. Yuvraj did this in the 19th over.

Unsurprisingly, he scored 50 runs in 12 deliveries, creating another world record. And think for a minute? He scored 50 with 3x4 and 6x6, which adds upto 48 runs in 9 balls. This world record will not be broken for a longer time and will stay than any other record?

With this six sixes, Yuvraj joined the club of Six Sixers as the 4th person. Before him; West Indies great Sir Garfield Sobers and India's Ravi Shastri joined there while Herschelle Gibbs joined this club this year in March. Well done Yuvraj.

[Java Tips] Add Array into a List and convert a List into an Array

With Java, putting contents of an Array into a new List object or adding into an existing List object can be achieved easily using a for() loop; just by going through each and every element in array and adding to List one at a time. But that is not needed. Java has provided a method to achieve that easily; just with one method call. Following code snippet shows how.

//import java.util.List;
//import java.util.Arrays;
String[] array = {"one", "two", "three"};
List newListObject = Arrays.asList(array);

//adding to existing List
String[] newArray = {"four", "five"};

List all = new ArrayList();
all.addAll(newListObject);
all.addAll(Arrays.asList(newArray));

Also creating a new Array object using an existing List object can be done using another for() loop; by creating a new Array object with a size matching to list size and adding each on at a time. But for this requirement, there are a set of methods.

List list = new ArrayList();
list.add("one");
list.add("two");

Object[] array1 = list.toArray(); //1
String[] array2 = (String[])list.toArray(new String[0]); //2

String[] array3 = new String[2];
list.toArray(array3); //3

With Line #1, returned array is of Object type, while #2 returns a new array object of String type.
Line #3 uses the same method used in #2, but in this case we have provided an array object with the same size as the list. Because of that in line #3, the provided array object is populated with list elements.

QuickTime security issue fixed with Firefox 2.0.0.7 new security release

Firefox 2.0.0.7 has released a new security release again. As addicted users of FF, we are really happy how Firefox progresses. When ever some issues are fixed, they provide us with a new release. Also user does not need to think twice in installing the new release as it does not add more issues to your existing installation. When Internet Explorer (IE) 7 came out users had issues; also still we have so many issues with IE7, but not with Firefox.

With this release they fixed the bug related to QuickTime; "Code execution via QuickTime Media-link files".

QuickTime Media-Link files contain a qtnext attribute that could be used on Windows systems to launch the default browser with arbitrary command-line options. When the default browser is Firefox 2.0.0.6 or earlier use of the -chrome option allowed a remote attacker to run script commands with the full privileges of the user. This could be used to install malware, steal local data, or otherwise corrupt the victim's computer. Read more

We really appreciate quick response to security issues like that.

boost::any

boost::any is a strong concept and a much better replacement to void* to hold any type of data. You can make heterogenous containers using it as well. Let us see how it works in a very simplified way. The idea is to have a template class that can wrap all types and a value associated with that type. Something like this:

template<typename T>
class HoldData
{
    T t;
};

And then having a base class from which this wrapper would derive, so the above becomes adding a constructor that needs the type to be stored in it to be copy constructible:

class BaseHolder
{
    public:
        virtual ~BaseHolder(){}
};

template<typename T>
class HoldData : public BaseHolder
{
    public:
        HoldData(const T& t_) : t(t_){}
    private:
        T t;
};

Now, you would have a class, name it Variant that will take inputs of all types and then has a pointer to this wrapper's base type. So, now you have (including above classes):

class BaseHolder
{
    public:
        virtual ~BaseHolder(){}
};

template<typename T>
class HoldData : public BaseHolder
{
    public:
        HoldData(const T& t_) : t(t_){}
    private:
        T t;
};

class Variant
{
    public:
        template<typename T>
        Variant(const T& t) : data(new HoldData<T>(t)){}
        ~Variant(){delete data;}
    private:
        BaseHolder* data;
};

You construct the corresponding type's wrapper objects and save their pointer into the another class that you call variant, that can hold and help retrieve any data type and does not lose the respective type information. That is actually what boost::any does. Take a look at the code here - boost::any code.

The documentation on it can be found here - boost::any documentation.

svnadmin create a new empty project repository in subversion (svn) in Linux

If you have installed subversion (used for version control) and looking for creating a repository inside that, you are at the right place. Command to create new project repositories inside subversion is svnadmin create.

# svnadmin create repoPath
- this will create an empty repository.
- repoPath has to be a path to a folder, if that folder does not exists; new folder will be created.

Consider following examples.

# svnadmin create /usr/local/svn/clients/MyProject
or
# svnadmin create .

First command will create a repository inside "/usr/local/svn/clients/MyProject" while the second command creates a repository in your current directory.

After creating the repository, you must alter access controls. For that open conf/svnserver.conf found inside newly created repository folder.
Common values to alter are;
anon-access
- access control for non authenticated users
- better to set it to none (anon-access = none)
auth-access
- access control for authenticated users
- will need set it to "read" or "write" (auth-access = read)

Call javascript in body tag on different events

Javascript can be called inside body tag of a web page. Mainly Javascript functions are called on events like Loading a web page, Clicking on a button, Moving mouse, Focusing on an element, etc. For each of these events, there are defined methods and they are fired at that particular event. For example; onclick() is triggered on a mouse click event while onmouseover() is called when mouse moves over an element. But you can call Javascript functions even without any event.

Available events for body tag can be listed as follows.

ONCLICK : mouse button clicked
ONDBLCLICK : mouse button double-clicked

ONMOUSEDOWN : mouse button is pressed
ONMOUSEOVER : mouse moved onto an element
ONMOUSEMOVE : mouse moved over an element
ONMOUSEOUT : mouse moved out of an element
ONMOUSEUP : mouse button is released

ONKEYPRESS : key pressed and released
ONKEYDOWN : key pressed
ONKEYUP : key released

There are two special events that are specific to body tag. Those are;

ONLOAD : document loaded completely
ONUNLOAD : document unloaded

Calling a Javascript method
At any of the above events, you can call any javascript function from your body tag. A Javascript function named testAlert() can be called as below.
<body onload="testAlert();">

To call move than one Javascript function, you have to write the function named separating by semi-colons ( ; ) as below.
<body onclick="validate(); calculate(); submit();">

Which software (geek) monkey worth the most?

An interesting conversation!!!

A tourist walked into a pet shop and was looking at the animals on display. While he was there, another customer walked in and said to the shopkeeper, "I'll have a C monkey please." The shopkeeper nodded, went over to a cage at the side of the shop and took out a monkey. He fit a collar and leash, handed it to the customer, saying, that'll be $5000." The customer paid and walked out with his monkey. Startled, the tourist went over to the shopkeeper and said, "That was a very expensive monkey. Why did it cost so much?" The shopkeeper answered, "Ah, that monkey can program in C very fast, tight code, no bugs, well worth the money."

The tourist looked at the monkey in another cage. "That one's even more expensive! $10,000!What does it do?". "Oh, that one's a C++ monkey; it can manage object- oriented programming, Visual C++, even some Java. All the really useful stuff," said the shopkeeper. The tourist looked around for a little longer and saw a third monkey in a cage of its own. The price tag around its neck read $50,000. He gasped to the shopkeeper, "That one costs more than all the other put together! What on earth does it do?"

The shopkeeper replied, "Well, I haven't actually seen it doing anything, but the other monkeys call him the project manager."

Is this true? Can we work on huge software projects without project managers? Who would do estimations and resource utilization? To be frank I can not agree 100% with this, but there are so many people who survive being a non-working project manager. They usually come to work early, but does nothing for the sake of his team members. Some of them even have no idea about the responsibilities of their role in a software development process. Which type of a project manager are you?

Run BASIC enters beta testing

Well, it's been quiet here for a couple of months but now that the summer activities are over things have begun to pick up steam. In particular we started beta testing Run BASIC Personal Server a few weeks ago. We are still looking for a few more people to help test.

Run BASIC Personal Server is an all in one web app server, BASIC scripting language, database engine and more. It offers an extremely easy way to get into web application development. When I say extremely easy I am not exaggerating. We are talking "a child could do it" easy web programming.

When I asked my testers how they would describe Run BASIC, here is what some of them said:

" - - Run BASIC provides a complete alternative to the complex development languages that have evolved to script web content. Run BASIC wrests control back to you, allowing BASIC language scripting of web content. Create web pages in an easy to use project development environment and publish on the web at the click of a mouse."

" - - If you've ever used one of the classic BASIC interpreters, then you already know most of what you need to build dynamic websites using Run Basic."

" - - Run BASIC moves desktop programming out onto the internet. The screen displays, forms, interactions, graphics, and data-handling processes you create with clear, understandable BASIC programs suddenly become web applications, usable by anyone, on any platform -- Windows, Mac, Linux -- on any computer with browser access to the internet. With Run BASIC, you can write your program from anywhere, on any computer, and run it everywhere, on every computer."

If you are interested in Run BASIC and feel you have enough time to spend at least a few hours testing it out, please send an email to me at carlg@libertybasic.com and explain why you would like to be a beta tester.

We will accept only a certain number of applicants and I'll post a note here when we have enough.

Thanks!

-Carl Gundel

programming zona

Pages

!!Happy New Year - 2008!!

Merry Christmas to Everyone

Inline Functions and their Uses

Search Engine and Ranking

Collaborative Filtering

Using a Stack to Reverse Numbers

Classes and Structures in C++

Initialization lists and base class members

Distributed UUID Generation

Threading Building Blocks

Something about Local Classes

How bad code is formed

Coding for maintainability

Easy Freeware Downloads - My New Blog

Understand Legacy Code

Legacy System Phenomenen

Perspective is Worth 80 IQ Points

"Test" as the "Spec"

Open Source Phenomenon

Reverse Engineering Tools

Unit Testing Async Calls

Half Life 2 Episode 2

Overloading the Parenthesis () Operator

Parsing XML in Run BASIC

Do-it-yourself programming

More on modularity

Six sixes in one over in Twenty20 Cricket - World record

[Java Tips] Add Array into a List and convert a List into an Array

QuickTime security issue fixed with Firefox 2.0.0.7 new security release

boost::any

svnadmin create a new empty project repository in subversion (svn) in Linux

Call javascript in body tag on different events

Which software (geek) monkey worth the most?

Run BASIC enters beta testing

Check out this stream

Tags