How to use the same 'Model' classes in both Android application and GWT using JPA to store data on a MySQL database - J2EE Video Tutorial

http://www.youtube.com/watch?v=6R9rkaotBHIendofvid [starttext] This is a Video Tutorial explains This video describes how you can use the same 'Model' classes in both your mobile Android application, GWT Web front-end, and on the back-end, using JPA to store them on a MySQL database.
[endtext]

integrating GWT with Spring and Hibernate ORM (transactions, exception handling, DAO templates, etc.) - GWT Video Tutorial

http://www.youtube.com/watch?v=JnVxtLPUE9Iendofvid [starttext] This is a Video Tutorial explains a conference Organized by Marakana, San Francisco Java User Group hosted an event on April 13th, 2010 with Kunal Jaggi who talked about integrating GWT with Spring and Hibernate ORM. So, in addition to the standard Spring+Hibernate integration (transactions, exception handling, DAO templates, etc.) Kunal also discussed how to access spring beans via GWT's RPC facilities. He talked about client side remote interfaces and service proxies, exception handling, domain object serialization to GWT, etc. Kunal demonstrated how all this fits together with a demo app he has prepared.

http://www.sfjava.org/calendar/12296574/
[endtext]

How to install Google plugin on Eclipse ( Google Web Toolkit [GWT] and Google App Engine [GAE]) - Gwt Video Tutorial

http://www.youtube.com/watch?v=kRvU8gG-_JAendofvid [starttext] This is a Video Tutorial explains Simple tutorial showing the steps for installation and testing of GWT in Eclipse Galileo SR2.
[endtext]

Android Demonstration of a solution that consumes a SOAP web service developed in Dotnet - Android Video Tutorial

http://www.youtube.com/watch?v=_MMByNiwqMcendofvid [starttext] This is a Video Tutorial explains an Android Demonstration of a solution that consumes a SOAP web service developed in Dotnet.
[endtext]

Reading file names from directory

The other day I had to read all the file names from a directory and I found it really difficult to write a simple program to do that. While searching I ended up at Stack Overflow and the following example is taken from here.

Note that the 'dirent.h' is not available as standard windows file and the best option is to download it from here and add it in your include directory of the project.

The program as follows:


//Program tested on Microsoft Visual Studio 2008 - Zahid Ghadialy
#include<iostream>

#pragma warning( disable : 4996 )
#include "dirent.h"
#pragma warning( default : 4996 )

using namespace
std;

int
main(int argc, char *argv[])
{

if
(argc != 2)
{

cout<<"Usage: "<<argv[0]<<" <Directory name>"<<endl;
return
-1;
}


DIR *dir;
dir = opendir (argv[1]);
if
(dir != NULL)
{

cout<<"Directory Listing for "<<argv[1]<<" : "<<endl;
struct
dirent *ent;
while
((ent = readdir (dir)) != NULL)
{

cout<<ent->d_name<<endl;
}
}

else

{

cout<<"Invalid Directory name"<<endl;
return
-1;
}


return
0;
}

Note that in this program we put Command-line arguments argc and argv. What this means is that the final .exe is run using command prompt and the first parameter will be exe file name and the second parameter will be the directory path or you can add the directory path in Debugging->Command arguments of the properties

You may also encounter the following warning

warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.

to get rid of it, I have used the #Pragma supressor.

You may also get the following error:

error C2664: 'FindFirstFileW' : cannot convert parameter 1 from 'char *' to 'LPCWSTR'
to get rid of it, go to properties->general->Character set - default is 'Use Unicode Charachter set', change it to 'Use Multi-Byte Character Set'

The output is as follows:

Project using Google App Engine and Google Web Toolkit (GWT) in Real Life - GWT Video Tutorial

http://www.youtube.com/watch?v=NWY24EvwqUwendofvid [starttext] This webinar focuses on starting a real life project using App Engine and Google Web Toolkit. This project showcases a prototype built for the PPCRV for Free Elections in the Philippines and constitutes the Registration Module.
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part1-b

http://www.youtube.com/watch?v=sA3ioQQhgMUendofvid [starttext] This is a Video Tutorial explains Google Web Toolkit (GWT) Video Tutorial Part1-b
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part1-a

http://www.youtube.com/watch?v=ZF4dtlj6KD8endofvid [starttext] This is a Video Tutorial explains Google Web Toolkit (GWT) Video Tutorial Part1-a
Google Web Toolkit (GWT /ˈɡwɪt/) is an open source set of tools that allows web developers to create and maintain complex JavaScript front-end applications in Java. Other than a few native libraries, everything is Java source that can be built on any supported platform with the included GWT Ant build files. It is licensed under the Apache License version 2.0.[1]
GWT emphasizes reusable, efficient solutions to recurring Ajax challenges, namely asynchronous remote procedure calls, history management, bookmarking, internationalization and cross-browser portability.
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part1-c

http://www.youtube.com/watch?v=Wm0TCj87DJkendofvid [starttext] This is a Video Tutorial explains Google Web Toolkit (GWT) Video Tutorial Part1-a
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part2-a

http://www.youtube.com/watch?v=4w5n1TT7cJkendofvid [starttext] This is a Video Tutorial explains Google Web Toolkit (GWT) Video Tutorial Part2-a
Development with GWT

Using GWT, developers can rapidly develop and debug AJAX applications in the Java language using the Java development tools of their choice. When the application is deployed, the GWT cross-compiler translates the Java application to standalone JavaScript files that are optionally obfuscated and deeply optimized.
GWT does not revolve only around user interface programming; it is a general set of tools for building any sort of high-performance client-side JavaScript functionality. In live presentations, the developers of GWT emphasize that "GWT is not its libraries" and that it only includes a library but is not fundamentally yet another AJAX library. This open-ended philosophy sometimes surprises developers new to GWT who expect it to provide an end-to-end "on rails" application framework. Indeed, many key architectural decisions are left completely to the developer. The GWT mission statement clarifies the philosophical breakdown of GWT's role versus the developer's role. History is an example of such: although GWT manages history tokens as users click Back or Forward in the browser, it does not prescribe how to map history tokens to an application state.
GWT applications can be run in two modes:
Development mode (formerly Hosted mode): The application is run as Java bytecode within the Java Virtual Machine (JVM). This mode is typically used for development, supporting hot swapping of code and debugging.
Web mode: The application is run as pure JavaScript and HTML, compiled from the Java source. This mode is typically used for deployment.
Several open-source plugins are available for making GWT development easier with other IDEs. E.g., GWT4NB for NetBeans, Cypal Studio for GWT, Eclipse and JDeveloper etc. The Google Plugin for Eclipse handles most GWT related tasks in the IDE, including creating projects, invoking the GWT compiler, creating GWT launch configurations, validations, syntax highlighting, etc.
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part2-b

http://www.youtube.com/watch?v=bOqhbTTvfFUendofvid [starttext] This is a Video Tutorial explains Google Web Toolkit (GWT) Video Tutorial Part2-b
Components

The major GWT components include:
GWT Java-to-JavaScript Compiler
Translates the Java programming language to the JavaScript programming language.
GWT Development Mode
Allows the developers to run and execute GWT applications in development mode (the app runs as Java in the JVM without compiling to JavaScript). Prior to 2.0, GWT hosted mode provided a special-purpose "hosted browser" to debug your GWT code. In 2.0, the web page being debugged is viewed within a regular browser. Development mode is supported through the use of a native-code plugin called the Google Web Toolkit Developer Plugin for many popular browsers.
JavaScript implementations of the commonly used classes in the Java standard class library (such as most of the java.lang package classes and a subset of the java.util package classes).
GWT Web UI class library
A set of custom interfaces and classes for creating widgets.
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part2-c

http://www.youtube.com/watch?v=HJbvijJWIdYendofvid [starttext] This is a Video Tutorial explains Google Web Toolkit (GWT) Video Tutorial Part2-c
evelopment with GWT

Using GWT, developers can rapidly develop and debug AJAX applications in the Java language using the Java development tools of their choice. When the application is deployed, the GWT cross-compiler translates the Java application to standalone JavaScript files that are optionally obfuscated and deeply optimized.
GWT does not revolve only around user interface programming; it is a general set of tools for building any sort of high-performance client-side JavaScript functionality. In live presentations, the developers of GWT emphasize that "GWT is not its libraries" and that it only includes a library but is not fundamentally yet another AJAX library. This open-ended philosophy sometimes surprises developers new to GWT who expect it to provide an end-to-end "on rails" application framework. Indeed, many key architectural decisions are left completely to the developer. The GWT mission statement clarifies the philosophical breakdown of GWT's role versus the developer's role. History is an example of such: although GWT manages history tokens as users click Back or Forward in the browser, it does not prescribe how to map history tokens to an application state.
GWT applications can be run in two modes:
Development mode (formerly Hosted mode): The application is run as Java bytecode within the Java Virtual Machine (JVM). This mode is typically used for development, supporting hot swapping of code and debugging.
Web mode: The application is run as pure JavaScript and HTML, compiled from the Java source. This mode is typically used for deployment.
Several open-source plugins are available for making GWT development easier with other IDEs. E.g., GWT4NB for NetBeans, Cypal Studio for GWT, Eclipse and JDeveloper etc. The Google Plugin for Eclipse handles most GWT related tasks in the IDE, including creating projects, invoking the GWT compiler, creating GWT launch configurations, validations, syntax highlighting, etc.
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part2-d

http://www.youtube.com/watch?v=09dmV-MW53Aendofvid [starttext] This is a Video Tutorial explains
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part3-a

http://www.youtube.com/watch?v=na0Qgx7956Uendofvid [starttext] This is a Video Tutorial explains Google Web Toolkit (GWT) Video Tutorial Part3-a
Google Web Toolkit (GWT /ˈɡwɪt/) is an open source set of tools that allows web developers to create and maintain complex JavaScript front-end applications in Java. Other than a few native libraries, everything is Java source that can be built on any supported platform with the included GWT Ant build files. It is licensed under the Apache License version 2.0.[1]
GWT emphasizes reusable, efficient solutions to recurring Ajax challenges, namely asynchronous remote procedure calls, history management, bookmarking, internationalization and cross-browser portability.
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part3-b

http://www.youtube.com/watch?v=Az7PaLpflBoendofvid [starttext] This is a Video Tutorial explains Google Web Toolkit (GWT) Video Tutorial Part3-b
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part3-c

http://www.youtube.com/watch?v=beUNFjzuce0endofvid [starttext] This is a Video Tutorial Google Web Toolkit (GWT) Video Tutorial Part3-c
Google Web Toolkit (GWT) is a development toolkit for building and optimizing complex browser-based applications. GWT is used by many products at Google, including Google AdWords and Orkut. It's open source, completely free, and used by thousands of developers around the world.
[endtext]

Google Web Toolkit (GWT) Video Tutorial Part3-d

http://www.youtube.com/watch?v=8LWiJSXuYYcendofvid [starttext] This is a Video Tutorial explains Google Web Toolkit (GWT) Video Tutorial Part3-d

[endtext]

CRUD GWT Project with Smart GWT , LGPL and APPENGINE

http://www.youtube.com/watch?v=XIfeLaYxNtsendofvid [starttext] This is a Video Tutorial explains a CRUD GWT Project with Smart GWT , LGPL and APPENGINE
for more information
http://jumanor.blogspot.com/2010/12/primeros-pasos-con-smartgwt.html
[endtext]

Adding GWT to a simple Struts 1.x application - Struts Video Training

http://www.youtube.com/watch?v=100T13qLqxIendofvid [starttext] The Full project @ http://codeherding.blogspot.com/2010/06/gwt-struts-1x-netbeans-tutorial-intro.html
[endtext]

How to Build a Web Application to manage a collection of books & CD's with Java EE 6 JSF 2.0 , EJB 3.1 , JPA 2.0 RESTful JAX-RS 1.1 and CDI 1.0 - JSF & JPA & EJB Video Tutorial

http://www.youtube.com/watch?v=vuwXxuCjOm0endofvid [starttext] This is a Video Tutorial explains how to Build a Web Applications with Java EE 6
How to Build a Web Application to manage a collection of books & CD's with Java EE 6 JSF 2.0 , EJB 3.1 , JPA 2.0 RESTful JAX-RS 1.1 and CDI 1.0 - JSF & JPA & EJB Video Tutorial
[endtext]

JPA Video Tutorial Part2 - Java Persistence API Video Tutorial

http://www.youtube.com/watch?v=PiSvH18afyEendofvid [starttext] This is a Video Tutorial explains JPA Video Tutorial Part2 - Java Persistence API Video Tutorial
[endtext]

JPA Video Tutorial Part1 - Java Persistence API Video Tutorial

http://www.youtube.com/watch?v=43_ARlfF5E0endofvid [starttext] This is a Video Tutorial explains JPA Video Tutorial Part1 - Java Persistence API Video Tutorial
[endtext]

Web Application with JSF 2.0 EJB 3.1 /JavaEE 6 / GlassFish 3 On NetBeans 6.8 - JSF Video Tutorial

http://www.youtube.com/watch?v=9LoBI79rzGIendofvid [starttext] This is a Video Tutorial explains a Web Application with JSF 2,0 EJB 3,1 /JavaEE 6 / GlassFish 3 On NetBeans 6.8 - JSF Video Tutorial
[endtext]

How to create a web service in Netbeans 6.7.1 ? - Web Service Video Tutorial

http://www.youtube.com/watch?v=OytDHyD-f4oendofvid [starttext] This is a Video Tutorial explains how to create a basic web service and web client to consumes the service.
[endtext]

Pt. 1/4 - Android for Java Developers - Learn about - Activities - Intents - Services - Broadcast Receivers - Content Providers- Android Video Tutorial

http://www.youtube.com/watch?v=XFRS5j3BOkwendofvid [starttext] This is a Video Tutorial explains Part 1 of 4. While Android is based on Java, there are some fundamental differences and Android specific constructs to consider. In this presentation, Marko will take you through the anatomy of and Android application and demonstrate key Android building blocks.

You will learn about:
- Activities
- Intents
- Services
- Broadcast Receivers
- Content Providers

** Get the code and slides at: http://marakana.com/f/270
[endtext]

Pt. 2/4 - Android for Java Developers - the anatomy of and Android application and demonstrate key Android building blocks- Android Video Tutorial

http://www.youtube.com/watch?v=hcxIchfpHfUendofvid [starttext] This is a Video Tutorial explains Part 2 of 4. While Android is based on Java, there are some fundamental differences and Android specific constructs to consider. In this presentation, Marko will take you through the anatomy of and Android application and demonstrate key Android building blocks.

You will learn about:
- Activities
- Intents
- Services
- Broadcast Receivers
- Content Providers

** Get the code and slides at: http://marakana.com/f/270

[endtext]

Pt. 3/4 - Android for Java Developers - While Android is based on Java, there are some fundamental differences and Android specific constructs to consider - Android Video Tutorial

http://www.youtube.com/watch?v=vUXyBB-V_GMendofvid [starttext] This is a Video Tutorial explains Part 3 of 4. While Android is based on Java, there are some fundamental differences and Android specific constructs to consider. In this presentation, Marko will take you through the anatomy of and Android application and demonstrate key Android building blocks.
[endtext]

Pt. 4/4 - Android for Java Developers - fundamental differences and Android specific constructs to consider , the anatomy of and Android application and demonstrate key Android building blocks. - Android Video Tutorial

http://www.youtube.com/watch?v=Ok7pJgzfQOYendofvid [starttext] This is a Video Tutorial explains Part 4 of 4. While Android is based on Java, there are some fundamental differences and Android specific constructs to consider. In this presentation, Marko will take you through the anatomy of and Android application and demonstrate key Android building blocks.

[endtext]

Learn about C2DM (Cloud To Device Messaging Framework ), the protocol, its requirements, its limitations, and how to get started in building applications that take advantage this amazing framework. You will get to see a complete end-to-end application (both the Android-client and its server-side counterpart) and understand how all of the pieces fit together. - C2DM Video Tutorial

http://www.youtube.com/watch?v=51F5LWzJqjgendofvid [starttext] While many people agree that Android's 2.2 release was a major milestone in its evolutionary path, one of the most important features is yet waiting to be discovered: C2DM.

The Android Cloud to Device Messaging framework, which was first introduced at Google I/O 2010, has the potential to enable a whole new breed of applications for the platform. In a nutshell, C2DM makes it possible for developers to push data from their servers to their applications on Android devices.

C2DM is a relatively simple, very light-weight, messaging technology that transcends carriers' networks and allows innovative ways to connect with our users - all without having to drain batteries on their phones or waste wireless data, which is what we were forced to do with pull/polling-based approaches.

In this session you will learn about C2DM, the protocol, its requirements, its limitations, and how to get started in building applications that take advantage this amazing framework. You will get to see a complete end-to-end application (both the Android-client and its server-side counterpart) and understand how all of the pieces fit together.
[endtext]

Reading Files into Vector

Thought of this while trying to create a parser. The intention is to read a complete text file into a vector and then use this vector for other operations. This program shows how to read from a text file into vectors.

The input file is as follows:

Line num 1

Another line

Line number 3

4th Line!

**%** Last line **%**

Program as follows:



//Program tested on Microsoft Visual Studio 2008 - Zahid Ghadialy
#include <iostream>
#include <vector>
#include <fstream>
#include <string>

using namespace
std;

int
main()
{

vector<std::string> lines;
lines.reserve(5000); //Assuming that the file to read can have max 5K lines

string fileName("test.txt");

ifstream file;
file.open(fileName.c_str());

if
(!file.is_open())
{

cerr<<"Error opening file : "<<fileName.c_str()<<endl;
return
-1;
}


//Read the lines and store it in the vector
string line;
while
(getline(file,line))
{

lines.push_back(line);
}


file.close();

//Dump all the lines in output
for(unsigned int i = 0; i < lines.size(); i++)
{

cout<<i<<". "<<lines[i]<<endl;
}


return
0;
}




Output as follows:

Compare Machine Learning models with ROC Curve

ROC Curve is a common method to compare performance between different models. It can also be used to pick trade-off decisions between "false positives" and "false negatives". ROC curve is defined as a plot of "false positive rate" against "false negative rate". However, I don't find the ROC concept is intuitive and has been struggled for a while to grasp the concept.

Here is my attempt to explain ROC curve from a different angle. We use a binary classification example to illustrate the idea. (ie: predicting whether a patient has cancer or not)

First of all, all predictive model is not 100% correct. The desirable state is that a person who actually has cancer got a positive test result, and a person who actually has no cancer got a negative test result. Since the test is imperfect, it is possible that a person who actually has cancer was tested negative (ie: Fail to detect) or a person who actually has no cancer was tested positive (ie: False alarm).


In reality, there is always a tradeoff between the false negative rate and the false positive rate. People can tune the decision threshold to adjust them (e.g. In "random forest", we can set the threshold of predicting positive when more than 30% decision trees predicting positive). Usually, the threshold is set based on the consequence or cost of mis-classification. (e.g. in this example, fail to detect has a much higher cost than a false alarm)


This can also be used to compare model performance. A good model is one that has both low false positive rate and low false negative rate, which is indicated in the size of the gray area below (the smaller the better).

"Random guess" is the worst prediction model and is used as a baseline for comparison. The decision threshold of a random guess is a number between 0 to 1 in order to determine between positive and negative prediction.


ROC Curve is basically what I have described above with one transformation, which is transforming the y-axis from "fail to detect" to 1 - "fail to detect", which now become "success to detect". Honestly I don't understand why this representation is better though.

Now, the ROC curve will look as follows ...

Predictive Analytics Conference 2011

I attended the San Francisco Predictive Analytic conference this week and got a chance to chat with some best data mining practitioners of the country. Here summarizes my key takeaways.

How is the division of labor between human and machine?

Another way to ask this question is how “machine learning” and “domain expertise” work together and complement each other, since each has different strength and weakness.


Machine learning is very good at processing large amount of data in an unbiased way while human is unable to process the same data volume and the judgment is usually biased. However, machine cannot look beyond the data being given. For example, if the prediction power is low, machine learning methods cannot distinguish whether it is because the data is not clean, or the wrong model is being chosen, or because some important input feature is not captured. Domain expertise must be brought in to figure out the problem.

So the consensus is data mining / machine learning is simply a toolbox that can be used to augment human’s domain expertise, but can never replace it. For example, the domain expert can throw in a large number of input features to the machine learning model, which can determine a subset that are most influential. But if the domain expert doesn’t recognize an important input feature (and not capturing it), there is no way the machine learning model can figure out what is missing, not even recognizing that something is missing.


On the other hand, human is also very good in visualizing data patterns. “Data visualization” technique can be a powerful means to get a good sense and quickly identify the area where drilldown analysis should be conducted. Of course, visualization is limited to low dimension data as human cannot comprehend more than a handful of dimensions. Human is also easily biased so they may find patterns where are actually coincidence. By having human and machine working together, they complement each other very well.

What are some of the key design decisions in data mining?
  1. Balance between false +ve and false –ve based on cost / consequence of making a wrong decision.
  2. We don’t have to use a method from beginning to end. We can use different methods at different stage of the analysis. For example, in a multi-class (A, B, C) problem, we can use decision tree to distinguish A from notA (ie: B, C) and then use support vector machine to separate B and C. As another example, we can use decision tree to determine the best input attributes to be used by the neural network.

What is the most powerful / most commonly used supervised machine learning modeling technique?


The general answer is that each modeling technique has its strength and weakness and none of them wins in all situations. So understand their corresponding strength and weakness is important to pick the right one.

Generalized Linear Regression
Linear and Logistic regression are based on fitting a linear plane into a set of data points such that the root mean square of error (distance between predicted output and actual output) is minimized. It is by far the most commonly used technique, one for numeric output and the other for categorical output. They have a long history in statistics. It is supported in pretty much all commercial and open source data mining tools.

Linear and Logistic regression model requires certain amount of data preparation such as missing data handling. It also assuming that the output (or logit output) is a linear combination of input features, error is expected to be normally distribution. However, real-life scenarios are not always linear. To deal with non-linearity, input terms will be mixed (usually by cross-multiplication) in different ways to generate additional input terms called “interactions”. This process is like trial and error and can generate huge number of combination. Nevertheless, they do a reasonably good job in a wide spectrum of business problems and are well-understood by statisticians and data miners. And they are commonly used as a baseline comparison with other models.

Neural Network
Neural Network is based on multiple layer of perceptrons (each is like a logistic regression with binary input and output). There is typically a hidden layer (so the number of layers is 3) with N perceptrons (where N is trial and error). Because of the extra layer and the logit() function in the neural network, it can handle non-linearity very well. If it has good predictor in its input data, Neural network can achieve very high performance in prediction.

Similar to linear regression, Neural network requires careful data preparation to remove noisy data as well as redundant input attributes (those that are highly correlated). Neural network also take much longer time to train as compared to other methods. Also the model that Neural network has learned is not explainable or make good sense out of it.

Support Vector Machine
Support Vector Machine is a binary classifier (input feature is numeric). It is based on finding a linear plane that can separate the binary output class such that the margin is maximized. The optimal solution is expressed in terms of the dot product of vectors. If the points are not linearly separable, we can use a function to transform the points to a higher dimension space such that it is linearly separable. The Math shows that the dot product (after transforming to a hi-dim space) can be generalized into a Kernel function (Radial basis function being the most common one). Although the underlying math is not easy for everyone to understand, SVM has demonstrated outstanding performance in a wide spectrum of problems and recently become one of the most effective methods.

Despite of its powerful capability, SVM is not broadly implemented in commercial products as there are some patent issue as AT&T holds the patent of SVM. On the other hand, the non-linear kernel function (such as the most common Radial Basis function) is difficult to implement in parallel programming model such as Map/Reduce. SVM is undergoing active research and a derivative Support Vector Regression can be used to predict numeric output.


Tree Ensembles

This is combining “ensemble methods” with “decision tree”.

Decision tree is the first generation machine learning algorithm based on a greedy approach. For a classification problem, decision tree try to split a branch where the combined “purity” (either by the Gini index or Entropy) after split is maximized. For a regression problem, decision tree try to split where the combined “between-class-variance” divided by “within-class-variance” can be maximized. This is equivalent to maximizing the F-value after split. The splitting continues until reaching the terminating condition such as there are too few member remains in the branch, or the gain of further split is insignificant.

Decision tree are very good at dealing with missing value (simply not using that value in learning and go own both path in scoring). Using a decision tree to capture the decision model is also very comprehensible and explainable. However, decision tree is relatively sensitive to noise and can easily overfit the data. Although the learning mechanism is easy to understand, Decision tree doesn’t perform very well in general and is rarely used in real system. However, when decision trees are used together with Ensemble methods, it becomes extraordinary powerful as all its weakness now disappears.


The idea of ensemble is simple. Instead of learning one model, we learning multiple models and combine the estimation of each individual learner (e.g. we let them vote on categorical output and compute the average for numeric output).


There are two main models for creating different learners. One is called “bagging”, which is basically drawing samples (with replacement) from the training set and then have the same Tree algorithm to learn on different sample data set. Another model is called “boosting”, which has a sequence of iterations where samples are drawn from the training set based on the probability distribution where the wrongly predicted items in last round will have a higher chance to be selected. In other words, the algorithm places more attention to learn from wrongly-classified examples.


It turns out Ensemble tree is the most popular method at this moment as it achieve very good prediction across the board, easy to understand and can be implemented in Map/reduce. Google recently published a good paper on their PLANET project which implements ensemble tree on map/reduce.

Automatic file backup - using a timer

Once our file backup utility is running and our window is open we need to be able to use the start and stop buttons to control the activity of program.

Here is the code for the start action. It ties into the [start] label in the Start button.

[start] 'startup the backup timer
#main.start "!disable"
#main.stop "!enable"
gosub [checkInitialFiles]
#main.statusLog "Starting backup"
#main.interval "!contents? interval"
timer interval * 1000, [checkFiles]
wait

When the button is clicked we disable the Start button and enable the Stop button to show which operations are valid. We call [checkInitialFiles] which we haven't written yet (we'll get into that later). We show in the statusLog texteditor that we are starting the backup process. Then we get the contents of the interval textbox and start the timer. The reason we multiply the interval by 1000 is that the timer measures time in milliseconds so if we want 5 a second interval we need to give the timer a value of 5000. Finally we stop and wait for a timer tick or for user interaction.

Once our timer is running we need to be able to stop it. Here is our stop handler:

[stop] 'stop the backup timer
timer 0
#main.start "!enable"
#main.stop "!disable"
#main.statusLog "Stopping backup"
wait

This is real simple. First thing is to stop the timer with timer 0. Then we reverse the enabling of the Start and Stop buttons. Compare this to the way that [start] does it. Then we log to the statusLog texteditor that we are stopping. Finally we wait.

The purpose of the [checkInitialFiles] subroutine is to create a description of the files we are interested in and the time and date they were last modified. Then each time the timer ticks after this we create a new description of these files. If the date and time changes on any of these files then it's time to make a new backup.

Just for now let's just create an empty [checkInitialFiles] subroutine:

[checkInitialFiles] 'snapshot of filenames and timestamps
return

The routine doesn't do anything yet, so we only have a RETURN statement.

Now we'll create a [checkFiles] routine which will be called each time the timer ticks. For now the routine will not do much. We will write the full routine in a later section.

[checkFiles] 'are there new files
#main.statusLog "tick"
'temporarily disable the timer
timer 0
'perform the check here
'reenable the timer
timer interval * 1000, [checkFiles]
wait

The first thing we do is log the word "tick" to the statusLog. This is for instructive purposes only and will be removed later. We do this so that we can see that the timer is working. After this we disable the timer. This might seem like a strange idea, but the reason we do it is because the next thing we do is check the files to see if they changed (we'll write this part later). If they change we don't want the timer to be running because if it takes a while to backup the files and the timer is still running then the timer events can build up. Once the file check and possible backup are finished we reenable the timer.

The entire listing so far is posted below. Try running it. When you click Start it will begin logging its activity. Notice that the word tick gets logged every five seconds. Click the Stop button and change the interval to 1. Start it again and the logging will happen once per second.


dim info$(10,10)
setupPath$ = DefaultDir$+"\backupsetup.ini"

WindowWidth = 560
WindowHeight = 460
statictext #main, "Files to backup:", 5, 5, 94, 20
texteditor #main.listOfFiles, 5, 26, 530, 95
statictext #main, "Destination folder:", 5, 132, 107, 20
textbox #main.destination, 115, 127, 420, 25
statictext #main, "Backup interval in seconds:", 5, 157, 163, 20
textbox #main.interval, 170, 152, 100, 25
button #main.save,"Save",[save], UL, 495, 152, 42, 25
button #main.start,"Start",[start], UL, 5, 187, 75, 25
button #main.stop,"Stop",[stop], UL, 90, 187, 70, 25
statictext #main, "Backup status log", 5, 217, 106, 20
texteditor #main.statusLog, 5, 237, 530, 160
menu #main, "Edit"
open "Backup Utility" for window_nf as #main
#main.stop "!disable"
gosub [loadSetup]
wait

[loadSetup]
#main.listOfFiles "!cls";
if fileExists(setupPath$) then
open setupPath$ for input as #setup
while filename$ <> "end!"
line input #setup, filename$
if filename$ <> "end!" then
#main.listOfFiles filename$
end if
wend
line input #setup, destination$
#main.destination destination$
line input #setup, interval
#main.interval interval
close #setup
end if
return

[start] 'startup the backup timer
#main.start "!disable"
#main.stop "!enable"
#main.interval "!contents? interval"
gosub [checkInitialFiles]
#main.statusLog "Starting backup"
timer interval * 1000, [checkFiles]
wait

[stop] 'stop the backup timer
timer 0
#main.start "!enable"
#main.stop "!disable"
#main.statusLog "Stopping backup"
wait

[checkInitialFiles] 'snapshot of filenames and timestamps
return

[checkFiles] 'are there new files
#main.statusLog "tick"
'temporarily disable the timer
timer 0
'perform the check here
'reenable the timer
timer interval * 1000, [checkFiles]
wait

'return a true if the file in fullPath$ exists, else return false
function fileExists(fullPath$)
files pathOnly$(fullPath$), filenameOnly$(fullPath$), info$()
fileExists = val(info$(0, 0)) > 0
end function

'return just the directory path from a full file path
function pathOnly$(fullPath$)
pathOnly$ = fullPath$
while right$(pathOnly$, 1) <> "\" and pathOnly$ <> ""
pathOnly$ = left$(pathOnly$, len(pathOnly$)-1)
wend
end function

'return just the filename from a full file path
function filenameOnly$(fullPath$)
pathLength = len(pathOnly$(fullPath$))
filenameOnly$ = right$(fullPath$, len(fullPath$)-pathLength)
end function

1.1 - Introduction to Spring Framework part 1 - Spring Framework Video Tutorial

http://www.youtube.com/watch?v=Jjp_EYEn4bcendofvid [starttext] Download the PPT and example code from http://www.java9s.com/spring-framework/spring-3.0/java-spring-introduction-tu...
This is Introduction to spring framework. Spring framework was introduced by rod Johnson. Spring offers dependency Injection and Aspect oriented Programming. Spring framework contributes to loose coupling and Inversion of control. Spring tutorial gives an overview of how spring framework contributes to loose coupling. Spring Also has modules like Spring jdbc spring JMS which reduce the complexity in implementing those apis.
[endtext]

1.2 - Introduction to Spring Framework part 2 - Spring Framework Video Tutorial

http://www.youtube.com/watch?v=mFDGhE0sKfYendofvid [starttext] Download the example code and PPT from http://www.java9s.com/spring-framework/spring-3.0/java-spring-introduction-tu...
This is Introduction to spring framework. Spring framework was introduced by rod Johnson. Spring offers dependency Injection and Aspect oriented Programming. Spring framework contributes to loose coupling and Inversion of control. Spring tutorial gives an overview of how spring framework contributes to loose coupling. Spring Also has modules like Spring jdbc spring JMS which reduce the complexity in implementing those apis. This spring teaching is explained with spring framework example for every spring framework sessions
[endtext]

1.3 - Introduction to Spring Framework part 3 - Spring Framework Video Tutorial

http://www.youtube.com/watch?v=d9kMhAEJTLEendofvid [starttext] Download the example code and PPT from http://www.java9s.com/spring-framework/spring-3.0/java-spring-introduction-tu...
This is Introduction to spring framework. Spring framework was introduced by rod Johnson. Spring offers dependency Injection and Aspect oriented Programming. Spring framework contributes to loose coupling and Inversion of control. Spring tutorial gives an overview of how spring framework contributes to loose coupling. Spring Also has modules like Spring jdbc spring JMS which reduce the complexity in implementing those apis. This spring teaching is explained with spring framework example for every spring framework sessions
[endtext]

2 - Basic Bean Wiring - Spring Framework Video Tutorial

http://www.youtube.com/watch?v=G3pCkkKA5F8endofvid [starttext] Download example code from http://java9s.com/spring-framework/spring-3.0/java-spring-basic-bean-wiring.html
This Session elaborates on Spring Basic Bean wiring. In this video i have explained about how to inject the values to the bean through the setter methods and the constructor arguments.
This video also elobarates about spring list configuration, configring to load set collection, Map, Properties.

This elobarates on how the spring framework containers work and how the bean life cycle is managed. First i have explained about the Spring framework BeanFactory and then ApplicationContext containers.
[endtext]

3 - Spring Auto Wiring - byType, byName, Constructor - Spring Framework Video Tutorial

http://www.youtube.com/watch?v=ZlLDHblbZ5kendofvid [starttext]Download the example code and PPT from http://www.java9s.com/spring-framework/spring-3.0/java-spring-auto-wiring.html
This video explains about the advantages of Spring autowiring. It explains different types of spring Auto wiring like byName, byType and constructor.

Auto wiring byName:
By using this autowiring- the spring framework checks looks for the properties and checks if any other beans are name with the same name in the xml and if matches then instantiates and injects it into the target bean.
Auto wiring byType:
By using this autowiring - the spring framework looks the type of the property and checks if there is a bean with the same type if the type matches, it creates the bean and injects it to the target type.

Autowiring by Constructor:
By using this kind of Autowiring: the spring framework checks each and every argument of the constructor and checks if there is a same type of bean present in the configuration. If present, spring instantiates it and injects it through the constructor.
[endtext]

4 - Spring Method Injection and Bean Scope - Spring Framework Video Tutorial

http://www.youtube.com/watch?v=_IWXG3Y-qjAendofvid [starttext] This video tutorial elobarates about the Spring Method Injection and Spring Bean Scopes. Spring Framework has this wonderful feature of Method Injection to control what kind of methods need to be executed and also what kind of return types should be returned by the Spring Framework.
There are two types of Method Injection - The first one is the Replace Method using Method Replacer and the Second one is the Look up Method which is used to return a different return type than the conventional return type already defined in the Class.
Regarding the Bean scope, i have discussed the importance of Singleton and Prototype scopes. The singleton scope defined beans will only have one instance created for the entire life and with the scope defined as prototype, there will be n number of instances that will be created with n number of call to getBean() method of the Spring Framework.

Please remember that you need to have CGLIB-NODEPS jar in the classpath to execute this example.
You can find the source code from the website http://java9s.com
[endtext]

Introduction To Spring - Spring Video Tutorial - Spring Training

http://www.youtube.com/watch?v=Q6mz3lrZqs0endofvid [starttext]

Spring is one of the most popular framework out side of the standard. The basic idea that Spring promotes is inversion of control.Spring was first introduced by rod Johnson in 2004. Spring framework has put the following principals as their mission statement. These statements do not look path breaking at the moment as most of the newer generation frameworks has bought these ideas. But it was quite path breaking when they were introduced first time when the world was struggling with EJB2.1 infinite number of interfaces. Now the mission statements:

  • J2EE should be easier to use - Take this statement with the amount of code you have to write in EJB2.1 or doing a statement execution against a database.
  • It’s best to program to interface, rather than classes. Spring reduces the complexity cost of using interfaces to zero. - This is a basic programming principal and is actually at a higher level than Spring. It's about loose coupling and building services which are bound with contract and not glued tightly with implementations.
  • JavaBeans offer a great way of configuring applications - Your class is a plain POJO and do not depends on the framework. This improves the testability of the program.
  • OO design is more important than any implementation technology. - Again this comes basically from EJB2.1. The framework forces you to write so many things as part of implementation that you start loosing grip on your Business logic, which is covered more with OO paradigm.
  • Checked exceptions are overused in Java. A framework shouldn’t force you to catch exceptions you’re unlikely to be able to recover from.- Checked exceptions are nuisance most of the time. Think if any time in your life you have done anything useful with your SQLException when dealing with JDBC code.
  • Testability is essential, and a framework such as Spring should help make your code easier to test. - This again comes from loose coupling in the pieces of the program so that they can be tested independently.

Before we move further download the Spring framework from http://www.springsource.org/download(external link). Dowload spring-framework*-with-dependencies.zip. This includes all the dependent libraries also. When you unzip it, it has following important directories.

  • dist - Contains spring framework.jar
  • docs - Contains documents.
  • lib - Contains all dependent libraries.
  • src - Contains source code.

Spring Hello World Program


Let's write a simple Hello world program using the spring way:

Bring the following jars in library path:

  • dist/spring.jar
  • lib/log4j/log4j-1.2.14.jar
  • lib/jakarta-commons/commons-logging.jar

Write a HelloWorld Bean which has a method print message


public class HelloWorld { 
public void printMessage(){
System.out.println("Hello World");
}
}


Write the Spring configuration file. We will name it as context.xml. The name can be anything. Put the file somewhere in the classpath.





Write the main program

String[] files = {"context.xml"};

//Start the context of Spring
ApplicationContext appContext = new ClassPathXmlApplicationContext(files);

//Get the Helloworld bean from spring context
HelloWorld helloWorld = (HelloWorld)appContext.getBean("helloWorld");
helloWorld.printMessage();


If you look into the main program than we are not creating the HelloWorld object ourself but we ask spring factory to return us an instance of HelloWorld object. This is the most important aspect of Spring, which is Ability to Create Objects. Spring essentially is a factory which creates objects for us.

Let's now say that we want to decouple the functionality of providing message to HelloWorld and pass that responsibility to a different class called HelloWorldMessage.
HelloWorldMessage:

public class HelloWorldMessage {
public String getMessage(){
return "HelloWorld";
}
}

And our HelloWorld class would look like

public class HelloWorld { 
private HelloWorldMessage message;
public void setMessage(HelloWorldMessage message) {
this.message = message;
}
public void printMessage(){
System.out.println(message.getMessage());
}
}

Now HelloWorld has a dependency relationship and it needs an object of HelloWorldMessage. For that we need to modify the context.xml








The main class still remains the same. Here what we did is to use spring to build the relationship. This is another important aspect of Spring. To Wire the relationship..

Again to reiterate, the two important things that Spring bring to table is:

  • Ability to create objects.
  • Ability to build the relationship between objects.

The corollary is that if you are making instance of objects your self or building relationships yourself than you are not using Spring effectively.

Please follow the video to how to make the application:

These are basic premises of Inversion of Control(IOC) or DI(Dependency Injection). We will not get into theoretical debate here. But the important thing to understand is that the control of building objects and wiring relationships is delegated to the environment. You as an application developer go to the Spring container and ask for your object to do your work. The object is given to you created and all relationships set.
[endtext]

SEO tutorials - meta tags - SEO Video Tutorial

http://www.youtube.com/watch?v=eaa_P9ls9a8endofvid [starttext] This is a Video Tutorial explains SEO book, video and tools from from www.tutorials-seo.com
Meta tags
[endtext]

Learn how to develop for Android, Beyond HelloWorld - Android Video Tutorial

http://www.youtube.com/watch?v=rm-hNlTD1H0endofvid [starttext] This is a Video Tutorial explains Marko Gargenta delivers a tutorial-style talk on how to develop for Google's Android platform (beyond the HelloWorld) at the San Francisco Android User Group.
** Get the source code at: http://marakana.com/forums/android/general/23.html **

Head to http://marakana.com to see more educational videos on Android and open source development.

Organized and Sponsored by Marakana
[endtext]

Javascript course - Javascript Video Tutorial - Java Script training video

http://www.youtube.com/watch?v=uUhOEj4z8Foendofvid [starttext] This is a Video Tutorial explains Lecture Series on Internet Technologies by Prof. I. Sengupta, Department of Computer Science Engineering, IIT Kharagpur. For more details on NPTEL visit http://nptel.iitm.ac.in
[endtext]

Javascript Examples (Continued) - Javascript Video Tutorial

http://www.youtube.com/watch?v=3uxp7mqUIfkendofvid [starttext] This is a Video Tutorial explains
Lecture Series on Internet Technologies by Prof. I. Sengupta, Department of Computer Science Engineering, IIT Kharagpur. For more details on NPTEL visit http://nptel.iitm.ac.in
[endtext]

How to change a GWT's module name and update the run configuration in Eclipse - GWT Video Tutorial

http://www.youtube.com/watch?v=UW4WSYs1bKEendofvid [starttext] This is a Video Tutorial explains How to rename a GWT Module in Eclipse
[endtext]

Switch Case using string

When I started thinking of this example, I also faced some of the other problems that I have encountered in my initial days of C++. One of these problems was how to convert String to multiple Ints and how to split strings easily, etc. Hopefully you will find the example useful.



//Program tested on Microsoft Visual Studio 2008 - Zahid Ghadialy
#include<iostream>
#include<string>
#include<sstream>
#include<map>

using namespace
std;

void
split(const string& input, int& num1, int& num2, string& operation);
int
calculate(int num1, int num2, string operation);

int
main()
{

string s[5] = {"2 + 2", "4 - 1", "4 * 6", "18 / 3", "12 ^ 2"};

for
(int i = 0; i < 5 ; i++)
{

int
num1=0, num2=0;
string op;
split(s[i], num1, num2, op);
int
retVal = calculate(num1, num2, op);
cout<<s[i]<<" = "<<retVal<<endl;
}


return
0;
}


void
split(const string& input, int& num1, int& num2, string& operation)
{

istringstream iss(input);
iss >> num1;
iss >> operation;
iss >> num2;
}


int
calculate(int num1, int num2, string operation)
{

enum
Operators {unknown, add, sub, mul, div};
map<string, Operators> mapOfOperators;
mapOfOperators["+"] = add;
mapOfOperators["-"] = sub;
mapOfOperators["*"] = mul;
mapOfOperators["/"] = div;

switch
(mapOfOperators[operation])
{

case
add:
return
num1 + num2;
case
sub:
return
num1 - num2;
case
mul:
return
num1 * num2;
case
div:
return
num1 / num2;
default
:
cout<<"Unrecognised Operator "<<operation<<endl;
}

return
0;
}


The output is as follows:
See Also: Codeguru article on 'Switch on Strings in C++'.

Removing all white-spaces from a string

Continuing on the same theme as the last post. What if all the spaces need to be stripped out from the input string:



//Program tested on Microsoft Visual Studio 2008 - Zahid Ghadialy
#include<iostream>
#include<string>

using namespace
std;

string removeAllSpaces(const string& s)
{

string newStr(s);
bool
spacesLeft = true;

while
(spacesLeft)
{

int
pos = newStr.find(" ");
if
(pos != string::npos)
{

newStr.erase(pos, 1);
}

else

spacesLeft = false;
}


return
newStr;
}


int
main()
{

string aString("This string has multiple spaces problem!");

cout<<"Original : "<<aString<<endl;
cout<<"Modified : "<<removeAllSpaces(aString)<<endl;

return
0;
}





The output is as follows:

Automatic file backup - loading setup

Now that we've got a simple GUI designed, let's write some code which will load the files we want to backup, the destination path, and the interval of our automatic backup.

We need a subroutine to load the setup which we will call using GOSUB. We can call this right after we open the window.

open "Backup Utility" for window_nf as #main
gosub [loadSetup]
wait


We want to know if the setup file exists. There is an example program called fileExists.bas that comes with the functions we need to check for file existence. We'll just grab those. Here they are:

'return a true if the file in fullPath$ exists, else return false
function fileExists(fullPath$)
files pathOnly$(fullPath$), filenameOnly$(fullPath$), info$()
fileExists = val(info$(0, 0)) > 0
end function

'return just the directory path from a full file path
function pathOnly$(fullPath$)
pathOnly$ = fullPath$
while right$(pathOnly$, 1) <> "\" and pathOnly$ <> ""
pathOnly$ = left$(pathOnly$, len(pathOnly$)-1)
wend
end function

'return just the filename from a full file path
function filenameOnly$(fullPath$)
pathLength = len(pathOnly$(fullPath$))
filenameOnly$ = right$(fullPath$, len(fullPath$)-pathLength)
end function


We will call the fileExists( ) function from our [loadSetup] subroutine. A really simple example of the data in our setup file would have a list of file paths, and end! marker for the end of that list of files, a single line with the desired destination path, and another line with the interval in seconds between backup attempts.

Example backupSetup.ini
c:\myfolder\test.txt
c:\myfolder\backMeUp.dat
c:\myfolder\SillyPutty.exe
end!
c:\backupFolder\files
5


Once we know the file exists we can open it up and read it, placing the information into the different fields in our GUI.

[loadSetup]
#main.listOfFiles "!cls";
if fileExists(setupPath$) then
open setupPath$ for input as #setup
while filename$ <> "end!"
line input #setup, filename$
if filename$ <> "" then
#main.listOfFiles filename$
end if
wend
line input #setup, destination$
#main.destination destination$
line input #setup, interval
#main.interval interval
close #setup
end if
return


Here is the complete listing so far:

dim info$(10,10)
setupPath$ = DefaultDir$+"\backupsetup.ini"

WindowWidth = 560
WindowHeight = 460
statictext #main, "Files to backup:", 5, 5, 94, 20
texteditor #main.listOfFiles, 5, 26, 530, 95
statictext #main, "Destination folder:", 5, 132, 107, 20
textbox #main.destination, 115, 127, 420, 25
statictext #main, "Backup interval in seconds:", 5, 157, 163, 20
textbox #main.interval, 170, 152, 100, 25
button #main.save,"Save",[save], UL, 495, 152, 42, 25
button #main.start,"Start",[start], UL, 5, 187, 75, 25
button #main.stop,"Stop",[stop], UL, 90, 187, 70, 25
statictext #main, "Backup status log", 5, 217, 106, 20
texteditor #main.statusLog, 5, 237, 530, 160
menu #main, "Edit"
open "Backup Utility" for window_nf as #main
gosub [loadSetup]
wait

[loadSetup]
#main.listOfFiles "!cls";
if fileExists(setupPath$) then
open setupPath$ for input as #setup
while filename$ <> ""
line input #setup, filename$
if filename$ <> "end!" then
#main.listOfFiles filename$
end if
wend
line input #setup, destination$
#main.destination destination$
line input #setup, interval
#main.interval interval
close #setup
end if
return

'return a true if the file in fullPath$ exists, else return false
function fileExists(fullPath$)
files pathOnly$(fullPath$), filenameOnly$(fullPath$), info$()
fileExists = val(info$(0, 0)) > 0
end function

'return just the directory path from a full file path
function pathOnly$(fullPath$)
pathOnly$ = fullPath$
while right$(pathOnly$, 1) <> "\" and pathOnly$ <> ""
pathOnly$ = left$(pathOnly$, len(pathOnly$)-1)
wend
end function

'return just the filename from a full file path
function filenameOnly$(fullPath$)
pathLength = len(pathOnly$(fullPath$))
filenameOnly$ = right$(fullPath$, len(fullPath$)-pathLength)
end function

Check out this stream