String Shift and Print

After my last blog post, someone pointed out a similar problem here. The problem goes:

Write a program that generates from the string "abcdefghijklmnopqrstuvwxyz{" the following:
a
bcb
cdedc
defgfed
efghihgfe
fghijkjihgf
ghijklmlkjihg
hijklmnonmlkjih
ijklmnopqponmlkji
jklmnopqrsrqponmlkj
klmnopqrstutsrqponmlk
lmnopqrstuvwvutsrqponml
mnopqrstuvwxyxwvutsrqponm
nopqrstuvwxyz{zyxwvutsrqpon

Well there is no time limit this time and we dont even need to use google. I think this is slightly trickier than the last one.

Here is my solution:



//Program tested on Microsoft Visual Studio 2008 - Zahid Ghadialy

#include<iostream>
#include<string>

using namespace
std;

int
main()
{

string str = "abcdefghijklmnopqrstuvwxyz{";
unsigned
strlen = str.length();
string tempStr;
int
loop = 0;
while
(tempStr.length() < strlen)
{

int
start1 = loop;
int
count1 = loop + 1;
tempStr = str.substr(start1,count1);

count1--;
for
(; count1 > 0; count1--)
tempStr.append(str.substr(start1+count1-1, 1));

cout<<tempStr;
cout<<endl;
loop++;
}

return
0;
}




ASCII Print

Few weeks back, someone I know got asked the following question in an interview:

Write a C++ program in 15 minutes to print the following in the output:
a
bcb
cdedc
defgfed
efghihgfe
fghijkjihgf
ghijklmlkjihg
hijklmnonmlkjih
ijklmnopqponmlkji
jklmnopqrsrqponmlkj
klmnopqrstutsrqponmlk
lmnopqrstuvwvutsrqponml
mnopqrstuvwxyxwvutsrqponm
nopqrstuvwxyz{zyxwvutsrqpon

You can do search on the web for any information but all your searches would be logged and used to assess your answer.

Well my obvious guess would be to take help of ASCII table. My only search was for Ascii table in google. I did try and make it in less than 15 mins. My solution is below but do try it on your own before looking at my solution.


//Program tested on Microsoft Visual Studio 2008 - Zahid Ghadialy
#include<iostream>
#include<string>

using namespace
std;

int
main()
{

char
c=(char)97; //'a' in ASCII
char loop = 0;
while
(c <= 'n')
{

char
x = 0;
for
(; x <=loop; x++)
{

cout<<char(c+x);
}

for
(char y = 1; y < x; y++)
{

cout<<char(c+x-y-1);
}

cout<<endl;
c++;
loop++;
}


return
0;
}



Conditional (Ternary) Operator in c++ [? :]

Conditional (Ternary) Operator in c++ [? :]


The condional operator can often be used instead of the if else statement. Since it is the only operator that requires three operands in c++, it is also called ternary operator.

For example, consider the assignment statement :
x = y > 3 ? 2 : 4;
If y is greater than 3 then 2 will be assigned to variable x or else the value 4 will be assigned to x.

/* A simple c++ program example to demonstrate the use of ternary operator. */


// Program 1


#include <iostream>
#include <iomanip>

using namespace std;

int main ()
{
    int first, second;

    cout << "Enter two integers." << endl;

    cout << "First" << setw (3) << ": ";
    cin >> first;

    cout << "Second" << setw (2) << ": ";
    cin >> second;

    string message = first > second ? "first is greater than second" :
                                "first is less than or equal to second";

    cout << message << endl;

    return 0;
}


Compare the above "Program 1" and below "Program 2" with "Program 2" and "Program 3" of Selection Statement (if-else if-else) in c++ respectively, for better understanding.

/* A c++ program example to demonstrate the use ternary operator.*/


// Program 2


#include <iostream>
#include <iomanip>

using namespace std;

int main ()
{
    int first, second;

    cout << "Enter two integers." << endl;

    cout << "First" << setw (3) << ": ";
    cin >> first;

    cout << "Second" << setw (2) << ": ";
    cin >> second;

    string message = first > second ? "first is greater than second" :
                                first < second ? "first is less than second" :
                                "first and second are equal";

    cout << message << endl;

    return 0;
}

Difference between Deep copy and Shallow copy in C++

So what is the difference between 'Deep copy' (sometimes referred to as 'Hard copy') and 'Shallow copy' in C++

Shallow copy: Copies the member values from one object into another.

Deep Copy: Copies the member values from one object into another. Any pointer objects are duplicated and Deep Copied.

There are some interesting discussions on Stack Overflow here and here. A related discussions here is interesting as well.

Program to demonstrate as follows:


//Program tested on Microsoft Visual Studio 2008 - Zahid Ghadialy
#include<iostream>
#include<string>

using namespace
std;

class
MyString
{

public
:
MyString(){}
MyString(string str)
{

size = str.size();
data = new char(size+1); //+1 for '\0'
memcpy(data, str.c_str(), size+1);
}

//MyString(const MyString& copy); - Use default for Shallow
void DeepCopy(const MyString& copy) //Deep Copy
{
size = copy.size;
data = new char(size+1); //+1 for '\0'
memcpy(data, copy.data, size+1);
}

int
size;
char
* data;
};


int
main()
{

MyString m1("Zahid");
cout<<"\nm1.size = "<<m1.size<<", &m1.data = "<<(void *)m1.data<<", *m1.data = "<<m1.data<<endl;

MyString m2(m1); //Uses the default copy constructor - Shallow Copy
cout<<"\nm2.size = "<<m2.size<<", &m2.data = "<<(void *)m2.data<<", *m2.data = "<<m2.data<<endl;

MyString m3; //Another default constructor
m3.DeepCopy(m1);
cout<<"\nm3.size = "<<m3.size<<", &m3.data = "<<(void *)m3.data<<", *m3.data = "<<m3.data<<endl;

return
0;
}


Output as follows:


BI at large scale

As more and more data being collected everywhere from pretty much everything a user do, such as transactions activities, social interactions, information search ... enterprises has been actively looking into ways to turn these vast amount of raw data into useful information.

BI process flow

It include the following stages of processing
  1. ETL: Extract operational data (inside enterprise or external sources) into data warehouse (typically organized in Star/Snowflake schema with Fact and Dimension tables).
  2. Data exploration: Get insight into data using simple visualization tools (e.g. histogram, summary statistics) or sophisticated OLAP tools (slice, dice, rollup, drilldown)
  3. Report generation: Produce executive reports
  4. Data mining: Extract patterns of the underlying data to form models (e.g. bayesian networks, linear regression, neural networks, decision trees, support vector machines, nearest neighbors, association rules, principal component analysis)
  5. Feedback: The model will be used to assist business decision making (predicting the future)
The gap of processing BIG data
Many data mining and machine learning algorithms are available in both commercial packages (e.g. SAS, SPSS) as well as open source libraries (e.g. Weka, R). Nevertheless, most of these ML algorithms implementation are based on fitting al data in memory and not designed to process big data (e.g. Tera byte data volume).

On the other hand, massively parallel processing platform such as Hadoop, Map/Reduce, over the last few years, has been proven in processing Terabyte or even Petabyte range of data. Although many sequential algorithm can be restructured to run in map reduce, including a big portion of machine learning algorithm, there isn't a corresponding parallel implementation of ML available in massively parallel form.

Approach 1: Apache Mahout
One approach is to "re-implement" the ML algorithm in Map/Reduce and this is the path of Apache Mahout project. Mahout seems to have implemented an impressive list of algorithms although I haven't used them for my projects yet.

Approach 2: Ensemble of parallel independent learners
This is an alternative path that doesn't require re-implementation of existing algorithms. It works in the following way.
  1. Draw samples from the Big data into many sample data sets, which can fit into the memory of a single, individual learner.
  2. Assign each sample data set to an individual learner, who use existing algorithms to learn the model. After learning, each individual learner keep their own learned model
  3. When a decision / prediction request is received, each individual learner will come up with its own prediction and then combine their results in some ways. (e.g. for classification task, the learners will vote for the predicted class and the majority wins. for regression, the average of the estimate values will be used to predict the output value)

I also found this approach can smoothly fade out outdated model. As user's behavior may change over time, same happens to the validity of a learned model. With this ensemble approach, I can have multiple learners each learn their model periodically. Everytime when a prediction is needed, I will pick the latest k models and combine the final prediction based on a time-decayed weighted voting model. Outdated model will automatically slide out the k-size window automatically.

One gotchas of sampling approach is the handling of rare events (since you may lost those rare events in sampling). In this case, stratified sampling (instead of simple random sampling) should be used.

Difference between 'new', 'new operator' and 'operator new' in C++

A question asked in a forum said:

Why does void* p = new (1024); gives me a compilation error while void* p = operator new (1024); works. What is the difference between new and "operator new"?

Lets go back to the beginning. Once upon a time there was this C language that used 'malloc' to allocate memory. Then when C++ came, a new way was defined of allocating memory by the use of 'new'. Its supposed to be much safter and better way but some software gurus may differ. Memory for 'new' is allocated from 'Free Store' and memory by 'malloc' is generated from 'heap'. The 'heap' and 'Free Store' may be the same area and is a compiler implementation detail.

The syntax for 'operator new' is:
void* new (std::size_t size);

All it does is allocate a memory of size specified

On the other hand, 'new' does 2 things:
1. It calls 'operator new'
2. It calls constructor for the type of object

In the above case since 1024 is not a type, it will fail in calling its constructor.

'new' is also referred to as 'keyword new' or 'new operator' to cause more confusion :)

Here is a small example to play with 'operator new' after the example you will realise that regardless of what is passed, it always allocates a 4 byte memory.



//Program tested on Microsoft Visual Studio 2008 - Zahid Ghadialy
#include<iostream>

using namespace
std;

int
main()
{

//void* p = new (1024);

void
* p = operator new (1024);

//Lets test if we can do anything with the memory
cout<<"p = "<<p<<endl;
cout<<"sizeof(p) = "<<sizeof(p)<<endl;
int
*x = static_cast<int *>(p);
cout<<"*x before = "<<*x<<endl;
*
x = 23456;
cout<<"*x after = "<<*x<<endl;

void
* q = operator new (0);
cout<<"\nq = "<<q<<endl;
cout<<"sizeof(q) = "<<sizeof(q)<<endl;
x = static_cast<int *>(q);
cout<<"*x before = "<<*x<<endl;
*
x = 65432;
cout<<"*x after = "<<*x<<endl;

void
* z = operator new ('a');
cout<<"\nz = "<<z<<endl;
cout<<"sizeof(z) = "<<sizeof(z)<<endl;
x = static_cast<int *>(z);
cout<<"*x before = "<<*x<<endl;
*
x = 11111;
cout<<"*x after = "<<*x<<endl;

return
0;
}


The output is as follows:


My knowledge in this area is quite limited so please feel free to improve on my explanation or correct it.

The switch statement in c++

The switch statement in c++


The program that we create should be readable. To increase the readability of the program we should use tools that is simple to read and understand. When possible use switch statement rather than if else statement, as it can be more readable than if else statement. But switch statement has limitation. It can't replace if else completely but can be helpful at certain situation. It can't do everything thing that if else statement can do. For example, switch statement can take only int or char datatype in c++. The following programs will help you to understand the switch statement.

/* A simple c++ program example that demonstrate the use of switch statement in c++ by taking character input.*/


// Program 1


#include <iostream>

using namespace std;

int main ()
{
    char permit;

    cout << "Are you sure you want to quit? (y/n) : ";
    cin >> permit;

    switch (permit)
    {
        case 'y' :
            cout << "Hope to see you again!" << endl;
            break;
        case 'n' :
            cout << "Welcome back!" < < endl;
            break;
        default:
            cout << "What? I don't get it!" << endl;
    }

    return 0;
}

/* A c++ program example that demonstrate the use of switch statement in c++ by taking integer input. */


// Program 2

#include <iostream>
#include <iomanip>

using namespace std;

int main ()
{
    const int CHEESE_PIZZA = 11;
    const int SPINACH_PIZZA = 13;
    const int CHICKEN_PIZZA = 14;

    cout << " *********** MENU ***********" << endl;
    cout << setw (9) << "ITEM" << setw (20) << "PRICE" << endl;
    cout << " (1) Cheese Pizza" << setw (8) << "$"
            << CHEESE_PIZZA << endl;
    cout << " (2) Spinach Pizza" << setw (7) << "$"
            << SPINACH_PIZZA << endl;
    cout << " (3) Chicken Pizza" << setw (7) << "$"
            << CHICKEN_PIZZA << endl;
    cout << endl;

    cout << "What do you want? ";
    int option;
    cin >> option;

    cout << "How many? ";
    int quantity;
    cin >> quantity;

    int price;

    switch (option)
    {
        case 1:
            price = CHEESE_PIZZA;
            break;
        case 2:
            price = SPINACH_PIZZA;
            break;
        case 3:
            price = CHICKEN_PIZZA;
            break;
        default:
            cout << "Please select valid item from menu. " << endl;
            return 1;
    }

    int amount = price * quantity;
    cout << "Your Bill: $ " << amount << endl;

    return 0;
}


Explanation for the above program:


In the above program we take an integer value from the user which is stored in 'option' variable. We pass this value to switch statement. The switch statement has 3 cases: case 1, case 2 and case 3. The case 1: is similar to if (option == 1). This is the advantage of switch statement over if else statement. You don't need to type the name of variable again and again if you are doing selection operation on same variable. You just put the variable name on switch statement and then just specify the value after 'case'. One more thing to be noted is that it requires 'break' statement at the end of each 'case'. If you remove the break statement then it will jump to the case that follows it. Try it and check by yourself. The 'default' is same as else in if else statement.

Check out this stream