Posted by: zyxo | February 21, 2010

Data mining at a higher level

Data mining at a higher level ?
What is this higher level ? and what was the “normal” level ?

The classic commercial data mining approach, the “normal” level goes as follows :

  • take your historical data
  • indentify the customers who did something (ex. buy product xyz) and the ones who did not.
  • make a model to distinguish between the do-ers and the non-do-ers.
  • use that model to calculate for each of your identified customers the probability that they will do that something.
  • contact those with the highest probability in some marketing campaign

You can find more on this “normal” approach in my previous posts :
data mining for marketing campaigns : interpretation of lift
Mining highy imbalanced data sets with logistic regressions
Howmany inputs do data miners need ?
Classification, Prior Probabilities and Soft Metrics
data mining with decision trees : what they never tell you

But there are better ways, that need a higher level of data mining :

  • estimation of the $ purchase amount
  • net lift of purchase probability
  • net lift of $ purchase amount
  • the above combined
  • optimal pricing modeling

Estimation of the $ purchase amount

Illustration of linear regression on a data set.
Image via Wikipedia

This is a straightforward one. In stead of only modeling the probability of purchase you can also model the expected purchase amount. Probably a simple multivariate linear regression can do the trick (but there are possibilities with decision trees also, I will write about that later). When selecting the people to target in your marketing campaign you simply multiply the estimated absolute purchase probability by the estimated purchase amount. In that way you will contact the people who you expect to shop heavily in your stores.

Net lift of purchase probability

This involves a lot more effort. It is something you have to do in three big steps :

  1. First step : organize a marketing campaign where you contact (e-mail, snail mail, personalized banner) a sufficiently large random target group of your customers. Be sure to keep an equally large control group, equally randomly chosen.
  2. Second step (the modeling step) : you need to make two “normal level” models for calculating the purchase probability : for the first one, you only use the targeted people (I call it the targeted model), for the second one, you only use the non-targeted people (the control model).
  3. Third step (the scoring step) : for each customer in your database you calculate the purchase probability twice : once with each of the two new models. Since the models are different, the two calculated probabilities for each single customer will differ. What you need is the customers where the calculated probability from the targeted model is a lot higher than the calculated probability from the control model. It means that as a result of the e-mail or the banner the purchasing probability went up. What you really model this way is the campaign effect. It gives you a means of selecting those people where your campaign will have the biggest effect on their purchasing behavior.

Net lift of $ purchase amount

This goes the same way as modeling the above. Only here you model the campaign impact on the $ purchase amount of your customers.

The above combined

Obviously this is just … the above combined, where you try to maximize
( net increase of purchase probability x net increase of purchase amount ) – contact cost and this summed over all the customers in you campaign target group.

Leaves us with one more : Optimal pricing modeling

price versus probability to buy

Optimal price chart

This one goes about the same way as the net lift modeling of the purchase probability. There we calculated the probability differences between the target group and the control group. In the optimal pricing modeling we also need a first campaign to get the data and thenwe calculate several models. Each model is calculated for a different target group. The difference between the target group is, well, the price. OK, we need a business where we can easily ask different prices to different customers for the same service or item. Say for target group A we ask 5$ for the item, in target group B we ask 5.5$ and in target group C we ask 6$.
After this test campaign we calculate three models : the 5$model, the 5.5$model and the 6$model.
With these models we calculate the three probabilities to purchase for each customer. The price we ask a customer for our item should be the one where this price times his calculated purchase probability is the highest. If John has a probability of 7% for purchasing at 5$, a prob. of 6.7% of purchasing at 5.5$ and a probability of 5.5% of purchasing at 6$ you get respectively 0.35, 0.3685 and 0.33. So you should try to sell your item to John at a price of 5.5$ for each “John will pay you on the average 5.5* times 6.7% purchases = 0.3685$ .

Reblog this post [with Zemanta]


  1. […] Data mining at a higher level ( […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: