You are lazy, or you have not much time, or the boss does not want to hire another data miner … But still you want to do more…

For example, like me you use decision trees to calculate models for response prediction, because decision trees are easy : they are not parametric, which means you do not need to go trough all the cumbersome work of imputing missing values, coding dummy’s for categorical variables, do some transformations to get more normally distributed variables and so on.

But still you want to do more. You not only want to predict the probability of purchasing, but you want also to estimate the amount that your customer will spend.

Why, that’s a problem, Did you ever try to run a multiple regression with 500 variables where 95% of them have a missing value for at least one variable ? Right, you get nothing, nada, nil …

But there’s no time to to all the work of imputing missing values, coding dummy’s for categorical variables, do some transformations to get … !

Wait! there IS another way, perhaps not perfect, but what the heck, Microsoft software isn’t either …

**Decision Trees ! YES ! Decision Trees ! **

The trick is simple :

- take all your customers who did some purchases in the past and calculate the amount (for example per week, per month …)
- calculate the distribution of these amounts to get an indea of what you are dealing with
- transform the amount into let’s say five categories (very low, low, moderate, high, very high) with a fair number of customers in each category. (What’s a fair number ? in data mining it is simple : “there’s no data like more data”)
- Now you can chose : either you calculate 5 decision trees with a binary target to predict the probability that your customer belongs to one of the 5 categories, or you can calculate one decision tree with a multi-level target of 5 values
- for each customer, you calculate the probability of each of the 5 amount categories. Keep the category with the highest probability as the right one.

You could even go further : in stead of your original 500 variables which where no good for a decent multivariate regression, you have now 5 new nice continuous variables which you can use in a multiple regression to model the original amount. Which means that the end result is not just a category, but a genuine estimated purchase amount :

*estimated purchase amount = constant + a x probability_very_low + b x probability_low + c x probability_moderate + d x probability_high + e x probability_very_high*

What do you think ? A silly idea ? Worth a try ? Please let me know.

###### Related articles by Zemanta

- Data mining at a higher level (zyxo.wordpress.com)
- Classification, Prior Probabilities and Soft Metrics (zyxo.wordpress.com)
- Data Mining : What is a good lift ? (zyxo.wordpress.com)

[…] Estimating the expected purchase amount with decision trees « Mixotricha […]

By:

My Data Mining Weblog » Decision Treeson March 6, 2010at 9:45 am

I’ve tried a similar thing to predict spend, but pricing plans in the telco market changes so fast…

regarding your comment;

– “transform the amount into let’s say five categories ”

I use frequency binning to create a nice round number of categories and treat it as a decimal numeric and use a simple backprop NN to predict.

Cheers

Tim

By:

TimMannson March 10, 2010at 9:10 pm

[…] Estimating the expected purchase amount with decision trees (zyxo.wordpress.com) […]

By:

10 sigma in the news again « kansas reflectionson May 8, 2010at 5:17 pm