Data miners who visited my blog in the past, already know that I like decision trees . They are without any doubt my favorite data mining tool.
Want to know why ? Because it is simply the best data mining algorithm.
For a number of reasons :
- Decision trees are white boxes = means they generate imple, understandable rules. You can look into the trees, clearly understand each an every split, see the impact of that split and even compare it to alternative splits.
- Decision trees are non-parametric = means no specific data distribution is necessary. Decision trees easily handle continuous and categorical variables.
- Decision trees handle missing values as easily as any normal value of the variable
- In decision trees elegant tweaking is possible. You can chose to set the dept of the trees, the minimum number of observations needed for a split, or for a leave, the number of leaves per split (in case of multilevel target variables). And many more.
- Decision trees is one of the best independent variable selection algorithms. If you really want to make a model with logistic (or linear) regressions or with neural networks, but first you want to reduce the number of variables by selecting only the relevant ones : use decision trees. They are fast, and, unlike calculating simple correlations with the target variable, they also take into account the interactions between variables .
- Decision trees are weak learners. At first sight this rather seems to be a disadvantage, but NO ! Weak learners are great when you want to use lots of them in ensembles, because ensembles, like bagging, boosting, random forests, treenets become very powerful algorithms when the individual models are weak learners,.
- Decision trees identifies subgroups. Each terminal or intermediate leave in a decision tree can be seen as a subgroup/segment of your population.
- Decision trees run fast even with lots of observations and variables
- Decision trees can be used for supervised AND unsupervised learning. Yes, even with the fact that a decision tree is per definition a supervised learning algorithm where you need a target variable, they can be used for unsupervised learning, like clustering. For this, see one of my previous posts.
- Decision trees are simple. I mean : it is a simple algorithm. No complicated mathematics needed to understand how they work.
- Decision trees deliver high quality models, are able to squeeze pretty much all information out of the data, especially if you use them in ensembles.
- Decision trees can easily handle unbalanced datasets. If you have 0.1 % of positive targets and 99.9% of negative ones : no problem for decision trees ! (see one of my previous posts)
Reasons enough ? Do you know other algorithms with such beautiful characteristics ?
Please do let me know !