Section 04: Decision Trees

Decision trees strike a nice balance between interpretability and accuracy. They pick up on nonlinearity and high degree interactions, but they still produce simple rules or diagrams that explain their decisions. They're also a very robust modeling technique that can generally accept missing values, variables of disparate scales, and correlated variables.

Many techniques have evolved for combining multiple decision trees into ensembles models. These ensembles decrease the error from variance a single tree can produce in new data, while typically not increasing error from bias. Tree-based ensembles are often the most accurate types of models for tabular data.

Class Notes

Overview of decision trees
Overview of training decision trees in Enterprise Miner - Blackboard electronic reserves
More decision tree splitting and stopping strategies
Advanced notes
EM decision tree example
H2o decision tree ensemble examples
Kaggle House Prices example notebook

Sample Quiz

Quiz Key

Supplementary References

XGBoost GitHub
Gradient Boosting Machines with H2O
H2O GBM Tuning Tutorial for Python
Predictive Modeling and Decision Trees in Enterprise Miner - Blackboard electronic reserves

Introduction to Statistical Learning
Chapter 8
Introduction to Data Mining
Chapter 4
Elements of Statistical Learning
Chapters 10 and 15
Pattern Recognition in Machine Learning
Chapter 14
Random Forests
by Leo Breiman
Greedy Function Approximation: A Gradient Boosting Machine
by Jerome Freidman
Extremely Randomized Trees
by Pierre Geurts, Damien Ernst and Louis Wehenkel
Stacked and blended ensemble models:
- Stacked Generalization
  by David Wolpert, 1992
- Super Learner
  by Van Der Laan et al, 2007
- Stacknet
  by Marios Michailidis
- Ensemble Models in SAS Enterprise Miner

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

04_decision_trees.md

04_decision_trees.md

Section 04: Decision Trees

Class Notes

Sample Quiz

Quiz Key

Supplementary References

Files

04_decision_trees.md

Latest commit

History

04_decision_trees.md

File metadata and controls

Section 04: Decision Trees

Class Notes

Sample Quiz

Quiz Key

Supplementary References