200 likes | 327 Views
Intelligible Models for Classification and Regression Yin Lou, Rich Caruana , Johannes Gerhke 2012 ACM Conference on Knowledge Discovery and Data Mining. Presented by Lindsay Stetson. Outline. Background and Motivation Generalized Additive Models Experimental Overview Results
E N D
Intelligible Models for Classification and Regression Yin Lou, Rich Caruana, Johannes Gerhke2012 ACM Conference on Knowledge Discovery and Data Mining Presented by Lindsay Stetson
Outline • Background and Motivation • Generalized Additive Models • Experimental Overview • Results • Conclusion
Outline • Background and Motivation • Generalized Additive Models • Experimental Overview • Results • Conclusion
Background • Linear Model • Regression: y = β0+ β1x1 + … + βnxn • Classification: y = logit(β0 + β1x1 + … + βnxn) • Easy to interpret, intelligible, but less accurate • Complex Model (SVM, Random Forest, Neural Networks) • y = (x1, …, xn) • More accurate, but usually unintelligble
Goals of Work “…construct accurate models that are interpretable.” Intelligibility is important! In applied fields like biology, physics, and medicine we need to understand the individual contributions of the features in the model.
Outline • Background and Motivation • Generalized Additive Models • Experimental Overview • Results • Conclusion
Generalized Additive Model • Regression: y = f1(x1) + … + fn(xn) • Classification: y = logit(f1(x1) + … + fn(xn)) • Each feature gets shaped by a function fi • Goal: Accurate and intelligble
Fitting Generalized Additive Models • Splines (SP) • Single Tree (TR) • Bagged Trees (bagTR) • Boosted Trees (bstTR) • Boosted Bagged Trees (bbTR)
Learning Methods • Least Squares (P-LS/P-IRLS) • Backfitting (BF) • Gradient Boosting (BST)
Outline • Background and Motivation • Generalized Additive Models • Experimental Overview • Results • Conclusion
Outline • Background and Motivation • Generalized Additive Models • Experimental Overview • Results • Conclusion
Outline • Background and Motivation • Generalized Additive Models • Experimental Overview • Results • Conclusion
Conclusion • Generalized additive models are accurate and intelligible • Trees have a low bias but a high variance • Bagging reduces the variance, making the trees methods high performers • Bagged trees, with a low number of leaves, that are gradient boosted are the most accurate