330 likes | 345 Views
Discover how Sayl maximizes area under the uplift curve, outperforming previous methods in breast cancer application. Quickly identify watchful waiting candidates and gain clinically interpretable insights with this innovative methodology.
E N D
Score as you lift (Sayl) A Statistical Relational Learning Approach to Uplift Modeling Houssam Nassif1, Finn Kuusisto1, Elizabeth S. Burnside1, David Page1, Jude Shavlik1, and Vítor Santos Costa2 1University of Wisconsin, Madison, USA 2CRACS-INESC TEC and DCC-FCUP, University of Porto, Portugal
Score As You Lift The Task • Identify patients with breast cancer who may be good candidates for watchful waiting • Produce interpretable classifiers, for insight on the problem
Score As You Lift Breast Cancer Stages In Situ Invasive Later stage Cancer has invaded surrounding tissue • Earlier stage • Cancer is localized
Score As You Lift Breast Cancer Age Difference Older Younger Cancer tends to progress more aggressively Patient has more time for cancer progression • Cancer tends to progress less aggressively • Patient has less time for cancer progression
Score As You Lift Overtreatment Problem Who is treated? Everyone Can we reduce overtreatment in older patients with in situ cancer?
Score As You Lift Problem Definition • Given: • a Multi-Relational DB • Labeling • Find: • Watchful Waiting Candidates • Interpretable Explanations
Score As You Lift Differential Prediction • New Problem? • Psychology: results of a test on two populations • Medicine: ADR • Marketing: Uplift Modeling
Score As You Lift Model Filtering (MF) • Use ILP to learn theories • Learn a theory TO for older • T = { Rules in TO that do badly on younger }
Score As You Lift Differentially Predictive Search (DPS) • Use ILP as before • But change search: • Change node scoring to incorporate case/controls • + on cases • - on controls
Score As You Lift Uplift Modeling • How to evaluate different approaches? • Key Work from Marketing • Idea: • Performance on Cases – Performance on Controls • Robust Measure of Performance: Area under Lift Curve
Score As You Lift Lift Curve • Lift Curve: • Fraction E% of Examples to correctly classify P positive examples • Lift(0) = 0 • Lift(1) = Pos • Similar to ROC curve
Score As You Lift AUL and Uplift • UpliftSC(x) = LiftS(x) – LiftC(x) • UpliftSC(0) = 0 • UpliftSC(1) = PosS - PosC • AU_UPLIFT(S, C) = AUL(S) – AUL(C)
Score As You Lift Lift and Uplift
Score As You Lift Maximise UPLIFT: SAYL • Can we do better than DPS? • SAYU was designed to max an external theory score • Can we use SAYU for a differential score?
Score As You Lift SAYU • SAYU learns a classifier from ILP rules • Classifiers are NBAYES, TAN • It learns incrementally, by adding rules • Rule in if it improves score on tune test
Score As You Lift SAYL • Separates cases (subjects) and controls • Learns one classifier for each • Scores on uplift
Score As You Lift SAYL Algorithm
Score As You Lift Evaluation • How does it perform? • Do rules interact? • Are they useful?
Score As You Lift Dataset
Score As You Lift Performance – Uplift Curves
Score As You Lift Performance - Significance
Score As You Lift Example Older TAN Model
Score As You Lift Example Younger TAN Model
Score As You Lift Example Rules • Current study combined BI-RADS increased up to 3 points over previous mammogram. • Patient had previous in situ biopsy at same location. • Breast BI-RADS score = 4.
Score As You Lift Conclusions • SAYL maximizes area under the uplift curve. • SAYL significantly outperforms previous ILP methods on breast cancer application. • Output rules are clinically interpretable.
Score As You Lift Future Work • Investigate clinical relevance further. • Investigate uplift maximization using different metrics. • Investigate use of initial TAN model structure.
Score As You Lift Uplift and AUC • AUL is proportional to AUC • AUL = P ( skew/2 + (1-skew)AUC) • 0 <= skew <= linear and monotonic • UPLIFT(s, c) = C1 + C2 (AUCs – AUCc)
Score As You Lift Marketing Customer Groups Persuadables Customers who will respond only when targeted. Sure Things Customers who will respond even when not targeted. Lost Causes Customers who will not respond, regardless of whether they were targeted or not. Sleeping Dogs Customers who will not respond as a result of being targeted.
Score As You Lift Marketing Ideal Ranking
Score As You Lift Marketing Dataset
Score As You Lift In Situ vs. Invasive Dataset