230 likes | 406 Views
Refining Rules Incorporated into Knowledge-Based Support Vector Learners via Successive Linear Programming. Richard Maclin University of Minnesota - Duluth Edward Wild, Jude Shavlik, Lisa Torrey, Trevor Walker University of Wisconsin - Madison. The Setting. Given
E N D
Refining Rules Incorporated into Knowledge-Based Support Vector Learners via Successive Linear Programming Richard Maclin University of Minnesota - Duluth Edward Wild, Jude Shavlik, Lisa Torrey, Trevor Walker University of Wisconsin - Madison
The Setting Given • Examples for classification/regression task • Advice from an expert about the task Do • Learn an accurate model • Refine the advice (if needed) Knowledge-Based Support Vector Classification/Regression
Motivation • Advice-taking methods incorporate human user’s knowledge • But users may not be able to precisely define advice • Idea: allow users to specify advice but refine the advice with the data
An Example of Advice True concept IF(3x1 – 4x2) > -1 THENclass = +ELSEclass =- Examples 0.8 , 0.7 , 0.3 , 0.2 , + 0.2 , 0.6 , 0.8 , 0.1 , - Advice IF(3x1 – 4x2) > 0 THENclass = + ELSEclass =- (wrong, threshold should be -1)
SVM Formulation min (model complexity) + C (penalties for error) such that model fits data (with slack vars for error)
Knowledge-Based SVMs[Fung et al., 2002, 2003 (KBSVM), Mangasarian et al., 2004 (KBKR)] min (model complexity) + C (penalties for error) +(µ1,µ2) (penalties for not following advice) such that model fits data (with slack vars for error) + model fits advice (also with slacks)
Refining Advice min (model complexity) + C (penalties for error) + (µ1,µ2) (penalties for not following advice) + ρ(penalties for changing advice) such that model fits data (with slack vars for error) + model fits advice (also with slacks) + variables to refine advice
Advice format Bx ≤ df(x) ≥ Incorporating Advice in KBKR IF(3x1 – 4x2) > 0 THENclass = + (f(x)≥ 1) f(x)≥ 1
Linear Programming with Advice Advice Bx ≤ d f(x) ≥ IF(3x1 – 4x2) > 0 THENclass = + KBSVMs: min ||w||1 + |b| + C||s||1 sum per advice k µ1||zk||1+µ2ζk such that Y(wTx +b) + s ≥ 1 for each advice k wk+BkTuk = zk -dTuk + ζk ≥ βk – bk (s,uk,ζk)≥0
Refining Advice Advice Bx ≤ (d - δ) f(x) ≥ Would like to just add to linear programming formulation, but KBSVMs: min ||w||1 + |b| + C||s||1 sum per advice k µ1||zk||1+µ2ζk+ρ||δ||1 such that Y(wTx +b) + s ≥ 1 for each advice k wk+BkTuk = zk (δ-d)Tuk + ζk ≥ βk – bk (s,uk,ζk)≥0 Cannot solve for δ and u simultaneously!
Solution: Successive Linear Programming Rule-Refining Support Vector Machines (RRSVM) algorithm: Set δ=0 Repeat Fix value of δ and solve LP for u Fix value of u and solve LP for δ Until no change to δ or max # of repeats
Experiments Artificial data sets IF(3x1–4x2)>-1 THENclass = + ELSEclass = - Data randomly generated (with and w/o noise) Errors added (e.g., -1 dropped) to make advice Promoter data set Data: Towell et al. (1990) Domain theory: Ortega (1995)
Methodology • Experiments repeated twenty times • Artificial data results – training and test set randomly generated (separately) • Promoter data – ten fold cross validation • Parameters chosen using cross validation (ten folds) on training data Standard SVMs: C KBSVMs: C, µ1, µ2 RRSVMs: C, µ1, µ2 , ρ
Related Work • Knowledge-Based Kernel Methods • Fung et al., NIPS 2002, COLT 2003 • Mangasarian et al., JMLR 2005 • Maclin et al., AAAI 2005, 2006 • Le et al., ICML 2006 • Mangasarian and Wild, IEEE Trans Neural Nets 2006 • Knowledge Refinement • Towell et al., AAAI 1990 • Pazzani and Kibler, MLJ 1992 • Ourston and Mooney, AIJ 1994 • Extracting Learned Knowledge from Networks • Fu, AAAI 1991 • Towell and Shavlik, MLJ 1993 • Thrun, 1995 • Fung et al., KDD 2005
Future Work • Test on other domains • Address limitations (speed, # of parameters) • Refine multipliers of antecedents • Add additional terms to rules • Investigate rule extraction methods
Conclusions RRSVM • Key idea: refine advice by adjusting thresholds of rules • Can produce more accurate models • Able to produce changes to advice • Have shown that RRSVM converges
Acknowledgements • US Naval Research Laboratory grant N00173-06-1-G002 (to RM) • DARPA grant HR0011-04-1-0007 (to JS)