170 likes | 243 Views
Linear Programming. As Used for Discriminant Analysis. Objectives. Maximize minimum distance from critical value Minimize sum of deviations from critical value Simple Direct Free of statistical assumptions Flexible. Requirements to use LP. LP modeling skills Commercial software.
E N D
Linear Programming As Used for Discriminant Analysis
Objectives • Maximize minimum distance from critical value • Minimize sum of deviations from critical value • Simple • Direct • Free of statistical assumptions • Flexible
Requirements to use LP • LP modeling skills • Commercial software
Linear Discriminant Analysis • Separate data into groups such that • Minimize distance within group • Maximize distance to other groups • Can have: • Binary (2 groups) • Multiple categories (more than 2 groups)
Minimize Sum of Deviations (MSD) Minimize 1 + … + r Subject to: A11 x1 + … + A1r xr b + 1for A1 in B, ………… An1x1+ … + Anr xr b - rfor An in G, , … , r 0,
Maximize Minimum Distance(MMD) Maximize 1 + … + r Subject to: A11 x1+ … + A1r xr b - 1forA1 in B, ………… An1 x1+ … + Anr xrb + rfor An in G, 1, …, r 0
Example Minimize 1 + 2 Subject to: 6 x1+ 8 x2 b + 1for A1 in B, 15 x1+ 31 x2 b - 2for A2 in G, 1,2 0 Use b = 9 Optimal solution: x1* = 0, x2* = 0.290323 A1= 2.35 < 9 so BAD; A2 = 9.00001 > 9 so GOOD
i i Bad Good 2.345840 9.000013 Perfect SeparationAX* = 9
bL1 bU1 bL2 C1 bU2 bL3 C2 bU3 X C3 a1 Three-Class Linear Discriminant Analysis
MCLP Classification • Two or more criteria • Create deviational variables for each Functiona + da- - da+ = Targeta Objective: Min weighted sum of deviations IDEAL POINT: all desired deviations = 0
Fuzzy LP Classification • Not all data precise • Fuzzy concept: • Membership function 0 ≤ MF ≤ 1 • Can have MF for any number of states • 50 degrees • Cold MF might be 0.7 • Warm MF might be 0.4 • Hot MF might be 0
Fuzzy MOLP • Discriminate to various classes available X-axis is alpha; Y-axis is beta
Real Application: Credit Card • Outcomes • Bankruptcy • Good • Scoring techniques • Behavior Score • Credit Bureau Scores • Proprietary Bankruptcy Score • Set Enumeration Decision Tree
Real Application – Credit Card • LP an alternative to these scoring methods • Classify cardholders in terms of payment • Common variables: • Balance • Purchase • Payment • Cash advance • State of residence • Job security
Real Application – Credit Card • FDR model • 38 original variables over 7 months • 65 derived variables generated • Separation criteria: • Information value – mean difference/STD • Concordance • Kolmogorov-Smirnov (best)
Real Application – Credit Cards • Sampled 6,000 records • 2-class output • 65 attributes • 50 LP solutions computed • Varied fuzzy parameters, setoff limits • Used 1000, 3000, 6000 records • Compared with decision tree, neural network model • MCLP best at not calling actual bad cases good • But this was on a small test set • Fuzzy LP best on large test set