1 / 39

Predictive Modeling CAS Reinsurance Seminar May 7, 2007

Predictive Modeling CAS Reinsurance Seminar May 7, 2007. Louise Francis, FCAS, MAAA Louise.francis@data-mines.com Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com. Why Predictive Modeling?. Better use of data than traditional methods

rashad
Download Presentation

Predictive Modeling CAS Reinsurance Seminar May 7, 2007

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Predictive Modeling CAS Reinsurance SeminarMay 7, 2007 Louise Francis, FCAS, MAAA Louise.francis@data-mines.com Francis Analytics and Actuarial Data Mining, Inc. www.data-mines.com

  2. Why Predictive Modeling? • Better use of data than traditional methods • Advanced methods for dealing with messy data now available Francis Analytics www.data-mines.com

  3. Data Mining Goes Prime Time Francis Analytics www.data-mines.com

  4. Becoming A Popular Tool In All Industries Francis Analytics www.data-mines.com

  5. Real Life Insurance Application – The “Boris Gang” Francis Analytics www.data-mines.com

  6. Predictive Modeling Family Francis Analytics www.data-mines.com

  7. Data Quality: A Data Mining Problem • Actuary reviewing a database Francis Analytics www.data-mines.com

  8. A Problem: Nonlinear Functions An Insurance Nonlinear Function:Provider Bill vs. Probability of Independent Medical Exam Francis Analytics www.data-mines.com

  9. Classical Statistics: Regression • Estimation of parameters: Fit line that minimizes deviation between actual and fitted values Francis Analytics www.data-mines.com

  10. Generalized Linear ModelsCommon Links for GLMs The identity link: h(Y) = Y The log link: h(Y) = ln(Y) The inverse link: h(Y) = The logit link: h(Y) = The probit link: h(Y) = Francis Analytics www.data-mines.com

  11. Supervised learning Most common situation A dependent variable Frequency Loss ratio Fraud/no fraud Some methods Regression CART Some neural networks Unsupervised learning No dependent variable Group like records together A group of claims with similar characteristics might be more likely to be fraudulent Ex: Territory assignment, Text Mining Some methods Association rules K-means clustering Kohonen neural networks Major Kinds of Data Mining Francis Analytics www.data-mines.com

  12. Desirable Features of a Data Mining Method • Any nonlinear relationship can be approximated • A method that works when the form of the nonlinearity is unknown • The effect of interactions can be easily determined and incorporated into the model • The method generalizes well on out-of sample data Francis Analytics www.data-mines.com

  13. The Fraud Surrogates used as Dependent Variables • Independent Medical Exam (IME) requested • Special Investigation Unit (SIU) referral • (IME successful) • (SIU successful) • Data: Detailed Auto Injury Claim Database for Massachusetts • Accident Years (1995-1997) Francis Analytics www.data-mines.com

  14. Predictor Variables • Claim file variables • Provider bill, Provider type • Injury • Derived from claim file variables • Attorneys per zip code • Docs per zip code • Using external data • Average household income • Households per zip Francis Analytics www.data-mines.com

  15. Different Kinds of Decision Trees • Single Trees (CART, CHAID) • Ensemble Trees, a more recent development (TREENET, RANDOM FOREST) • A composite or weighted average of many trees (perhaps 100 or more) Francis Analytics www.data-mines.com

  16. Non Tree Methods • MARS – Multivariate Adaptive Regression Splines • Neural Networks • Naïve Bayes (Baseline) • Logistic Regression (Baseline) Francis Analytics www.data-mines.com

  17. Classification and Regression Trees (CART) • Tree Splits are binary • If the variable is numeric, split is based on R2 or sum or mean squared error • For any variable, choose the two way split of data that reduces the mse the most • Do for all independent variables • Choose the variable that reduces the squared errors the most • When dependent is categorical, other goodness of fit measures (gini index, deviance) are used Francis Analytics www.data-mines.com

  18. CART – Example of 1st split on Provider 2 Bill, With Paid as Dependent • For the entire database, total squared deviation of paid losses around the predicted value (i.e., the mean) is 4.95x1013. The SSE declines to 4.66x1013 after the data are partitioned using $5,021 as the cutpoint. • Any other partition of the provider bill produces a larger SSE than 4.66x1013. For instance, if a cutpoint of $10,000 is selected, the SSE is 4.76*1013. Francis Analytics www.data-mines.com

  19. Continue Splitting to get more homogenous groups at terminal nodes Francis Analytics www.data-mines.com

  20. Ensemble Trees: Fit More Than One Tree • Fit a series of trees • Each tree added improves the fit of the model • Average or Sum the results of the fits • There are many methods to fit the trees and prevent overfitting • Boosting: Iminer Ensemble and Treenet • Bagging: Random Forest Francis Analytics www.data-mines.com

  21. Treenet Prediction of IME Requested Francis Analytics www.data-mines.com

  22. Neural Networks = Francis Analytics www.data-mines.com

  23. Neural Networks • Also minimizes squared deviation between fitted and actual values • Can be viewed as a non-parametric, non-linear regression Francis Analytics www.data-mines.com

  24. Hidden Layer of Neural Network(Input Transfer Function) Francis Analytics www.data-mines.com

  25. The Activation Function (Transfer Function) • The sigmoid logistic function Francis Analytics www.data-mines.com

  26. Neural Network: Provider 2 Bill vs. IME Requested Francis Analytics www.data-mines.com

  27. MARS: Provider 2 Bill vs. IME Requested Francis Analytics www.data-mines.com

  28. How MARS Fits Nonlinear Function • MARS fits a piecewise regression • BF1 = max(0, X – 1,401.00) • BF2 = max(0, 1,401.00 - X ) • BF3 = max(0, X - 70.00) • Y = 0.336 + .145626E-03 * BF1 - .199072E-03 * BF2 - .145947E-03 * BF3; BF1 is basis function • BF1, BF2, BF3 are basis functions • MARS uses statistical optimization to find best basis function(s) • Basis function similar to dummy variable in regression. Like a combination of a dummy indicator and a linear independent variable Francis Analytics www.data-mines.com

  29. Baseline Method: Naive Bayes Classifier • Naive Bayes assumes feature (predictor variables) independence conditional on each category • Probability that an observation X will have a specific set of values for the independent variables is the product of the conditional probabilities of observing each of the values given target category cj,j=1 to m (m typically 2) Francis Analytics www.data-mines.com

  30. Naïve Bayes Formula A constant Francis Analytics www.data-mines.com

  31. Advantages/Disadvantages • Computationally efficient • Under many circumstances has performed well • Assumption of conditional independence often does not hold • Can’t be used for numeric variables Francis Analytics www.data-mines.com

  32. Naïve Bayes Predicted IME vs. Provider 2 Bill Francis Analytics www.data-mines.com

  33. True/False Positives and True/False Negatives (Type I and Type II Errors) The “Confusion” Matrix • Choose a “cut point” in the model score. • Claims > cut point, classify “yes”. Francis Analytics www.data-mines.com

  34. ROC Curves and Area Under the ROC Curve • Want good performance both on sensitivity and specificity • Sensitivity and specificity depend on cut points chosen • Choose a series of different cut points, and compute sensitivity and specificity for each of them • Graph results • Plot sensitivity vs 1-specifity • Compute an overall measure of “lift”, or area under the curve Francis Analytics www.data-mines.com

  35. TREENET ROC Curve – IME Explain AUROC AUROC = 0.701 Francis Analytics www.data-mines.com

  36. Ranking of Methods/Software – IME Requested Francis Analytics www.data-mines.com

  37. Some Software Packages That Can be Used • Excel • Access • Free Software • R • Web based software • S-Plus (similar to commercial version of R) • SPSS • CART/MARS • Data Mining suites – (SAS Enterprise Miner/SPSS Clementine) Francis Analytics www.data-mines.com

  38. References • Derrig, R., Francis, L., “Distinguishing the Forest from the Trees: A Comparison of Tree Based Data Mining Methods”, CAS Winter Forum, March 2006, WWW.casact.org • Derrig, R., Francis, L., “A Comparison of Methods for Predicting Fraud ”,Risk Theory Seminar, April 2006 • Francis, L., “Taming Text: An Introduction to Text Mining”, CAS Winter Forum, March 2006, WWW.casact.org • Francis, L.A., Neural Networks Demystified, Casualty Actuarial Society Forum, Winter, pp. 254-319, 2001. • Francis, L.A., Martian Chronicles: Is MARS better than Neural Networks? Casualty Actuarial Society Forum, Winter, pp. 253-320, 2003b. • Dahr, V, Seven Methods for Transforming Corporate into Business Intelligence, Prentice Hall, 1997 • The web site WWW.data-mines.com has some tutorials and presentations Francis Analytics www.data-mines.com

  39. Predictive Modeling CAS Reinsurance SeminarMay, 2006 Louise.francis@data-mines.com www.data-mines.com

More Related