270 likes | 288 Views
Does Credit Score Really Help Explain Insurance Losses?. Cheng-Sheng Peter Wu, FCAS, ASA, MAAA, Jim Guszcza, ACAS, MAAA, Ph. D. Themes. The History What Does the Question Mean? Simpson’s Paradox - Need for Multivariate Analysis What Has Been Done So Far?
E N D
Does Credit Score Really Help Explain Insurance Losses? Cheng-Sheng Peter Wu, FCAS, ASA, MAAA, Jim Guszcza, ACAS, MAAA, Ph. D.
Themes • The History • What Does the Question Mean? • Simpson’s Paradox - Need for Multivariate Analysis • What Has Been Done So Far? • Our Large-Scale Data Mining Experience • Going Beyond Credit • Conclusions
The History Pricing/Class Plans • Few factors before World War II • Explosion of class plan factors after the War • Current class plans (Auto) – territory, driver, vehicle, loss and violation, others, tiers/company, etc. • Actuarial techniques – Minimum Bias & GLM
The History Credit • First important factor identified over the past 2 decades • Composite multivariate score vs. raw credit information • Introduced in late 80’s and early 90’s • Viewed at first as a “secret weapon” • Currently almost everyone is using it • Industry scores vs. proprietary scores • Quiet, confidential, controversial, black-box, …etc
What Does the Question Mean? Can Credit Score Really “Explain” Ins Losses? • “X explains Y” • Weaker than claiming that XcausesY • Stronger than merely reporting that X is correlated with Y
What Does the Question Mean? Working Definition • We say that “X helps explain Y” if: • X is correlated with Y • The correlation does not go away when other available, measurable information is introduced
What Does the Question Mean? Intuition Behind the Definition • It might be okay for X to be a proxy for a “true” cause of Y • Testosterone level might be a true cause of auto losses…. But it’s not available • Age/Gender is a reasonable proxy • It might not be okay for X to be a proxy for other available predictive information
What Does the Question Mean? Applying the Definition • Suppose we see that credit score plays an important role in a multivariate regression equation that predicts loss ratio • Then it is fair to say the credit helps explain insurance losses • A multivariate study is needed
Simpson’s Paradox – Need for Multivariate Analysis • Statistics can lie • Illustrates how a univariate association can lead to a spurious conclusion • The “true” explanatory factor is masked by the spurious correlation • Famous example: 1973 Berkeley admissions data
Simpson’s Paradox – Need for Multivariate Analysis The Berkeley Example (stylized) • 2200 people applied for admission • 1100 men; 1100 women • 210 men, 120 women were accepted. • Clear-cut case of gender discrimination… • …. Or is it?
What Has Been Done So Far • We (actuaries) have been quiet • Few published actuarial studies/opinions • NAIC/Tillinghast (1997) • Monaghan’s Study (2000) • Recent/related studies • Virginia State Study (1999) • CAS Sub-Committee (2002) • Washington State Study (2003) • University of Texas Study (2003)
What Has Been Done So Far Relevant Actuarial/Statistical Principles • Pure premium vs. loss ratio • Loss ratio studies go beyond existing rating plans, and are implicitly multivariate • Independence vs. correlation • Most insurance variables are correlated • Univariate vs. multivariate • Correlated variables call for multivariate studies for true answers (Simpson’s Paradox) • Credibility vs. homogeneity • Studies need to be credible and representative
What Has Been Done So Far The Tillinghast Study • 9 companies’ data, seems representative • Loss ratio study • No other predictive variables included in the study • No detailed information given about the data • Strong correlation with loss ratio, seems credible • This is true, but it doesn’t answer our question and doesn’t quiet the critics
What Has Been Done So Far Monaghan’s Study • Loss ratio study • Large amount of data – credible analysis • Analyze individual credit variables as well as score • Multivariate analysis – limited to score + 1 traditional rating variable at a time • Shows strong correlations with loss ratio do not go away in the presence of other variables • Another good step, but we can go further
Our Large-Scale Data Mining Experience Our Work • Loss ratio studies • Multiple studies - representative • Large amounts of data – credible • Hundreds of variables tested along with credit – truly multivariate • Policy, driver, vehicle, coverages, billing, agency, external data, synthetic, …etc. • Sound actuarial and statistical model design • Disciplined data mining process
Our Large-Scale Data Mining Experience What Have We Found Out? • Credit score is always one of top variables selected for the multivariate models • Credit score has among the strongest parameters and statistical measurements (t-score) • Credit’s predictive power does not go away in the truly multivariate context • Removing credit score dampens the predictive power of the models
Our Large-Scale Data Mining Experience What Do We Conclude? • We conclude that credit score bears an unambiguous relationship to insurance losses, and is not a mere proxy for other kinds of information available to insurance companies. • This does not mean that credit score is the “cause” of insurance losses
Our Large-Scale Data Mining Experience Why Is Credit Score Correlated with Ins Losses? • Beyond the scope of our work • Emphasis is not causation • Plausible speculations include • Stress/planning & organization • Risk-seeking behavior • ?? • Analogy: Age/Gender might be a proxy for testosterone
Going Beyond Credit Can We Do Well Without Credit? • YES: non-credit predictive models are • Valuable alternative to credit scores • Flexible • Tailored to individual companies • Comparable predictive power to credit scores • Also possible to build mixed credit/non-credit models
Going Beyond Credit Keys to Building Successful Non-Credit Models: • Fully utilize all sources of information • Leverage company’s internal data sources • Enriched with other external data sources • Use large amount of data • Employ disciplined analytical process • Utilize state-of-the-art modeling tools • Apply multivariate methodology
Going Beyond Credit Advantages of Going Beyond Credit • Next generation of competitive advantage • More variables, more predictive power • Leverages company’s internal data sources • More flexibility • Address regulatory issues and public concerns • Expense savings • Everyone gets a score (less of a “no hit” problem) • More customized – less “plain vanilla” than credit score
Conclusions • Credit works… even in a fully multivariate setting • But non-credit models can work well too! • What it means to us – beginning of a new era • Advances in computer technology • Advances in predictive modeling techniques • Large scale multivariate studies now practical • More external and internal info, anything else out there? • Other ways to go beyond credit?
Conclusions Future works on this topic • Multivariate pure premium analysis would provide more insights • Further study of public policy issues • WA, VA came to opposite conclusions • Comparison of various existing scoring models