By Cheng-sheng Peter Wu, FCAS, ASA, MAAA John Lucker, CISA Deloitte Consulting, LLP

A View Inside the “Black Box”:A Review and Analysis of Personal Lines Insurance Credit Scoring Models Filed in the State of Virginia By Cheng-sheng Peter Wu, FCAS, ASA, MAAA John Lucker, CISA Deloitte Consulting, LLP CAS 2004 Ratemaking Seminar, Call-3 Philadelphia March 11, 2004

Our Efforts on this Topic to Date • Does Credit Work? “Does Credit Score Really Explain Insurance Losses? Multivariate Analysis from a Data Mining Point of View” • How Can You Go Beyond Credit? “Mining the Most from Credit and Non-Credit Data” • How Do Credit Score Models Work? “A View Inside the “Black Box”: A Review and Analysis of Personal Lines Insurance Credit Scoring Models Filed in the State of Virginia”

The Motivation for Our Paper • What we did want to do? • Contribute substantive material and insights to the ongoing debate over credit scoring • Assist companies, regulators and the public in understanding how the Credit Scoring “Black Box” works • Show the similarities and differences in credit scoring models • What we didn’t want to do? • Attribute our findings directly to the filing companies and their business practices • Expose proprietary information beyond what is publicly available • Render opinions on the superiority of one model over another

The History of Personal Lines Pricing and Class Plans • Few class plan factors before World War II • Proliferation of class plan factors after the war • Class plans for Personal Auto – territory, driver, vehicle, coverage, loss and violation, others, tiers/company, etc. • Class plans for Homeowners – territory, construction class, protection class, insurance amount, coverage, prior loss, others, tiers/company, etc. • Insurance credit scoring started in late 80’s and early 90’s as research and a developing concept – became widespread from the mid-1990’s onward

The History of Personal Lines Credit Scoring • Credit Score was the first important rating factor identified in 20 years • Credit Score is a composite multivariate score vs. raw credit info • Until recently, it was viewed as a “secret weapon” worthy of secrecy • Today 90+% of Personal Lines insurers use credit scoring for some form of new biz acquisition, risk selection, pricing, and renewal • Credit Score has been easy and relatively inexpensive to get, “quiet” to use, confidential, and straight forward in its implementability • Today, it is the hottest, most widely contested and debated topic in the Personal Lines insurance industry

The Current Environment • Continues to be a hot topic for debate • Many entities have conducted studies on the true correlation with loss ratio and the Disparate Impact issue • Virginia, Washington, Maryland, Texas, Missouri • NAIC, CAS, Tillinghast Towers-Perrin, EPIC • Many states have restricted (or are considering restricting) the usage of the score or certain credit information • More states want the “black box” filed and opened • More companies are considering proprietary credit models for greater transparency and non-credit scoring models

Study of VA Credit Score Filings • Insurers filed over 40 credit scoring models in Virginia in 2002 • Deloitte obtained copies of 11 of these filings, covering: • 9 filings for Personal Auto and 2 filings for Homeowners • 8 insurance groups • $45 billion in personal lines premiums

Types of Models • Industry Model – Fair Isaac (FICO) • 4 different FICO scores used by 3 insurance groups • Uses credit information from TransUnion • Multiple models by line, by market segment, and by version • Industry Model – ChoicePoint • 3 insurance groups for Auto • Uses credit information from Experian • Open model • Insurance Company Proprietary / Custom Models • 2 insurance groups • Uses credit information from TransUnion • Home and Auto are the same models

Scoring Functions • Rule-based • Table driven format • If factor x is equal to y, then get z points, etc... • Sum all the points to generate a raw score • All FICO models and one of the two proprietary models use this technique • Formula • Can be linear or non-linear • Need to determine the parameters/weights • One of the two proprietary models uses this technique • The ChoicePoint model is a mix of the two, but is more of a formula function

Scoring Functions • Rule-based • Advantages: simplicity, easier to explain, easier integration with a company’s class plan • Disadvantages: must predetermine the groupings, potential limitations in the number of variables used in the model • Formula • Advantages: easier to include more variables, formula is a direct result of the modeling process and doesn’t require transformation • Disadvantages: more difficult to explain and interpret

Scoring Functions • One way to compare a rule-based function and a formula function: review the “delta” • A formula function: • Z = 2 X + 3 Y, • An increase of 1 in X – an increase of 2 in Z • An increase of 1 in Y – an increase of 3 in Z • A rule-based function: • If X = 1 then 20 points; if X=2 then 40 points, if X=3 then 60 points. • If Y=1 then 10 points; if Y=2 then 40 points; if Y=3 then 70 points. • These two functions are essentially the same!

Scoring Process • Step 1 – calculate the raw score • Step 2 – scale the raw score to the final score, (Score Scaling) • Transform a raw score to a final score (e.g. 0.34778 becomes 570) • Monotonic functions are used • Simple Scaling Functions vs Complex Scaling Functions • Simple scaling functions: linear shift and expand (a*score+b) • Complex scaling: non-linear formula / transformation

Scoring Scaling Function: Simple vs Complex Simple Scaling Function Complex Scaling Function Raw Score Raw Score Final Score Final Score

Score Ranges

Model Variables • Fair Isaac – 10 to 13 variables, depending on the models • ChoicePoint – 29 variables for “thin file” scores, and 37 variables for “thick file” scores • Proprietary #1 – 10 variables • Proprietary #2 – 36 variables

Model Variables

More Comparisons Possible • More model comparisons that could be performed: • Variable strength comparison between models • Score changes from one model to another • Model lift and stability from one model to another • To find out the answers to these questions: “Normalization of the Score Ranking and then Testing with Real Data”

Normalization of the Score Ranking & Testing with Real Data • Score a group of risks with different models • Sort the scores for the risks from the best to the worst • Group the sorted risks into deciles (or quintiles, quartiles, etc) • Use the deciles (or quintiles, quartiles, etc) as the “score” for comparison between models for • Predictive Power / Lift • Variable Strength • Score Changes / Migration

Score Comparison Between 2 FICO Models - Original Score

Score Comparison Between 2 FICO Models – Normalization with Decile Score Ranking

Considerations for Building or Selecting a Model • How does your competitive advantage impact your choices? • Degree of predictive power desired relative to other factors? • How stable is the score from one period to the next? • How flexible do you want your company’s models to be? • What is your resource availability for development, “care & feeding”? • What are your expectations with regards to the regulatory climate? • What is the impact of the regulatory environment on your company? • What is your potential cost savings for credit scores & credit data purchases? • How can model performance be measured and monitored?

Conclusions • All models are similar in the type, form and structure of the variables and the data sources they come from • The models use different scoring functions and implementation approaches • The models produce scores with different score ranges • To perform a real comparison we must rank test the various models with real data • This will continue to be a hot topic in the industry – stay tuned…!

By Cheng-sheng Peter Wu, FCAS, ASA, MAAA John Lucker, CISA Deloitte Consulting, LLP