GLMs in Personal Lines Pricing

MAF Fall Meeting September 26, 2002 GLMs in Personal Lines Pricing Claudine Modlin, FCAS Watson Wyatt Insurance & Financial Services Inc. www.watsonwyatt.com/pretium

Agenda • Overview of GLMs in the rating process • GLMs in practice • data • diagnostics • interactions • Territory analysis • How to get started

Objective Age Sex Vehicle Rate Scheme Premium Area Claim Limit

Age Sex Expected Vehicle cost of Model Area claims Claim Limit Modeling the cost of claims

Modeling the cost of claims x BI Freq = Cost 1 Amt x PD Freq = Cost 2 Amt x MED Freq = Cost 3 Amt x COL Freq = Cost 4 Amt x OTC Freq = Cost 5 Amt

Modeling the cost of claims • Rating factors • Statistical techniques

Standard factors: Age Sex Marital status Number years licensed Claim experience Territory Usage Mileage Limits Deductibles Make/Model of vehicle Violations Credit Multi-line Multi-car Safety devices Theft devices External data: geodemographic data geophysical data Data from other products: banking data other insurance data Example auto rating factors

Claims True risk T C Total T C M 80 20 100 M 40% 20% F 20 20 40 F 20% 10% Total 100 40 140 One-way Exposure Exp Claims Ratio T C Total M 300 100 33.3% M 200 100 300 F 300 40 13.3% F 100 200 300 Total 300 300 600 T 300 100 33.3% C 300 40 13.3% The failings of one way analysis * 2 * 2.5

35 30 25 20 Number of policies 15 10 5 Old High Vehicle Age Vehicle Value New Low Example correlation

Generalized linear models E[Y] = m = g-1(X.b + x) Var[Y] = f.V(m) / w • Consider all factors simultaneously • Allow for nature of random process • Robust and transparent • EU industry standard

Why GLMs over other methods • One-way and two-way analyses • Distorted by correlations, no diagnostics • Iteratively standardized one-ways • No diagnostics, no faster than GLMs, less flexibility for allowance of random process, not always tractable solution • Neural networks • Not transparent, hard to interpret, can be unstable with new types of policy, easy to over/under fit • Cluster analyses / "segmenting" • Suitable for marketing but less appropriate for assessing continuous risk; does not fit with rating structures • Data mining • General term for all of the above but can often be merely one-way or two-way analyses on subsets of data

Example of GLM output (real UK data) 0.25 180 0.2 160 0.15 0.1 140 0.05 0% 120 0 -4% -5% -0.05 100 Exposure (policy years) Log of multiplier -0.1 -15% 80 -17% -0.15 -19% -20% -0.2 60 -0.25 40 -0.3 -0.35 20 -0.4 -0.45 0 1 2 3 4 5 6 7 Factor Exposure Approx 2 SE from estimate GLM estimate

Example of GLM output (real UK data) 0.25 22% 180 0.2 160 0.15 10% 0.1 7% 140 6% 0.05 0% 0% 120 0 -4% -5% -0.05 100 Exposure (policy years) Log of multiplier -0.1 -15% -16% 80 -17% -0.15 -19% -19% -20% -0.2 60 -0.25 40 -0.3 -0.35 20 -0.4 -0.45 0 1 2 3 4 5 6 7 Factor Exposure Oneway relativities Approx 2 SE from estimate GLM estimate

Modeling the cost of claims x BI Freq = Cost 1 Amt x PD Freq = Cost 2 Amt x MED Freq = Cost 3 Amt x COL Freq = Cost 4 Amt x OTC Freq = Cost 5 Amt

Rate level adjustments Profit loadings Risk model The premium rating process

Current Rates Rate level adjustments Profit loadings Risk Compare Model The premium rating process

Factor effect analysis

Demonstration job Run 10 Model 2 - Third party material, standard risk premium run - Unsmoothed standard risk premium model 0.35 0.3 500000 28% 0.25 400000 0.2 300000 Log of multiplier Exposure 0.15 12% 0.1 200000 0.05 5% 5% 0% 100000 0 0% -0.05 0 Yearly Half-yearly Quarterly MPFREQ - Payment frequency Approx 2 SEs from unsmoothed estimate Unsmoothed unrestricted estimate Unsmoothed restricted estimate Current rating structure Factor effect analysis

Example job Currently profitable business 7000 6000 5000 4000 Count of records 3000 Currently unprofitable business 2000 1000 0 0.450 - 0.550 - 0.650 - 0.750 - 0.850 - 0.950 - 1.050 - 1.150 - 1.250 - 1.350 - 1.450 - 1.550 - 1.650 - 1.750 - 1.850 - 1.950 - 2.050 - 2.150 - 2.250 - 2.350 - 2.450 - 0.500 0.600 0.700 0.800 0.900 1.000 1.100 1.200 1.300 1.400 1.500 1.600 1.700 1.800 1.900 2.000 2.100 2.200 2.300 2.400 2.500 Ratio: Risk Premium / Current tariff Impact analysis

Example job 7000 180% 170% 160% 6000 150% 140% 5000 130% 120% 4000 110% Loss ratio Count of records 100% 3000 90% 80% 2000 70% 60% 1000 50% 40% 0 30% 0.450 - 0.600 - 0.750 - 0.900 - 1.050 - 1.200 - 1.350 - 1.500 - 1.650 - 1.800 - 1.950 - 2.100 - 2.250 - 2.400 - 0.500 0.650 0.800 0.950 1.100 1.250 1.400 1.550 1.700 1.850 2.000 2.150 2.300 2.450 Ratio: Risk Premium / Current tariff Yearly Claims / Earnedprem Impact analysis

Impact analysis

New Rates The premium rating process Freq TPBI x = Cost 1 Amt TPPD Freq x = Cost 2 Competitor Amt Current Rates AD Freq x = Cost 3 Amt Model FT Freq x = Cost 4 Amt WS Freq x = Cost 5 Amt Expense loadings Profit loadings Risk Compare Model

Survey market rate filings quotation systems question policyholder mystery shopping Investigate competitors' structures Apply "cheapest" tariff to own portfolio Use in retention / new business model Competitive position

New Rates The premium rating process TPBI Freq x = Cost 1 Amt TPPD Freq x = Cost 2 Competitor Amt Current Rates Freq AD x = Cost 3 Amt Model FT Freq x = Cost 4 Amt WS Freq x = Cost 5 Amt Expense loadings Profit loadings Risk Compare Lapse/take-up Model Model

Age Sex Vehicle age Probability Model of lapsing D Premium Claims Premium / Competitors' premium Modeling retention • Model - rating factors - other products held - payment method - change in coverage - discount expectation plus… - source - change in premium - claims history - competitiveness

Log of multiplier 20 25 30 35 40 45 50 55 60 65 70 Age of policyholder Approx 2 SEs from estimate Unsmoothed estimate Retention model - Policyholder age

1 0.7 0.4 Log of multiplier 0.1 -0.2 -0.5 -100 -90 -80 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 90 Change in premium on renewal Approx 2 SEs from estimate Unsmoothed estimate Retention model - Change in premium

Log of multiplier of p/(1-p) -47 0.6 0.7 0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 Quote/Average of the three cheapest quotes on the market Approx 2 SD from estimate Smoothed estimate New business modelCompetitiveness of premium

Profitability - Low Risk High Current Rates Model High Target marketing at these Retention Lapse model Increase premiums Actively target at renewal (discount vouchers / phone Low calls) Customer lifetime value

Price elasticity

The premium rating process TPBI Freq x = Cost 1 Amt TPPD Freq Competitor x = Cost 2 Amt Current Rates AD Freq x = Cost 3 Amt Model FT Freq x = Cost 4 Amt WS Freq x = Cost 5 Amt Expense loadings Profit loadings Risk Compare Lapse/take-up Model Model New Model Rates office

Data required • Linked policy + claims data • Record: one insured risk (eg car) for one policy period or portion of policy period for which risk has not changed • Fields: • explanatory variables - rating, underwriting, marketing, external • stats - earned exposure, incurred claim count, incurred loss, earned premium (optional) • Minimum of 100,000 earned exposures

Data considerations • Reflect cancellation/endorsement • Include time lag to reduce effect of IBNR • Include dummy variables to standardize for geography (if countrywide study) and time • Display rating factors applicable at time of exposure, categorized on current basis

Factor 4 Factor 3 Factor 5 Factor 2 Factor 6 Factor 1 Factor 7 Factor 3 Factor 5 Factor 6 Model iteration diagnostics • Standard errors of parameter estimates • F-tests / c2 tests on deviances (with ranks) • Consistency over time • Common sense

Standard errors ofparameter estimates

Age Sex Vehicle Deviance = 9585 df = 109954 Fitted Model A value Zone Multi-car ? Claims Age Sex Vehicle Deviance = 9604 df = 109965 Fitted Model B value Multi-car Claims Deviances

Consistency over time

Common sense • Does it make sense given correlations? • Are ordered categorical variables well behaved? • Can you believe it? • Can underwriters believe it? • Consider results for frequency and amounts at the same time • Consider results for each claim type at the same time

Interactions

Interactions  

Geographic rating • Territory is one of the main drivers of cost • Considerable variety in how insurers rate for territory • One insurer will have limited exposure in any one area

Spatial smoothing • Fit GLM (excluding current territories) • Map "residual" risk by "region" • Make this residual risk more predictive • Categorize into territories to derive appropriate loadings

GLMs in Personal Lines Pricing