The Cotor Challenge, Round 2

This analysis explores the distribution of large claims data through exploratory data analysis techniques such as GCHART, BOXPLOT, and UNIVARIATE. It also examines the fit of tail distributions using Extreme Value Theory and Generalized Pareto distribution.

The Cotor Challenge, Round 2

  Matthew Flynn (860) 633-4119 x8764 Matt.Flynn@sas.com The Cotor Challenge, Round 2

  2. A little EDA … Proc GCHART;The data are dominated by single large claim, dashed horizontal lines are at the 95% and 99% percentiles $10M $5M 99th pctile 95th pctile

  3. A little EDA … Proc BOXPLOTThe data are dominated by single large claim

  4. A little EDA continued… Proc UNIVARIATE;

  5. A little EDA … Proc UNIVARIATE; Loss Histogram – very, very long tail

  6. A little EDA … Proc UNIVARIATE; Losses verses Exponential distribution Large loss (upper right)

  7. A little EDA continued … Proc UNIVARIATE - logLoss; overall fits are unlikely to fit tails well.

  8. A little EDA … Proc GCHART;The data are dominated by single large claim, vertical lines are at $5m and $10m

  9. A little EDA continued … Proc UNIVARIATE - logLoss;

  10. A little EDA continued… - logLoss;Top loss = 60% of total dollars, 90% of all dollars are in the top 25 (or 1%) losses

  11. Sample Mean Excess Distribution The sample mean excess distribution is the sum of the excesses over the threshold u divided by the number of data points, n − k + 1, which exceed the threshold u. The sample mean excess function describes the expected excess of a threshold given that exceedance occurs and is an empirical estimate of the mean excess function; e(u) = E [x − u|x > u]. If a graph of the sample mean excess function is horizontal, the tail is exponential. An upward sloping graph is said to be ‘fat-tailed’, relative to an exponential.

  12. Extreme Value Theory – “Peaks Over Threshold” and the Generalized Pareto distribution Next, fitting A GPD fit the tail of the loss distribution via SAS Proc NLMIXED. proc nlmixed data=Cotor(where=(logLoss>11.9)); parms sigma=1 xi=0.3; bounds sigma >= 0; if (1 + xi * ((logLoss – 11.9) / sigma)) <= 0 then lnlike = 11.9 ** 6; else lnlike = -log(sigma) - (1 + (1 / xi))*log(1 + xi * ((logLoss – 11.9) / sigma)); model logLoss ~ general(lnlike); run;

  13. Quantile or Tail Estimator – VaR (Value at Risk) See: McNeil, Alexander J. The Peaks over Thresholds Method for Estimating High Quantiles of Loss Distributions, ASTIN Colloquium, 1997, equation 5, page 10.

  14. Expected Shortfall – Tail VaR – Conditional Tail Expectation If things go bad, how bad is bad? Expected value of a layer from r to R

  15. GPD Model Fit – Parameter estimates

  16. GPD Model Fit – additional estimates – estimated percentiles, expected shortfall

  17. Sensitivity analysis – expected shortfall, varying size of single largest loss $5M xs $5M layer price estimate $2,364

  18. The entire analysis can be run directly from Excel

  19. The entire analysis can be run directly from Excel

  20. The entire analysis can be run directly from Excel

  21. The entire analysis can be run directly from Excel

  22. The entire analysis can be run directly from Excel

  26. Matt Flynn (860) 633-4119 x8764 Matt.Flynn@sas.com

