240 likes | 740 Views
Actuarial Modeling in R. Glenn Meyers, FCAS, MAAA Jim Guszcza, FCAS, MAAA. CAS Predictive Modeling Seminar Las Vegas October, 2007. Contents. R Background. R Background. R is an open-source statistical programming language Pre-History:
E N D
Actuarial Modeling in R Glenn Meyers, FCAS, MAAA Jim Guszcza, FCAS, MAAA CAS Predictive Modeling Seminar Las Vegas October, 2007
R Background • R is an open-source statistical programming language • Pre-History: • R is based on the S statistical programming language developed at Bell labs in the 1980’s • The commercial package S-plus is based on the S language • R is an open-source implementation of the S language • Developed by Ross Ihaka & Robert Gentlemen at U. Aukland • Maintained/Developed by R development Core Team • Features: • R is a high-level, object-oriented programming environment • R has advanced graphical capabilities • Statisticians around the world contribute add-on packages • Highly interactive in nature • Allows for experimentation and creativity • Known as “the statistician’s calculator”
Installing R • Go to http://cran.r-project.org/ • Or just type “R” into Google and click “I feel lucky” • Click on “Download CRAN” on the left of the screen • Click on one of the USA CRAN mirror sites • Click on “Windows (95 and later)” • Click on “base” • Right-click on R-2.5.1-win32.exe • “Save target as” into any directory • After you’ve downloaded this setup program, double-click on it and follow the instructions
Add-on Packages • Click on “Packages” • Select “Install Package(s) • Select a CRAN mirror
Add-on Packages • “Packages” window will appear • Select “MASS” and click OK • MASS stands for Modern Applied Statistics in S • By Venables and Ripley • … add anything else you like. • It’s all free • There are hundreds of add-on packages available
R Warm-Up R as a Calculator Assignments Vectors, Matrices, Data Frames Getting Help Linear Models Maximum Likelihood Estimation
Example 1 Estimating a non-trivial loss distribution
Example 1: Fitting a Non-trivial Loss Distribution • Here is a size-of-loss histogram for 539 claims • Let’s estimate the true distribution that generated these claims.
Example 2 Curve Fitting
Example 2: Curve Fitting • In this dataset, Y has a non-linear relationship with X • Let’s fit a curve to this data
Example 3 Non-Linear Predictive Modeling
Example 3: Predictive Modeling Problem • We have data on 369 Workers Comp claims • Age of claimant • Distance to work • Claim Duration • Let’s build a model to predict Duration using Age and Distance
Tinn-R (This is not notepad) – A text editor for R • Search Tinn-R on Google • Free download • Helpful in a lot of little ways
Example 4 Generalized Additive Model (GAM) Example
Generalized Additive Model (GAM) • Similar to Generalized Linear Model (GLM) • Allows for non-linear predictors • Select mgcv package
Example 5 Collective Risk Model Example
Collective Risk Model • Easily viewed as a simulation • Select c from a gamma distribution with mean 1 and variance c • Select N from a Poisson distribution with mean c·l • Note – This process gives a negative binomial distribution • Select N claims from a Pareto distribution • X = Sum of the N claims • Fast calculation with FFT’s • Discretize claim severity distribution • Reference Loss Models by Klugman, Panjer and Willmot • p 185 and p 656 (2nd Edition)
Example 6 Parameter Risk in Loss Reserving
Parameter Risk in Loss Reserve Estimates • Expected Loss Payment in Lag t = Premium·ELR·(b(t|a,b)-b(t-1|a,b) • Observed loss in Lag t has an overdispersed Poisson distribution with Mean = Expected Loss Payment in Lag t • Estimate ELR, a and b by maximum likelihood • Repeat on OD Poisson simulated data from fixed ELR, a and b • References • Clark, CAS Forum (Fall 2003) • England and Verrall, PCAS 2001
A Package in R for Actuarial Science – ASTIN Colloquium • actuar: an R package for Actuarial Science by Vincent Goulet The actuar project is a package of Actuarial Science functions for the R statistical system. The project was launched in 2005 and the package is available on CRAN (Comprehensive R Archive Network) since February 2006. The current version of the package contains functions for use in the fields of risk theory, loss distributions and credibility theory. This talk will present the most recent developments and demonstrate how the package can be useful in teaching, research and practice. http://www.actuaries.org/ASTIN/Colloquia/Orlando/Papers/Goulet.pdf • Alsohttp://www.actuarialoutpost.com has an excellent discussion of R under the “Software and Technology” forum. My nomination for the best reference from that site http://toolkit.pbwiki.com/RToolkit