240 likes | 256 Views
Bridging the Academic–Practitioner Divide in Credit Risk Modeling. Vadim Melnitchouk, Metropoliten State University, Saint Paul, MN, US. Agenda. 1. Academic model selection by a practitioner and organizational issues 2. ‘Optimal complexity model’ :
E N D
Bridging the Academic–Practitioner Divide in Credit Risk Modeling Vadim Melnitchouk, Metropoliten State University, Saint Paul, MN, US
Agenda 1. Academic model selection by a practitioner and organizational issues 2. ‘Optimal complexity model’ : stochastic parametric method with macroeconomic variables and unobserved consumer heterogeneity 3. Data access , collaboration & prototype development
Who is a practitioner? 1. Ph. D in applied math, former academic, teaching part-time ‘Data Mining’. 2. Ph. D in physics, former academic 3. M.S. in OR, former ‘Fed’ examiner 4. M.S. in Econometric
Time to Default: Optimal complexity model • According to Bellotti & Crook (2007) survival (hazard) modeling is competitive alternative to logistic regression when predicting default events. • The method has become a model of choice in recent publications. But its complexity makes such technique unfeasible for practitioners. • It also has some limitations. Bellotti (2010) believes that ‘any credit risk model with macroeconomic variables can’t be expected to capture the direct reason for default like a loss of job, negative equity or a sudden personal crisis such as sickness or divorce’.
Methodology • The goal of this paper is to present more practical method which also can take unobserved obligor heterogeneity into account. • Stochastic parametric Time to Event method is well known in marketing (Hardie & Fader, 2001). • It was also applied by Brusilovskiy (2005) to predict the time of the first home purchase by immigrants. • The method as far as we know has not been used in credit risk by academics or practitioners.
Assumptions & inputs 1. Time to Default - Weibull distribution (Appendix) 2. Default density across obligors - Gamma distribution (to include unobserved consumer heterogeneity). 3. Vintage aggregate level modeling to avoid so called aggregation bias when unemployment is used. Inputs: 1. Monthly number of defaults 2. Time varying covariates : Unemployment and Home Price Index (HPI). Macroeconomic factors are incorporated into the hazard rate function.
Recent trends in mortgage default rate & data 1. The default rates have spiked from historical trends in 2005 and more significantly in 2006 & 2007 beginning almost immediately after origination. 2. Average time to reach maximum default rate decreased from 5-6 (Vintage 2001-2004) to 2-3 years (Vintage 2005-2007) 3. LPS prime, first, fixed rate 30 years mortgage originated in 2006 data were used to build a model (Schelkle, 2011).
Model training and out-of-time validation • Model training period for vintage 2006 was June 2006 – March 2009. • April 2009 to March 2010 period was selected for ‘out of time’ validation because unemployment increased from 8.5% to 10.1% during this period. • The model was implemented in MS Excel (using Solver) and in SAS/IML. Maximum likelihood was estimated to get values for five parameters.
Forecasted vs Actual monthly # of defaults Weibull/Gamma model for 2006 mortgage origination year (LPS data, vintage 2006).
Results & Discussion The forecast accuracy for ‘out-of-time’ period is at acceptable level (low forecast error and conservative estimate for regulators). Issues with one segment model: • Time varying covariates formula is taken from marketing application and is not flexible one for credit risk modeling (Appendix). • The impact of unemployment and HPI can be double counted.
Next steps in collaboration with academics • Bayesian parameters’ estimation was applied in collaboration with Prof. Shemyakin (St.Thomas University, St. Paul. MN) and his students to improve numeric stability. • Two segments latent class Weibull model (Appendix) was also used to estimate parameters of consumer segment with default hazard increasing over time. • Unemployment and HPI were not included to avoid double counting (academic’s preference).
Data access 1. It is very problematic to get loan level data from financial firms for joint projects. 2. Aggregate level delinquency and default data for mortgages, credit cards , installment loans and commercial lending can be extracted from public websites. 3. But data decomposition of completely aggregated data like Federal Reserve one (Appendix) should be implemented first to apply vintage based modeling.
Next search for optimal complexity model: Combined Markov Chain and Survival Analysis
Conclusions • Stochastic parametric method with macroeconomic variables and unobserved consumer heterogeneity can be used by practitioners as an alternative to survival modeling • The optimal complexity model can provide an incentive to try to bridge the Academic –Practitioner Divide
Latent class Weibull model with two segments Assumptions: • All obligors can be divided into two segments with their own fixed but unknown values of shape and scale parameters. • Large segment has decreasingdefault hazard. • Relatively small consumer segment exists with default hazard increasing over time . The segment size (percentage) is latent variable which must be estimated for each vintage.