270 likes | 341 Views
Materials for Lecture 12. Chapter 7 – Study this closely Chapter 16 Sections 3.9.1-3.9.7 and 4.3 Lecture 12 Multivariate Empirical Dist.xls Lecture 12 Multivariate Normal Dist.xls. Multivariate Probability Distributions.
E N D
Materials for Lecture 12 • Chapter 7 – Study this closely • Chapter 16 Sections 3.9.1-3.9.7 and 4.3 • Lecture 12 Multivariate Empirical Dist.xls • Lecture 12 Multivariate Normal Dist.xls
Multivariate Probability Distributions • Definition: Multivariate (MV) Distribution --Two or more random variables that are correlated • MV you have 1 distribution with 2 or more random variables that are related to each other • Univariate distribution we have many distributions (one for each random variable)
Parameter Estimation for MV Dist. • Data were generated contemporaneously • Output observed each year or month, • Prices observed each year for related commodities • Corn and sorghum used interchangeably for animal feed • Steer and heifer prices related • Fed steer price and Feeder steer prices related • Supply and demand forces affect prices similarly, bear market or bull market; prices move together • Prices for tech stocks move together • Prices for an industry or sector’s stocks move together
Different MV Distributions • Multivariate Normal distribution – MVN • Multivariate Empirical – MVE • Multivariate Mixed where each variable is distributed differently, such as • X ~ Uniform • Y ~ Normal • Z ~ Empirical • R ~ Beta • S ~ Gamma
Sim MV Distribution as Independent • If correlation is ignored when random variables are correlated, results are biased: • If Z = Ỹ1 + Ỹ2 OR Z = Ỹ1 * Ỹ2 and the model is simulated without correlation • But the true ρ1,2 > 0 then the model will understate the risk for Z • But the true ρ1,2 < 0 then the model will overstate the risk for Z • If Z = Ỹ1 * Ỹ2 • The Mean of Z is biased, as well
Parameters for a MVN Distribution • Deterministic component • Ŷij -- a vector of means or predicted values for the period i to simulate all of the j variables, for example: Ŷij = ĉ0 + ĉ1 X1 + ĉ2 X2 • Stochastic component • êji -- a matrix of residuals from the predicted or mean values for each (j) of the M random variables êji = Yij – Ŷij and the StdDev of the residuals σêj • Multivariate component calculated from residuals • Covariance matrix (Σ) for all M random variables in the distribution MxM covariance matrix (in the general case use correlation matrix) • Estimate the covariance (or correlation) matrix using residuals about the forecast (or the deterministic component) σ211 σ12 σ13 σ14 1 ρ12ρ13ρ14 Σ=σ222 σ23 σ24 ORΡ = 1 ρ23 ρ24 σ233 σ34 1 ρ34 σ244 1 13
3 Variable MVN Distribution • Deterministic component for three random variables • Ĉi = a + b1Ci-1 • Ŵi = a + b1Ti + b2 Wi-1 • Ŝi = a + b1Ti • Stochastic component • êCi = Ci – Ĉi • êWi = Wi – Ŵi • êSi = Si – Ŝi • Multivariate component calculated from the residuals σ2cc σcwσcs Σ=σ2ww σws σ2ss
Simulating MVN in Simetar • One Step procedure for a 4 variable Highlight 4 cells if the distribution is for 4 variables, type =MVNORM( 4x1Means Vector, 4x4 Covariance Matrix) =MVNORM( A1:A4 , B1:E4) Control Shift Enter where: the 4 means or forecasted values are in column A rows 1-4, covariance matrix is in columns B-E and rows 1-4 • If you use the historical means, the MVN will validate perfectly, but only forecasts (simulates) the future if the data are stationary. • If you use forecasts rather than means, the validation test fails for the mean vector. • The CV will differ inversely from the historical CV as the means increase or decrease relative to history
Simulating MVN in Simetar • Two Step procedure for a 4 variable MVN Highlight 4 cells if the distribution is for 4 variables, and type =CUSD (Location of Correlation Matrix) Control Shift Enter =CUSD (B1:E4) for a 4x4 correlation matrix in cells B1:E4 Next use the individual CSNDs to calculate the random values, using Simetar NORM function: For Ỹ1 = NORM( Mean1 , σ1 , CUSD1 ) For Ỹ2 = NORM( Mean2 , σ2 , CUSD2 ) For Ỹ3 = NORM( Mean3 , σ3 , CUSD3 ) For Ỹ4 = NORM( Mean4 , σ4 , CUSD4 ) • Use Two Step if you want more control of the process
Example of MVN Distribution • Demonstrate MVN for a distribution with 3 variables • One step procedure in line 63 • Means in row 55 and covariance matrix in B58:D60 • Validation test shows the random variables maintained historical covariance
Review Steps for MVN • Develop parameters • Calculate averages (and standard deviations used for two step procedure) • Calculate Covariance matrix • Calculate Correlation matrix (Used for Two Step procedure and for validation of One Step procedure) • One Step MVN procedure is easier • Use Two Step MVN procedure for more control of the process • Validate simulated MVN values vs. historical series • If you use different means than in history, the validation test for means vector WILL fail
Parameters for MV Empirical • Step I Deterministic component for three random variables • Ĉi = a + b1Ci-1 • Ŵi = a + b1Ti + b2 Wi-1 • Ŝi = a + b1Ti • Step II Stochastic component calculated from residuals • êCi = Ci – Ĉi • êWi = Wi – Ŵi • êSi = Si – Ŝi • Step III Calculate the stochastic empirical distribution’s parameters • SCi = Sorted (êCi / Ĉi) • SWi = Sorted (êWi / Ŵi) • SSi = Sorted (êSi / Ŝi) • Step IV Multivariate component is a correlation matrix calculated using unsorted residuals in Step II
Simulating MVE in Simetar • One Step procedure for a 4 variable MVE Highlight 4 cells if the distribution is for 4 variables, then type =MVEMP( Location Actual Data ,,,, Location Y-Hats, Option) Option = 0 use actual data Option = 1 use Percent deviations from Mean Option = 2 use Percent deviations from Trend Option = 3 use Differences from Mean End this function with Control Shift Enter =MVEMP(B5:D14 ,,,, G7:I6, 2) Where the 10 observations for the 3 random variables are in rows 5-14 of columns B-D and simulate as percent deviations from trend
Two Step MVE • Two Step procedure for a 4 variable MVE Highlight 4 cells if the distribution is for 4 variables, type =CUSD( Location of Correlation Matrix) Control Shift Enter =CUSD( A12:A15) Next use the CUSDs to calculate the random values (Mean here could also be Ŷ) For Ỹ1 = Mean1 + Mean1 * Empirical(S1, F(Si) , CUSD1) For Ỹ2 = Mean2 + Mean2 * Empirical(S2, F(Si) , CUSD2) For Ỹ3 = Mean3 + Mean3 * Empirical(S3, F(Si) , CUSD3) For Ỹ4 = Mean4 + Mean4 * Empirical(S4, F(Si) , CUSD4) • Use Two Step if you want more control of the process
If Cannot Factor Correl matrix • When the Matrix is over defined then use “Always Calculate” Option
If Cannot Get CUSD or CSDs • When the Matrix is over defined then you can not calculate CSNDs or CSNDs • In that case use “Always Calculate” Option
MV Mixed Distributions • What if you need to simulate a MV distribution made up of variables that are not all Normal or all Empirical? For example: • X is ~ Normal • Y is ~ Beta • T is ~ Gamma • Z is ~ Empirical • Develop parameters for each variable • Estimate the correlation matrix for the random variables in the distribution
MV Mixed Distributions • Simulate a vector of Correlated Uniform Standard Deviates using =CUSD() function =CUSD( correlation matrix ) is an array function so highlight the number of cells that matches the number of variables in the distribution • Use the CUSDi values in the appropriate Simetar functions for each random variable =NORM(Mean, Std Dev, CUSD1) =BETAINV(CUSD2, Alpha, Beta) =GAMMAINV(CUSD3, P1, P2) =Mean*(1+EMP(Si, F(Si), CUSD4))
Validation of MV Distributions • Simulate the model and specify the random variables as the KOVs then test the simulated random values • Perform the following tests • Use the Compare Two Series Tab in HoHi to: • Test means for the historical series or the forecasted means vs. the simulated means • Test means and covariance for historical series vs. simulated • Use the Check Correlation Tab to test the correlation matrix used as input for the MV model vs. the implied correlation in the simulated random variables • Null hypothesis (Ho) is: Simulated correlationij = Historical correlation coefficientij • Critical t statistic is 1.98 for 100 iterations; if Null hypothesis is true the calculated t statistics will exceed 1.98 • Use caution on means tests if your forecasted Ŷ is different from the historical Ῡ
Test Correlation for MV Distributions • Test simulated values for MVE and MVN distribution to insure the historical correlation matrix is reproduced in simulation • Data Series is the simulated values for all random variables in the MV distribution • The original correlation matrix used to simulate the MVE or MVN distribution
Validation Tests in Simetar • Student t Test is used to calculate statistical significance of simulated correlation coefficient to the historical correlation coefficient • You want the test coefficient to be less than the Critical Value • If the calculated t statistic is larger than the Critical value it is bold