Semivariance Significance in the S&P500

Semivariance Significance in the S&P500 Baishi Wu, 4/7/08

Outline • Motivation • Background Math • Data Information • Summary Statistics • Regressions • Appendix

Introduction • Want to examine predictive regressions for realized variance by using realized semi-variance as a regressor • Test significance of realized semi-variance and realized up-variance by correlation with daily open-close returns • Regressions are of the HAR-RV form from Corsi (2003) • Semi-variance from Barndorff-Nielsen, Kinnebrock, and Shephard (2008)

Equations • Realized Volatility (RV) • Bipower Variance (BV)

Equations • Realized Semivariance (RS) • Realized upVariance (upRV) upRV = RV - RS • Bipower Downard Variance (BPDV)

Equations • Daily open to close returns (ri) ri = log(priceclose) – log(priceopen) • The daily open to close returns are correlated with the RV, upRV, and RS to determine whether market volatility is dependent on direction • This statistic is also squared to determine if the size of the open to close price shift correlates with the magnitude of realized volatility

Equations • Heterogenous Auto-Regressive Realized Volatility (HAR-RV) from Corsi, 2003: • Multi-period normalized realized variation is defined as the average of one-period measures. The model is using rough daily, weekly, monthly periods.

Equations • Extensions of HAR-RV • Created different regressions using lagged RS and lagged upRV in predicting RV creating HAR-RS and HAR-upRV • Compared to original HAR-RV model • Created combined regressions of a combination of both RS and upRV to predict RV using HAR-RS-upRV

Equations • Tri-Power Quarticity • Relative Jump

Equations • Max Version z-Statistic (Tri-Power) • The max version Tri-Power z-Statistic is used to measure jumps in the data in this case • Take one sided significance at .999 level, or z = 3.09

Data Preparation • Collected at five minute intervals • S&P 500 Data Set • 1985 to late 2007 (5751 Observations) – Included large spike in RV/BV, less sampling days in this data set • 1990 to late 2007 (4487 Observations) – Largely influenced by upward trend of S&P 500 in the 1990s • 2000 to late 2007 (1959 Observations) – Possibly examines a period of the greatest market volatility • Chose different sample lengths in order to test the consistency in correlations and regressions

Data Preparation S&P500, 1985-2007

Summary Statistics • Numbers are similar except for daily returns

Correlation • Semi-variance correlates the highest with squared daily returns; is this indicative of higher volatility in a down market? • Realized up-variance is not higher than Realized Variance S&P500, 1985-2007

Correlation • This segment has the lowest correlation of semi-variance with realized up-variance • Semi-variance does not have a higher correlation with squared daily returns than either RV or upRV S&P500, 1990-2007

Correlation • Only segment where daily squared returns are positively correlated (though slightly) with daily returns S&P500, 2000-2007

Correlation Summary • Anticipate positive correlations of realized up-variance with daily returns, negative correlations of semi-variance • Both semi-variance statistics ought to have a higher correlation with the daily returns than the realized variance (found untrue in 1985-2007 dataset) • Expected to see a higher correlation with semi-variance and daily squared returns in order to indicate higher volatility in a down market (not the case) • Bipower Downward Variation is a combination of Bipower Variation and Semivariance; correlates very negatively with daily returns (why?)

HAR-RV • R2 = 0.1088 • Monthly regressor not significant, very low correlation S&P500, 1985-2007

HAR-RV • R2 = 0.3648 • Daily lag not significant S&P500, 1990-2007

HAR-RV • R2 = 0.4972 • Daily, monthly not significant S&P500, 2000-2007

HAR-RS • R2 = 0.2110 • Weekly lag very insignificant, monthly lag also insignificant S&P500, 1985-2007

HAR-RS • R2 = 0.3158 • Daily lag not significant S&P500, 1990-2007

HAR-RS • R2 = 0.4221 • Daily, monthly not significant S&P500, 2000-2007

HAR-upRV • R2 = 0.0616 • Very low R2 value, monthly regressor very insignificant, daily insignificant S&P500, 1985-2007

HAR-upRV • R2 = 0.2600 • Daily insignificant S&P500, 1990-2007

HAR-upRV • R2 = 0.3985 • Daily, weekly (slightly) insignificant S&P500, 2000-2007

Normal Regressions Summary • Low R2 coefficient in 1985-2007 S&P 500 dataset seems largely caused by the realized up-variance. This is also the only dataset that has the R2 value of the RV greater than the average of its parts • Observe similar levels of correlation, similar significant variables despite specific statistic (RV, RS, or upRV) • Generally there do not seem to be any noticeable trends that are specific to any individual test statistic; the significances of the regressors seem to be a function of the data set and not the test statistic

RV Regressed with RS • R2 = 0.2191 • Only monthy lags not significant S&P500, 1985-2007

RV Regressed with RS • R2 = 0.3950 • Daily lags are not as significant S&P500, 1990-2007

RV Regressed with RS • R2 = 0.5134 • Monthly lags not significant S&P500, 2000-2007

RV Regressed with upRV • R2 = 0.0565 • Very low correlation, monthly lags not significant S&P500, 1985-2007

RV Regressed with upRV • R2 = 0.3034 • Daily lags not significant S&P500, 1990-2007

RV Regressed with upRV • R2 = 0.4398 • Daily and weekly (to a lesser extent) are not significant S&P500, 2000-2007

RV Regressed with RS and upRV • R2 = 0.5910 • Both monthly lags in general not significant S&P500, 1985-2007

RV Regressed with RS and upRV • R2 = 0.3957 • Semi-variance statistics much more significant than realized up-variance statistics S&P500, 1990-2007

RV Regressed with RS and upRV • R2 = 0.5194 • Semi-variance statistics much more significant than realized up-variance statistics S&P500, 2000-2007

Combined Regressors Summary • Highest R2 values were found for the HAR-RS-upRV regression combination of using both the semi-variances and the realized-upvariances (could this be the zeros?) • In general, semi-variance is a better predictor of RV than realized up-variance and even RV itself; does this indicate that the down market predicts overall volatility best? (or am I over interpreting the value of R2?) • For the combined regression, the semi-variance coefficients were found to be much more significant

Summary Statistics

Appendix • Graphs for 1990-2007 S&P 500 Data Set • Realized Variance and Bipower Variation • Z-Scores with 0.001 Significance • Semivariance, Realized upVariance • Bipower Variation and Bipower Downward Variation • Autocorrelation Plots for 1990-2007 • Realized Variance • Semivariance • Realized upVariance

Realized and Bipower Variance S&P500, 1990-2007

Z-Scores S&P500, 1990-2007

Semivariance, Realized upVariance S&P500, 1990-2007

Bipower Downward Variation S&P500, 1990-2007

Correlogram – Realized Variance S&P500, 1990-2007

Correlogram – Realized Semivariance S&P500, 1990-2007

Correlogram – Realized upVariance S&P500, 1990-2007

Semivariance Significance in the S&P500