370 likes | 640 Views
Which Model Should Be Appropriate to Describe the Dynamics of Trading Duration in Chinese Stock Market?. Wanbo Lu luwb@swufe.edu.cn School of Statistics, Southwestern University of Finance and Economics, 55 Guanghua Village Road, Chengdu, 610074, China. Outline. 1. Introduction
E N D
Which Model Should Be Appropriate to Describe the Dynamics of Trading Duration in Chinese Stock Market? Wanbo Lu luwb@swufe.edu.cn School of Statistics, Southwestern University of Finance and Economics, 55 Guanghua Village Road, Chengdu, 610074, China
Outline 1. Introduction 2. Parametric ACD models and estimation 3. Nonparametric ACD model and estimation 4. Monte Carlo simulations 5. Empirical results 6. Conclusion and expansion www.swingtum.com/institute/IWIF
1. Introduction • High frequency data usually is referred to the data sampled by hour, minute or second, while ultra-high frequency data is referred to the tick by tick data in the trading process. These data are very useful for researching market microstructure and trading mechanisms in theory and practice. • Among many properties, the most important characteristic is that trading irregularly happens at random time. • An econometric framework for the modeling of inter-temporally correlated event arrival times is provided by Engle and Russell (1997), who introduced the autoregressive conditional duration (ACD) model. www.swingtum.com/institute/IWIF
Engle and Russell (1998), Engle (2000) complemented and expanded the ACD model. Following the contribution of Engle and Russell (1997, 1998) and Engle (2000), a series expansion models are proposed, including ACD-GARCH model, Box-Cox ACD model, Log-ACD model, the threshold (TACD) model, Markov switching ACD model, asymmetric ACD model and so on. • Engle and Russell (1998) gave the ACD model based on exponential and Weibull distribution,Grammig and Maurer (2000) discussed ACD model based on Burr distribution,Lunde (1999) used ACD model based on generalized Gamma distribution. www.swingtum.com/institute/IWIF
Bauwens, Giot, Gramming and Veredas (2004) compared the predictive performance of econometric specifications that have been developed for modeling duration processes in intra-day financial markets by using density forecast evaluation techniques. • Cosma and Galli (2006) made use of an existing algorithm to describe nonparametrically the dynamics of the process in terms of its lagged realizations and of a latent variable, its conditional mean. • About the review of ACD model, readers can see Lu Wanbo (2005) and Maria (2006). www.swingtum.com/institute/IWIF
We Consider • From above review, the research on transaction duration is changed gradually from linearity to non-linearity on the specification of ACD models. The model is estimated by the method of nonlinear maximum likelihood estimation based on the assumption on error distribution. In order to find the best model, it is necessary to do some tests on the random point process to find which error distribution is suitable. • Few researchers discussed the comparison of parametric and nonparametric ACD models. Moreover, it is relatively unexplored terrain for econometrists to ascertain which model should be appropriate to describe the dynamics of trading duration in Chinese stock market so that researchers can find right ACD model to explain the true trading behavior. This article tries to discuss the problem and gives a way to test which model is appropriate. www.swingtum.com/institute/IWIF
2. Parametric ACD models and estimation • Let be i-th transaction time where , and be the duration between trades. • We focus on the adjusted time duration , where is a deterministic function consisting of the cyclical component of . • The basic ACD(p,q) model is defined as . (1) • is the conditional expected duration, which linearly depends on q past durations and p past expected durations, . (2) www.swingtum.com/institute/IWIF
When the distribution is exponential, the resulting model is called an EACD(p,q) model. Similarly, if follows a Weibull, Gamma, or Burr distribution, the model is a WACD(p,q ), GACD(p,q ), or BACD(p,q ) model, respectively. • Since the distribution is known, the estimation of parameters can be gained by maximizing conditional likelihood function. • A crucial assumption for obtaining QML consistent estimates of the ACD model is that the conditional expectation of duration is corrected specified. www.swingtum.com/institute/IWIF
3. Nonparametric ACD model and estimation • Let be a nonnegative stationary process adapted to the filtration , with , and the nonparametric ACD(p,q) model is defined as , . (4) • In order to estimate , we rewrite (4) into the additive form: , . (5) www.swingtum.com/institute/IWIF
could be estimated by regressing on nonparametrically. • In practice, the 's are unobserved variables. • To overcome the problem, Cosma and Galli (2006) adapted the recursive algorithm suggested by Buhlmann and McNeil (2002), we also take this algorithm. www.swingtum.com/institute/IWIF
4. Monte Carlo simulations • We discussed the ACD(1,1) model that was simulated and studied by Tsay, R.S. (2002), (6) • Where is assumed to follow four types of distributions:exponential, Weibull, Gamma and Burr distribution. www.swingtum.com/institute/IWIF
We did two different kinds of simulated experiments. • The first experiment (ExperimentⅠ) is to estimate the correct data generate process (DGP) with MLE and nonparametric methods and compare their estimated results. • The second experiment (ExperimentⅡ) is to estimate the misspecified DGP with former mentioned two methods and compare the estimated results (E→W, E→G) , (W→E , W→G) , (G→E, G→W) . www.swingtum.com/institute/IWIF
Table 1 gives the DGPs under different known parameters. Table 1 DGP in ExperimentⅠ www.swingtum.com/institute/IWIF
We used MLE and nonparametric methods to estimate the simulated series. Table 2 gives the MLE and standard error (in brackets) for each parametric in ExperimentⅠ. • It is obviously found that we got the ideal results with MLE if the real distribution of is known. • We also estimated the simulated series by nonparametric method. www.swingtum.com/institute/IWIF
Table 2 MLE in ExperimentⅠ www.swingtum.com/institute/IWIF
The performances of the parametric and nonparametric estimators were compared by computing three widely used measures of estimation errors——MSE, MPE and MAE. • Table 3 gives estimation errors (in brackets) of MLE and nonparametric estimation. It can be found that MSE, MPE and MAE of nonparametric estimation are less than those of MLE for WACD(1,1) and BACD(1,1) model, while MSE and MPE of nonparametric estimation are more than those of MLE for EACD(1,1) and GACD(1,1) model. www.swingtum.com/institute/IWIF
Table 3 Estimation errors in ExperimentⅠ Therefore, we can firstly use nonparametric estimation to gain residual if we don’t know the exact distribution of , and then consider making a test to choose right residual distribution to estimate parameters by MLE. www.swingtum.com/institute/IWIF
Table 4 gives MLE and standard error for each parametric in ExperimentⅡ. • Estimated values of parameters are far away from the real parameters in DGP, especially in shape and scale parameters. • Table 5 gives MSE, MPE and MAE in ExperimentⅡ. • In contrast with Table 3, we find that estimation errors of MLE all increase, and estimation errors of MLE are more than those of nonparametric estimation in general in Table 5. www.swingtum.com/institute/IWIF
Table 4 MLE in ExperimentⅡ www.swingtum.com/institute/IWIF
Table 5 Estimation errors in ExperimentⅡ Table 3 Estimation errors in ExperimentⅠ www.swingtum.com/institute/IWIF
Summary • The estimated results of MLE will be wrong if the exact distribution of is misspecified, while nonparametric estimation is still consistent with the real DGP. • Misspecification on distribution of random error results in serious consequence for MLE, while nonparametric estimation can avoid this misspecification. • We can firstly use nonparametric estimation to gain residual if we don’t know the exact distribution, and then consider making a test to choose right residual distribution to estimate parameters by MLE. www.swingtum.com/institute/IWIF
5. Empirical results • The intraday irregular data were provided by Shenzhen Tinysoft Technology Development Co., Ltd. The data set contains several variables including bad-ask quotes and associated arrival times. In order to learn the typical duration pattern of price, only trading price duration and volume are analyzed. • This subsample consists of one minute trading data in 149 days (7 months) and 35760 observations from Pufa bank during the period form June 1, 2004 to December 31, 2004. • Because we are interested in the price that fluctuates at certain extent, we choose average volume as threshold to whether the trading is happened or not. www.swingtum.com/institute/IWIF
So, if the volume is bigger than the average volume in one intra-day, the trading event is deemed occurrence. There are 10924 observations after this filtering. • Cubic splines are used to smooth the time-of-day function that displays the intraday seasonal effects. • Table 6 gives the basic descriptive statistics after removing the deterministic part of intraday durations. The shortest trading duration is 0.2076 minutes, the longest trading duration is 26.4190 minutes, and the average trading duration is 0.9974 minutes. www.swingtum.com/institute/IWIF
Table 6 Basic descriptive statistics after removing the deterministic part of intraday durations www.swingtum.com/institute/IWIF
We detect strong serial correlation in the durations. • The sample ACF values are all significantly positive at the 5% level, the sample PACF values are all significantly positive at the 5% level except for 12 lags. The Ljung–Box statistics with 15 lags for duration is 4391.459 with a P-value <<0.05, and the test statistics follow and its 5% critical value is 25. Moreover, sample ACF values decay slowly, displaying strong autocorrelation and long memory. www.swingtum.com/institute/IWIF
Under different assumption on distribution of, we estimate EACD(1,1), WACD(1,1), GACD(1,1) and BACD(1,1) model. The results are in Table 7. • All of the estimates are statistically significant at the 5% level. And also the sum of and is 0.9879, 0.9963, 0.9336 and 0.8282 respectively, which indicates high persistence but the process is still ergodic since . www.swingtum.com/institute/IWIF
We compute the LB statistics with 15 lags for these models. • For the EACD(1,1) model, LB statistics is 9.8633(P-value=0.557). For the WACD(1,1) model, LB statistics is 9.5442(P-value=0.504). For the GACD(1,1) model, LB statistics is 8.4097(P-value=0.325). For the BACD(1,1) model, LB statistics is 35.7445 (P-value=0.000). • These results indicate that former three models explain the temporal dependence in the transaction durations, and the EACD(1,1,) model is the best. www.swingtum.com/institute/IWIF
Based upon the nonparametric estimation algorithm, we gain the estimation for the nonparametric ACD(1,1) model. • We also use Ljung-Box statistics with 15 lags for the model. The LB statistics is 4.6356 (P-value=0.767). The filtering effect of nonparametric ACD(1,1) model is better than that of EACD(1,1) model. www.swingtum.com/institute/IWIF
Figure 1 gives the nonparametric kernel density estimation of residual in parametric ACD model. • Figure 2 gives the nonparametric kernel density estimation of residual in nonparametric ACD model. • We can consider to select EACD(1,1) model or GACD(1,1) model. • Considering simplicity for researchers, exponential distribution can be used as a reasonable and appropriate assumption. www.swingtum.com/institute/IWIF
Table 7 MLE of ACD model for Pufa bank data set www.swingtum.com/institute/IWIF
Figure1 Nonparametric kernel density estimation of residual in parametric ACD model www.swingtum.com/institute/IWIF
Figure2 Nonparametric kernel density estimation of residual in nonparametric ACD model www.swingtum.com/institute/IWIF
6. Conclusion and expansion • We introduce several different parametric ACD models with their estimation methods and the nonparametric ACD model with its estimation method. • We compare the fitting ability of parametric and nonparametric ACD model by Monte Carlo simulations. • We also compare these models by different duration distributions, and give a diagram method to choose the appropriate distribution in ACD model based on nonparametric density estimation. www.swingtum.com/institute/IWIF
1) Based on simulation results, the fitting performance of the parametric and nonparametric estimators is almost the same in ExperimentⅠ, whilethe fitting performance of the nonparametric is better than that of parametric estimators in Experiment Ⅱ. So, misspecification on distribution of random error results in serious consequence for MLE, while nonparametric estimation can avoid this misspecification. Before knowing the exact function and distribution form,we can use nonparametric estimation to help us to select appropriate distribution. • 2) Based on empirical results, the sample ACF values of the transaction duration data of Pufa bank decay slowly, displaying strong autocorrelation and long memory. The EACD(1,1) model, GACD(1,1,) model can explain the temporal dependence in the transaction duration. Considering simplicity for researchers, exponential distribution can be used as a reasonable and appropriate assumption. www.swingtum.com/institute/IWIF
Further research • An initial exploration how to choose the better ACD model in China stock market • More empirical results • Dynamics of duration, volume and return • Semi-paramertric method • SSM+nonparametric www.swingtum.com/institute/IWIF
Thank you for your attention ! www.swingtum.com/institute/IWIF