260 likes | 385 Views
http://www.fsv.cuni.cz. Charles University. Founded 1348. Prague. Prague. COMPSTAT 2004. COMPSTAT 2004. 23.-27.8. 2004. 23.-27.8. 2004. ROBUSTIFYING INSTRUMENTAL VARIABLES. ROBUSTIFYING INSTRUMENTAL VARIABLES.
E N D
http://www.fsv.cuni.cz Charles University Founded 1348
Prague Prague COMPSTAT 2004 COMPSTAT 2004 23.-27.8. 2004 23.-27.8. 2004 ROBUSTIFYING INSTRUMENTAL VARIABLES ROBUSTIFYING INSTRUMENTAL VARIABLES Jan Ámos Víšek Jan Ámos Víšek http://samba.fsv.cuni.cz/~visek/compstat http://samba.fsv.cuni.cz/~visek/compstat Institute of Economic Studies Faculty of Social Sciences Charles University Prague Institute of Information Theory Institute of Economic Studies Faculty of Social Sciences Charles University Prague Institute of Information Theory and Automation Academy of Sciences of the Czech Republ and Automation Academy of Sciences of the Czech Republ
Topic of presentation ● Recalling the “classical” Instrumental Variables and their robust version ● Why another robust version of the Instrumental Variables ? ● Recalling definition of the Least Weighted Squares ● Proposing an instrumental version of the Least Weighted Squares ● Conditions for their consistency and asymptotic normality ● An algorithm for their evaluation ● A heretic question at the end
Classical regression ( by the Ordinary Least Squares ) i.i.d. r.v.’s Model Notice that the true value of the vector of regression coefficients is , If the Ordinary Least Squares are not consistent, hence ....
The Instrumental Variables as “close” as possible to BUT . The Instrumental Variables are consistent. Notice that the Instrumental Variables are solution of the normal equations .
The Instrumental Variables continued As , the Instrumental Variables are vulnerable to the influential observations. What about M-version of the Instrumental Variables, i.e. ? or Víšek, J.Á. (1998): Robust instruments. Proc. Robust'98 (ed. J. Antoch & G. Dohnal) Union of Czechoslovak Mathematicians and Physicists, 195 - 224. Víšek, J.Á. (2000): Robust instrumental variables and specification test. Proc. PRASTAN 2000, ISBN 80-227-1486-0, 133 - 164..
The Instrumental Variables continued Since M-estimators are not scale- and regression-equivariant, for discussion see Bickel, P.J. (1975): One-step Huber estimates in the linear model. JASA 70, 428-433. Jurečková J., P.K. Sen (1984): On adaptive scale-equivariant M-estimators in linear models. Statistics and Decisions, vol.2(1984), Suppl. IssueNo.1. the M-version of Instrumental Variables is not scale- and regression-equivariant, too !
The Instrumental Variables continued There are basically two possibilities: ● Studentization of residuals by an estimator of scale which has to be scale-equivariant and regression-invariant see again Jurečková J., P.K. Sen (1984): On adaptive scale-equivariant M-estimators in linear models. Statistics and Decisions, vol.2(1984), Suppl. Issue No.1. Víšek, J.Á. (1998): Robust estimation of regression model. Not very easy to evaluate. Bulletin of the Czech Econometric Society, Vol.6, No 9/1999, 57 - 79. ● To start with a robust, scale- and regresion-equivariant estimator Much easier to carry out. Let’s employ the Least Weighted Squares .....
The Least Weighted Squares Víšek, J.Á. (2000): Regression withhigh breakdown point. ROBUST 2000, 324 – 356, ISBN 80-7015-792-5. If interested in, ask me for sending by e-mail. non-increasing, absolutely continuous
Why the Least Weighted Squares? Highbreakdown point (assuming deletion of some observations) may be sometimes self-destructive !! Let us agree, for a while, that the majority of data determines the “true” model. Then a small change even of one observation can cause a large change of estimate. What is the problem ? The method too much relies on selected “true” points ! Hence, it may be preferable to reject observations “smoothly”. Moreover, ...
The Least Weighted Squares .... General discussion Requirements on the estimator of regression coefficients naturally inherited from the classical statistics Consistency Asymptotic normality Controllable level of efficiency Scale- and regression-equivariance Hampel’s paradigm of robust estimation Controllable gross-error sensitivity Controllable local shift sensitivity Controllablebreakdown point Possibly finite rejection point Hampel, F. R., E.M. Ronchetti, P. J. Rousseeuw, W. A. Stahel(1986): Robust Statistics - The Approach Based on Influence Functions. New York: J.Wiley & Sons.
The Least Weighted Squares .... continued General discussion Requirements inevitable for meaningful, competent and liable application Existence of an implementation of the algorithm with acceptable complexity and tested reliability of evaluation Extremely important, hence discussed in details below Evidently geometric, similar to the Least Squares An efficient and acceptable heuristics Available diagnostics, sensitivity studies and accompanying procedures Under progress, something already available Víšek, J.Á. (2000): A new paradigm of point estimation. Proc. of Data Analysis 2000/II, Modern Statistical Methods - Modeling, Regression, Classification and Data Mining, ISBN 80-238-6590-0, 195 - 230.
The Least Weighted Squares ... Already available Both, in the framework of random carriers Mašíček,, L. (2003): Diagnostika a sensitivita robustního odhadu. (Diagnostics and sensitivity of robust estimators, in Czech) Dissertation on the Faculty of Mathematics, Charles University. Mašíček, L. (2003): Consistency of the least weighted squares estimator. To appear in Kybernetika. as well as for deterministic ones Plát, P. (2003): Nejmenší vážené čtverce. (The Least Weighted Squares, in Czech.) Diploma thesis on the Faculty of Nuclear and Physical Engineering , he Czech Technical University, Prague we have consistency, asymptotic normality and Bahadur representation of the Least weighted Squares. There are also some optimality results Mašíček,, L. (2003): Optimality of the least weighted squares estimator. To appear in the Proceedings of ICORS'2003.
The instrumental version of the Least Weighted Squares Recalling , let’s put ranks of the squared residuals . Hence define .
The instrumental version ... Notice that can be written as . It is nearly equivalent to which can be interpreted as empirical counterpart of . Conclussion: The instrumental version of the Least Weighted Squares can be interpreted as a Weighted GMM estimation – see Víšek, J.Á. (2004):. Weighted GMM estimation. Submitted to ROBUST 2004.
The instrumental version of the Least Weighted Squares ... Assumptions i.i.d. r.v.’s with absolutely continuous d.f. bounded compact support absolutely continuous, non-increasing bounded from below independent positive definite
The instrumental version of the Least Weighted Squares ... Assumptions continued for all only for
Requirements inevitable for the meaningful and competent application An example Existence of an implementation of the algorithm with acceptable complexity and tested reliability of evaluation Hettmansperger, T.P., S. J. Sheather (1992): A Cautionary Note on the Method of Least Median Squares. The American Statistician 46, 79-83. Engine knock data - treated by the Least Median of Squares Number of observations: 16 Response variable: Number of knocks of an engine Explanatory variables: - the timing of sparks - air / fuel ratio - intake temperature - exhaust temperature A small change (7.2%) of one value in data caused a large change of the estimates. The results were due to bad algorithm, they used. They are on the next page.
continued Requirements inevitable for meaningful and competent application Existence of an implementation of the algorithm with .... Minimized squared residual Engine knock data - results by Hettmansperger and Sheather A new algorithm, based on simplex method, was nearly immediately available, although published a bit later. Boček, P., P. Lachout (1995): Linear programming approach to LMS-estimation. Mem. vol. Comput. Statist. & Data Analysis 19 (1995), 129 - 134.. It indicates that the reliability of algorithm and its implementation is crucial.
Requirements inevitable for meaningful and competent application Another example An efficient and acceptable heuristics (?) In 1989 Martin et al. studied estimators minimizing maximal bias of them Martin, R.. D., V.J. Yohai, R.H. Zamar (1989):Min-max bias robust regression. Ann Statist. 17, 1608 - 1630. - maximum was taken over some set of underlying d.f.’s and minimum over possible estimators, - it seems quite acceptable heuristics, unfortunately it does not work, - for the example of data for which the min-max-estimator failed see Víšek, J.Á. (2000): On the diversity of estimates. CSDA 34, (2000) 67 - 89. • the problem is that the method implicitly takes maximum over • “unexpected” set of d.f.’s. But papers like Hansen, L. P. (1982): Large sample properties of generalized method of moments estimators. Econometrica, 50, no 4, 1029 - 1054. hints that, in the case of sufficient “demand for data-processing”, we may “cope” without any heuristics.
The Least Weighted Squares ... There is also algorithm for evaluating the LEAST WEIGHTED SQUARES. It is a modification of the algorithm for the LEAST TRIMMED SQUARES which was described and tested in: Víšek, J.Á. (1996): On high breakdown point estimation. Computational Statistics (1996) 11:137-146. Víšek, J.Á. (2000): On the diversity of estimates. CSDA 34, (2000) 67 - 89. Čížek, P., J. Á.Víšek (2000): The least trimmed squares. User Guide of Explore, Humboldt University. (Of course, the algorithm for LTS is available in the package EXPLORE.) The algorithm for the instrumental version of the Least Weighted Squares is a straightforward slight generalization of the algorithm for the Least Weighted Squares.
The Least Weighted Squares - algorithm Put A Select randomly p + 1 observations and find regression plane through them. Evaluate squared residuals for all observations, order these squared residuals from the smallest one to the largest, multiply them by the weights and evaluate the sum of these products. No Is this sum of weighted squared residuals smaller than the sum from the previous step? B Yes This step will be modified for ILWS Order observations in the same order as the squared residuals and apply the classical weighted least squares on them with weights , i.e. and so find new regression plane.
continued The Least Weighted Squares - algorithm B An arbitrary reasonable number A stopping rule Have we found already 20 identical models or have we exhausted a priori given number of repetitions ? Yes No Return to End of evaluation A In the case when we were able to pass all n! orders of observations ( less than 18 observations), i.e. when we were able to find the LEAST WEIGHTED SQUARES estimator precisely, the algorithm returned the same value. The algorithm is available in MATLAB.
The instrumental version of the Least Weighted Squares - algorithm The only modification of the previous algorithm: Instead of employing , we utilize .
A heretic question ... as “close” as possible to BUT