ASYMPTOTIC PROPERTIES OF LASSO-TYPE ESTIMATORS

ASYMPTOTIC PROPERTIES OF LASSO-TYPE ESTIMATORS MeRVE YASEMIN TEKBUDAK April 24th , 2014

What is the LASSO ? • The LASSO (Least Absolute Shrinkage and Selection Operator) is a shrinkage and variable selection method proposed by R. Tibshirani in 1996 that involves penalizing the absolute size of the regression coefficients. • LASSO minimizes the Residual Sum of Squares (RSS) but poses a constraint to the sum of the absolute values of the coefficients being less than a constant.

Drawbacks of Least Squares Estimates • Prediction Accuracy • Low bias but large variance • How to improve? By shrinking (Sacrifice a little bit of bias to reduce the variance of the predicted values.) • Interpretation • Large number of predictors • We often would like to determine a smaller subset that exhibit the strongest effects.

Drawbacks of Subset Selection and Ridge Regression Subset Selection • provides interpretable models, BUT it can be extremely variable. Ridge Regression • shrinks coefficients. It is more stable, BUT it does not set any coefficients to 0 and hence does not give an easily interpretable model.

The LASSO

Standard Errors Since the lasso estimate is a non-linear and non-differentiable function of the response values even for a fixed value of t, it is difficult to obtain an accurate estimate of its standard error. To get an approximate closed form estimate, we need to write the penalty

Estimation of the parameter t • Mean-squared error (MSE) :

Prediction error (PE) :

LASSO Algorithms

LASSO in SAS and R • PROC GLMSELECT DATA=prostate PLOTS=ALL; MODEL lpsa = pgg45 gleasonlcpsvilbph age lweight lcavol/ SELECTION=LASSO(STOP=NONE) STATS=SBC; RUN; • library(glmnet) fit <- glmnet(X,y)

EXAMPLE – 1 (Kyphosis Data – Logistic Regression)

EXAMPLE – 2 (Prostate Cancer Data)

ASYMPTOTIC PROPERTIES

Consistency

Limiting Distribution

EXAMPLE - 3 :

Local Asymptotics and Small Parameters

For Further Information; • About small sample behavior of Bridge estimators • Bootstrapping • Asymptotics for nearly singular designs See Knight and Fu (2000).

References Chen, X., Whang, Z. J. & McKeown, M. J. (2010). Asymptotic analysis of robust LASSOs in the presence of noise with large variance. IEEE Transactions On Information Theory, 56(10), 5131-5149. Hastie, T., Tibshirani, R. & Friedman, J. (2009). The elements of statistical learning: data mining, inference and prediction. New York, NY: Springer. Huang, J., Horowitz, J. L. & Ma, S. (2008). Asymptotic properties of Bridge estimators in sparse high-dimensional regression models. The Annals of Statistics, 36(2), 587-613. Knight, K. & Fu, W. (2000). Asymptotics for lasso-type estimators. The Annals of Statistics, 28(5), 1356-1378. Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), 58(1), 267-288. Tibshirani, R. (2011). Regression shrinkage and selection via the lasso: a retrospective. Journal of the Royal Statistical Society, Series B, 73(3), 273-282.

Thank You!

ASYMPTOTIC PROPERTIES OF LASSO-TYPE ESTIMATORS

ASYMPTOTIC PROPERTIES OF LASSO-TYPE ESTIMATORS

Presentation Transcript

Lasso Development

Desirable properties of estimators

Properties of Estimators

Asymptotic Notations

Asymptotic Efficiency of Recurrences

Asymptotic Analysis of Algorithms

Observers/Estimators

Lasso

Lasso tool

Matching Estimators

Asymptotic Analysis

7.4 PROPERTIES OF ESTIMATORS

STATISTICAL INFERENCE PART II SOME PROPERTIES OF ESTIMATORS

PROPERTIES OF A TYPE ABSTRACT INTERPRETATER

Asymptotic Efficiency of Recurrences

Asymptotic Behavior of Stochastic Complexity of Complete Bipartite Graph-Type Boltzmann Machines

Asymptotic Notation

STATISTICAL INFERENCE PART II SOME PROPERTIES OF ESTIMATORS

Asymptotic Analysis