The Principal Components Regression Method

The Principal Components Regression Method David C. Garen, Ph.D. Hydrologist USDA Natural Resources Conservation Service National Water and Climate Center Portland, Oregon

The General Linear Regression Model where: Y = dependent variable Xi = independent variables bi = regression coefficients n = number of independent variables

The Problem If X’s are intercorrelated, they contain redundant information, and the b’s cannot be meaningfully estimated. However, we don’t want to have to throw out most of the X’s but prefer to retain them for robustness.

The Solution Possibilities: 1) Pre-combine X’s into composite index(es), e.g., Z-score method 2) Principal components regression These are similar in concept but differ in the mathematics.

Principal Components Analysis Principal components regression is just like standard regression except the independent variables are principal components rather than the original X variables. Principal components are linear combinations of the X’s.

Principal Components Analysis Each principal component is a weighted sum of all the X’s: . . .

Principal Components Analysis The e’s are called eigenvectors, derived from a matrix equation whose input is the correlation matrix of all the X’s with each other. Principal components are new variables that are not correlated with each other. The principal components transformation is equivalent to a rotation of axes.

Principal Components Analysis

Principal Components Analysis The eigenvectors (weights) are based solely on the intercorrelations among the X’s and have no knowledge of Y (in contrast to Z-score, for which the opposite is true). Principal components can be used for purely descriptive purposes, but we want to use them as independent variables in a regression.

Principal Components Analysis -- Example Independent Variables: X1 – X5 Snow water equivalent at 5 stations X6 – X10 Water year to date precipitation at 5 stations X11 Antecedent streamflow X12 Climate teleconnection index

Correlation Matrix

First Five Eigenvectors

Principal Components Regression Procedure • Try the PC’s in order • Test for regression coefficient significance (t-test) • Stop at first insignificant component • Transform regression coefficients to be in terms of original variables • Sign test – coefficient signs must be same as correlation with Y

Principal Components Regression Procedure t-test iterations for example data set (tcrit = 1.2): 10.243 10.105 0.622 : stop here, use only first PC Continuing ... 10.225 0.629 1.235 : 3rd PC exceeds tcrit 10.261 0.632 1.239 -1.073 10.092 0.621 1.219 -1.055 -0.588 11.723 0.722 1.416 -1.225 -0.683 -2.764 11.395 0.702 1.376 -1.191 -0.664 -2.686 -0.073

Principal Components Regression Procedure Final model for example data set (1 PC): Y = 2.91 X1 + 3.34 X2 + 2.44 X3 + 2.27 X4 + 2.50 X5 + 3.34 X6 + 2.69 X7 + 2.45 X8 + 2.97 X9 + 2.78 X10 + 0.55 X11 + 2.47 X12 - 79.78 R = 0.906 JR = 0.890 SE = 62.558 JSE = 67.410

Summary • Principal components analysis is a standard multivariate statistical procedure • Can be used for descriptive purposes to reduce the dimensionality of correlated variables • Can be taken a step further to provide new, non-correlated independent variables for regression • PC’s taken in order, subject to t-test and sign test • Final model is expressed in terms of original X variables

The Principal Components Regression Method

The Principal Components Regression Method

Presentation Transcript

Principal Components Analysis

Principal Components An Introduction

Principal Components Analysis

Principal Components An Introduction

Principal component regression

Principal Components Analysis

Multicollinearity in Regression Principal Components Analysis

Robust Principal Components Analysis

Principal Components

Interpreting Principal Components

Principal Components Analysis

Principal Components Analysis

Principal Components Analysis

Principal Components Analysis

Principal components

Principal Components Analysis

Principal Components Analysis

Principal Components Analysis ( PCA)

Principal Components Analysis

Principal Components