160 likes | 315 Views
Principal Component Analysis. Tanya and Caroline. Overview . Basic function is to condense data PCA is used when several underlying factors shape the data Differences in geology between two areas Unlike Bray-Curtis ordination, PCA is objective
E N D
Principal Component Analysis Tanya and Caroline
Overview • Basic function is to condense data • PCA is used when several underlying factors shape the data • Differences in geology between two areas • Unlike Bray-Curtis ordination, PCA is objective • It finds the most useful angle from which to view the shape of the pattern the data points make
PCA is NOT… • Factor Analysis or Principal Coordinates Analysis (PCO) • A test of significance • No null hypothesis is required • Prior to ordination – no way to objectively decide which variables to include • After analysis – no way to decide which variables were unimportant • Cannot cope with missing values
2-D vs. multi-D • Make a scatter plot of all data points • As the number of variables increases, data space becomes harder to visualize This is where PCA comes in!
PCA • Simplifies data by reducing dimensions of data space • Finds the most informative viewpoint from which to visualize the data from a scatter plot • Produces low-dimensional images of high dimensional shapes • Shows amount of variance between axes
Find first principal axis which always passes through the overall mean of the dataset • Find second ordination axis which must be orthogonal or 90° to first axis • Each successive axis explains less variance than its predecessors and is assumed to be less important • First principal axis accounts for greatest possible percentage of overall variance and second principal axis accounts for remaining variance
Mechanics of PCA • Normalizing data • Generating Principal Axes • Loadings→ Eigenvalues + Eigenvectors → Correlation matrix Eigenvalues– rate of growth per multiplication Eigenvector– pattern formed • Interpretation of eigenvalues- gives the importance of each ordination axis and the largest eigenvalue indicates the first principal axis, etc. • Eigenvalues and eigenvectors summarize underlying structure of a matrix • Deriving axis scores- take the Normalized Data X First Eigenvector to get first principal axis then the same for second eigenvector
Example • 2 sites- site 1 is a Heath and site 2 is a Mound • PCA only for the data for the 8 plant species (vegetation)
The larger the variance, the greater the amount of info that has been condensed into the ordination axis % Variance
Homework • What’s the purpose of PCA and what 3 things does it give us? • Define eigenvalue and eigenvector. • Interpret the Figure 6.10 on page 111, what does the First Principal Axis show and what does the Second Principal Axis show???