290 likes | 312 Views
Explore the use of Principal Components Analysis to measure and evaluate different city settings based on various metrics like mountain height and coastline length. Learn how to combine metrics, calculate variances, and define principal components. Discover how to find uncorrelated weights, maximize variance, and express data in terms of principal components scores.
E N D
Principal Components:A Mathematical Introduction Simon Mason International Research Institute for Climate Prediction The Earth Institute of Columbia University L i n k i n g S c i e n c e t o S o c i e t y
What is the most beautiful city setting? The setting could be measured on a variety of metrics, such as height of surrounding mountains, length of coastline. But if more than one metric is used, then some combined measure will need to be devised. L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
The city scores can be represented by a matrix, X. For simplicity, the scores are considered on only two metrics, and for only three cities. The metrics are sea and mountains, and the cities are San Francisco, Hong Kong, and Cape Town: L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
The means and variances are: The variance is used to distinguish the cities’ attractiveness. The total variance is 5. L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
X can be expressed as an anomaly matrix or a standardized anomaly matrix: L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
In general, if and a + b + c = 0, and d + e + f = 0, then L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
In general, matrix multiplication gives: So, if a + b + c = 0, and d + e + f = 0, then: L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
If X contains data expressed as anomalies If X contains data expressed as standardized anomalies L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
Using the city data expressed in standardized anomalies: L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
The variance-covariance matrix for the city data is: Note that the covariances are greater than zero, implying that both metrics represent a common aspect of city attractiveness. L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
Because of the covariance (or correlation) between the two metrics, we could combine these two metrics into a single new metric that represents the variance that is common to both metrics. Specifically we want to define sets of weights so that the new variables are uncorrelated, and have maximized variance. Let the weights for the first principal component be a, and, for the second principal component, b. L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
In matrix notation data are post-multiplied by the weights, represented as U: This gives the principal components Z. The scores on the principal components are: L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
The principal components are defined as: Which simply states that they are calculated as the weighted sums of the original metrics. Note that the sums of the squared weights = 1. Also if the principal components are to be uncorrelated, the weights also need to be uncorrelated. L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
These two properties of the weights are useful: The diagonals are the sums of the squares of each column of X. A column of X contains the weights for one of the principal components, so the diagonal of XTX are 1. Because the weights are uncorrelated, the off-diagonals are 0. L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
So if we post-multiply by UT, we get: Allowing us to express X in terms of the principal component scores and loadings. L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
Using the city data: L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
which gives us the variances for both principal components. (Note the total variance.) The eigenvectors can be obtained by solving: L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
Therefore the singular vectors are simply rescaled eigenvalues. L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
Finally, the SVD is useful for demonstrating the equivalence between S- and T-mode analyses. L i n k i n g S c i e n c e t o S i g h t – S e e i n g !
Therefore a T-mode principal components analysis will generate the same results as an S-mode analysis, except that the loadings and the scores are swapped., and the singular values will be scaled by a different value for n. L i n k i n g S c i e n c e t o S i g h t – S e e i n g !