Understanding Principal Component Analysis (PCA) for Data Analysis

Principal Component Analysis Zelin Jia Shengbin Lin 10/20/2015

What is PCA? • An orthogonal transformation • Convert correlated variables to an artificial variable(Principle Component) • The resulting vectors are an orthogonal basis set • A tool in exploratory data analysis https://en.wikipedia.org/wiki/Principal_component_analysis

Why use PCA? • Reduce the dimensionality of the data • Compress the data • Prepare the data for further analysis using other techniques • Understand your data better by interpreting the loadings, and by graphing the derived variables http://psych.colorado.edu/wiki/lib/exe/fetch.php?media=labs:learnr:emily_-_principal_components_analysis_in_r:pca_how_to.pdf Dr. Peter Westfall

How PCA works • PCA begin with covariance matrix: Cov(X)=XTX • For the covariance matrix, calculate its eigenvectors and eigenvalues. • Get sets of eigenvectors zi and eigenvaluesλi (Constraint: ziT zi=1) • arrange the eigenvectors in decreasing order of the eigenvalues • Pick eigenvectors, multiple by original data matrix(X), we will get PC matrix. https://www.riskprep.com/all-tutorials/36-exam-22/132-understanding-principal-component-analysis-pca

Example of how PCA works (by R) • A financial sample data with 8 variables and 25obs • Perform PCA on this data and reduce the number of variables from 8 to something more manageable https://www.riskprep.com/all-tutorials/36-exam-22/132-understanding-principal-component-analysis-pca

Simulate PC on uncorrelated data and highly correlated data (by R) • PCA is better for more highly correlated data in that greater reduction is achievable. Provided by Dr. Peter Westfall

PCA standardization Why: The variable with the smaller numbers – even though this may be the more important number – will be overwhelmed by the other larger numbers in what it contributes to the covariance https://www.riskprep.com/all-tutorials/36-exam-22/132-understanding-principal-component-analysis-pca

properties of PC • The number of principal components is less than or equal to the number of original variables. • The first principal component has the largest possible variance. • Each succeeding component in turn has the highest variance possible under the constraint that it is orthogonal to the preceding components. https://en.wikipedia.org/wiki/Principal_component_analysis

What is SVD? Applied_Regression_Analysis_A_Research_Tool.pdf

Relationship between SVD and PCA • From SVD we have X = UL1/2ZT-> W = XZ = UL1/2 • If X is an n × p matrix of observations on p variables, each column of W is a new variable defined as a linear transformation of the original variables. Applied_Regression_Analysis_A_Research_Tool.pdf

EFA vs PCA • EFA: EFA provides a model to explain why the data looks like it does. • PCA: PC is not a model that explains how the data looks. There is no model at all. Provided by Dr. Peter Westfall

EFA vs PCA http://www.gac-usp.com.br/resources/use_of_exploratory_factor_analysis_park_dailey.pdf

EFA vs PCA EFA: in EFA one postulates that there is a smaller set of unobserved (latent) variables or constructs underlying the variables actually observed or measured (this is commonly done to assess validity) PCA: in PCA one is simply trying to mathematically derive a relatively small number of variables to use to convey as much of the information in the observed/measured variables as possible http://www.gac-usp.com.br/resources/use_of_exploratory_factor_analysis_park_dailey.pdf

Application of PCA • Data visualization • Image compression

Data visualization • If a multivariate dataset is visualized as a set of coordinates in a high-dimensional data space (1 axis per variable), PCA can supply the user with a lower-dimensional picture. https://en.wikipedia.org/wiki/Principal_component_analysis

PCA using on compressing image • The PCA formulation may be used as a digital image compression algorithm with a low level of loss. http://www.scielo.br/scielo.php?script=sci_arttext&pid=S1679-45082012000200004

princomp vs prcomp • For prcomp: • The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using eigen on the covariance matrix. This is generally the preferred method for numerical accuracy. • For princomp: • The calculation is done using eigen on the correlation or covariance matrix, as determined by cor. This is done for compatibility with the S-PLUS result. A preferred method of calculation is to use svd on x, as is done in prcomp." http://stats.stackexchange.com/questions/20101/what-is-the-difference-between-r-functions-prcomp-and-princomp

Thanks!

Understanding Principal Component Analysis (PCA) for Data Analysis

Understanding Principal Component Analysis (PCA) for Data Analysis

Presentation Transcript

Principal Component Analysis

Principal component analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal Component Analysis

Principal component analysis

Principal Component Analysis

Principal Component Analysis