230 likes | 286 Views
Factor Analysis. An Alternative technique for studying correlation and covariance structure. Let. have a p -variate Normal distribution. with mean vector. The Factor Analysis Model :. Let F 1 , F 2 , … , F k denote independent standard normal observations ( the Factors ).
E N D
Factor Analysis An Alternative technique for studying correlation and covariance structure
Let have a p-variate Normal distribution with mean vector The Factor Analysis Model: Let F1, F2, … , Fk denote independent standard normal observations (the Factors) Let e1, e2, … , ep denote independent normal random variables with mean 0 and var(ei) = yp Suppose that there exists constants lij (the loadings) such that: x1= m1 + l11F1+ l12F2+ … + l1kFk + e1 x2= m2 + l21F1+ l22F2+ … + l2kFk + e2 … xp= mp+ lp1F1+ lp2F2+ … + lpkFk + ep
Factor Analysis Model where and
Note: hence and i.e. the component of variance of xi that is due to the common factors F1, F2, … , Fk. i.e. the component of variance of xi that is specific only to that observation.
F1, F2, … , Fkare called the common factors e1, e2, … , epare called the specific factors = the correlation between xi and Fj. if
Rotating Factors Recall the factor Analysis model This gives rise to the vector having covariance matrix: Let P be any orthogonal matrix, then and
Hence if with is a Factor Analysis model then so also is with where P is any orthogonal matrix.
The process of exploring other models through orthogonal transformations of the factors is called rotating the factors There are many techniques for rotating the factors • VARIMAX • Quartimax • Equimax VARIMAX rotation attempts to have each individual variables load high on a subset of the factors
Several Methods – we consider two • Principal Component Method • Maximum Likelihood Method
Principle Component Method Recall where are eigenvectors of S of length 1 and are eigenvalues of S.
Hence Thus This is the Principal Component Solution with p factors Note: The specific variances, yi, are all zero.
The objective in Factor Analysis is to explain the correlation structure in the data vector with as few factors as necessary It may happen that the latter eigenvalues of S are small.
The equality will be exact along the diagonal In addition let In this case where
Maximum Likelihood Estimation Let denote a sample from where The joint density of is
The Likelihood function is The maximum likelihood estimates Are obtained by numerical maximization of
Example: Olympic decathlon Scores Data was collected for n = 160 starts (139 athletes) for the ten decathlon events (100-m run, Long Jump, Shot Put, High Jump, 400-m run, 110-m hurdles, Discus, Pole Vault, Javelin, 1500-m run). The sample correlation matrix is given on the next slide