1 / 38

Factor Analysis, Part 1

Factor Analysis, Part 1. BMTRY 726 4/1/14. Uses. Goal : Similar to PCA… describe the covariance of a large set of measured traits using a few linear combinations of underlying latent traits Why : again, similar reasons to PCA (1) Dimension Reduction (use k of p components)

majed
Download Presentation

Factor Analysis, Part 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Factor Analysis, Part 1 BMTRY 726 4/1/14

  2. Uses Goal: Similar to PCA… describe the covariance of a large set of measured traits using a few linear combinations of underlying latent traits Why: again, similar reasons to PCA (1) Dimension Reduction (use k of p components) (2) Remove redundancy/duplication from a set of correlated variables (3) Represent correlated variables with a smaller set of “derived” variables (4) Create “new” factor variables that are independent

  3. For Examples Say we want to define “frailty” in a population of cancer patients We have a concept of what “frailty” is but no direct way to measure it We believe an individual’s frailty has to do with their weight, strength, speed, agility, balance, etc. We therefore want to be able to define frailty as some composite measure of all of these factors…

  4. Key Concepts Fi is a latent underlying variable (i = 1, 2, …, m) X’s are observed variables related to what we think Fi might be ej is the measurement error for Xj, j = 1, 2, …, p ljiare the factor “loadings” for Xj

  5. Orthogonal Factor Model Consider data with p observed variables:

  6. Model Assumptions We must make some serious assumptions… Note, these are very strong assumptions which implies only narrow application These models are also best when p >> m

  7. Model Assumptions Our assumptions can be related back to the variability of our original X’s

  8. Model Terms Decomposition of the variance of Xj The proportion of variance of the jthmeasurement Xj contributed by the m factors F1, F2, …, Fmis called the jthcommunality

  9. Model Terms Decomposition of the variance of Xj The remaining proportion of the variance of the jthmeasurement, associated with ej, is called the uniqueness or specific variance Note, we are assuming that the variances and covariances of X can be reconstructed from our pm factor loadings lji and the pspecific variances

  10. Potential Pitfall The problem is, most covariance matrices can not be factored in the manor we have defined for a factor model: LL’ + y For example… Consider data with 3 variables which we are trying to describe with 1 factor

  11. Potential Pitfall We can write our factor model as follows: Using our factor model representation, LL’ + y, we can define the following six equations

  12. Potential Pitfall Use these equations to find the factor loadings and specific variances:

  13. Potential Pitfall However, this results in the following problems:

  14. Limitations • Linearity -Assuming factors = linear combinations of our X’s -Factors unobserved so we can not verify this assumption -If relationship non-linear, the linear combinations may provide a good approximation for only a small range of values • The elements of S described by mp factor loadings in Land p specific variances {yi} -Model most useful for small m, but often mp + p parameters not sufficient and S is not close to LL’+ y

  15. Limitations • Even when m < p, we find L such that X = LL’ + e… but Lis not unique -Suppose T= orthogonal matrix (T’ = T-1 andTT’ = I) -We can use any orthogonal matrix and get the same representation -Thus we have infinitely many possible loadings

  16. Methods of Estimation We need to estimate: We have a random sample from n subjects from a population Measure p attributes for each of the n subjects latent factors L

  17. Methods of Estimation We could also standardize our variables: Methods of Estimation 1. Principal Components method 2. Principal Factor method 3. Maximum likelihood method

  18. Principal Component Method Given S (or S if we have a sample from the population) Consider decomposition

  19. Principal Component Method Problem here is m = p so we want to drop some factors We drop l’s that are small (i.e. stop at lm)

  20. Principal Component Method Estimate L and y by substituting estimate eigenvectors/values for S or R: To make diagonal elements of , , we let

  21. Principal Component Method The optimality of using to approximate S due to: Note, the sum of squared elements is an approximation of the sum of squared error We can also estimate the proportion of the total sample variance due to the jth factor

  22. Example Stock price data consists of n = 100 weekly rates of return on p = 5 variables Data standardized and factor analysis performed in sample correlation matrix R

  23. Example Given the eigenvalues/vectors of R, find the first two factor

  24. Example Given a 2-factor solution, we can find the communalities and specific variances based on our loadings and R.

  25. Example What is the cumulative proportion of variance accounted by factor 1, what about both factors?

  26. Example What about how our model checks out….

  27. Example Two factor (m = 2) solution: Data standardized and factor analysis performed in sample correlation matrix R How might we interpret these factors?

  28. Principal Component Method Estimated loadings on factors do not change as number of factors increases Diagonal elements of S (or R) exactly equal diagonal elements of , but sample covariances may not be exactly reproduced Select number of factors m to make off-diagonal elements small for residual matrix Contribution of the kthfactor to total variance is:

  29. Principal Factor Method Consider the model: Suppose initial estimates available for the communalities or specific variances

  30. Principal Factor Method Then

  31. Principal Factor Method Apply procedure iteratively 1. Start with 2. Compute factor loadings from eigenvalues/vectors of Rr 3. Compute new values 4. Repeat steps 2 and 3 until algorithm converges Problems: - some eigenvalues Rr can be negative -choice of m (m too large, some communalities > 1 and iteration terminates)

  32. Example Principal factor method m = 2 factor solution:

  33. Maximum Likelihood Method Likelihood function needed and additional assumptions made: Additional restriction specifying unique solution MLE’s are:

  34. Maximum Likelihood Method For m factors: -estimated communalities -proportion of the total sample variance due to kth factor

  35. Example Two factor (m = 2) solution: Data standardized and factor analysis performed in sample correlation matrix R

  36. Large Sample Test for number of factors We want to be able to decide of the number of common factors m we’ve choose is sufficient So if n is large, we do hypothesis testing: We can consider our estimates in our hypothesis statement…

  37. Large Sample Test for number of factors From this we develop a likelihood ratio test:

  38. Test Results What does it mean if we reject the null hypothesis? -Not an adequate number of factors Problem with the test -If n is large and m is small compared to p, this test will very often reject the null -Results is we tend to want to keep in more factors -This can defeat the purpose of factor analysis -exercise caution when using this test

More Related