1 / 27

Factor Analysis 1

Factor Analysis 1. What is Factor Analysis (FA)?. Method of data reduction take many variables and explain them with a few “factors” or “components” correlated variables are grouped together and separated from other variables with low or no correlation

jefferylee
Download Presentation

Factor Analysis 1

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Factor Analysis 1

  2. What is Factor Analysis (FA)? • Method of data reduction • take many variables and explain them with a few “factors” or “components” • correlated variables are grouped together and separated from other variables with low or no correlation • seeks underlying unobservable (latent) variables that are reflected in the observed variables (manifest variables)

  3. More on Factor Analysis • requires a large sample size since it is based on the correlation matrix of the variables involved • 50 cases is very poor • 100 is poor • 200 is fair • 300 is good • 500 is very good, and 1000 or more is excellent. • rule of thumb – a bare minimum of 10 observations per variable is necessary to avoid computational difficulties.

  4. “Good Factor” • A good factor: • makes sense • easy to interpret • simple structure • lacks complex loadings

  5. Problems with Factor Analysis • There is no statistical criterion to compare the linear combination to as in MANOVA or Canonical Correlations • It is more art than science • several extraction methods • several rotation methods • number of factors to extract • communality estimates • Life (researcher) saver • Often, when nothing else can be salvaged from research a FA will be conducted.

  6. Types of Factor Analysis • Exploratory Factor Analysis (EFA) • Confirmatory Factor Analysis (CFA)

  7. Exploratory Factor Analysis (EFA) • Exploratory Factor Analysis (EFA) • summarizing data by grouping correlated variables • investigating sets of measured variables related to theoretical constructs • usually done near the onset of research • generate “factor scores“ which represent values of the underlying constructs for use in other analyses • often confused with Principal Component Analysis (PCA) which is a similar statistical procedure

  8. FA vs. PCA EFA PCA analyzes all of the variance only summarizes empirical associations very data driven produces components components are aggregates of the variables • analyzes only the variance shared among the variables (common variance without error or unique variance) • examines what are the underlying processes that could produce these correlations • produces factors • factors cause variables

  9. Confirmatory Factor Analysis (CFA) • Confirmatory FA • more advanced technique • used when factor structure is known or at least theorized • testing generalization of factor structure to new data, etc. • tested through Structural Equation Modeling (SEM) methods discussed later in course

  10. defining indicators of constructs • ideally 4 or more measures should be chosen to represent each construct of interest • choice of measures should, as much as possible, be guided by theory, previous research, and logic • selecting items or scales to be included in a measure • determine what items or scales should be included and excluded from a measure • results of the analysis should not be used alone in making decisions of inclusions or exclusions • decisions should be taken in conjunction with the theory and what is known about the construct(s) that the items or scales assess Application of Factor Analysis

  11. measured variables are linearly related to the factors + errors. • likely to be violated if items use limited response scales, i.e. too many dichotomous variables • should have a bivariate normal distribution for each pair of variables • observations are independent • assumes variables are determined by common factors and unique factors • unique factors assumed to be uncorrelated with each other and with the common factors Assumptions Underlying Factor Analysis

  12. Terminology • Reproduced Correlation Matrix • correlation matrix based on the extracted factors • want the values in the reproduced matrix to be as close to the values in the original correlation matrix as possible • If reproduced matrix is very similar to the original correlation matrix, then the few factors do a good job of representing the original data • Residual Correlation Matrix • represents the differences between original correlations and the reproduced correlations • should be close to zero

  13. Terminology • Eigenvalues • number of variables which the factor represents • amount of variance in the data described by the factor • Communalities • proportion of the variance in the original variables that can be explained by the factors • factor solution should explain at least half of each original variable's variance, so the communality value for each variable should be 0.50 or higher

  14. Terminology • Rotated Factor Matrix • represents both how the variables are weighted for each factor and also the correlation between the variables and the factor  • these are correlations so possible values range from -1 to +1  • In SPSS, you can tell it to print any of the correlations that less than a particular value (usually use 0.3) • makes the output easier to read by removing the clutter of low correlations that are probably not meaningful anyway

  15. General Steps to FA • Step 1: Selecting and Measuring a set of variables in a given domain • Step 2: Data screening in order to prepare the correlation matrix • Step 3: Factor Extraction • Step 4: Factor Rotation to increase interpretability • Step 5: Interpretation • Further Steps: Validation and Reliability of the measures

  16. generate a correlation matrix for all variables identify variables not related to other variables if correlation between variables are small, unlikely that they share common factors (variables must be related to each other for the factor model to be appropriate) think of correlations in absolute value. correlation coefficients > 0.3 in absolute value are indicative of acceptable correlations. examine visually the appropriateness of the factor model The Correlation Matrix

  17. Bartlett Test of Sphericity • tests the null hypothesis that the correlation matrix is an identity matrix (all diagonal terms are 1 and all off-diagonal terms are 0) • want to reject this null hypothesis • If the value of the test statistic for sphericity is large and the associated significance level is small, it is unlikely that the population correlation matrix is an identity The Correlation Matrix

  18. The Kaiser-Meyer-Olkin (KMO) measure of sampling adequacy • index for comparing magnitude of observed correlation coefficients to magnitude of partial correlation coefficients. • closer KMO measure is to 1, evidence of a sizeable sampling adequacy • 0.8 and higher are great • 0.7 is acceptable • 0.6 is mediocre • < 0.5 is unacceptable • Small KMO values indicate that a factor analysis may not be a good idea. The Correlation Matrix

  19. primary objective is to determine the factors • initial decisions can be made here about number of factors underlying a set of measured variables. • several factor extraction methods • Principal Component Analysis – used for data reduction • Maximum likelihood method • Principal axis factoring • Alpha method • Unweighted lease squares method • Generalized least square method • Image factoring Factor Extraction

  20. To decide on how many factors needed to represent the data, use 2 statistical criteria: • Eigenvalues • Scree Plot • Determination of number of factors is usually done by considering only factors with Eigenvalues > 1. • Factors with a variance less than 1 are no better than a single variable, since each variable is expected to have a variance of 1. Factor Extraction

  21. Examination of Scree Plot provides visual of total variance associated with each factor. • Steep slope shows large factors. • Gradual trailing off (scree) shows rest of factors usually have Eigenvalue < 1. • In choosing number of factors, in addition to the statistical criteria, make initial decisions based on conceptual and theoretical grounds. Factor Extraction

  22. Factor Extraction – using PCA

  23. Factor Extraction using Principal Axis Factoring

  24. Unrotated factors are typically not very interpretable (most factors are correlated with may variables). • Factors are rotated to make them more meaningful and easier to interpret (each variable is associated with a minimal number of factors). • Different rotation methods may result in the identification of somewhat different factors. Factor Rotation

  25. Two types of rotation • Orthogonal – produces uncorrelated factors/components • Varimax: most popular • attempts to minimize the number of variables that have high loadings on a factor. • enhances the interpretability of the factors • Quartimax • Equamax • Oblique – produces correlated factors/components • used less frequently because results are more difficult to summarize • types • Direct Quartimin • Promax • Harris-Kaiser Orthoblique Factor Rotation

  26. A factor is interpreted or named by examining largest values linking the factor to the measured variables in the rotated factor matrix. Factor Rotation

  27. Making final decisions • should base final decision on number of factors for rotated solution that is most interpretable. • identify factors by grouping variables that have large loadings for same factor • interpret factors according to meaning of the variables • decision should be guided by: • conceptual beliefs about the number of factors from past research or theory • Eigenvalues computed earlier • relative interpretability of rotated solutions computed Making Final Decisions

More Related