810 likes | 816 Views
This book provides an introduction to factor analysis, a statistical technique used to reduce variables and identify underlying structures in data. It covers the uses of factor analysis in data reduction, creating and validating composites/scales for psychometric instruments, and the exploration and development of questionnaires. The book discusses different approaches to factor analysis, methodological considerations, and the assumptions and requirements of the technique.
E N D
An Introduction to Factor Analysis Reducing variables and/or detecting underlying structures
Uses • Data reduction 24 actual variables Factor 1 Factor 2 Two latent variables
Uses • Create composites/scales for psychometric instruments Depression Anxiety
Uses • Validate composites/scales for psychometric instruments Depression Anxiety
Also used in the development or exploration of questionnaires or other psychometric instruments. Factor analytic techniques are most commonly used to reduce many items into a more usable number of factors. This way, the more simplified data can be used more easily in research. Summary of uses
Latent variables A metaphor
An example of common variance using bivariate relationships • I measure a sample of kindergarten children’s ability to recognize the sound(s) at the beginning of words, e.g., /k/ in “cat” • I also measure the children’s ability to segment (break apart) sounds e.g., “cat” = /k/ /a/ /t/ • I correlate these two measures
Phoneme Segmentation Beginning letter sounds
A vast array of variables, with no theoretical association are forced into analysis just to see what turns up The variables have inadequate reliability. This lack of stability of measurement affects the meaningfulness of the derived factors. Not useful when
Exploratory Mathematically driven technique Seeks to identify the underlying structure of a set of items or variables Use of scholarly intuition to figure out what the factors mean Confirmatory Starts with a theory of what you expect to confirm (a priori) Do the items load as you expected on the factors that you predicted? Much more involved Structural Equation Modeling approach—test of model fit Approaches to Factor Analytic Techniques
Methodological Considerations • Selection of variables • Size of sample • Reliability of measures • Appropriateness of using Factor Analytic techniques (given the goal of the research) • Choice of method (how to extract the factors) • How many factors to retain • Methods of rotation (to ease interpretability) Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.
Methodological Considerations • Selection of variables Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.
Assumptions and Requirements of Factor Analytic Techniques • More than one variable involved • Sample acquired through random selection • Robust bivariate relationships among variables • Variables are measured using either interval or ratio (or ordinal—quasi-interval?) level data • Data approximate a normal distributions (multivariate normality is also nice) • Relationships among variables are linear • Variables are measured reliably • No multicolinearity (e.g., bivariate r above 0.90) • Few missing observations • “Large” number of observations
Methodological Considerations • Selection of variables • Size of sample • Reliability of measures • Appropriateness of using Factor Analytic techniques (given the goal of the research) • Choice of method (how to extract the factors) • How many factors to retain • Methods of rotation (to ease interpretability) Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.
Size of sample What is a reasonable sample size? How many observations do you need? • Old school: Ten observations per planned extracted factor (with a minimum of 100 recommended) • “More is better” rule. Similar reasoning as other parametric statistical techniques, but less can be okay under some circumstances. • Recently, it is more recognized that smaller samples can be reasonably factor analyzed, but this is something still hotly debated.
Methodological Considerations • Selection of variables • Size of sample • Reliability of measures • Appropriateness of using Factor Analytic techniques (given the goal of the research) • Choice of method (how to extract the factors) • How many factors to retain • Methods of rotation (to ease interpretability) Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.
Reliability of measures • Factor analysis is a correlational technique (multiple regression) • Low reliabilities attenuate correlations • Low reliabilities introduce “noise” and obscure “signal” for the factors you are trying to detect and extract
Methodological Considerations • Selection of variables • Size of sample • Reliability of measures • Appropriateness of using Factor Analytic techniques (given the goal of the research) • Choice of method (how to extract the factors) • How many factors to retain • Methods of rotation (to ease interpretability) Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.
Appropriateness of Factor Analysis • Test development and instrument validation • Create composites/sub-scales for psychometric instruments • Detect underlying structures within • Construct validity • Evaluation of a theory • Data reduction • Reduce multiple variables to a smaller group, while maintaining the diversity of information offered. • Demonstrate that multiple instruments test the same thing • demonstrate that items load on one factor, or no factors, or multiple factors
Methodological Considerations • Selection of variables • Size of sample • Reliability of measures • Appropriateness of using Factor Analytic techniques (given the goal of the research) • Choice of method (how to extract the factors) • How many factors to retain • Methods of rotation (to ease interpretability) Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.
Partitioning Variance • Variance common to other variables • Variance specific to that variable • Random measurement error
Common Factor Analysis (CFA) Assumption: The factors explain the correlations among the variables (variance in common) Finds common variance among many items, groups it, and then it must be appropriately labeled Goal: To find the fewest number of factors that account for the relationships among variables Most common methods of extracting factors? Common variance Unique variance (item) Unique variance (item) CFA considers this variance Unique variance (item) DeCoster (1998) Overview of Factor Analysis Kahn 2006
Assumption: Components explain the variance in common among the variables and the amount of unique variance (item & error) present Goal: To find the fewest components that account for the relationships among variables Principal Components Analysis (PCA) Unique variance (item+error) Unique variance (item+error) Unique variance (item+error)
Common Factor Analysis Seeks the factors that account for the common variance among the variables Used for Exploratory Factor Analysis (EFA) or Confirmatory Factor Analysis (CFA) Easier to generalize to other samples/populations since the unique and error variance of items isn’t considered Most often used to detect underlying structures among variables. Principal Components Analysis Seeks factors that account for all of the common and other variance among the variables Harder to generalize since other sources of variance (that are item specific and not shared) are included in the model Most often used for data reduction to use in research Comparisons
Observed variables Factor Analytic Techniques Item 1 unique Item 4 Exploratory questions: unique What factors exist among the variables? unique Item 5 Factor 1 Item 7 unique Item 8 Latent Variables (unobserved) unique Item 2 unique Item 3 unique To what degree are the variables (items) related to the factors that were extracted? FACTOR LOADINGS Item 6 Factor 2 unique Item 9 unique unique Item 10 Kahn 2006
Common Factor Analysis • CFA takes into account shared (common) and item specific variance and uses the squared multiple correlation (R squared) as the measure of communality. • Communality is the variance in one variable that is shared with the other variables. • The factors extracted by CFA, therefore, explain the shared variance common to more than one variable.
Common Factor Analysis • Variance common to other variables Multicultural Counseling Inventory—Item 6: “I include the facts of age, gender roles, and socioeconomic status in my understanding of different minority cultures.” The measured overlap (R square) between this item and the other items on the MCI is the communality.
Common Factor Analysis Partitions variance for that variable, that is in common with other variables. How? Uses Multiple Regression. • Use each item as an outcome in MR • Use all other items as predictors • Finds the communality among all of the variables, relative to one another
Predictors: Item 2 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8 Item 9 Item 10 Outcome: Item 1 The R square is the average shared variance for that item with the other items Common Factor Analysis Item 1
Predictors: Item 1 Item 3 Item 4 Item 5 Item 6 Item 7 Item 8 Item 9 Item 10 Outcome: Item 2 The average R square is the average shared variance for that item with the other items Common Factor Analysis Item 2
Predictors: Item 1 Item 2 Item 4 Item 5 Item 6 Item 7 Item 8 Item 9 Item 10 Outcome: Item 3 The average R square is the average shared variance for that item with the other items Common Factor Analysis Item 3
How is communality reported with CFA? Squared multiple correlations (R square) are on the diagonal of the correlation matrix
What makes a good factor? • It is consistent with the literature regarding past investigations of variable relationships • It is easy to understand and interpret • It adheres to the “simple structure” model
Principal Component Analysis Data reduction
Principal Component Analysis Item 1 unique Item 4 unique unique Item 5 Component 1 Item 7 unique How many components are there that can account for all or most of the information contained in the original data? Item 8 unique Item 2 unique Item 3 unique Item 6 Component 2 unique Item 9 unique unique Item 10 Kahn 2006
CFA vs. PCA • Common factor analysis and principal components analysis often yield similar results when sample sizes are large and/or if item communalities are large. • Common factor analysis is preferred in situations in which these criteria are not met, especially when the researcher wishes to better understand the latent variables that underlie a mass of items.
Factor Analytic Family of Techniques Metaphors for extraction of factors/components
With each extraction of a component, less and less variance is unaccounted for. 1 2 4 5 6 7 8 3
Factor Analysis Metaphor ITEM POOL: Variance-covariance matrix for an instrument + + + First factor + + + + - - + + + Extracts the shared variation only (i.e., plusses) + + + + + + + + + + - - + - - + + + + + + + + + + + + + - - + + + + - - + - - + - - + + + ITEM POOL: There is still shared variance left, but it is different than the first batch + + + Second factor + - - Extracts the shared variation only (i.e., plusses) + - - + - - + + + + + + + + + - - + - - + - - + - - + + +
The Principle of Parsimony • Goal: We often want to use the smallest number of separate variables to convey the most information about the relationships among constructs. “Less is more” Kahn 2006
Methodological Considerations • Selection of variables • Size of sample • Reliability of measures • Appropriateness of using Factor Analytic techniques (given the goal of the research) • Choice of method (how to extract the factors) • How many factors to retain? • Methods of rotation (to ease interpretability) Hagarty, K. Y., Kromrey, J. D., Ferron, J. M., & Hines, C. V. (2004). Selection of variables in exploratory factor analysis: An empirical comparison of a stepwise and traditional approach. Psychomtrika, 69(4), 593-611.
How many factors to retain? If you keep letting the program extract factors, it will extract as many factors as there are items. So how do you decide how many factors to extract? Bryant & Yarnold (1995). Principal-Components and Factor Analysis from Grimm & Yarnold’s (Eds.) Reading and Understanding Multivariate Statistics
You want the fewest factors necessary to account for the most variance. Factor Analytic techniques will give you as many factors as you want (even if they’re complete nonsense). The aim is to find the real factors that are consistent with the theoretical structure, not just factors that pop up and have no logical explanation. Ferketich & Muller (1999) Readings in Research Methodology, Second Edition
How many factors to retain? A priori criterion • Replication criterion • Percentage criterion Stopping rules • Kaiser rule • Catell’s scree plot • Parallel analysis Bryant & Yarnold (1995). Principal-Components and Factor Analysis from Grimm & Yarnold’s (Eds.) Reading and Understanding Multivariate Statistics
A priori criterion 1. When you are replicating research and you want to use the same number of factors to retain as previous researchers. 2. You decide a cut-off point, based on some theoretical rationale (e.g., retain factors until 80% of the variance is explained by the extracted factors).
Eigenvalues The eigenvalue is the variance in every variable that is accounted for by the factor in question. The sum of all eigenvalues = number of variables/items in component analysis Ferketich & Muller (1999) Readings in Research Methodology, Second Edition
Kaiser criterion - Retain all factors with an Eigenvalue greater than 1.0) This sets the limit so that a component must account for at least as much variance as a single variable (to be considered useful). How many factors to retain? (For CFA, which SPSS calls principal axis factoring, this would be “factor” instead of “component”) Kahn 2006