230 likes | 344 Views
PC Decisions: # PCs, Rotation & Interpretation. Remembering the process Some cautionary comments Statistical approaches Mathematical approaches “Nontrivial factors” approaches Simple Structure & Factor Rotation Major Kinds of Factor Rotation Factor Interpretation.
E N D
PC Decisions: # PCs, Rotation & Interpretation Remembering the process Some cautionary comments Statistical approaches Mathematical approaches “Nontrivial factors” approaches Simple Structure & Factor Rotation Major Kinds of Factor Rotation Factor Interpretation
How the process really works… Here’s the series of steps we talked about earlier. • # factors decision • Rotate the factors • interpreting the factors • factor scores These “steps” aren’t made independently and done in this order! Considering the interpretations of the factors can aid the # factors decision! Considering how the factor scores (representing the factors) relate to each other and to variables external to the factoring can aid both the # factors decision and interpretation.
Statistical Procedures • PC analyses are extracted from a correlation matrix • PCs should only be extracted if there is “systematic covariation” in the correlation matrix • This is know as the “sphericity question” • Note: the test asks if there the next PC should be extracted • There are two different sphericity tests • Whether there is any systematic covariation in the original R • Whether there is any systematic covariation left in the partial R, after a given number of factors has been extracted • Both tests are called “Bartlett’s Sphericity Test”
Statistical Procedures, cont. • Applying Bartlett’s Sphericity Tests • Retaining H0: means “don’t extract another factor” • Rejecting H0: means “extract the next factor” • Significance tests provide a p-value, and so a known probability that the next factor is “1 too many” (a type I error) • Like all significance tests, these are influenced by “N” • larger N = more power = more likely to reject H0: = more likely to “keep the next factor” (& make a Type I error) • Quandary?!? • Samples large enough to have a stable R are likely to have “excessive power” and lead to “over factoring” • Be sure to consider % variance, replication & interpretability
Mathematical Procedures • The most commonly applied decision rule (and the default in most stats packages -- chicken & egg ?) is the > 1.00 rule … here’s the logic Part 1 • Imagine a spherical R (of k variables) • each variable is independent and carries unique information • so, each variable has 1/kth of the information in R • For a “normal” R (of k variables) • each variable, on average, has 1/kth of the information in R
Mathematical Procedure, cont. Part 2 • The “trace” of a matrix is the sum of its diagonal • So, the trace of R (with 1’s in the diag) = k (# vars) • tells the amount of variance in R accounted for by each extracted PC • for a full PC solution = k (accounts for all variance) Part 3 • PC is about data reduction and parsimony • “trading” fewer more-complex things (PCs - linear combinations of variables) for fewer more-simple things (original variables)
Mathematical Procedure, cont. Putting it all together (hold on tight !) • Any PC with > 1.00 accounts for more variance than the average variable in that R • That PC “has parsimony” -- the more complex composite has more information than the average variable • Any PC with < 1.00 accounts for less variance than the average variable in that R • That PC “doesn’t have parsimony” -- the more complex composite has more no information than the average variable
Mathematical Procedure, cont. There have been examinations the accuracy of this criterion • The usual procedure is to generate a set of variables from a known number of factors (vk = b1k*PC1 + … +bfk*PCf, etc.) --- while varying N, # factors, # PCs & communalities • Then factor those variables and see if > 1.00leads to the correct number of factors Results -- the rule “works pretty well on the average”, which really means that it gets the # factors right some times, underestimates sometimes and overestimates sometimes • No one has generated an accurate rule for assessing when which of these occurs • But the rule is most accurate with k < 40, f between k/5 and k/3 and N > 300
Nontrivial Factors Procedures These “common sense” approaches became increasing common as… • the limitations of statistical and mathematical procedures became better known • the distinction between exploratory and confirmatory factoring developed and the crucial role of “successful exploring” became better known These procedures are more like “judgement calls” and require greater application of content knowledge and “persuasion”, but are often the basis of good factorings !!
Nontrivial factors Procedures, cont. Scree-- the “junk” that piles up at the foot of an glacier a “diminishing returns” approach • plot the for each factor and look for the “elbow” • “Old rule” -- # factors = elbow (1966; 3 below) • “New rule” -- # factors = elbow - 1 (1967; 2 below) • Sometimes there isn’t a clear elbow -- try another “rule” • This approach seems to work best when combined with attention to interpretability !! 4 2 0 # PC 1 2 3 4 5 6
An Example… A buddy in graduate school wanted to build a measure of “contemporary morality”. He started with the “10 Commandments” and the “7 Deadly Sins” and created a 56-item scale with 8 subscales. His scree plot looked like… How many factors? λ 1? – big elbow at 2, so ’67 rule suggests a single factor, which clearly accounts for the biggest portion of variance 7? – smaller elbow at 8, so ’67 rule suggests 7 8? – smaller elbow at 8,’66 rule gives the 8 he was looking for – also 8th has λ > 1.0 and 9th had λ < 1.0 0 1 10 20 1 8 20 40 56 • Remember that these are subscales of a central construct, so.. • items will have substantial correlations both within and between subscales • to maximize the variance accounted for, the first factor is likely to pull in all these inter-correlated variables, leading to a large λ for the first (general) factor and much smaller λs for subsequent factors • This is a common scree configuration when factoring items from a multi-subscale scale!
Kinds of well-defined factors • There is a trade-off between “parsimony” and “specificity” whenever we are factoring • This trade-off influences both the #-of-factors and cutoff decisions, both of which, in turn, influence factor interpretation • general and “larger” group factors include more variables, account for more variance -- are more parsimonious • unique and “smaller” group factors include fewer variables & many be more focused -- are often more specific • Preferences really depends upon ... • what you are expecting • what you are trying to accomplish with the factoring
Kinds of ill-defined factors Unique factors • hard to know what construct is represented by a 1-variable factor • especially if that variable is multi-vocal • then the factor is defined by “part” of that single variable -- but maybe not the part defined by its name Group factors can be ill-defined • “odd combinations” can be hard to interpret -- especially later factors comprised of multi-vocal variables (knowledge of variables & population is very important!)
Simple Structure • The idea of simple structure is very appealing ... • Each factor of any solution should have an unambiguous interpretation, because the variable loadings for each factor should be simple and clear. • There have been several different characterizations of this idea, and varying degrees of success with translating those characterizations into mathematical operations and objective procedures, here are some of the most common
Components of Simple Structure Each factor should have several variables with strong loadings • admonition for well-defined factors • remember that “strong” loadings can be “+” or “-” Each variable should have a strong loading for only one factor • admonition against multi-vocal items • admonition of conceptually separable factors • admonition that each variable should “belong” to some factor Each variable should have a large communality • implying that its membership “accounts” for its variance
The benefit of “simple structure” ? • Remember that … • we’re usually factoring to find “groups of variables” • But, the extraction process is trying to “reproduce variance” • the factor plot often looks simpler than the structure matrix PC2 PC1 PC2 V1 .7 .5 V2 .6 .6 V3 .6 -.5 V4 .7 -.6 V2 V1 PC1 V3 V4 • True, this gets more complicated with more variables and factors, but “simple structure” is basically about “seeing” in the structure matrix what is apparent in the plot
How rotation relates to “Simple Structure” Factor Rotations -- changing the “viewing angle” of the factor space-- have been the major approach to providing simple structure • structure is “simplified” if the factor vectors “spear” the variable clusters PC1’ Unrotated PC1 PC2 V1 .7 .5 V2 .6 .6 V3 .6 -.5 V4 .7 -.6 PC2 Rotated PC1 PC2 V1.7 -.1 V2.7 .1 V3 .1 .5 V4.2 .6 V2 V1 PC1 V3 V4 PC2’
Major Types of Rotation Remember -- extracted factors are orthogonal (uncorrelated) • Orthogonal Rotation -- resulting factors are uncorrelated • more parsimonious & efficient, but less “natural” • Oblique Rotation -- resulting factors are correlated • more “natural” & better “spearing”, but more complicated Orthogonal Rotation Oblique Rotation PC1’ PC1’ PC2 PC2 Angle less than 90o Angle is 90o V2 V2 V1 V1 PC1 PC1 V3 V3 V4 V4 PC2’ PC2’
Major Types of Orthogonal Rotation & their “tendencies” Varimax -- most commonly used and common default • “simplifies factors” by maximizing variance of loadings of variables of a factor (minimized #vars with high loadings) • tends to produce group factors Quartimax • “simplifies variables” by maximizing variance of loadings of a variable across factors (minimizes #factors a var loads on) • tends to “move” vars from extraction less than varimax • tends to produce a general & small group factors Equimax • designed to “balance” varimax and quartimax tendencies • didn’t work very well -- can’t do simultaneously - whichever is done first dominates the final structure
Major Types of Oblique Rotation & their “tendencies” Promax • computes best orthogonal solution and then “relaxes” orthogonality constraints to better “spear” variable clusters with factor vectors (give simpler structure) Direct Oblimin • spearing variable clusters as well as possible to produce lowest occurrence of multi-vocality All oblique rotations have a parameter (, , Κ) that set maximum correlation allowed between rotated factors • changing this parameter can “importantly” change the resulting rotation and interpretation • try at least a couple of values & look for consistency
Some things that are different (or not) when you use a Oblique Rotation Different things: • There will be a (phi) matrix that holds the factor intercorrelations • The -values and variances accounted for by the rotated factors will be different than those of the extracted factors • compute for each factor by summing the squared structure loadings for that factor • compute the variance accounted for as the newly computed / k Same things: • the communality of each variable will be the same -- but can’t be computed by summing squared structure loadings for each variable (since factors are correlated)
Interpretation & Cut-offs • Interpretation is the process of naming factors based on the variables that “load on” them • Which variables “load” is decided based on a “cutoff” • cutoffs usually range from .3 to .4 ( + or - ) • Higher cutoffs limit # loading variables • factors may be ill-defined, some variables may not load • Lower cutoffs increases # loading variables • variables more likely to be multi-vocal • Worry & make a careful decision when your interpretation depends upon the cutoff that is chosen !!
Combining #-factors & Rotation to Select “the best Factor Solution” To specify “the solution” you must pick the #-factors, type or rotation & cutoff ! • Apply the different rules to arrive at an initial “best guess” of the #-factors • Obtain orthogonal and oblique rotations for that many factors, for one fewer and for one more • Compare the solutions to find for “your favorite” – remember this is exploratory factoring, so explore! • parsimony vs. specificity • different cutoffs (.3 - .4) • rotational survival • simple structure • conceptual sense • interesting surprises (about factors and/or variables)