540 likes | 755 Views
FACTOR ANALYSIS. LECTURE 11 EPSY 625. PURPOSES. SUPPORT VALIDITY OF TEST SCALE WITH RESPECT TO UNDERLYING TRAITS (FACTORS) EFA - EXPLORE/UNDERSTAND UNDERLYING FACTORS FOR A TEST CFA - CONFIRM THEORETICAL STRUCTURE IN A TEST. HISTORICAL DEVELOPMENT.
E N D
FACTOR ANALYSIS LECTURE 11 EPSY 625
PURPOSES • SUPPORT VALIDITY OF TEST SCALE WITH RESPECT TO UNDERLYING TRAITS (FACTORS) • EFA- EXPLORE/UNDERSTAND UNDERLYING FACTORS FOR A TEST • CFA- CONFIRM THEORETICAL STRUCTURE IN A TEST
HISTORICAL DEVELOPMENT • PEARSON (1901)- eigenvalue/eigenvector problem (dimensional reduction) “method of principal axes) • SPEARMAN (1904) “General Intelligence, Objectively Measured and Determined” • Others: Burt, Thompson, Garnett, Holzinger, Harmon, Thurstone
FACTOR MODELS SUBJECTS FIXED SAMPLE Fixed VARIABLES Principal components, common factors Image Sampl e ALPHA Factor Analysis Canonical Factor Analysis
EXPLORATORY FACTOR ANALYSIS • USE PRINCIPAL AXIS METHOD: • ASSUMES THERE ARE 3 VARIANCE COMPONENTS IN EACH ITEM: • COMMONALITY (h2) • UNIQUENESS: • SPECIFICITY (s2) • ERROR (e2)
SINGLE FACTOR • REQUIRES AT LEAST 3 ITEMS OR MEASUREMENTS TO UNIQUELY DETERMINE
ASSUMED=0 FOR PARALLEL ITEMS SPECIFICITY CALLED FACTOR LOADING CORRELATION BETWEEN ITEM AND FACTOR .714 e ITEM1 .7 .6 e FACTOR ITEM2 .8 .6 .8 ITEM3 e
ASSUMED=0 FOR PARALLEL ITEMS ALPHA= SPEARMAN-BROWN STEPPED UP AVERAGE INTER ITEM CORRELATION: (.56 +.42+.48)/3=.49 ALPHA= 3(.49)/[1+2(.49)] = .74 SPECIFICITY .714 e ITEM1 .7 =1-.72 .6 e FACTOR ITEM2 .8 .6 .8 ITEM3 e
TWO FACTORS • NEED AT LEAST 2 ITEMS OR MEASUREMENTS PER FACTOR, ASSUMING FACTORS ARE CORRELATED
.7 e ITEM1 FACTOR 1 e ITEM2 ITEM 3 e .8 .6 e .5 CORRELATION BETWEEN FACTORS FACTOR 2 ITEM 4 .7
CORRELATION BETWEEN ANY TWO ITEMS = PRODUCT OF ALL PATHS BETWEEN THEM; EX. R(ITEM1, ITEM4) = .7 x .5 x .7 = .245 .7 e ITEM1 FACTOR 1 e ITEM2 ITEM 3 e .8 .6 e .5 CORRELATION BETWEEN FACTORS FACTOR 2 ITEM 4 .7
SIMPLE STRUCTURE • TRY TO CREATE SCALE IN WHICH EACH ITEM CORRELATES WITH ONLY ONE FACTOR: ITEM FACTOR 1 2 3 ITEM 1 1 0 0 ITEM 2 1 0 0 ITEM 3 0 1 0 ETC
CRITERIA FOR SIMPLE STRUCTURE • Structural equation modeling provides chi square test of fit • Compares observed covariance (correlation) matrix with predicted/fitted matrix • Alternatively, look at RMSEA (Root mean square error of approximation) of deviations from fitted matrix
MATHEMATICAL MODEL • Z = persons by variables matrix of p x k standardized variables (mean=0, SD=1) • Z’Z = NR (covariance matrix) k x k • Zi = aiFi + ei
MATHEMATICAL MODEL • Z = AF = C + U • ZZ’/N = R = AFF’A’ + U2 • S = ZF’/N (structure matrix: correlations between Z and F) = AFF’/N • = FF’/N (correlations among factors) • A = Pattern matrix
MATHEMATICAL MODEL • S = A • A = S -1 (If factors uncorrelated, A=S) Pattern matrix = Structure matrix • R = ZZ’/N = CC’/N + U2
MATHEMATICAL MODEL • If we take the covariance matrix of F to be diagonal, and the metric of variances of Fi to be 1.0, • R = AA’/N = SA’ = AS’
MATHEMATICAL MODEL • Now let Zi = aiFi + si + ei • Let Ŕ = R - D2, where D2 is a diagonal matrix of specificities and error: si + e2i • Then Ŕ = AFF’A/N = A A’ = SA’ = AS’ • = I Ŕ = AA’
MATHEMATICAL MODEL • How do we estimate s2i ? • Instead, estimate [R2- U2]ii= [I- s2i - e2i]ii • Consider for each zi that it is predictable from the rest: zi = b1z1 + b2z2 + …bi-1zi-1 + ... • Then R2i = variance common to all other variables (squared multiple correlation or SMC) h2i = communality for item i • Due to Dwyer (1939)
MATHEMATICAL MODEL • SMC is estimable from the observed data, so that Ŕ = R - [1-SMCi] where [SMCi] = diagonal matrix with SMCs for each variable on the diagonals and zeros off-diagonal • Theorem states “SMCs guarantee that the number of factors # eigenvalues>1.0
MATHEMATICAL MODEL Ŕ = R21.234.. 0 0 0 0 … 0 R22.134.. 0 0 0 … 0 0 R23.124.. 0 0 … 0 0 0 R24.123.. 0 …
MATHEMATICAL MODEL • SOLUTIONS: PRINCIPAL COMPONENTS (R = Ŕ ) Rq = q, RQ = Q, = diagonal [i] Q-1RQ = QQ’ = I = Q-1 = Q’ Q’RQ = (Spectral Theorem)
MATHEMATICAL MODEL • SOLUTIONS: PRINCIPAL AXIS ( Ŕ- I)q = 0 • That is, solve for first eigenvalue • | Ŕ- I | = 0, solved by • Rmq = mq begin with m=2: R2q = 2q , then put solution in R(Rq1) = 2q1, iterate for m=4
MATHEMATICAL MODEL • Now compute residual correlation matrix: R21 = R2 - Ŕ , iterate
EIGENVALUES • i = variance of ith factor • i / i = proportion of total variance accounted for by the ith factor • i < 1 chance factor • Scree plot (value x factor eigenvalue ordered from greatest to lowest)
K 1.0 0 SCREE PLOT 1 2 3 4 5 6 7 . . . . K
ROTATION • MEANING CRITERION: SIMPLE STRUCTURE POSITIVE MANIFOLD • B=AT A=INITIAL FACTOR MATRIX T=TRIANGULAR MATRIX B=FINAL FACTOR MATRIX • TT’=
VARIMAX ROTATION (uncorrelated Factors) • ORTHOGONAL (RIGID) ROTATION • Maximize V=n (bjp/hj)4 - (b2jp/h2j)2 • Geometric problem: • (X,Y) = (x,y) cos - sin sin - cos
VARIMAX ROTATION • (X,Y) = (x,y) cos - sin • sin - cos uj = x2j - y2j vj = 2xjyj A= uj B= vj C= (uj - vj)2 D=2 ujvj solve tan4 = [D-2AB/h]/[C-(A2-B2)/h] -45o 45o
Orthogonal (perpendicular) Rotation of Axes Unrotated Factor 1 loading values Unrotated Factor 2 loading values
OBLIQUE SOLUTION (correlated Factors) • MINIMIZE S (OBLIMIN) • S = [n(v2jp/h2j)(v2jg/h2j)- ((v2jp/h2j)((v2jg/h2j)] • PROMAX: • Start with VARIMAX, B=AT, transform with • vjp = (bjp4)/bjp
FACTOR CORRELATION • = TT’ • Tij = cos(ij) -sin(ij) sin(ij) cos(ij) • rij= [cos(ij)(-sin(ij)]+ [sin(ij)cos(ij)] = T11T12 + T21T22
FACTOR CORRELATION • S = P (Structure matrix= Pattern matrix x factor correlation matrix) • P = A(T’)-1 • A = PT’
Oblique Rotation of Axes ij
ALPHA FACTOR ANALYSIS • Estimates population h2i for each variable • Little different from common factors
Canonical Factor Analysis • Uses canonical analysis to maximize R between factors and variables, iterative Maximum Likelihood analysis
Image Analysis • h2i = R2i.1,2,…K • pj = wjkzk (standard regression) • ej = zj - pj called anti-image • Var(ej)> Var(j) where Var(j) = anti-image for the regression of zj on the factors F1,F2, …FK
FACTOR CONGRUENCE • Alternative to Confirmatory Analysis for two groups who it is hypothesized have the same factor structure: • Spq = ajpbjq / [a2jp b2jq ] • This is basically the correlation between factor loadings on the comparable factors for two groups
Example of 2 factor structure • Achievement (reading, math) and IQ (verbal, nonverbal) • quasi-multitrait multimethod analysis: • reading is verbal • math is “nonverbal”
BASIC PRINCIPLES • xx´) • 2x1 • xx = x1x2 2x2 • x1x3 x2x3 2x3
BASIC PRINCIPLES • 2x1 =2111 + 21 • 2xk =2k11 + 2k • xixk =x111 xk 1 x1 1 xk k
IDENTIFICATION RULES • t-rule : tq(q+1), q=#manifest variables • necessary but not sufficient • 3-indicator rule: 1 factor3 indicators • sufficient but not necessary • 2-indicator rule: 2+ factors2 indicators @ • local vs. global identification: • local: sample estimates of parameters independent- necessary but not sufficient • global: population parameters independent- necessary and sufficient
ESTIMATION • MODEL EVALUATION • FIT: FML used to evaluate , S • Residuals: E= S - • RMR = SD(sij - ij ) • RMSEA = √[(2/df - 1) /(N - 1)] • note: factor analyze E, should be 0 ˆ ˆ ˆ ˆ ˆ
Hancock’s Formula- reliability for a given factor Hj = 1/ [ 1 + {1 / (Σ[l2ij/(1- l2ij )] ) } Ex. l1 = .7, l2= .8, l3 = .6 H = 1 / [ 1 +1/( .49/.51 + .64/.36 + .36/.64 )] = 1 / [ 1 + 1/ ( .98 +1.67 + .56 ) ] = 1/ (1 + 1/3.21) = .76
Hancock’s Formula Explained Hj = 1/ [ 1 + {1 / (Σ[l2ij/(1- l2ij )] ) } now assume strict parallelism: then l2ij= 2xt thus Hj = 1/ [ 1 + {1 / (Σ[2xt /(1- 2xt)] ) } = k 2xt / [1 + (k-1) 2xt ] = Spearman-Brown formula
TEST • (n-1)FML ~ t • used for nested model: model with one or more restrictions from original • restriction = known parameter, equality of two or more parameters • Proof: Bollen shows (N-1)[-2Log(L0/L1)= (N-1)FML where L0 is unrestricted, L1 restricted models
INCREMENTAL FIT • Bentler and Bonnet: 1 = Fb - Fm Fb = b - m b • can be used to compare improvements over original model or against a standard or baseline