80 likes | 217 Views
Principle Components Analysis. A method for data reduction. Factor Analytic Techniques. Reduce the number of variables Detect structure in the relationships among variables. Principal Factor Analysis (Common Factor Analysis). A method for detecting structure Y = XB + E
E N D
Principle Components Analysis • A method for data reduction
Factor Analytic Techniques • Reduce the number of variables • Detect structure in the relationships among variables
Principal Factor Analysis(Common Factor Analysis) • A method for detecting structure Y = XB + E • In the preceding equation, X is the matrix of factor scores, and B' is the factor pattern. There are two critical assumptions: • The unique factors are uncorrelated with each other. • The unique factors are uncorrelated with the common factors.
yij • is the value of the ith observation on the jth variable • xik • is the value of the ith observation on the kth common factor • bkj • is the regression coefficient of the kth common factor for predicting the jth variable • eij • is the value of the ith observation on the jth unique factor • q • is the number of common factors
Sample Dimensions Y = XB + E • Y – (n x p) • X – (n x q) • B – (q x p) • E – (n x p)
Random Variable Dimensions Y = XB + E • Y – (1 x p) • X – (1 x q) • B – (q x p) • E – (1 x p)
Principal Factor Factor Analysis – (a.k.a. Principal Axis Factoring and sometimes even Principal Components Factoring!) Come up with initial estimates of the communality for each variable and replace the diagonals in the correlation matrix with those. Then do principal components and take the first m loadings. Because you have taken out the specificity the error matrix should be much closer to a diagonal matrix. There are various initial estimates used for the initial communalities: the absolute value of the maximum correlation of that variable with any of the others, the squared multiple correlation coefficient for predicting that variable from the others in multiple regression, and the corresponding diagonal element from the inverse of the correlation matrix. There seems to be no agreement on which is best… but the first is a slight bit easier to program.