150 likes | 464 Views
Factor Analysis. Statistical Learning Theory Fall 2005. Outline . General Motivation Definition/Derivation The Graphical Model Implications/Interpretations Maximum Likelihood Estimation Motivation Application (using the EM Algorithm) . Motivation.
E N D
Factor Analysis Statistical Learning Theory Fall 2005
Outline • General Motivation • Definition/Derivation • The Graphical Model • Implications/Interpretations • Maximum Likelihood Estimation • Motivation • Application (using the EM Algorithm)
Motivation • Know: Discrete Mixture Models (ch.10) • Application: HMM • Want: Continuous Mixture Models • Application: ??
Definition: Factor Analysis • We consider here density estimation, but Factor Analysis can be extended to regression and classification problems. Consider a “high-d” data vector V in Rn such that the entries of V lie “near” a lower-dimension manifold M. Then the factor analysis model is a product of the following assumptions: • A point in M is generated according to a PDF. • V is then generated conditionally according to another (simple) PDF, centered on a point in M. • M is a linear subspace of Rn
Another Definition… Factor analysis is a statistical technique that originated in psychometrics. It is used in the social sciences and in marketing, product management, operations research, and other applied sciences that deal with large quantities of data. The objective is to explain the most of the variability among a number of observable random variables in terms of a smaller number of unobservable random variables called factors. The observable random variables are modeled as linear combinations of the factors, plus "error" terms. ~[Wikipedia]
The Graphical Model NOTE: p < q
Derivation We assume: Now we need: and
Derivation cont’d… Identities: These imply:
Derivation cont’d… Let Then
Result #1: The Joint Distribution • So now we can say that the joint is a gaussian distribution with: • So that
Calculating the Conditional… The results of chapter 13’s discussion of the marginalization and conditioning of the multi-variate gaussian yield: (see equations 13.26 and 13.27 in [Jordan])
Implementation Issues • The derived expressions require the inversion of a qxq matrix. • Jordan claims that the following forms are equivalent: • Note that these only require the inversion of a pxp matrix! (recall that p<q)
Interpretations… • Our discussion of Factor analysis so far can be seen as a discussion of an update process. • Before data Y is observed, X is a gaussian distribution about the origin of the lower dimension subspace M. • Observing Y=y, in a sense, updates the distribution of X as given by our derivation of E(X|y) and Var(X|y).
Geometric Interpretation Y=y y3 Rp=3 µ M y2 y1 (see Ch. 14 p.7)