1 / 15

Factor Analysis

Factor Analysis. Statistical Learning Theory Fall 2005. Outline . General Motivation Definition/Derivation The Graphical Model Implications/Interpretations Maximum Likelihood Estimation Motivation Application (using the EM Algorithm) . Motivation.

Albert_Lan
Download Presentation

Factor Analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Factor Analysis Statistical Learning Theory Fall 2005

  2. Outline • General Motivation • Definition/Derivation • The Graphical Model • Implications/Interpretations • Maximum Likelihood Estimation • Motivation • Application (using the EM Algorithm)

  3. Motivation • Know: Discrete Mixture Models (ch.10) • Application: HMM • Want: Continuous Mixture Models • Application: ??

  4. Definition: Factor Analysis • We consider here density estimation, but Factor Analysis can be extended to regression and classification problems. Consider a “high-d” data vector V in Rn such that the entries of V lie “near” a lower-dimension manifold M. Then the factor analysis model is a product of the following assumptions: • A point in M is generated according to a PDF. • V is then generated conditionally according to another (simple) PDF, centered on a point in M. • M is a linear subspace of Rn

  5. Another Definition… Factor analysis is a statistical technique that originated in psychometrics. It is used in the social sciences and in marketing, product management, operations research, and other applied sciences that deal with large quantities of data. The objective is to explain the most of the variability among a number of observable random variables in terms of a smaller number of unobservable random variables called factors. The observable random variables are modeled as linear combinations of the factors, plus "error" terms. ~[Wikipedia]

  6. The Graphical Model NOTE: p < q

  7. Derivation We assume: Now we need: and

  8. Derivation cont’d… Identities: These imply:

  9. Derivation cont’d… Let Then

  10. Result #1: The Joint Distribution • So now we can say that the joint is a gaussian distribution with: • So that

  11. Calculating the Conditional… The results of chapter 13’s discussion of the marginalization and conditioning of the multi-variate gaussian yield: (see equations 13.26 and 13.27 in [Jordan])

  12. Implementation Issues • The derived expressions require the inversion of a qxq matrix. • Jordan claims that the following forms are equivalent: • Note that these only require the inversion of a pxp matrix! (recall that p<q)

  13. Interpretations… • Our discussion of Factor analysis so far can be seen as a discussion of an update process. • Before data Y is observed, X is a gaussian distribution about the origin of the lower dimension subspace M. • Observing Y=y, in a sense, updates the distribution of X as given by our derivation of E(X|y) and Var(X|y).

  14. Geometric Interpretation Y=y y3 Rp=3 µ M y2 y1 (see Ch. 14 p.7)

More Related