Introducing Latent Semantic Analysis

Introducing Latent Semantic Analysis Tomas K Landauer et al., “An introduction to latent semantic analysis,” Discourse Processes, Vol. 25 (2-3), pp. 259-284, 1998. Scott Deerwester et al., “Indexing by latent semantic analysis,” Journal of the American Society for Information Science, Vol. 41 (6), pp. 391-407, 1990. Kirk Baker, “Singular Value Decomposition Tutorial,” Electronic document, 2005. Aug 22, 2014 Hee-Gook Jun

Outline • SVD • SVD to LSA • Conclusion

Eigendecomposition vs. Singular Value Decomposition • Eigendecomposition • Must be a diagonalizable matrix • Must be a square matrix • Matrix (n x n size) must have n linearly independent eigenvector • e.g. symmetric matrix .. • Singular Value Decomposition • Computable for any size (M x n) of matrix A A U P ∑ VT Ʌ P-1

U: Left Singular Vectors of A • Unitary matrix • Columns of U are orthonormal (orthogonal + normal) • orthonormal eigenvectors of AAT A U ∑ VT and is orthogonal is normal vector = [0,0,0,1] = [0,1,0,0] = (0x0) + (0x1) + (0x0) + (1x0) = 0 = [0,0,0,1] || = = 1

V: Right Singular Vectors of A • Unitary matrix • Columns of V are orthonormal (orthogonal + normal) • orthonormal eigenvectors of ATA A U ∑ VT

∑ (or S) • Diagonal Matrix • Diagonal entries are the singular values of A • Singular values • Non-zero singular values • Square roots of eigenvalues from U (or V) in descending order A U ∑ VT

Calculation Procedure • U is a list of eigenvectors of AAT • Compute AAT • Compute eigenvectors of AAT • Matrix Orthonormalization • V is a list of eigenvectors of ATA • Compute ATA • Compute eigenvalues of ATA • Orthonormalize and transpose • ∑ is a list of eigenvalues of U or V • (eigenvalues of U = eigenvalues of V) A U ∑ VT ① ② ③

1.1 Matrix U – Compute AAT • Start with the matrix • Transpose of A • Then

1.2 Matrix U – Eigenvectors and Eigenvalues [1/2] • Eigenvector • Nonzero vector that satisfies the equation • A is a square matrix, is an eigenvalue (scalar), is the eigenvector rearrange ≡ set determinent of the coefficient matrix to zero

1.2 Matrix U – Eigenvectors and Eigenvalues [2/2] Calculated eigenvalues eigenvector ① For eigenvector ② For Thus, set of eigenvectors

1.3 Matrix U – Orthonormalization set of eigenvectors orthonormal matrix Gram-Schmidt orthonormalization normalize v1 find w2 (orthogonal to u1) normalize w2

2.1 Matrix VT – Compute ATA • Start with the matrix • Transpose of A • Then

2.2 Matrix VT– Eigenvectors and Eigenvalues [1/2] • Eigenvector • Nonzero vector that satisfies the equation • A is a square matrix, is an eigenvalue (scalar), is the eigenvector rearrange ≡ set determinent of the coefficient matrix to zero by cofactor expansion (여인수 전개)

2.2 Matrix VT– Eigenvectors and Eigenvalues [2/2] eigenvector ① For ② For ③ For Thus, set of eigenvectors

2.3 Matrix VT – Orthonormalization and Transformation Gram-Schmidt orthonormalization set of eigenvectors orthonormal matrix Transpose normalize v1 find w2 (orthogonal to u1) normalize w2 find w3 (orthogonal to u2) normalize w3

3.1 Matrix ∑ (= S) • Square roots of the non-zero eigenvalues • Populate the diagonal with the values • Diagonal entries in ∑ are the singular values of A

Outline • SVD • SVD to LSA

Latent Semantic Analysis • Use SVD (Singular Value Decomposition) • to simulate human learning of word and passage meaning • Represent word and passage meaning • as high-dimensional vectors in the semantic space

LSA Example First analysis – Document Similarity Second analysis – Term Similarity doc 1 " modem the steering linux. modem, linux the modem. steering the modem. linux " doc 2 "linux; the linux. the linux modem linux. the modem, clutch the modem. petrol " doc 3 " petrol! clutch the steering, steering, linux. the steering clutch petrol. clutch the petrol; the clutch " doc 4 " the thethe. clutch clutchclutch! steering petrol; steering petrol petrol; steering petrol "

LSA Example: Build a Term Frequency Matrix Let Matrix A =

LSA Example: Compute SVD of Matrix A A - R code - result ← svd(A) = U S VT x x 6 x 4 4 x 4 4 x 4

LSA Example: Reduced SVD 4 x 4 4 x 4 6 x 4 x x 2 x 2 2 x 4 6 x 2 x x

LSA Example: Document Similarity S V = x 2 x 2 2 x 4 doc 1 "modem the steering linux. modem, linuxthe modem. steering the modem. linux " doc 2 "linux; the linux. the linux modem linux. the modem, clutch the modem. petrol " doc 3 "petrol! clutch the steering, steering, linux. the steering clutch petrol. clutch the petrol; the clutch " doc 4 "the thethe. clutch clutchclutch! steering petrol; steering petrol petrol; steering petrol "

LSA Example: Term Similarity S V x = linux modem the clutch steering petrol

Conclusion • Pros • Compute document similarity • even if they do not have common words • Cons • Statistical foundation missing → PLSA Which one is to be chosen to reduce? x x

Introducing Latent Semantic Analysis

Introducing Latent Semantic Analysis

Presentation Transcript

An Introduction to Latent Semantic Analysis

Latent Semantic Analysis (LSA)

Latent Semantic Indexing: A probabilistic Analysis

Latent Semantic Analysis

IR Models: Latent Semantic Analysis

Lecture 5: Probabilistic Latent Semantic Analysis

Multi-Relational Latent Semantic Analysis .

Paper: Indexing by Latent Semantic Analysis

An Introduction to Latent Semantic Analysis

Lecture 5: Probabilistic Latent Semantic Analysis

Indexing by Latent Semantic Analysis

Latent Semantic Indexing

Bayesian Learning for Latent Semantic Analysis

Probabilistic Latent Semantic Analysis

Latent Semantic Analysis

Latent Semantic Indexing

Latent Semantic Indexing: A probabilistic Analysis

Latent Semantic Analysis (LSA)

Latent Semantic Analysis