280 likes | 576 Views
Introducing Latent Semantic Analysis. Tomas K Landauer et al., “An introduction to latent semantic analysis,” Discourse Processes, Vol. 25 (2-3), pp. 259-284, 1998.
E N D
Introducing Latent Semantic Analysis Tomas K Landauer et al., “An introduction to latent semantic analysis,” Discourse Processes, Vol. 25 (2-3), pp. 259-284, 1998. Scott Deerwester et al., “Indexing by latent semantic analysis,” Journal of the American Society for Information Science, Vol. 41 (6), pp. 391-407, 1990. Kirk Baker, “Singular Value Decomposition Tutorial,” Electronic document, 2005. Aug 22, 2014 Hee-Gook Jun
Outline • SVD • SVD to LSA • Conclusion
Eigendecomposition vs. Singular Value Decomposition • Eigendecomposition • Must be a diagonalizable matrix • Must be a square matrix • Matrix (n x n size) must have n linearly independent eigenvector • e.g. symmetric matrix .. • Singular Value Decomposition • Computable for any size (M x n) of matrix A A U P ∑ VT Ʌ P-1
U: Left Singular Vectors of A • Unitary matrix • Columns of U are orthonormal (orthogonal + normal) • orthonormal eigenvectors of AAT A U ∑ VT and is orthogonal is normal vector = [0,0,0,1] = [0,1,0,0] = (0x0) + (0x1) + (0x0) + (1x0) = 0 = [0,0,0,1] || = = 1
V: Right Singular Vectors of A • Unitary matrix • Columns of V are orthonormal (orthogonal + normal) • orthonormal eigenvectors of ATA A U ∑ VT
∑ (or S) • Diagonal Matrix • Diagonal entries are the singular values of A • Singular values • Non-zero singular values • Square roots of eigenvalues from U (or V) in descending order A U ∑ VT
Calculation Procedure • U is a list of eigenvectors of AAT • Compute AAT • Compute eigenvectors of AAT • Matrix Orthonormalization • V is a list of eigenvectors of ATA • Compute ATA • Compute eigenvalues of ATA • Orthonormalize and transpose • ∑ is a list of eigenvalues of U or V • (eigenvalues of U = eigenvalues of V) A U ∑ VT ① ② ③
1.1 Matrix U – Compute AAT • Start with the matrix • Transpose of A • Then
1.2 Matrix U – Eigenvectors and Eigenvalues [1/2] • Eigenvector • Nonzero vector that satisfies the equation • A is a square matrix, is an eigenvalue (scalar), is the eigenvector rearrange ≡ set determinent of the coefficient matrix to zero
1.2 Matrix U – Eigenvectors and Eigenvalues [2/2] Calculated eigenvalues eigenvector ① For eigenvector ② For Thus, set of eigenvectors
1.3 Matrix U – Orthonormalization set of eigenvectors orthonormal matrix Gram-Schmidt orthonormalization normalize v1 find w2 (orthogonal to u1) normalize w2
2.1 Matrix VT – Compute ATA • Start with the matrix • Transpose of A • Then
2.2 Matrix VT– Eigenvectors and Eigenvalues [1/2] • Eigenvector • Nonzero vector that satisfies the equation • A is a square matrix, is an eigenvalue (scalar), is the eigenvector rearrange ≡ set determinent of the coefficient matrix to zero by cofactor expansion (여인수 전개)
2.2 Matrix VT– Eigenvectors and Eigenvalues [2/2] eigenvector ① For ② For ③ For Thus, set of eigenvectors
2.3 Matrix VT – Orthonormalization and Transformation Gram-Schmidt orthonormalization set of eigenvectors orthonormal matrix Transpose normalize v1 find w2 (orthogonal to u1) normalize w2 find w3 (orthogonal to u2) normalize w3
3.1 Matrix ∑ (= S) • Square roots of the non-zero eigenvalues • Populate the diagonal with the values • Diagonal entries in ∑ are the singular values of A
Outline • SVD • SVD to LSA
Latent Semantic Analysis • Use SVD (Singular Value Decomposition) • to simulate human learning of word and passage meaning • Represent word and passage meaning • as high-dimensional vectors in the semantic space
LSA Example First analysis – Document Similarity Second analysis – Term Similarity doc 1 " modem the steering linux. modem, linux the modem. steering the modem. linux " doc 2 "linux; the linux. the linux modem linux. the modem, clutch the modem. petrol " doc 3 " petrol! clutch the steering, steering, linux. the steering clutch petrol. clutch the petrol; the clutch " doc 4 " the thethe. clutch clutchclutch! steering petrol; steering petrol petrol; steering petrol "
LSA Example: Build a Term Frequency Matrix Let Matrix A =
LSA Example: Compute SVD of Matrix A A - R code - result ← svd(A) = U S VT x x 6 x 4 4 x 4 4 x 4
LSA Example: Reduced SVD 4 x 4 4 x 4 6 x 4 x x 2 x 2 2 x 4 6 x 2 x x
LSA Example: Document Similarity S V = x 2 x 2 2 x 4 doc 1 "modem the steering linux. modem, linuxthe modem. steering the modem. linux " doc 2 "linux; the linux. the linux modem linux. the modem, clutch the modem. petrol " doc 3 "petrol! clutch the steering, steering, linux. the steering clutch petrol. clutch the petrol; the clutch " doc 4 "the thethe. clutch clutchclutch! steering petrol; steering petrol petrol; steering petrol "
LSA Example: Term Similarity S V x = linux modem the clutch steering petrol
Conclusion • Pros • Compute document similarity • even if they do not have common words • Cons • Statistical foundation missing → PLSA Which one is to be chosen to reduce? x x