1 / 27

Lecture 21 SVD and Latent Semantic Indexing and Dimensional Reduction

Lecture 21 SVD and Latent Semantic Indexing and Dimensional Reduction. Shang-Hua Teng. Singular Value Decomposition. where u 1 … u r are the r orthonormal vectors that are basis of C(A) and v 1 … v r are the r orthonormal vectors that are basis of C(A T ).

Download Presentation

Lecture 21 SVD and Latent Semantic Indexing and Dimensional Reduction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 21SVD and Latent Semantic Indexing and Dimensional Reduction Shang-Hua Teng

  2. Singular Value Decomposition • where • u1 …ur are the r orthonormal vectors that are basis of C(A) and • v1 …vr are the r orthonormal vectors that are basis of C(AT )

  3. Low Rank Approximation and Reduction

  4. · · The Singular Value Decomposition 0 A U S VT = 0 m x n m x m m x n n x n 0 A U S VT 0 = m x n m x r r x r r x n

  5. 0 A U S VT 0 = m x n m x r r x r r x n · · The Singular Value Reduction Ak Uk S VkT = m x n m x k k x k k x n

  6. How Much Information Lost?

  7. Distance between Two Matrices • Frobenius Norm of a matrix A. • Distance between two matrices A and B

  8. How Much Information Lost?

  9. Approximation Theorem • [Schmidt 1907; Eckart and Young 1939] Among all m by n matrices B of rank at most k, Ak is the one that minimizes

  10. Application: Image Compression • Uncompressed m by n pixel image: m×n numbers • Rank k approximation of image: • k singular values • The first k columns of U (m-vectors) • The first k columns of V (n-vectors) • Total: k× (m + n + 1) numbers

  11. Example: Yogi (Uncompressed) • Source: [Will] • Yogi: Rock photographed by Sojourner Mars mission. • 256 × 264 grayscale bitmap  256 × 264 matrix M • Pixel values  [0,1] • ~ 67584 numbers

  12. Example: Yogi (Compressed) • M has 256 singular values • Rank 81 approximation of M: • 81 × (256 + 264 + 1) = ~ 42201 numbers

  13. Example: Yogi (Both)

  14. Patented by MIT Utilizes two dimentional, global grayscale images Face is mapped to numbers Create an image subspace(face space) which best discriminated between faces Can be used in properly lit and only frontal images Eigenface

  15. The Face Database • Set of normalized face images • Used ORL Face DB

  16. Two-dimensional Embedding

  17. EigenFaces • Eigenface (PCA)

  18. Latent Semantic Analysis (LSA) • Latent Semantic Indexing (LSI) • Principal Component Analysis (PCA)

  19. Term-Document Matrix • Index each document (by human or by computer) • fij counts, frequencies, weights, etc • Each document can be regarded as a point in m dimensions

  20. Document-Term Matrix • Index each document (by human or by computer) • fij counts, frequencies, weights, etc • Each document can be regarded as a point in n dimensions

  21. Term Occurrence Matrix

  22. c1 c2 c3 c4 c5 m1 m2 m3 m4 human 1 0 0 1 0 0 0 0 0 interface 1 0 1 0 0 0 0 0 0 computer 1 1 0 0 0 0 0 0 0 user 0 1 1 0 1 0 0 0 0 system 0 1 1 2 0 0 0 0 0 response 0 1 0 0 1 0 0 0 0 time 0 1 0 0 1 0 0 0 0 EPS 0 0 1 1 0 0 0 0 0 survey 0 1 0 0 0 0 0 0 1 trees 0 0 0 0 0 1 1 1 0 graph 0 0 0 0 0 0 1 1 1 minors 0 0 0 0 0 0 0 1 1

  23. Another Example

  24. Term Document Matrix

  25. D T LSI using k=2… “applications& algorithms” LSI Factor 2 LSI Factor 1 “differentialequations” Each term’s coordinates specified in first K valuesof its row. Each doc’s coordinates specified in first K valuesof its column.

  26. Positive Definite Matrices and Quadratic Shapes

  27. Positive Definite Matrices and Quadratic Shapes For any m x n matrix A, all eigenvalues of AAT and ATA are non-negative Symmetric matrices that have positive eigenvalues are called Positive Definite matrices Symmetric matrices that have non-negative eigenvalues are called Positive semi-definite matrices

More Related