180 likes | 474 Views
Papers from the same authors with similar topics. Kolda, T.G.
E N D
1. Latent Semantic Indexing via a Semi-discrete Matrix Decomposition
2. Papers from the same authors with similar topics Kolda, T.G. & O'Leary, D.P. A semidiscrete matrix decomposition for latent semantic indexing information retrieval ACM Trans. Inf. Syst., 1998, 16, 322-346
Kolda, T.G. & O’Leary, D.P. George Cybenko, D.P.O. (ed.) Latentsemantic indexing via a semi-discrete matrix decomposition Springer-Verlag, 1999, 107, 73–80
Kolda, T.G. & O'leary, D.P. Algorithm 805: computation and uses of the semidiscrete matrix decomposition ACM Transactions on Mathematical Software, 2000, 26, 415–435
3. Vector Space Framework Query:
4. Weight of term in a document
5. Weight of term in a document
6. Motivation for using SDD Singular Value Decomposition (SVD) is used for Latent Semantic Indexing (LSI) to estimate the structure of word usage across documents.
Use Semi-discrete Decomposition (SDD) instead of SVD for LSI to save storage space and retrieval time.
7. Why? Claim: SVD has nice theoretical properties but SVD contains a lot of information, probably more than is necessary for this application.
8. SVD vs SDD SVD:
SDD:
9. SDD is an approximate representation of the matrix. Repackaging, even without removing anything, might not result in the original matrix. Theorems exist that say that as the number of terms k tends to infinity, slowly you will converge to the original matrix. The speed of convergence depends on the original estimate, used to "initialize" the iterative decomposition algorithm.
10. Result: Storage Space
11. Medline test case
12. Results on Medline test case
13. Method for SDD
14. Metrics in those papers Kolda, T.G. & O'Leary, D.P. A semidiscrete matrix decomposition for latent semantic indexing information retrieval ACM Trans. Inf. Syst., 1998, 16, 322-346
Kolda, T.G. & O’Leary, D.P. George Cybenko, D.P.O. (ed.) Latentsemantic indexing via a semi-discrete matrix decomposition Springer-Verlag, 1999, 107, 73–80
Kolda, T.G. & O'leary, D.P. Algorithm 805: computation and uses of the semidiscrete matrix decomposition ACM Transactions on Mathematical Software, 2000, 26, 415–435
15. Greedy Algorithm
16. Notes on the algorithm Starting vector y: every 100th element is 1 and all the other are 0.
Ak ? A as k? 8
Find the minimum F-norm can be simplified to find an optimal x.
Improvement threshold may be 0.01.improvement = |new - old| / old
17. Finding x and d