1 / 9

Discussion Class 4

Discussion Class 4. Latent Semantic Indexing. Discussion Classes. Format: Question Ask a member of the class to answer. Provide opportunity for others to comment. When answering: Stand up. Give your name. Make sure that the TA hears it.

epipkin
Download Presentation

Discussion Class 4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Discussion Class 4 Latent Semantic Indexing

  2. Discussion Classes Format: Question Ask a member of the class to answer. Provide opportunity for others to comment. When answering: Stand up. Give your name. Make sure that the TA hears it. Speak clearly so that all the class can hear. Suggestions: Do not be shy at presenting partial answers. Differing viewpoints are welcome.

  3. Question 1: Basics (a) Explain the name "latent semantic analysis"? (b) What problems is latent semantic analysis attempting to solve? (c) What criteria were used in selecting singular-value decomposition?

  4. Question 2 • term document query --- cosine > 0.9

  5. Question 3: Rank Reduction ^ ~ ~ (a) Explain the matrices in the singular value decomposition: X = T0S0D0' (b) The rank reduction method is to keep the first k elements of S0and set the others to zero. This gives: X X = TSD' What has this to do with latent semantics?

  6. Q4: Experimental Results: 100 Factors (a) LSI-100 does better at the right of this graph than on the left. What has this to do with synonymy and polysemy? (b) Why were the authors surprised that TERM and SMART gave similar results?

  7. Question 5: Experimental Results (a) Describe the methodology of the MED experiment. (b) What conclusions can you draw from this experiment? (c) The results of the CISI experiment were disappointing. What are some possible explanations? (d) This is a new method. What comes next?

  8. Question 6: Number of Factors What data does this graph plot? What conclusions can you draw from this graph?

  9. Question 7: Performance The paper states, "the only way documents can be retrieved is by an exhaustive comparison of a query vector against all stored document vectors." Explain this statement Is this a serious problem?

More Related