1 / 10

Dimension of Meaning

Dimension of Meaning. Author: Hinrich Schutze Presenter: Marian Olteanu. Introduction. Represent context as vectors Dimensions of space – words Initial vectors – determined by word occurrence This paper – reduce dimensionality by singular value decomposition Applications WSD

mandell
Download Presentation

Dimension of Meaning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dimension of Meaning Author: Hinrich Schutze Presenter: Marian Olteanu

  2. Introduction • Represent context as vectors • Dimensions of space – words • Initial vectors – determined by word occurrence • This paper – reduce dimensionality by singular value decomposition • Applications • WSD • Thesaurus induction

  3. Introduction • Classic scheme in IR • Documents are represented as vectors of words in term space • Extension – represent contexts as vectors of words within a fixed window • Disadvantage – content can be expressed with different words, close in meaning • This approach • Represent words as term vectors that reflect their pattern of usage in a large corpus

  4. Introduction • Dimension in this space: • Cash • Sport • Measure • Cosine of the angle between vectors

  5. Introduction • Compute a representation of context more robust than bag-of-words • Centroid (normalized average) of the vectors of the words in a context • Practical applications • Thousands of dimensions (words) • Matrix of concurrence with only 10% zeros

  6. Application • WSD • Done by clustering the contexts • AutoClass • Buckshot • Assign a sense for each cluster

  7. Word space

  8. Window size, dimension sets

  9. Discussion • Resembles LSI • Uses SVD • Purpose of space reduction • LSI – improve the quality of representation (because of null values) • This paper • Reducing the computation • Detection of term dependencies (similar terms) • SVD doesn’t influence accuracy of WSD

  10. Discussion • Small number of parameters (thousands) compared to other statistical approaches (i.e.: trigrams)

More Related