Jerome R. Bellegarda

Latent Semantic Mapping: Dimensionality Reduction via Globally Optimal Continuous Parameter Modeling Jerome R. Bellegarda

Outline • Introduction • LSM • Applications • Conclusions

Introduction • LSA in IR: • Words of queries and documents • Recall and precision • Assumption: There is some underlying latent semantic structure in the data • Latent structure is conveyed by correlation patterns • Documents: bag-of-words model • LSA improves separability among different topics

Introduction

Introduction • Success of LSA: • Word clustering • Document clustering • Language modeling • Automated call routing • Semantic Inference for spoken interface control • These solutions all leverage LSA’s ability to expose global relationships in context and meaning

Introduction • Three unique factors for LSA: • The mapping of discrete entries • The dimensionality reduction • The intrinsically global outlook • The change of terminology to latent semantic mapping (LSM) to convey increased reliance on the general properties

Latent Semantic Mapping • LSA defines a mapping between the discrete sets • M: an inventory of M individual units, such as words • N: an collection of N meaningful compositions of units, such as documents • L: a continuous vector space • ri: unit in M • cj: composition in N

Feature Extraction • Construction of a matrix W of co-occurrences between units and compositions • The cell of W:

Feature Extraction • The entropy of ri: • Value of Entropy Close to 0 means that the unit is present only in a few specific compositions. • The global weight is therefore a measure of the indexing power of the unit ri

Singular Value Decomposition • The MxN unit-composition matrix W defines two vector representations for the units and the compositions • ri: a row factor of dimension N • cj: a column factor of dimension M • Unpractical: • M,N can be extremely large • Vector ri, cj are typically sparse • Two spaces are distinct from each other

Singular Value Decomposition • Employ SVD: • U: MxR left singular matrix with row vectors ui • S: RxR diagonal matrix of singular values • V: NxR right singular matrix with row vector vj • U, V are column-orthonormal • UTU=VTV=IR • R<min(M, N)

Singular Value Decomposition

Singular Value Decomposition • captures the major structural associations in and ignores higher order effects • The closeness of vector in L: • Unit-unit comparison • Composition-composition comparison • Unit-Composition comparison

Closeness Measure • WWT: co-occurrences between units • WTW: co-occurrences between compositions • ri, rj: units which have similar pattern of occurrence across the composition • ci, cj: compositions which have similar pattern of occurrence across the unit

Closeness Measure • Unit-Unit Comparisons: • Cosine measure: • Distance: [0, π]

Unit-Unit Comparisons

Closeness Measure • Composition-Composition Comparisons: • Cosine measure: • Distance: [0, π]

Closeness Measure • Unit-Composition Comparisons: • Cosine measure: • Distance: [0, π]

LSM Framework Extension • Observe a new composition , p>N, the tilde symbol reflects the fact that the composition was not part of the original N • , a column vector of dimension M, can be thought of as an additional column of the matrix W • U, S do not change:

LSM Framework Extension : pseudo-composition : pseudo-composition vector • If the addition of causes the major structural associations in W to shift in some substantial manner, the singular vectors will become inadequate.

LSM Framework Extension • It would be necessary to re-compute SVD to find a proper representation for

Salient Characteristics of LSM • A single vector embedding for both units and compositions in the same continuous vector space L • A relatively low dimensionality, which make operations such as clustering meaningful and practical • An underlying structure reflecting globally meaningful relationships, with natural similarity metrics to measure the distance between units, between compositions or between units and compositions in L

Applications • Semantic classification • Multi-span language modeling • Junk e-mail filtering • Pronunciation modeling • TTS Unit Selection

Semantic Classification • Semantic classification refers to determine which one of predefined topic a given document is most closely aligned with • The centroid of each clusters can be viewed as the semantic representation of this outcome in LSM space • Semantic anchor • A newly observed word sequence measures by computing the distance between the document and semantic anchor, and pick minimum

Semantic Classification • Domain knowledge is automatically encapsulated in the LSM space in a data-driven fashion • For Desktop interface control: • Semantic inference

Semantic Inference

Multi-Span Language Modeling • In a standard n-gram , the history is string • In LSM language modeling, the history is the current document up to word • Pseudo-document: • Continually updated as q increases

Multi-Span Language Modeling • An Integrated n-gram + LSM formulation for the overall language model probability: • Different syntactic constructs can be used to carry the same meaning (content words)

Multi-Span Language Modeling Assume that the probability of the document History given the current word is not affected by immediate context preceding it

Multi-Span Language Modeling

Junk E-mail Filtering • It can be viewed as a degenerate case of semantic classification (two categories) • Legitimate • Junk • M: an inventory of words, symbols • N: a binary collection of email messages • Two semantic anchors

Pronunciation Modeling • Also called grapheme-to-phoneme conversion (GPC) • Orthographic anchors • (one for each in-vocabulary word) • Orthographic neighborhood • In-vocabulary word with High closeness for out-vocabulary word

Pronunciation Modeling

Conclusions • Descriptive Power • Forgoing local constraints is not acceptable in some situations • Domain Sensitivity • Depend on the quality of the training data • polysemy • Updating the LSM Space • SVD on the fly is not practical • Success of LSM for three characteristics

Jerome R. Bellegarda

Jerome R. Bellegarda

Presentation Transcript

Jerome Bruner

JEROME BRUNER

Jerome Klapka Jerome

Jerome Bruner

Jerome Bruner

Jerome Bruner

Jerome Bruner

Jerome Bruner

Jerome Bruner

Jerome Kern

Jerome Bruner

Jerome kaino

Jerome Bruner

ST. Jerome

Jerome Robbins

JEROME BETTIS

Jerome Robins

Jerome

Jerome:

Jerome Bruner

Jerome Bruner

Jerome Reid