1 / 23

Why Spectral Retrieval Works

Why Spectral Retrieval Works. SIGIR 2005 in Salvador, Brazil, August 15 – 19. Holger Bast Max-Planck-Institut für Informatik (MPII) Saarbrücken, Germany joint work with Debapriyo Majumdar. What we mean by spectral retrieval. Ranked retrieval in the term space. . 1.00. 1.00. 0.00.

RexAlvis
Download Presentation

Why Spectral Retrieval Works

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Why Spectral Retrieval Works SIGIR 2005 in Salvador, Brazil, August 15 – 19 Holger Bast Max-Planck-Institut für Informatik (MPII) Saarbrücken, Germany joint work with Debapriyo Majumdar

  2. What we mean by spectral retrieval • Ranked retrieval in the term space  1.00 1.00 0.00 0.50 0.00 "true" similarities to query  qTd2 ———|q||d2| qTd1 ———|q||d1| cosine similarities 0.82 0.00 0.00 0.38 0.00

  3. What we mean by spectral retrieval • Ranked retrieval in the term space  1.00 1.00 0.00 0.50 0.00 "true" similarities to query  cosine similarities 0.82 0.00 0.00 0.38 0.00 • Spectral retrieval = linear projection to an eigensubspace L q projection matrix L  cosine similarities in the subspace (Lq)T(Ld1)——————|Lq| |Ld1| 0.98 0.98 -0.25 0.73 0.01 …

  4. Why and when does this work? • Previous work: if the term-document matrix is a slight perturbation of a rank-k matrix then projection to ak-dimensional subspace works • Papadimitriou, Tamaki, Raghavan, Vempala PODS'98 • Ding SIGIR'99 • Ando and Lee SIGIR'01 • Azar, Fiat, Karlin, McSherry, Saia STOC'01 • Our explanation: spectral retrieval works through its ability to identify pairs of terms with similar co-occurrence patterns • no single subspace is appropriate for all term pairs • we fix that problem

  5. Spectral retrieval — alternative view • Ranked retrieval in the term space • Spectral retrieval = linear projection to an eigensubspace L q projection matrix L  (Lq)T(Ld1)——————|Lq||Ld1| cosine similarities in the subspace … = qT(LTLd1)——————|Lq||LTLd1|

  6. Spectral retrieval — alternative view • Ranked retrieval in the term space expansion matrix LTL • Spectral retrieval = linear projection to an eigensubspace L q projection matrix L  cosine similarities in the subspace … qT(LTLd1)——————|Lq||LTLd1|

  7. Spectral retrieval — alternative view • Ranked retrieval in the term space expansion matrix LTL qT(LTLd1)——————|q||LTLd1|  … similarities after document expansion • Spectral retrieval = linear projection to an eigensubspace L q projection matrix L  qT(LTLd1)——————|Lq||LTLd1| cosine similarities in the subspace … Spectral retrieval = document expansion (not query expansion)

  8. Why document "expansion" internet surfing beach web = · 0-1 expansion matrix

  9. Why document "expansion" add "internet" if "web" is present internet surfing beach web = · 0-1 expansion matrix

  10. Why document "expansion" • Ideal expansion matrix has • high scores for intuitively related terms • low scores for intuitively unrelated terms add "internet" if "web" is present internet surfing beach web = · matrix L projectingto 2 dimensions expansion matrix LTL expansion matrixdepends heavily on the subspace dimension!

  11. Why document "expansion" • Ideal expansion matrix has • high scores for intuitively related terms • low scores for intuitively unrelated terms add "internet" if "web" is present internet surfing beach web = · matrix L projectingto 3 dimensions expansion matrix LTL expansion matrixdepends heavily on the subspace dimension!

  12. logic / logics node / vertex logic / vertex 0 200 400 600 0 200 400 600 0 200 400 600 subspace dimension subspace dimension subspace dimension Our Key Observation • We studied how the entries in the expansion matrix depend on the dimension of the subspace to which documents are projected expansion matrix entry 0 no single dimension is appropriate for all term pairs

  13. logic / logics node / vertex logic / vertex 0 200 400 600 0 200 400 600 0 200 400 600 subspace dimension subspace dimension subspace dimension Our Key Observation • We studied how the entries in the expansion matrix depend on the dimension of the subspace to which documents are projected expansion matrix entry 0 no single dimension is appropriate for all term pairs but the shape of the curve is a good indicator for relatedness!

  14. 0 200 400 600 0 0 200 200 400 400 600 600 subspace dimension subspace dimension subspace dimension Curves for related terms • We call two terms perfectly related if they have an identical co-occurrence pattern term 1 term 2 proven shape for perfectly related terms provably small change after slight perturbation half way to a real matrix expansion matrix entry 0 point of fall-off is different for every term pair! up-and-then-down shape remains

  15. 0 0 0 200 200 200 400 400 400 600 600 600 subspace dimension subspace dimension subspace dimension Curves for unrelated terms • Co-occurrence graph: • vertices = terms • edge = two terms co-occur • We call two terms perfectly unrelated if no path connects them in the graph provably small changeafter slight perturbation proven shape forperfectly unrelated terms half way to a real matrix expansion matrix entry 0 curves for unrelated terms are random oscillations around zero

  16. Telling the shapes apart — TN • Normalize term-document matrix so that theoretical point of fall-off is equal for all term pairs • For each term pair: if curve is never negative before this point, set entry in expansion matrix to 1, otherwise to 0 expansion matrix entry 0 set entry to 1 set entry to 1 set entry to 0 0 200 400 600 0 200 400 600 0 200 400 600 subspace dimension subspace dimension subspace dimension a simple 0-1 classification, no fractional entries!

  17. An alternative algorithm — TM • Again, normalize term-document matrix so that theoretical point of fall-off is equal for all term pairs • For each term pair compute the monotonicity of its initial curve (= 1 if perfectly monotone,  0 as number of turns increase) • If monotonicity is above some threshold, set entry in expansion matrix to 1, otherwise to 0 0.07 0.07 0.69 0.69 0.82 0.82 expansion matrix entry 0 set entry to 1 set entry to 1 set entry to 0 0 200 400 600 0 200 400 600 0 200 400 600 subspace dimension subspace dimension subspace dimension again: a simple 0-1 classification!

  18. Experimental results (average precision) 425 docs3882 terms Baseline: cosine similarity in term space Latent Semantic Indexing Dumais et al. 1990 Term-normalized LSI Ding et al. 2001 Correlation-based LSI Dupret et al. 2001 Iterative Residual Rescaling Ando & Lee 2001 our non-negativity test our monotonicity test * the numbers for LSI, LSI-RN, CORR, IRR are for the best subspace dimension!

  19. Experimental results (average precision) 425 docs3882 terms 21578 docs5701 terms 233445 docs99117 terms * the numbers for LSI, LSI-RN, CORR, IRR are for the best subspace dimension!

  20. Conclusions • Main message: spectral retrieval works through its ability to identify pairs of termswith similar co-occurrence patterns • a simple 0-1 classification that considers a sequence of subspaces is at least as good as schemes that commit to a fixed subspace • Some useful corollaries … • new insights into the effect of term-weighting and other normalizations for spectral retrieval • straightforward integration of known word relationships • consequences for spectral link analysis?

  21. Conclusions • Main message: spectral retrieval works through its ability to identify pairs of terms with similar co-occurrence patterns • a simple 0-1 classification that considers a sequence of subspaces is at least as good as schemes that commit to a fixed subspace • Some useful corollaries … • new insights into the effect of term-weighting and other normalizations for spectral retrieval • straightforward integration of known word relationships • consequences for spectral link analysis? Obrigado!

  22. Why document "expansion" • Ideal expansion matrix has • high scores for related terms • low scores for unrelated terms • Expansion matrix LTL depends on the subspace dimension add "internet" if "web" is present internet surfing beach web = · matrix L projectingto 4 dimensions expansion matrix LTL

More Related