170 likes | 289 Views
Attention-Based Information Retrieval. Georg Buscher German Research Center for Artificial Intelligence (DFKI) Knowledge Management Department Kaiserslautern, Germany. SIGIR 07 Doctoral Consortium. Motivation.
E N D
Attention-Based Information Retrieval Georg Buscher German Research Center for Artificial Intelligence (DFKI) Knowledge Management Department Kaiserslautern, Germany SIGIR 07 Doctoral Consortium
Motivation • Magnetic Resonance Imaging uses magnetic fields and radio waves to produce high quality two- or three-dimensional images of brain structures. Sensors read frequencies of radio waves and a computer uses the information to construct an image of the brain (see 2) . 1 2 3 • Homer's personality is one of frequent stupidity, laziness, and explosive anger. He also suffers from a short attention span which complements his intense but short-lived passion for hobbies, enterprises and various causes. Furthermore, he is prone to emotional outbursts. • Positron Emission Tomography measures emissions from radioactively labeled metabolically active chemicals that have been injected into the bloodstream. The emission data are computer-processed to produce 2- or 3-dimensional images of the distribution of the chemicals throughout the brain. Especially useful are a wide array of chemicals used to map different aspects of neurotransmitter activity (see 3).
Outline • Acquiring attention evidence • Attention evidence through eye tracking • Attention annotation and derivation with Dempster-Shafer • Applications in Information Retrieval • Attention-based TfIdf • Context elicitation • Context-based Index • Query Expansion / result re-ranking
Sources of Attention-Data • There are many indications of attention from the user: Reading evidence (implicit) read Annotations (explicit) skimmed longer viewed
Attention Annotations Imply Different Levels of Attention • Attention evidence values … [1.0; 1.0] [0.7; 1.0] [0.5; 1.0] … … [0.2; 0.7] • Range from 0 to 1 • Width of an interval expresses uncertainty
Dempster-Shafer Combination of Attention Evidence read [The demo … provide][different][visualizations][and interfaces][according … situation.] R R H R H U R U R [0.5; 1] [0.85; 1] [0.96; 1] [0.85; 1] [0.5; 1] Calculate one value of attention (att(t) = bel(t) – 0.2*bel(t) + 0.2*pl(t)): 0.6 0.88 0.97 0.88 0.6 In that way, the function att provides an attention value for every term of the document. attdifferent, d = 0.88 attaccording, d = 0.6 attsomethingElse, d = 0
Outline • Acquiring attention evidence • Attention evidence through eye tracking • Attention annotation and derivation with Dempster-Shafer • Applications in Information Retrieval • Attention-based TfIdf Desktop Index • Context elicitation • Context-based Index • Query Expansion
Attention-Based Desktop Index • A Desktop index is especially for re-finding known documents. • You can better remember those parts of a document that you paid attention to. • Attended terms should be weighted higher. • TfIdf-based modification • Attention is a local factor (like tf) • The higher the maximal intensity of an attended document part, the more weight should be assigned to the attention value. • The lower the maximal intensity of an attended document part, the more weight should be assigned to tf. attention part term frequency part tft,d : term frequency of term t in document d α in [0; 1] is a balancing factor for defining the influence of attention in contrast to term frequency. attt,d : attention value of term t in document d
Why Context? The Search for the Mental Model • If a knowledge worker tries to recall something concerning a topic,does he primarily think • on the basis of documents and document structures or • on the basis of former thematic contexts? Rather the latter… • While re-finding some information, one does not search primarily for the document, but for the former mental model.Documents mediate.
Elicitation and Representation of the Thematic Context Document 1 Brain imaging Document 2 Brain imaging Document 3 The Simpsons Document 4 Brain imaging • Some read sub-documents • Combination of the viewed sub-documents to one virtual context document (only those attended parts that have a thematic overlapping) thematic context Brain imaging
Determination of Thematical Overlapping • Determine buzzwords for each viewed document by using • Attention value • Idf of desktop index • Compare buzzword vector with previous context vectors • If there is a similarity, then merge with context vector • Else buzzword vector is a new context Currentlyvieweddocument(part) ? Previouscontexts
Context-Based Vector-Space Index • Common index structure Doc1 Doc2 Doc3 Term1 Term2 Term3 0 1 0 4 0 1 2 3 1 • Idea: two indexes1. Term – Context 2. Context – Document • A context is represented by a virtual context document • The value for each term–context relation is influenced by the degree of attention C1 C2 C3 Doc1 Doc2 Term1 Term2 Term3 Term4 2 1 0 3 1 2 1 3 5 2 0 1 C1 C2 C3 x x x x
New Kinds of Search Tasks Possible • Local search:Find for the current task (parts of) documents,that I formerly used for a similar task. • Enterprise-wide search:Find for the current task (parts of) documents,that I do not know yet, butthat have been used by some colleague for a similar task.
Evaluation of the Context-Based Index • Main advantage is expected to show up in several weeks. • Not possible to do real-world eye tracking studies for such a long time • Artificial experiment: • Several different exploration tasks within some hours • Then some re-finding tasks about previously viewed content • Measuring the time or user-satisfaction during the search process? Context-based search Normal search
Contextual Attention-Based Relevance Feedback • Problem with context-based index: it doesn’t scale for web search therefore query expansion • Current elicited context (i.e. term vector) expresses current interest of the user • Topmost characteristic keywords will be used for query expansion
The Global Picture Eye Tracker Attention data generation module Attention-baseddesktop index Text Mark Recognition Attention-annotated document Context-basedindex Thank youfor your Context document attention attention ! Query expansionfor web search