龙星计划课程 : 信息检索 Personalized Search & User Modeling

龙星计划课程:信息检索Personalized Search & User Modeling ChengXiang Zhai (翟成祥) Department of Computer Science Graduate School of Library & Information Science Institute for Genomic Biology, Statistics University of Illinois, Urbana-Champaign http://www-faculty.cs.uiuc.edu/~czhai, czhai@cs.uiuc.edu

What is Personalized Search? • Use more user information than the user’s query in retrieval • “more information” = user’s interaction history  Implicit feedback • “more information” = user’s judgments or user’s answer to clarification questions  explicit feedback • Personalization can be done in multiple ways: • Personalize the collection • Personalize ranking • Personalize result presentation • … • Personalized search = user modeling + model exploitation

Why Personalized Search? • The more we know about the user’s information need, the more likely we can get relevant documents, thus we should know as much as we can about the users • When a query doesn’t work well, personalized search would be extremely helpful.

Client-Side vs. Server-Side Personalization • Server-Side (most work, including commercial products): • Sees global information (all documents, all users) • Limited user information (can’t see activities outside search results) • Privacy issue • Client-Side (UCAIR): • More information about the user, thus more accurate user modeling (complete interaction history + other user activities) • More scalable (“distributed personalization”) • Alleviate the problem of privacy • Combination of server-side and client-side? How?

Outline • A framework for optimal interactive retrieval • Implicit feedback (no user effort) • Within a search session • For improving result organization • Explicit feedback (with user effort) • Term feedback • Active feedback • Improving search result organization

1. A Framework for Optimal Interactive Retrieval [Shen et al. 05]

IR as Sequential Decision Making (Information Need) (Model of Information Need) User System A1 : Enter a query Which documents to present? How to present them? Which documents to view? Ri: results (i=1, 2, 3, …) Which part of the document to show? How? A2 :View document R’: Document content View more? A3 : Click on “Back” button

Retrieval Decisions History H={(Ai,Ri)} i=1, …, t-1 Rt =? Rt r(At) Given U, C, At , and H, choose the best Rt from all possible responses to At Query=“Jaguar” Click on “Next” button User U: A1 A2 … … At-1 At System: R1 R2 … … Rt-1 The best ranking for the query The best ranking of unseen docs C All possible rankings of C Document Collection All possible rankings of unseen docs

A Risk Minimization Framework User Model Seen docs M=(S, U…) Information need L(ri,At,M) Loss Function Optimal response: r* (minimum loss) Bayes risk Inferred Observed Observed User: U Interaction history: H Current user action: At Document collection: C All possible responses: r(At)={r1, …, rn}

A Simplified Two-Step Decision-Making Procedure • Approximate the Bayes risk by the loss at the mode of the posterior distribution • Two-step procedure • Step 1: Compute an updated user model M* based on the currently available information • Step 2: Given M*, choose a response to minimize the loss function

Optimal Interactive Retrieval M*1 P(M1|U,H,A1,C) L(r,A1,M*1) R1 A2 M*2 P(M2|U,H,A2,C) L(r,A2,M*2) R2 A3 … User U C Collection A1 IR system

Refinement of Risk Minimization • r(At): decision space (At dependent) • r(At) = all possible subsets of C (document selection) • r(At) = all possible rankings of docs in C • r(At) = all possible rankings of unseen docs • r(At) = all possible subsets of C + summarization strategies • M: user model • Essential component: U = user information need • S = seen documents • n = “Topic is new to the user” • L(Rt ,At,M): loss function • Generally measures the utility of Rt for a user modeled as M • Often encodes retrieval criteria (e.g., using M to select a ranking of docs) • P(M|U, H, At, C): user model inference • Often involves estimating a unigram language model U

Case 1: Context-Insensitive IR • At=“enter a query Q” • r(At) = all possible rankings of docs in C • M= U, unigram language model (word distribution) • p(M|U,H,At,C)=p(U |Q)

Case 2: Implicit Feedback • At=“enter a query Q” • r(At) = all possible rankings of docs in C • M= U, unigram language model (word distribution) • H={previous queries} + {viewed snippets} • p(M|U,H,At,C)=p(U |Q,H)

Case 3: General Implicit Feedback • At=“enter a query Q” or “Back” button, “Next” button • r(At) = all possible rankings of unseen docs in C • M= (U, S), S= seen documents • H={previous queries} + {viewed snippets} • p(M|U,H,At,C)=p(U |Q,H)

Case 4: User-Specific Result Summary • At=“enter a query Q” • r(At) = {(D,)}, DC, |D|=k, {“snippet”,”overview”} • M= (U, n), n{0,1} “topic is new to the user” • p(M|U,H,At,C)=p(U,n|Q,H), M*=(*, n*) If a new topic (n*=1), give an overview summary; otherwise, a regular snippet summary Choose k most relevant docs

What You Should Know • Disadvantages and advantages of client-side vs. server-side personalization • The optimal interactive retrieval framework provides a general way to model personalized search • Maximum user modeling • Immediate benefit (“eager feedback”) • Personalization can be potentially done for all the components and steps in a retrieval system

2. Implicit Feedback[Shen et al. 05, Tan et al. 06]

“Jaguar” Example Suppose we know: • Previous query = “racing cars” vs. “Apple OS” • “car” occurs far more frequently than “Apple” in pages browsed by the user in the last 20 days 3. User just viewed an “Apple OS” document Car Car Software Car Animal Car

How can we exploit such implicit feedback information that already naturally exists to improve ranking accuracy?

Risk Minimization for Implicit Feedback • At=“enter a query Q” • r(At) = all possible rankings of docs in C • M= U, unigram language model (word distribution) • H={previous queries} + {viewed snippets} • p(M|U,H,At,C)=p(U |Q,H) Need to estimate a context-sensitive LM

Scenario 1:Use Information in one Session [Shen et al. 05] e.g., Apple software Q1 User Query Qk User Clickthrough C1={C1,1, C1,2 ,C1,3 ,…} e.g., Apple - Mac OS X The Apple Mac OS X product page. Describes features in the current version of Mac OS X, … Q2 C2={C2,1, C2,2 ,C2,3 ,… } … e.g., Jaguar User Model: Query History Clickthrough

Method1: Fixed Coeff. Interpolation (FixInt) C1 … Linearly interpolate history models Ck-1 Average user query history and clickthrough Q1 … Qk-1 Linearly interpolate current query and history model Qk

Method 2: Bayesian Interpolation(BayesInt) C1 … Ck-1 Average user query and clickthrough history Dirichlet Prior Q1 … Qk Qk-1 Intuition: trust the current query Qk more if it’s longer

Method 3: Online Bayesian Updating (OnlineUp) Q1 C1 Q2 C2 Qk Intuition: incremental updating of the language model

Method 4: Batch Bayesian Update(BatchUp) Q1 Q2 Qk C1 … Ck-1 Intuition: all clickthrough data are equally useful C2

TREC Style Evaluation • Data collection: TREC AP88-90 • Topics: 30 hard topics of TREC topics 1-150 • System: search engine + RDBMS • Context: Query and clickthrough history of 3 participants (http://sifaka.cs.uiuc.edu/ir/ucair/QCHistory.zip)

Example of a Hard Topic <topic> <number> 2 (283 relevant docs in 242918 documents) <title> Acquisitions <desc> Document discusses a currently proposed acquisition involving a U.S. company and a foreign company. <narr> To be relevant, a document must discuss a currently proposed acquisition (which may or may not be identified by type, e.g., merger, buyout, leveraged buyout, hostile takeover, friendly acquisition). The suitor and target must be identified by name; the nationality of one of the companies must be identified as U.S. and the nationality of the other company must be identified as NOT U.S. </topic>

Performance of the Hard Topic Q1: acquisition u.s. foreign company MAP: 0.004; Pr@20: 0.000 Q2: acquisition merge takeover u.s. foreign company MAP: 0.026; Pr@20: 0.100 Q3: acquire merge foreign abroad international MAP: 0.004; Pr@20: 0.050 Q4: acquire merge takeover foreign european japan MAP: 0.027; Pr@20: 0.200

Overall Effect of Search Context • Short-term context helps system improve retrieval accuracy • BayesInt better than FixInt; BatchUp better than OnlineUp

Using Clickthrough Data Only Query MAP pr@20 Q3 0.0331 0.125 Performance on unseen docs Q3+HC 0.0661 0.178 Improve 99.7% 42.4% Q4 0.0442 0.165 Q4+HC 0.0739 0.188 Improve 67.2% 13.9% Query MAP pr@20 Q3 0.0421 0.1483 Q3+HC 0.0521 0.1820 Improve 23.8% 23.0% Snippets for non-relevant docs are still useful! Q4 0.0536 0.1930 Q4+HC 0.0620 0.1850 Improve 15.7% -4.1% Clickthrough is the major contributor BayesInt (=0.0,=5.0)

Sensitivity of BatchUp Parameters • BatchUp is stable with different parameter settings • Best performance is achieved when =2.0; =15.0

A User Study of Implicit Feedback • UCAIR toolbar (a client-side personalized search agent using implicit feedback) is used in this study • 6 participants use UCAIR toolbar to do web search • 32 topics are selected from TREC Web track and Terabyte track • Participants evaluate explicitly the relevance of top 30 search results from Google and UCAIR

UCAIR Outperforms Google: Precision at N Docs More user interactions  better user models  better retrieval accuracy

UCAIR Outperforms Google: PR Curve

Scenario 2:Use the Entire History of a User [Tan et al. 06] • Challenge: Search log is noisy • How do we handle the noise? • Can we still improve performance? • Solution: • Assign weights to the history data (Cosine, EM algorithm) • Conclusions: • All the history information is potentially useful • Most helpful for recurring queries • History weighting is crucial (EM better than Cosine)

Algorithm Illustration

Sample Results: EM vs. Baseline History is helpful and weighting is important

Sample Results: Different Weighting Methods EM is better than Cosine; hybrid is feasible

What You Should Know • All search history information helps • Clickthrough information is especially useful; it’s useful even when the actual document is non-relevant • Recurring queries get more help, but fresh queries can also benefit from history information

3. Explicit Feedback [Shen et al. 05, Tan et al. 07]

Term Feedback for Information Retrieval with Language Models Bin Tan, Atulya Velivelli, Hui Fang, ChengXiang Zhai University of Illinois at Urbana-Champaign

Problems with Doc-Based Feedback • A relevant document may contain non-relevant parts • None of the top-ranked documents is relevant • User indirectly controls the learned query model

What about Term Feedback? • Present a list of terms to a user and asks for judgments  • More direct contribution to estimating q • Works even when no relevant document on top • Challenges: • How do we select terms to present to a user? • How do we exploit term feedback to improve our estimate of q ?

Improve q with Term Feedback Term Extraction Terms Term Judgments Term Feedback Models Improved estimate of q d1 3.5 d2 2.4 ... Retrieval Engine Query User Document collection

Feedback Term Selection • General (old) idea: • The original query is used for an initial retrieval run • Feedback terms are selected from top N documents • New idea: • Model subtopics • Select terms to represent every subtopic well • Benefits • Avoid bias in term feedback • Infer relevant subtopics, thus achieve subtopic feedback

User-Guided Query Model Refinement Inferred topic preference direction Most promising new topic areas to move to t11 t12 t21 t22 t31 t32 … ored expl un T1 T3 T2 + + - + - - User Explored area Document space

Collaborative Estimation of q judged feedback terms feedback terms t1 t2 t3 … tL t1 t2 t3 … tL q TCFB qq’ refined query model C1:0.2 C2:0.1 C3:0.3 … CK:0.1 P(w|1) C1 C2 C3 … CK P(w|2) rank docs by D(qq’ || qd) weighted clusters P(w|k) Subtopic clusters TFB P(t1|TFB)=0.2 … P(t3|TFB)=0.1 … Original q P(w|q) q qq d1 d2 d3 … dN top N docs ranked by D(qq || qd) CFB P(w| CFB)=0.2*P(w|1)+ 0.1*P(w|2)+ …

Discovering Subtopic Clusters with PLSA [Hofmann 99, Zhai et al. 04] traffic 0.3 railway 0.2.. Theme 1 “Generating” word w in doc d in the collection Tunnel 0.1fire 0.05smoke 0.02 .. d,1 1 Theme 2 2 d,2 1 - B … tunnel 0.2amtrack 0.1train 0.05 .. d, k W k Theme k B Is 0.05the 0.04a 0.03 .. B Background B Query = “transportation tunnel disaster” Document d Maximum Likelihood Estimator (EM Algorithm)

Selecting Representative Terms L • Original query terms excluded • Shared terms assigned to most likely clusters

龙星计划课程 : 信息检索 Personalized Search & User Modeling