1 / 32

Implicit User Modeling for Personalized Search

Implicit User Modeling for Personalized Search. Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign. Current Search Engines are Mostly Document-Centered…. …. Search Engine. …. Documents. Search is generally non-personalized….

Download Presentation

Implicit User Modeling for Personalized Search

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Implicit User Modeling for Personalized Search Xuehua Shen, Bin Tan, ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign

  2. Current Search Engines are Mostly Document-Centered… ... … Search Engine … Documents Search is generally non-personalized…

  3. As of Oct. 17, 2005 Car Car Software Car Animal Car Example of Non-Personalized Search Query = Jaguar Without knowing more about the user, it’s hard to optimize…

  4. Therefore, personalization is necessary to improve the existing search engines. However, many questions need to be answered…

  5. Research Questions • Client-side or server-side personalization? • Implicit or explicit user modeling? • What’s a good retrieval framework for personalized search? • How to evaluate personalized search? • …

  6. Client-Side vs. Server-Side Personalization • So far, personalization has mostly been done on the server side • We emphasize client-side personalization, which has 3 advantages: • More information about the user, thus more accurate user modeling (complete interaction history + other user activities) • More scalable (“distributed personalization”) • Alleviate the problem of privacy

  7. Implicit vs. Explicit User Modeling • Explicit user modeling • More accurate, but users generally don’t want to provide additional information • E.g., relevance feedback • Implicit user modeling • Less accurate, but no extra effort for users • E.g., implicit feedback We emphasize implicit user modeling

  8. Suppose we know: • Previous query = “racing cars” 3. User just viewed an “Apple OS” document “Jaguar” Example Revisited • “car” occurs far more frequently than “Apple” in pages browsed by the user in the last 20 days All the information is naturally available to an IR system

  9. Remaining Research Questions • Client-side or server-side personalization? • Implicit or explicit user modeling? • What’s a good retrieval framework for personalized search? • How to evaluate personalized search? • …

  10. Outline • A decision-theoretic framework • UCAIR personalized search agent • Evaluation of UCAIR

  11. Implicit user information exists in the user’s interaction history. We thus need to develop a retrieval framework for interactive retrieval…

  12. Modeling Interactive IR • Model interactive IR as “action dialog”: cycles of user action (Ai ) and system response (Ri )

  13. History H={(Ai,Ri)} i=1, …, t-1 Rt =? Rt r(At) Retrieval Decisions Given U, C, At, and H, choose the best Rt from all possible responses to At Query=“Jaguar” User U: A1 A2 … … At-1 At System: R1 R2 … … Rt-1 Click on “Next” button Best ranking for the query Best ranking of unseen docs C All possible rankings of C Document Collection All possible rankings of unseen docs

  14. User Model Seen docs M=(S, U…) Information need L(ri,At,M) Loss Function Optimal response: Rt (minimum loss) expected risk Inferred Observed Decision Theoretic Framework Observed User: U Interaction history: H Current user action: At Document collection: C All possible responses: r(At)={r1, …, rn}

  15. A Simplified Two-Step Decision-Making Procedure • Approximate the expected risk by the loss at the mode of the posterior distribution • Two-step procedure • Step 1: Compute an updated user model M* based on the currently available information • Step 2: Given M*, choose a response to minimize the loss function

  16. M*1 P(M1|U,H,A1,C) L(r,A1,M*1) R1 A2 M*2 P(M2|U,H,A2,C) L(r,A2,M*2) R2 A3 … Optimal Interactive Retrieval User U C Collection A1 IR system

  17. Refinement of Decision Theoretic Framework • r(At): decision space (At dependent) • r(At) = all possible rankings of docs in C • r(At) = all possible rankings of unseen docs • M: user model • Essential component: U = user information need • S = seen documents • L(ri,At,M): loss function • Generally measures the utility of ri for a user modeled as M • P(M|U, H, At, C): user model inference • Often involves estimating U

  18. Case 1: Non-Personalized Retrieval • At=“enter a query Q” • r(At) = all possible rankings of docs in C • M= U, unigram language model (word distribution) • p(M|U,H,At,C) = p(U |Q)

  19. Case 2: Implicit Feedback for Retrieval • At=“enter a query Q” • r(At) = all possible rankings of docs in C • M= U, unigram language model (word distribution) • H={previous queries} + {viewed snippets} • p(M|U,H,At,C) = p(U |Q,H) Implicit User Modeling

  20. Case 3: More General Personalized Search with Implicit Feedback • At=“enter a query Q” or “Back” button, “Next” link • r(At) = all possible rankings of unseen docs in C • M= (U, S), S= seen documents • H={previous queries} + {viewed snippets} • p(M|U,H,At,C) = p(U |Q,H) Eager Feedback

  21. Benefit of the Framework • Traditional view of IR • Retrieval  Match a query against documents • Insufficient for modeling personalized search (user and the interaction history are not part of a retrieval model) • The new framework provides a map for systematic exploration of • Methods for implicit user modeling • Models for eager feedback • The framework also provides guidance on how to design a personalized search agent (optimizing responses to every user action)

  22. The UCAIR Toolbar

  23. UCAIR Toolbar Architecture(http://sifaka.cs.uiuc.edu/ir/ucair/download.html) UCAIR User query Query Modification Search Engine (e.g., Google) Search History Log (e.g.,past queries, clicked results) User Modeling Result Re-Ranking clickthrough… results Result Buffer

  24. Decision-Theoretic View of UCAIR • User actions modeled • A1 = Submit a keyword query • A2 = Click the “Back” button • A3 = Click the “Next” link • System responses • r(Ai) = rankings of the unseen documents • History • H = {previous queries, clickthroughs} • User model: M=(X,S) • X = vector representation of the user’s information need • S = seen documents by the user

  25. Decision-Theoretic View of UCAIR (cont.) • Loss functions: • L(r, A2, M)= L(r, A3, M)  reranking, vector space model • L(r,A1,M)  L(q,A1,M)  query expansion, favor a good q • Implicit user model inference • X* = argmaxx p(x|Q,H), computed using Rocchio feedback • S* = all seens docs in H Vector of a seen snippet Newer versions of UCAIR have adopted language models

  26. UCAIR in Action • In responding to a query • Decide relationship of the current query with the previous query (based on result similarity) • Possibly do query expansion using the previous query and results • Return a ranked list of documents using the (expanded) query • In responding to a click on “Next” or “Back” • Compute an updated user model based on clickthroughs (using Rocchio) • Rerank unseen documents (using a vector space model)

  27. Screenshot for Result Reranking

  28. A User Study of Personalized Search • Six participants use UCAIR toolbar to do web search • Topics are selected from TREC web track and terabyte track • Participants explicitly evaluate the relevance of top 30 search results from Google and UCAIR

  29. UCAIR Outperforms Google: Precision at N Docs More user interactions  better user models  better retrieval accuracy

  30. UCAIR Outperforms Google: PR Curve

  31. Summary • Propose a decision theoretic framework to model interactive IR • Build a personalized search agent for the web search • Do a user study of web search and show that UCAIR personalized search agent can improve retrieval accuracy

  32. The End Thank you !

More Related