Enhancing Search Relevance Using Collaborative Filtering

Future Direction : Collaborative Filtering • Motivating Observations: •  Relevance Feedback is useful, but expensive • Humans don’t often have time to give positive/negative judgments on a long list of returned web pages • to improve individual searches • Effort is used once, then wasted  want pooling and re-use of efforts access individuals cs466-25

Collaborative Filtering Motivating Observations (continued) :  Relevance ¹ Quality Queries : bootleg CD’s NAFTA Medical School Admissions Simulated Annealing REM Alzheimer’s Many web pages can be “about” a topic (specialized unit) But there are great differences in quality of presentation, detail, professionalism, substance, etc. cs466-25

Possible Solution: build a supervised learner for quality/ NOT topic matter Train on examples of each, learn distinguishing properties cs466-25

One Solution: Supervised Learner for “Quality” of a Page • P(Quality|Features) in addition to topic similarity • salient features may include: • # of links • Size • How often cited • Variety of content • “Top 5th of Web” awards etc, • assessment of usage counter (hit count) • Complexity of graphics µ quality?? • Prior quality rating of server cs466-25

Collaborative Filtering Problem: Different humans have different profiles of relevance/quality Query: Alzheimer’s disease Appropriate for Care Giver Relevant (high quality) for 6th Grader Medical Researcher = A document or web page cs466-25

One Solution: Pool collective wisdom and compute weighted average of page rankings across multiple users in an affinity group (taking into account topic relevance, quality, and other intangibles) Hypothesis : humans have a better idea than machines of what other humans will find interesting cs466-25

Collaborative Filtering Idea: instead of trying to model (often intangible) quality judgments, keep a record of previous human relevance and quality judgments Query: Alzheimer’s Users A B C D E F G 1 2 3 4 1059 1060 1061 Table of user rankings of web pages for a query Web pages cs466-25

Solution 1: Identify individual with similar tastes (high Pearson’s coefficient on similar ranking judgments) instead of: P(relevant to me | Pagei content) compute: P(relevant to me | relevant to you)  My similarity to you * P(relevant to you | Pagei content)  Your Judgments cs466-25

Solution 2: Model Group Profiles for relevance judgments (e.g. Junior High School vs. Medical Researchers) compute: P(relevant to me | relevant to groupg)  My similarity to the group * P(relevant to groupg | Pagei content)  group’s collective (avg) relevance judgments Supervised Learning cs466-25

Enhancing Search Relevance Using Collaborative Filtering