Collaborative Filtering: Some Comments on the State of the Art

Collaborative Filtering: Some Comments on the State of the Art Jon Herlocker Assistant Professor School of Electrical Engineering and Computer Science Oregon State University Corvallis, OR

Yahoo! Employee-to-be? AudreyHerlocker Age 1 8/18/2004

Take-Aways • We have a problem synthesizing research in CF • CoFE: free, could increase research productivity, and reduce barriers to standardization • More focus on the user experience needed • There is a great potential for CF in information retrieval (i.e. not just product recommendation)

What is the State of the Art? • 10+ years of collaborative filtering (CF) research • CF == machine learning? • 20+ years of machine learning? • Still hasn’t transitioned from a science to engineering • Still no “recommender system cookbook”

What do we know? • Consider the academic literature on CF • Lots of disconnected discoveries • Hard to synthesize • Different data sets • Variance in algorithm implementation • Variance in experimental procedures • Analysis of systems, not features • Private knowledge not shared • High barrier to formal experimentation and publication • No venue or reward for negative results • Commercial discoveries == intellectual property • So the sum of all knowledge? • Doesn’t add up

Productivity of CF Research Community • How to increase productivity of CF research? • Each effort should have greater effect on total knowledge • Each effort should cost less • Increase the quantity of practical experience with CF • Our contribution: • CoFE: Collaborative Filtering Engine

Shared Research Infrastructure • Concept • Free, open-source, infrastructure for rapid development and analysis of algorithms • Also make it fast and stable enough for mid-scale production • Facilitates • Lower cost methodical research • Sharing of new algorithms • Repositories • Comparability in analysis methods and algorithm implementations • More practical usage of CF

CoFE • CoFE - “Collaborative Filtering Engine” • Open source framework for Java • Easy to create new algorithms • Includes testing infrastructure (next month) • Reference implementations of many popular CF algorithms • Can support high-performance deployment • Production-ready (see Furl.net)

CoFE Data Manager Object In-memory cache with high performance datastructures Algorithm Object Relational DB (MySQL) Algorithm Interface Server instance Analysis Framework XML Experiment Metadata File and Delimited data file Experiment Configuration File (XML)

Checkpoint: Take-Aways • We have a problem synthesizing research in CF • CoFE: free, could increase research productivity and reduce barriers to standardization • Coming up • More focus on the user experience needed • There is a great potential for CF in information retrieval (i.e. not just product recommendation) • CoFE URL: • http://eecs.oregonstate.edu/iis/CoFE

Does the Algorithm Really Matter? • Where do we get the most impact? (benefit/cost) • A. Improving the algorithm? • B. Changing user interface/user interaction?

Does the Algorithm Really Matter? • Where do we get the most impact? (benefit/cost) • A. Improving the algorithm? • B. Changing user interface/user interaction? • Answer: • Unless you have already optimized your user interface extensively, the answer is usually B.

Scenario from a Related Field • Document retrieval study by Turpin and Hersh (SIGIR 2001) • Two groups of medical students • Compared human performance of • 1970s search model (basic TF/IDF) • Recent OKAPI search model with greatly improved Mean Average Precision • Identical user interfaces • Task: locating medical information • Result: no statistical difference!!!!

Turpin & Hersh Findings • Humans quickly compensate for poor algorithm performance • Possible conclusion: provide user interfaces that allow users to compensate • Many relevant results weren’t selected as relevant • Possible conclusion: focus on persuading as well as recommending

Analyzing Algorithms for End-user Effects • Algorithms believed “reasonable” may actually be terrible! • McLaughlin & Herlocker, SIGIR 2004. • In this case, poor handling of low confidence recommendations • In situations with small amounts of data • Changes in algorithm -> big changes in recommendations • Analyze exact recommendations seen by end-user • Instead of just items with existing ratings

Data from SIGIR 2004 Paper

Checkpoint: Take-Aways • Previously • We have a problem synthesizing research in CF • CoFE: free, could increase research productivity and reduce barriers to standardization • More focus on the user experience needed • Coming up • There is a great potential for CF in information retrieval (i.e. not just product recommendation) • CoFE URL: • http://eecs.oregonstate.edu/iis/CoFE

Exploring Library Search Interfaces With Janet Webster, Oregon State University Libraries

Features of Web-based Library Search • Diverse content • Web pages, catalogs, journal indexes, electronic journals, maps, various other digital “special collections • Searchable databases are important “sources” • Library responsibility • Guiding people to “appropriate” content • Understanding what the user’s “real” need is

SERF: System for Electronic Recommendation Filtering

The Human Element • Capture and leverage the experience of every user • Recommendations are based on human evaluation • Explicit votes • Inferred votes (implicit) • Recommend (question, document) pairs • Not just documents • Human can determine if questions have similarity • System gets smarter with each use • Not just each new document

Initial Results

Only Google Results (706 - 59.13%) Google results + recommendations (488 - 40.87%) Average visited documents: 2.196 Average visited documents: 1.598 Clicked (172 – 24.4%) No clicks (534 - 75.6%) Clicked (197 – 40.4%) No click (291 – 59.6%) First click - Google result (56 – 28.4%) First click - recommendation (141 – 71.6%) Average ratings: 14.727 Average ratings: 20.715 Three months SERF usage – 1194 search transactions

Three months SERF usage – 1194 search transactions Only Google Results (706 - 59.13%) Google results + recommendations (488 - 40.87%) Average visited documents: 2.196 Average visited documents: 1.598 Clicked (172 – 24.4%) No clicks (534 - 75.6%) Clicked (197 – 40.4%) No click (291 – 59.6%) First click - Google result (56 – 28.4%) First click - recommendation (141 – 71.6%) Average ratings: 14.727 Average ratings: 20.715

Three months SERF usage – 1194 search transactions Only Google Results (706 - 59.13%) Google results + recommendations (488 - 40.87%) Average visited documents: 2.196 Average visited documents: 1.598 Clicked (172– 24.4%) No clicks (534 - 75.6%) Clicked (197 – 40.4%) No click (291 – 59.6%) First click - Google result (56 – 28.4%) First click - recommendation (141 – 71.6%) Average ratings: 14.727 Average ratings: 20.715

Three months SERF usage – 1194 search transactions Only Google Results (706 - 59.13%) Google results + recommendations (488 - 40.87%) Average visited documents: 2.196 Average visited documents: 1.598 Clicked (172 – 24.4%) No clicks (534 - 75.6%) Clicked (197 – 40.4%) No click (291 – 59.6%) First click - Google result (56 – 28.4%) First click - recommendation (141 – 71.6%) Average rating: 14.727 (49% Voted as Useful) Average rating: 20.715 (69% Voted as Useful) Vote of yes = 30, vote of no = 0

Conclusion • No large leaps in language understanding expected • Understanding the meaning of language is *very* hard • Collaborative filtering (CF) bypasses this problem • Humans do the analysis • Technology is widely applicable

Try it!

Final Take-Aways • We have a problem synthesizing research in CF • CoFE: free, could increase research productivity and reduce barriers to standardization • More focus on the user experience needed • Great potential for CF in information retrieval (i.e. not just product recommendation)

Links & Contacts • Research Group Home Page • http://eecs.oregonstate.edu/iis • CoFE • http://eecs.oregonstate.edu/iis/CoFE • SERF • http://osulibrary.oregonstatate.edu/ • Jon Herlocker • herlock@cs.orst.edu • + 1 (541) 737-8894

Simple CF Users Items User-User Links Item-ItemLinks Observed preferences

Ending Thoughts • Recommendation vs. persuasion

Stereotypical Integrator of RS Has: • Large item catalog • With item attributes (e.g. keywords, metadata such as author, subject, cross-references, …) • Large user base • With user attributes (age, gender, city, country, …) • Evidence of customer preferences • Explicit ratings (powerful, but harder to elicit) • Observations of user activity (purchases, page views, emails, prints, …)

The RS Space Users Items User-User Links Item-ItemLinks Observed preferences

Traditional Personalization Users Items User-User Links Item-ItemLinks Observed preferences

Classic CF Users Items User-User Links Item-ItemLinks Observed preferences In the end, most models will be hybrid

Classic CF Users Items User-User Links Item-ItemLinks Observed preferences

Advantages of Pure CF • No expensive and error-prone user attributes or item attributes • Incorporates quality and taste • Works on any rate-able item • One data model => many content domains • Serendipity • Users understand and connect with it!

Collaborative Filtering: Some Comments on the State of the Art