1 / 16

Personalizing Java based Answers for Hundreds of Millions of Users

Personalizing Java based Answers for Hundreds of Millions of Users. Anurag Gupta Senior Architect, Yahoo Answers & Groups anuragg@yahoo-inc.com. Agenda. Industry Gaps Vision Strategy Use Cases Architecture Next Steps. 2010: Resurgence of Q&A. 2010: A year of highlights….

darin
Download Presentation

Personalizing Java based Answers for Hundreds of Millions of Users

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Personalizing Java based Answers for Hundreds of Millions of Users Anurag Gupta Senior Architect, Yahoo Answers & Groups anuragg@yahoo-inc.com

  2. Agenda • Industry Gaps • Vision • Strategy • Use Cases • Architecture • Next Steps

  3. 2010: Resurgence of Q&A 2010: A year of highlights… 2011: The story continues… Quora, Location-based Q&A apps (Crowd Beacon, Hipster), Facebook Questions and Mahalo pivoting, Answers.com acquisition… . . . Yahoo! Answers is still #1 (twice size of nearest competitor) Launch Acquisition Investment Mobile play

  4. Why this activity? Companies entering market to address deficiencies of Social Media • Meeting unmet needs: • Improving signal to noise ratio • Beyond realtime: creating User Generated Content of lasting, evergreen value • Organising people’s knowledge and opinion for mass consumption • Allowing people to connect and share based on common interests, locations etc. • Providing platforms for people to become regarded as experts • Identifying untapped monetisation opportunities • Mining intent and interest and information from participating users

  5. Industry Gaps

  6. Yahoo Answers is the place to share opinions, experience & knowledge around personal interests

  7. Y! Answers: Leading Site with over 2X next competitor

  8. Strengthen core and reach out Monetization Ecosystem Distribution Personalization, User Interest Graph User Reputation

  9. Personalization & Relevance Connected Devices Ranked content, video, ads User clicks Insights Users APIs Social graph Ads Content Yahoo User Generated Content, tagging Partner Data APIs Publisher Partners

  10. Personalization & Relevance Users Finance Ads 3rd party publisher Sports News Search Content & Ad Server In-memory user-content-relevance_score Gaps drive acquisition of new relevant long-tail content User clicks Search terms Ranked content & ad Collaborative Filtering, social, geo, time Feeds User Segments Content- Tags Tag User Interest Graph Ad & Content Advertisers Social Graph ‘like’ Interactions: UGC, tags, Q&A Publishers

  11. Yahoo Answers Personalization Use Cases • Learn about new users’ interests (cold-start) • Show relevant questions to user that comes via search engine • Show relevant questions to Answerer on Y! Answers or 3rd party site • Use knowledge of user interests to increase user engagement, page views, reach, monetization

  12. Answers: Relevance & Content Quality Increase signal to noise ratio Reward content creators with relevant audience Help audience discover relevant high quality content High quality High relevance Q&A page Green – Y! wide Yellow – Answers specific Viewer’s interest Question Popularity Quality of Answers Answerers with High PeopleRank Answerability User Interest Graph PeopleRank of Viewer who voted “useful” # Best Answers Attributed To Answerer Useful Vote Like Vote

  13. Architecture for Online & Offline Computation Relevance computation Tags Front-End Fast path Notification userId, contentId, relevance_score Middle-tier search terms, UGC User Profile Services NoSQL Long Tail Cache Oracle User interest Answerability Collaborative Filtering PeopleRank Feed Acquisition Quality of Answers Thumbs-up Content Question Popularity Tags 3rd party feeds New Online serving Answers serving New Offline on Hadoop Grid

  14. Offline Relevance Computation 7, userID-Q-relevance_score 3, viewer interests 1, userID User Interest Graph PeopleRank Relevance Computation 4, top answerers 2, viewer interests 4b, popular Qs 5, top answerers 3b, viewer interests 6, Qs answered Answers Data on Grid

  15. Incremental Online Relevance Computation 1, click, search, UGC 10, relevant Qs Front End 9, relevant Qs 2 5, viewer interests Middle Tier 3, userID, tags PeopleRank UPS 6, top answerers Relevance Computation 4, viewer interests 6b, popular Qs 7, top answerers 5b, viewer interests 8, Qs answered Answers Oracle Database

  16. Next Steps • Move Oracle batch processing to Hadoop grid • Get Answers data on Hadoop grid • Annotation of source property for user interest • Detect useful vs. interesting feedback • User Interest Graph • PeopleRank • Tag computation • Bucketing infrastructure • Notification services

More Related