160 likes | 257 Views
Personalizing Java based Answers for Hundreds of Millions of Users. Anurag Gupta Senior Architect, Yahoo Answers & Groups anuragg@yahoo-inc.com. Agenda. Industry Gaps Vision Strategy Use Cases Architecture Next Steps. 2010: Resurgence of Q&A. 2010: A year of highlights….
E N D
Personalizing Java based Answers for Hundreds of Millions of Users Anurag Gupta Senior Architect, Yahoo Answers & Groups anuragg@yahoo-inc.com
Agenda • Industry Gaps • Vision • Strategy • Use Cases • Architecture • Next Steps
2010: Resurgence of Q&A 2010: A year of highlights… 2011: The story continues… Quora, Location-based Q&A apps (Crowd Beacon, Hipster), Facebook Questions and Mahalo pivoting, Answers.com acquisition… . . . Yahoo! Answers is still #1 (twice size of nearest competitor) Launch Acquisition Investment Mobile play
Why this activity? Companies entering market to address deficiencies of Social Media • Meeting unmet needs: • Improving signal to noise ratio • Beyond realtime: creating User Generated Content of lasting, evergreen value • Organising people’s knowledge and opinion for mass consumption • Allowing people to connect and share based on common interests, locations etc. • Providing platforms for people to become regarded as experts • Identifying untapped monetisation opportunities • Mining intent and interest and information from participating users
Yahoo Answers is the place to share opinions, experience & knowledge around personal interests
Strengthen core and reach out Monetization Ecosystem Distribution Personalization, User Interest Graph User Reputation
Personalization & Relevance Connected Devices Ranked content, video, ads User clicks Insights Users APIs Social graph Ads Content Yahoo User Generated Content, tagging Partner Data APIs Publisher Partners
Personalization & Relevance Users Finance Ads 3rd party publisher Sports News Search Content & Ad Server In-memory user-content-relevance_score Gaps drive acquisition of new relevant long-tail content User clicks Search terms Ranked content & ad Collaborative Filtering, social, geo, time Feeds User Segments Content- Tags Tag User Interest Graph Ad & Content Advertisers Social Graph ‘like’ Interactions: UGC, tags, Q&A Publishers
Yahoo Answers Personalization Use Cases • Learn about new users’ interests (cold-start) • Show relevant questions to user that comes via search engine • Show relevant questions to Answerer on Y! Answers or 3rd party site • Use knowledge of user interests to increase user engagement, page views, reach, monetization
Answers: Relevance & Content Quality Increase signal to noise ratio Reward content creators with relevant audience Help audience discover relevant high quality content High quality High relevance Q&A page Green – Y! wide Yellow – Answers specific Viewer’s interest Question Popularity Quality of Answers Answerers with High PeopleRank Answerability User Interest Graph PeopleRank of Viewer who voted “useful” # Best Answers Attributed To Answerer Useful Vote Like Vote
Architecture for Online & Offline Computation Relevance computation Tags Front-End Fast path Notification userId, contentId, relevance_score Middle-tier search terms, UGC User Profile Services NoSQL Long Tail Cache Oracle User interest Answerability Collaborative Filtering PeopleRank Feed Acquisition Quality of Answers Thumbs-up Content Question Popularity Tags 3rd party feeds New Online serving Answers serving New Offline on Hadoop Grid
Offline Relevance Computation 7, userID-Q-relevance_score 3, viewer interests 1, userID User Interest Graph PeopleRank Relevance Computation 4, top answerers 2, viewer interests 4b, popular Qs 5, top answerers 3b, viewer interests 6, Qs answered Answers Data on Grid
Incremental Online Relevance Computation 1, click, search, UGC 10, relevant Qs Front End 9, relevant Qs 2 5, viewer interests Middle Tier 3, userID, tags PeopleRank UPS 6, top answerers Relevance Computation 4, viewer interests 6b, popular Qs 7, top answerers 5b, viewer interests 8, Qs answered Answers Oracle Database
Next Steps • Move Oracle batch processing to Hadoop grid • Get Answers data on Hadoop grid • Annotation of source property for user interest • Detect useful vs. interesting feedback • User Interest Graph • PeopleRank • Tag computation • Bucketing infrastructure • Notification services