1 / 38

Yury Lifshits Yahoo! Research yury.name

Yury Lifshits Yahoo! Research http://yury.name. St. Petersburg CS Club December 2008. Future of Search. Outline. Structured Search Yahoo! Work in Search SearchMonkey BOSS Research Agenda. Structured Search: work in progress. Structured Search = Bring structured data to search users.

KeelyKia
Download Presentation

Yury Lifshits Yahoo! Research yury.name

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Yury LifshitsYahoo! Researchhttp://yury.name St. Petersburg CS Club December 2008 Future of Search

  2. Outline • Structured Search • Yahoo! Work in Search • SearchMonkey • BOSS • Research Agenda

  3. Structured Search:work in progress

  4. Structured Search =Bring structured data to search users M.K. Bergman. The Deep Web: Surfacing Hidden Value. 2001.

  5. Value Proposition • Coverage • Real-time data • Semi-private data • Structured queries • Ordering and filtering results • Straight-to-answers

  6. User Interface: Query • Search assist: Yahoo! • Selector: LinkedIn, VKontakte.ru • Multiple search buttons: Gmail • Search tabs: Yahoo / Google

  7. User Interface: Results • Federated page • Facets • Search transfer / search form K.P. Yee, K. Swearingen, K. Li, M. Hearst. Faceted metadata for image search and browsing. CHI 2003. Fernando Diaz. Aggregation of News Content Into Web Results. WSDM 2009. http://glue.yahoo.com http://au.alpha.yahoo.com

  8. Data Supply Chain • Atomic fact Flight, Event, Patent • Data aggregator US Patents, Amadeus/Sabre flights, Upcoming.com • Domain search Expedia, Spock • General purpose search Yahoo!, Google, Yandex, Baidu

  9. Getting structured data • Entity extraction • Markup • Feeds • Search API (OpenSearch) OR • Do a search transfer

  10. Give Us Your Data For … • Traffic via search transfer Firefox search box • Better presentation in search SearchMonkey • Hosted search BOSS Custom • Showing your ads Yahoo Local + AT&T

  11. Yahoo! Work in Search

  12. Slides by: Paul Tarjan, Chief Technical Monkey (ptarjan@yahoo-inc.com) Full version http://www.slideshare.net/ptarjan/searchmonkey-presentation

  13. What is SearchMonkey? an open platform for using structured data to build more useful and relevant search results Before After

  14. Enhanced Result: Zagat Image Links Key/Value Pairs or Abstract

  15. Infobar: Wikipedia Preview Summary Blob

  16. Creating an Infobar • Infobar advantages • Annotate someone else’s site • Use links and images from other domains • Mash up info from multiple sites • Affiliate / coupon links? Hmmm… • Can act on *, all websites • But these apps can be annoying if poorly designed • Key design principles • Put something useful in the summary • Be creative with the HTML

  17. How to get data to SearchMonkey? • Humans see: • name • picture of a person • current job • industry, … • Computers see: • an undifferentiated • blob of HTML • Can we make computers smarter?

  18. How does it work? 1 site owners/publishers share structured data with Yahoo!. site owners & third-party developers build SearchMonkey apps. 2 consumers customize their search experience with Enhanced Results or Infobars 3 Page Extraction RDF/Microformat Markup Acme.com’s Web Pages Index DataRSS feed Web Services Acme.com’s database

  19. SearchMonkey Resources • Main: • http://developer.yahoo.com/searchmonkey • Lists and forums: • searchmonkey-developers@yahoogroups.com • http://suggestions.yahoo.com/searchmonkey

  20. Vik Singh (Architect)Graham Mudd (Senior PMM)

  21. What BOSS = Build your Own Search Service Open Yahoo’s core search features via web services to let 3rd parties revolutionize Search Unrestricted

  22. What • Unrestricted: • Unlimited queries • Blend, re-order, discard • Full presentation control • Non-search apps OK • Monetization:Free or CPM or Ads

  23. Why • Barriers to entry are massive • $300M, top talent, a prayer to get to basic parity • No monopoly over great ideas • Search anywhere • Improve Vertical Quality w/ Web comprehensiveness • Fragment the market, foster more players, choice, competition • Yahoo extends advertising reach, 3rd parties revenue share

  24. Why + BOSS Distribution Traditional Search Distribution

  25. Tracks API A self-service, web services model for developers and start-ups to quickly build and deploy new search experiences. CUSTOM Working with 3rd parties to build a more relevant, brand/site specific web search experience. This option is jointly built by Yahoo! and select partners. • ACADEMIC • Working with the following universities to allow for wide-scale research in the search field: • UIUC • CMU • Stanford • Purdue • IIT Bombay • MIT • UMass Interested in Custom? Email us bosscustom@yahoo-inc.com

  26. BOSS API v1 http://boss.yahooapis.com/ysearch/{vert}/v1/{q} {vert} := {web, news, images, spelling} @ required appid @ optional (Y!OS compliant) start, count, lang, region, format, callback, sites

  27. BOSS Mashup Framework Python (v2.5+) library BOSS Search SDK plus … SQL for remixing arbitrary XML/JSON sources Loosely Functional programming paradigm

  28. BMF + Google App Engine Ported enhanced version of BMF to GAE platform http://zooie.wordpress.com/2008/08/04/yahoo-boss-google-app-engine-integrated/ Easiest way to deploy a BOSS application online

  29. Examples http://www.4hoursearch.com http://123people.com Mashable! Contest for BOSS search engines http://mashable.com/boss/

  30. BOSS Custom for TechCrunch

  31. TechCrunch Neywork Search • CrunchBase + Posts + Web • Sort by time / relevance • Enhanced results • Domain-specific facets • Yahoo! sponsored search • Real-time indexing • Special results

  32. Research Agenda

  33. Structured Search • Analysis of search demand • Intent classification • General search vs. vertical • Incentives in data supply • Push & real-time indexing • Search user interface • One box vs. multi-box • General vs. vertical • Deciding search transfer • When? • To whom?

  34. Key Scientific ChallengesDraft: http://research.yahoo.com/ksc • Search intent • Quality metrics • Web mining • Multilingual IR • Nextgen search • Synthesized result pages • World knowledge A.Z. Broder. Taxonomy of web search. SIGIR 2002.

  35. More Problems • Discovery search • Web search vs. asking people • Event search

  36. Thanks for your attention! Yury Lifshits http://yury.name yury@yury.name

More Related