380 likes | 586 Views
Yury Lifshits Yahoo! Research http://yury.name. St. Petersburg CS Club December 2008. Future of Search. Outline. Structured Search Yahoo! Work in Search SearchMonkey BOSS Research Agenda. Structured Search: work in progress. Structured Search = Bring structured data to search users.
E N D
Yury LifshitsYahoo! Researchhttp://yury.name St. Petersburg CS Club December 2008 Future of Search
Outline • Structured Search • Yahoo! Work in Search • SearchMonkey • BOSS • Research Agenda
Structured Search =Bring structured data to search users M.K. Bergman. The Deep Web: Surfacing Hidden Value. 2001.
Value Proposition • Coverage • Real-time data • Semi-private data • Structured queries • Ordering and filtering results • Straight-to-answers
User Interface: Query • Search assist: Yahoo! • Selector: LinkedIn, VKontakte.ru • Multiple search buttons: Gmail • Search tabs: Yahoo / Google
User Interface: Results • Federated page • Facets • Search transfer / search form K.P. Yee, K. Swearingen, K. Li, M. Hearst. Faceted metadata for image search and browsing. CHI 2003. Fernando Diaz. Aggregation of News Content Into Web Results. WSDM 2009. http://glue.yahoo.com http://au.alpha.yahoo.com
Data Supply Chain • Atomic fact Flight, Event, Patent • Data aggregator US Patents, Amadeus/Sabre flights, Upcoming.com • Domain search Expedia, Spock • General purpose search Yahoo!, Google, Yandex, Baidu
Getting structured data • Entity extraction • Markup • Feeds • Search API (OpenSearch) OR • Do a search transfer
Give Us Your Data For … • Traffic via search transfer Firefox search box • Better presentation in search SearchMonkey • Hosted search BOSS Custom • Showing your ads Yahoo Local + AT&T
Slides by: Paul Tarjan, Chief Technical Monkey (ptarjan@yahoo-inc.com) Full version http://www.slideshare.net/ptarjan/searchmonkey-presentation
What is SearchMonkey? an open platform for using structured data to build more useful and relevant search results Before After
Enhanced Result: Zagat Image Links Key/Value Pairs or Abstract
Infobar: Wikipedia Preview Summary Blob
Creating an Infobar • Infobar advantages • Annotate someone else’s site • Use links and images from other domains • Mash up info from multiple sites • Affiliate / coupon links? Hmmm… • Can act on *, all websites • But these apps can be annoying if poorly designed • Key design principles • Put something useful in the summary • Be creative with the HTML
How to get data to SearchMonkey? • Humans see: • name • picture of a person • current job • industry, … • Computers see: • an undifferentiated • blob of HTML • Can we make computers smarter?
How does it work? 1 site owners/publishers share structured data with Yahoo!. site owners & third-party developers build SearchMonkey apps. 2 consumers customize their search experience with Enhanced Results or Infobars 3 Page Extraction RDF/Microformat Markup Acme.com’s Web Pages Index DataRSS feed Web Services Acme.com’s database
SearchMonkey Resources • Main: • http://developer.yahoo.com/searchmonkey • Lists and forums: • searchmonkey-developers@yahoogroups.com • http://suggestions.yahoo.com/searchmonkey
What BOSS = Build your Own Search Service Open Yahoo’s core search features via web services to let 3rd parties revolutionize Search Unrestricted
What • Unrestricted: • Unlimited queries • Blend, re-order, discard • Full presentation control • Non-search apps OK • Monetization:Free or CPM or Ads
Why • Barriers to entry are massive • $300M, top talent, a prayer to get to basic parity • No monopoly over great ideas • Search anywhere • Improve Vertical Quality w/ Web comprehensiveness • Fragment the market, foster more players, choice, competition • Yahoo extends advertising reach, 3rd parties revenue share
Why + BOSS Distribution Traditional Search Distribution
Tracks API A self-service, web services model for developers and start-ups to quickly build and deploy new search experiences. CUSTOM Working with 3rd parties to build a more relevant, brand/site specific web search experience. This option is jointly built by Yahoo! and select partners. • ACADEMIC • Working with the following universities to allow for wide-scale research in the search field: • UIUC • CMU • Stanford • Purdue • IIT Bombay • MIT • UMass Interested in Custom? Email us bosscustom@yahoo-inc.com
BOSS API v1 http://boss.yahooapis.com/ysearch/{vert}/v1/{q} {vert} := {web, news, images, spelling} @ required appid @ optional (Y!OS compliant) start, count, lang, region, format, callback, sites
BOSS Mashup Framework Python (v2.5+) library BOSS Search SDK plus … SQL for remixing arbitrary XML/JSON sources Loosely Functional programming paradigm
BMF + Google App Engine Ported enhanced version of BMF to GAE platform http://zooie.wordpress.com/2008/08/04/yahoo-boss-google-app-engine-integrated/ Easiest way to deploy a BOSS application online
Examples http://www.4hoursearch.com http://123people.com Mashable! Contest for BOSS search engines http://mashable.com/boss/
TechCrunch Neywork Search • CrunchBase + Posts + Web • Sort by time / relevance • Enhanced results • Domain-specific facets • Yahoo! sponsored search • Real-time indexing • Special results
Structured Search • Analysis of search demand • Intent classification • General search vs. vertical • Incentives in data supply • Push & real-time indexing • Search user interface • One box vs. multi-box • General vs. vertical • Deciding search transfer • When? • To whom?
Key Scientific ChallengesDraft: http://research.yahoo.com/ksc • Search intent • Quality metrics • Web mining • Multilingual IR • Nextgen search • Synthesized result pages • World knowledge A.Z. Broder. Taxonomy of web search. SIGIR 2002.
More Problems • Discovery search • Web search vs. asking people • Event search
Thanks for your attention! Yury Lifshits http://yury.name yury@yury.name