320 likes | 620 Views
Hack the BOSS Ted DRAKE Yahoo! France. BOSS Basics. “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect www2009 Conference, Madrid. BOSS = Freedom . Change ranking Create your own look and feel Use your favorite ads Mash with external APIs. Coming Soon…. SLA
E N D
BOSS Basics “BOSS is a data API. It’s not a search API” -Vik Singh, BOSS Architect www2009 Conference, Madrid
BOSS = Freedom • Change ranking • Create your own look and feel • Use your favorite ads • Mash with external APIs
Coming Soon… • SLA • Customer Support • Fees: • Free for most uses • Costs based on usage
BOSS Details • REST based API. • XML or JSON output • Web, News, Image, SiteSearch, and Spelling Suggestion services • Time span filtering for News Search • Delicious Tags and Popularity • Keyterm extraction • Microformat and RDF data • Extended abstracts • Recognizes most search filters from Yahoo! and Google (backdoor hacks)
What is the most important part of your application? • The results display? • The text ads? • The rounded borders? • The smooth animations? • The perfect URL? THE QUERY STRING!!!
The Query • Tells you what the user is looking for • Generates related topics • Powers secondary APIs • Can be generated by a search box, URL, tags,or keyword extraction from the page. • The Query is your BFF!
Let’s Start Hacking! • Get an API key • http://developer.yahoo.com • You don’t need a URL for now. • Update it later for better tracking and promotion.
Site Specific Results Search only one site:/ysearch/web/v1/golf+site:vw.com? Search from a select group of sites:/ysearch/web/v1/golf?sites=vw.com,vwtrendsweb.com,performancevwmag.com,caranddriver.com
Tag or Title Filters Use the inurl: filter to simulate tag search:/ysearch/web/v1/inurl:golf? Use intitle: to filter results with query in title/ysearch/web/v1/intitle:golf?
Get Related Sites Use related:foo.html to find related sites/ysearch/web/v1/related:http://www.caranddriver.com/car/2006-models/2006-golf.html?
BOSS Keyterms • Keyterms are words used to find a site while searching on Yahoo! • Listed in order of relevance. • /web/v1/{query}?view=keyterms
Delicious Tags and Popularity • How many times has a page been saved in Delicious? • What tags have been associated with the page? How many times? • view=delicious_saves,delicious_toptags
KeyTerms + Delicious Tags: What are they good for? • Relevancy • Related Searches • Search Suggest • Tag Clouds • Trigger secondary APIs • Highlight Popular Results
What it looks like <keyterms> <terms> <term>Bucharest</term> <term>city</term> <term>Romanian</term> <term>population</term> <term>Romania</term> <term>architecture</term> <term>city centre</term> <term>clubs</term> </terms> </keyterms>
BOSS Mashup Framework • Python based framework to mash BOSS API with secondary web services and proprietary data • Easy integration with Google APP Engine • Powers the infamous YUIL (4 hour search) project. • Fast prototyping with minimal code
BOSSY Code on BOSS Mashup Platform __author__ = "Vik Singh (viksi@yahoo-inc.com)" from yos.util import text, typechecks from yos.yql import db from yos.boss import ysearch def month_lookup(s): for m in ["jan", "feb", "mar", "apr", "may", "jun", "jul", "aug", "sept", "oct", "nov", "dec"]: if s.startswith(m): return m def parse_month(s): months = filter(lambda m: m is not None, map(month_lookup, text.uniques(s))) if len(months) > 0: return text.norm(months[0]).capitalize() def parse_year(s): years = filter(lambda t: len(t) == 4 and typechecks.is_int(t), text.uniques(s)) if len(years) > 0: return text.norm(years[0])
Location Based Relevancy • Where am I? • Where am I going? • What can I find? Map generated by FirePin application on iPhone
Location Based Relevancy • Fire Eagle: Standardized location and sharing platform • Live location tracking • Find upcoming traffic cameras, landmarks, restaurants, headlines, photos, twitter buzz, etc… • Shared locations with friends • Mining Interesting Locations and Travel Sequences from GPS Trajectories for Mobile Users by Yu Zheng, Lizhu Zhang, Xing Xie and Wei-Ying Ma
Secondary SourcesWikipedia, Craigslist, Government Data… • Multiple sources to increase relevance • DuckDuckGo.com = BOSS + Wikipedia (and other services) • Understanding User's Query Intent with Wikipedia by Jian Hu, gang wang, Fred Lochovsky and Zheng Chen - www2009 conference • OpenData: DataMob.org, TheInfo.org, InfoChimps.org Blah Foo Blah Blah 1. Foo Baz Bar Foo
Real Time Events • Tweet News: Twitter + News Search • Twitter users share most timely articles • Relevancy highlights tweeted stories BOSS
Internal + External Data Sources • Tech Crunch Search: BOSS + Access to proprietary data • Create custom tables in YQL • BOSS “Vertical Lens” defines what internal data BOSS should index as well as your preferred external sources. BOSS
Offline Analysis • Coloralo • requests extra images • caches them • analyzes them for relevancy • Coloralo finds coloring book images.
Quick and Easy semantic Search • Limit your results to sites with microformats or rdf data:searchmonkeyid:com.yahoo.page.uf.hreview • Request structured data, keyterms, and Delicious data from BOSS:view=keyterms,searchmonkey_feed,searchmonkey_rdf,delicious_toptags,delicious_saves • Sample request:http://boss.yahooapis.com/ysearch/web/v1/cocorosie+searchmonkeyid:com.yahoo.page.uf.hreview?appid=YourAppId&format=xml&start=0&count=15&view=keyterms%2Csearchmonkey_feed%2Csearchmonkey_rdf%2Cdelicious_toptags
Inurl and Intitle Hacks • Use your favorite search engine hacks with BOSS. • Most of the SERP advanced search tricks will work with your BOSS requests. • This does not include Google, Yahoo!, or other specific patterns such as !sports
Website Description • Get a more complete picture of a target web site by combining multiple requests • Find the number of external sites linking to the site:/ysearch/se_inlink/v1/{site}?omit_inlinks=domain • Find the pages within the site: /ysearch/se_pagedata/v1/{site}? • Find related web pages:/ysearch/web/v1/related:{site}?view=delicious_saves,delicious_toptags
Filter News by Time • Older, less timely articles may have more natural relevancy. Control this by selecting the age range for news articles. • Use orderby=date to show latest instead of most relevant. • What happened while you were asleep: /ysearch/news/v1/{query}?age=9h&orderby=date • Limit news articles to 1-7 days old:/ysearch/news/v1/{query}?age=1d-7d
Vertical Focus • Vertical Search Engines already have a niche audience. • Limit searches to appropriate sites: InsiderFood • Truevert creates a model of word relations in context to its niche: environmental.
Go Beyond the Web Site • Desktop: Xobni for Outlok • Tools: Zemanta finds related information for blogs and emails • Modular: Create an application for Facebook, Yahoo, MySpace and more with the Open Social standard.
Go from Search to Action • Keyword Finder uses BOSS keyterms to return the top 10 keywords used by successful sites for a query • Bossy returns a single answer to questions. Where is Big Ben? London.
Resources • Yahoo! BOSS: http://developer.yahoo.com/boss • BOSS Mashup Framework: http://developer.yahoo.com/search/boss/mashup.html • YQL: http://developer.yahoo.com/yql • Fire Eagle: http://developer.yahoo.com/fireeagle/ • Google App Engine: http://appengine.google.com • Amazon Web Services: http://aws.amazon.com • oAuth: http://oauth.net/ • Open Social: http://www.opensocial.org/ • Open Data: http://theinfo.org • Alt Search Engines: http://www.altsearchengines.com/ • BOSS Hacks: http://bosshacks.com • Add your hack to http://www.bosshacks.com/hacks/open-hack-day-london-2009