900 likes | 1.14k Views
Semantic Search Engines – On the Way to Web 3.0. מנועי חיפוש סמנטיים – Web בדרך ל-3.0. אריאל פרנק מחלקה למדעי המחשב אוניברסיטת בר-אילן ariel@cs.biu.ac.il. Contents. Web 3.0 & Semantic Search General Search "Natural Language" Search Vertical Search "Social Networking" Search
E N D
Semantic Search Engines – On the Way to Web 3.0 מנועי חיפוש סמנטיים – Web בדרך ל-3.0 אריאל פרנק מחלקה למדעי המחשב אוניברסיטת בר-אילן ariel@cs.biu.ac.il A. Frank
Contents • Web 3.0 & Semantic Search • General Search • "Natural Language" Search • Vertical Search • "Social Networking" Search • Personalized Search A. Frank
What is Web 2.0?! , Open Gardens blog, AjitJaokar http://opengardensblog.futuretext.com/archives/2005/12/mobile_Web_20_w.html A. Frank
“The good, the bad and the …” A. Frank
Web 1.0, Web 2.0, Web 3.0, Web X.0… A. Frank
Semantic Search • Syntactic search – can match the query against • index of the textual content of the resources • URIs (URLs, URNs) in the system • literals in the RDF metadata • or a combination of these, possibly using: • Exact, prefix or substring match, stemming, minimal edit distance • Semantic search – in addition to syntactic search, can use • index of the meaning of sentences in each resource • semantic information and analysis • the graph structure of RDF metadata • or a combination of these, possibly using: • query expansion, classification/categorization, tagging, graph traversal, microformats, RDF & OWL inferencing and reasoning A. Frank
Can Semantic SEs answer this :-?) A. Frank
Types/Examples of Semantic SEs • General Search • MetaWeb Freebase, Yahoo! Microsearch, … • "Natural Language" Search • Powerset, Hakia, AskMeNow AskWiki, … • Vertical Search • Kango, AdaptiveBlue, ReportLinker, … • "Social Networking" Search • SemantiNet, Delver, Google Social Graph API, … • Personalized Search • Twine, MavinIT PSS, … A. Frank
Contents • Web 3.0 & Semantic Search • General Search • "Natural Language" Search • Vertical Search • "Social Networking" Search • Personalized Search A. Frank
MetaWeb Technologies -Freebase • Based in San Francisco, MetaWeb Technologies was spun out of Applied Minds in July 2005. • Goal: build a better infrastructure for the Web application developers and publishers. A. Frank
Freebase Rational • Open, shared database of the world’s knowledge that collects data from the Web to build a massive, collaboratively-edited database of cross-linked data. • It is built by the community, for the community. • Free for anyone to query, contribute to, build applications on top of, or integrate into their Web sites. • Focus is on organizing and managing complex data structures by use of Semantic Web technologies. • Enables extraction of ordered knowledge out of the information chaos that is the current Web. A. Frank
Freebase A. Frank
Freebase Repository • Covers millions of topics in hundreds of categories. • Draws from large open repositories like Wikipedia, MusicBrainz, and the SEC archives. • Contains structured information on many popular topics, like movies, music, people and locations – all reconciled and freely available via an open API. • Freebase information is supplemented by the efforts of a passionate global community of users, who are working together to add structured information on everything relevant. A. Frank
Domains and Types A. Frank
Google Company A. Frank
Freebase Help Center A. Frank
Freebase Semantics • Freebase spans domains, but requires that a particular topic exist only once, even if it might normally be found in multiple databases. • For example, Arnold Schwarzenegger would appear in a movie database as an actor, a political database as a governor and a bodybuilder database as a Mr. Universe. • In Freebase, there is only one topic for Arnold Schwarzenegger, with all three facets of his public persona brought together. • The unified topic acts as an information hub, making it easy to find and contribute information about him. A. Frank
Arnold Schwarzenegger (1) A. Frank
Arnold Schwarzenegger (2) A. Frank
Freebase Dynamics • If the user is a developer, or just mildly technical, Freebase offers tools that make it easy to query and integrate the data into Web applications, blogs, wikis, user pages or anything else that would benefit from an injection of structured information. • In addition to reconciling many facets of one topic, the underlying structure of Freebase lets the user run more complex queries. • For example, if Freebase is asked for films starring Jennifer Connelly and actors who have appeared in Steven Spielberg movies, a list of 8 movies is given. A. Frank
…Films starring Jennifer Connelly A. Frank
Freebase vs. Wikipedia • The difference lies in the way they store information. • Wikipedia arranges information in the form of articles. • Freebase lists facts and statistics. Its list form is good not only for people who like to glance at facts, but also for people who want to use the data to build other Web sites and software. (Information in an article form can’t be reused in the same way.) • Topics covered by Freebase include subjects that are too obscure for Wikipedia, which strives for notability appropriate to an encyclopedia. A. Frank
Contents • Web 3.0 & Semantic Search • General Search • "Natural Language" Search • Vertical Search • "Social Networking" Search • Personalized Search A. Frank
Powerset • Powerset is a Silicon Valley company. • Goal: build a transformative consumer search engine based on Natural Language Processing (NLP). A. Frank
Powerset Rational • Unlike conventional search engines that use keywords, Powerset reads and understands every sentence on a Webpage and allows asking questions in plain English. • Unique innovations in search are rooted in breakthrough technologies that take advantage of the structure and nuances of natural language. • Using these advanced techniques, Powerset is building a large-scale search engine that breaks the confines of keyword search. • By making search more natural and intuitive, Powerset is fundamentally changing how we search the Web, by delivering higher quality results. A. Frank
Who proved Fermat’s last theorem? A. Frank
What did Bush say about Gore? A. Frank
Powerlabs • Powerlabs is a community where users can: • interact with demonstrations of Powerset’s technology before search engine launches in 2008 • give feedback to help improve the "Natural Language" indexing • suggest ideas for the ideal search engine. • Utilizes the participation of users on such a scale and at such an early stage of development, as a recognition of the potential of crowds wisdom to guide Powerset. A. Frank
Powerlabs Sign In A. Frank
Wiki Search Sneak Peek • Access to first open search box covering Wikipedia. • Powerset uses linguistic analyses of both the query and Wikipedia to find the best matches. • The Miniviewer allows to view highlighted matches in the context of a Wikipedia article without ever having to leave the results page. • By incorporating semantic information from Powerset’s indexing process into republished Wiki pages, internal page search enables a whole new kind of search: semantic-search-within-the-page. A. Frank
Explore Wikipedia A. Frank
Google acquire something A. Frank
Google acquire company A. Frank
Search Wikipedia A. Frank
Companies acquired in 2001 A. Frank
Powerset PowerMouse • PowerMouse is an application that provides a view into Powerset’s technology, letting users examine how structured information is extracted from open text. • It is not intended as a search application per se, but allows to search for and navigate through facts encoded in Powerset’s Wikipedia index. • It allows to see in dramatic fashion how compactly large amounts of data can be organized and displayed based on a few semantic relationships. A. Frank
PowerMouse Examples A. Frank
Google acquire something A. Frank
something eats carrot A. Frank
person won nobel A. Frank
Contents • Web 3.0 & Semantic Search • General Search • "Natural Language" Search • Vertical Search • "Social Networking" Search • Personalized Search A. Frank
Kango • Vertical semantic search engine for personalized travel information. • Goal: first step to deciding where to go, where to stay or what to do; finds the trip that is right for you. A. Frank
Kango Rational • Kango indexes the collective wisdom on travel from the entire Web. • Recommendations are based on a gestalt of voices heard in over 20 million reviews, ratings, blogs, journals, and articles collected from over a thousand sources such as Web sites, books and magazines. • Organizes and presents the most relevant opinions and product details in a "federated" search display based on what’s known about travel preferences. A. Frank
Kango Repository • Kango has scoured the Web to collect all kinds of places to go, things to do and places to stay. • It then analyzed and organized millions of travelers' opinions to enable search based on exact travel requirements and preferences. • Kango brings together: • more than a thousand sites • 400,000 lodging, activity and destinations options • 20 million reviews, ratings and blogs. A. Frank
How Kango Works A. Frank
Kango Semantics • It provides many options for specifying a trip. • Kango thinks about those options in terms of the “Long Tail“ concept to help make the trips distinct and memorable. • It "understands" the travel lingo, so it helps make informed decisions about what best fits specific travel preferences for each user. • Kango is creating an ontology of global travel content that includes ranking of superlatives within review sites. A. Frank
Lodging A. Frank
Things to Do A. Frank
Kango Dynamics • Enables new ways of filtering through its collection to get the recommendations that are most relevant to preferences and priorities. • Based on persons traveled with, the kind of destination looked for, and what is likely to be done, it sifts through its information to deliver the right getaway. • For example, returns • one set of hotel and activity recommendations when traveling to Monterey for a romantic getaway • a different set when going to Monterey with the family to visit the aquarium and hang out on the beaches. A. Frank