380 likes | 636 Views
Making Sense of the Semantic Web. Nova Spivack CEO & Founder Radar Networks. About This Talk. Making sense of the semantic sector Making the Semantic Web more useable Future outlook Twine.com Q & A. The social graph just connects people. The semantic graph connects everything….
E N D
Making Sense of the Semantic Web Nova Spivack CEO & Founder Radar Networks
About This Talk • Making sense of the semantic sector • Making the Semantic Web more useable • Future outlook • Twine.com • Q & A
The social graph just connects people The semantic graph connects everything… People Companies Emails Places Products Interests Services Web Pages Activities Documents Projects Events Multimedia Groups The Big Opportunity… Better search More targeted ads Smarter collaboration Deeper integration Richer content Better personalization
The third decade of the Web • A period in time, not a technology… • Enrich the structure of the Web • Improve the quality of search, collaboration, publishing, advertising • Enables applications to become more integrated and intelligent • Transform Web from fileserver to database • Semantic technologies will play a key role
The Intelligence is in the Connections Intelligent Web Web 4.0 Web OS 2020 - 2030 Intelligent personal agents Web 3.0 Semantic Web Distributed Search SWRL OWL 2010 - 2020 SPARQL Semantic Databases AJAX OpenID Connections between Information Semantic Search Social Web ATOM Widgets RSS RDF Mashups P2P Web 2.0 Office 2.0 Javascript Flash SOAP XML Weblogs Social Media Sharing 2000 - 2010 The Web Java HTML SaaS Social Networking HTTP Directory Portals Wikis VR Keyword Search Lightweight Collaboration Web 1.0 The PC BBS Websites Gopher 1990 - 2000 MacOS SQL MMO’s Groupware SGML Databases Windows File Servers The Internet PC Era Email IRC 1980 - 1990 FTP USENET PC’s File Systems Connections between people
Beyond the Limits of Keyword Search The Intelligent Web Web 4.0 Productivity of Search 2020 - 2030 Reasoning The Semantic Web Web 3.0 Semantic Search 2010 - 2020 The Social Web Natural language search Web 2.0 The World Wide Web 2000 - 2010 Tagging Web 1.0 1990 - 2000 Keyword search The Desktop Directories PC Era 1980 - 1990 Files & Folders Databases Amount of data
A Higher Resolution Web IBM.com Web Site Joe Person Lives in IBM Company Palo Alto City Publisher of Fan of Subscriber to Lives in Employee of Sue Person Jane Person Dave.com RSS Feed Fan of Coldplay Band Friend of Member of Depiction of Design Team Group Married to Source of Member of 123.JPG Photo Dave.com Weblog Bob Person Depiction of Member of Member of Dave Person Stanford Alumnae Group Author of Member of
Five Approaches to Semantics • Tagging • Statistics • Linguistics • Semantic Web • Artificial Intelligence
Pros Easy for users to add and read tags Tags are just strings No algorithms or ontologies to deal with No technology to learn Cons Easy for users to add and read tags Tags are just strings No algorithms or ontologies to deal with No technology to learn Technorati Del.icio.us Flickr Wikipedia The Tagging Approach
Pros: Pure mathematical algorithms Massively scaleable Language independent Cons: No understanding of the content Hard to craft good queries Best for finding really popular things – not good at finding needles in haystacks Not good for structured data Google Lucene Autonomy The Statistical Approach
Pros: True language understanding Extract knowledge from text Best for search for particular facts or relationships More precise queries Cons: Computationally intensive Difficult to scale Lots of errors Language-dependent Powerset Hakia Inxight, Attensity, and others… The Linguistic Approach
Pros: More precise queries Smarter apps with less work Not as computationally intensive Share & link data between apps Works for both unstructured and structured data Cons: Lack of tools Difficult to scale Who makes all the metadata? Radar Networks DBpedia Project Metaweb The Semantic Web Approach
Pros: Smart in narrow domains Answer questions intelligently Reasoning and learning Cons: Computationally intensive Difficult to scale Extremely hard to program Does not work well outside of narrow domains Training takes a lot of work Cycorp The Artificial Intelligence Approach
The Approaches Compared Make the Data Smarter A.I. Semantic Web Linguistics Tagging Statistics Make the software smarter
Two Paths to Adding Semantics • “Bottom-Up” (Classic) • Add semantic metadata to pages and databases all over the Web • Every Website becomes semantic • Everyone has to learn RDF/OWL • “Top-Down” (Contemporary) • Automatically generate semantic metadata for vertical domains • Create services that provide this as an overlay to non-semantic Web • Nobody has to learn RDF/OWL -- Alex Iskold
In Practice: Hybrid Approach Works Best Tagging Semantic Web Top-down Statistics Linguistics Bottom-up Artificial intelligence
The Semantic Web is a Key Enabler • Moves the “intelligence” out of applications, into the data • Data becomes self-describing; Meaning of data becomes part of the data • Apps can become smarter with less work, because the data carries knowledge about what it is and how to use it • Data can be shared and linked more easily
User Profiles Web Content Ads & Listings Data Records Apps & Services Open Query Interfaces Open Data Mappings Open Data Records Open Rules Open Ontologies The Semantic Web = Open database layer for the Web
Semantic Web Open Standards • RDF – Store data as “triples” • OWL – Define systems of concepts called “ontologies” • Sparql – Query data in RDF • SWRL – Define rules • GRDDL – Transform data to RDF
Predicate Subject Object RDF “Triples” • the subject, which is an RDF URI reference or a blank node • the predicate, which is an RDF URI reference • the object, which is an RDF URI reference, a literal or a blank node Source: http://www.w3.org/TR/rdf-concepts/#section-triples
Semantic Web Data is Self-Describing Linked Data Ontologies Definition Definition Definition Definition Data Record ID Field 1 Value Field 2 Value Field 3 Value Field 4 Value Definition Definition Definition
RDBMS vs Triplestore Person Table S P O Subject Predicate Object 001 isA Person 001 firstName Jim 001 lastName Wissner 001 hasColleague 002 002 isA Person 002 firstName Nova 002 lastName Spivack 002 hasColleague 003 003 isA Person 003 firstName Chris 003 lastName Jones 003 hasColleague 004 004 isA Person 004 firstName Lew 004 lastName Tucker f_name jim nova chris lew ID 001 002 003 004 l_name wissner spivack jones tucker Colleagues Table SRC-ID 001 001 001 001 002 002 002 002 003 003 003 003 004 004 004 004 TGT-ID 001 002 003 004 001 002 003 004 001 002 003 004 001 002 003 004
Merging Databases in RDF is Easy S P O S P O S P O
IBM.com Web Site Joe Person IBM Company Palo Alto City Lives in Publisher of Fan of Subscriber to Lives in Employee of Sue Person Jane Person Dave.com RSS Feed Coldplay Band Fan of Friend of Member of Design Team Group Depiction of Married to Source of 123.JPG Photo Member of Dave.com Weblog Bob Person Depiction of Member of Dave Person Stanford Alumnae Group Member of Author of Member of The Web IS the Database! Application A Application B
Are RDF/OWL the Only Way to Express Semantics? • Other contenders: • String tags • Taxonomies and controlled vocabularies • Microformats • Ad hoc [name, value] pairs • Alternative semantic metadata notations
One Semantic Web or Many? • The answer is….Both • The Semantic Web is a web of semantic webs • Each of us may have our own semantic web…
Why has it Taken So Long? • The Dream of the Semantic Web has been slow to arrive • The original vision was too focused on A.I. • Technologies and tools were insufficient • Needs for open data on the Web were not strong enough • Keyword search and tagging were good enough…for a while • Lack of end-user facing killer apps • Lots of misunderstanding to clear up
Crossing the Chasm… • Communicating the vision • Focus on open data, not A.I. • Technology progress • Standards & tools finally maturing • Needs were not strong enough • Keyword search and tagging not as productive anymore • Apps need better way to share data • Killer apps and content • Several companies are starting to expose data to the Semantic Web. Soon there will be a lot of data. • Market Education • Show the market what the benefits are
Future Outlook • 2007 – 2009 • Early-Adoption • A few killer apps emerge • Other apps start to integrate • 2010 – 2020 • Mainstream Adoption • Semantics widely used in Web content and apps • 2020 + • Next big cycle: Reasoning and A.I. • The Intelligent Web • The Web learns and thinks collectively
The Future of the Platform… • 1980’s -- The desktop is the platform • 1990’s -- The browser is the platform • 2000’s -- The Web is the platform • 2010’s -- The Graph is the platform • 2020’s -- The network is the platform • 2030’s -- The body is the platform…?
What is Twine? • Twine is a new service for managing & sharing information on the Web • Works for content, knowledge, data, or any other kinds of information • Designed for individuals and groups that need a better way to organize, search, share and keep track of their information
How Twine Works • Collect or author structured or unstructured information into Twine via email, the Web or the desktop • Twine creates a knowledge web automatically • Understands, tags & links information automatically • Automatically does further research for you on the Web • Organizes information automatically • Provides semantic search, discovery & interest tracking • Helps you connect with other people & groups to grow and share knowledge webs around common interests
Use-Cases • Individuals • Collect & author information about interests • Share with your friends & colleagues • Find and discover things more relevantly • Groups & Teams • Manage content & knowledge related to common interests, goals, or activities • Leverage and contribute to collective intelligence • Collaborate more productively
Contact Info • Visit www.twine.com to sign up for the invite beta wait-list • You can email me at nova@radarnetworks.com • My blog is at http://www.mindingtheplanet.net • Thanks!
Rights • This presentation is licensed under the Creative Commons Attribution License. • Details: This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/ or send a letter to Creative Commons, 171 Second Street, Suite 300, San Francisco, California, 94105, USA. • If you reproduce or redistribute in whole or in part, please give attribution to Nova Spivack, with a link to http://www.mindingtheplanet.net