260 likes | 513 Views
Semantic Web. Tim Berner Lee’s Vision : Web as a means of collaboration for people Web as a means of collaboration for machines. Semantic Web is a web of data that machines can “understand” too. The Semantic Web vision (1).
E N D
Semantic Web Tim Berner Lee’s Vision: • Web as a means of collaboration for people • Web as a means of collaboration for machines Semantic Web is a web of data that machines can “understand” too.
The Semantic Web vision (1) The entertainment system was belting out the Beatles' "We Can Work It Out" when the phone rang. When Pete answered, his phone turned the sound down by sending a message to all the other local devices that had a volume control. His sister, Lucy, was on the line from the doctor's office: "Mom needs to see a specialist and then has to have a series of physical therapy sessions. Biweekly or something. I'm going to have my agent set up the appointments." Pete immediately agreed to share the chauffeuring. At the doctor's office, Lucy instructed her Semantic Web agent through her handheld Web browser. The agent promptly retrieved information about Mom's prescribed treatment from the doctor's agent, looked up several lists of providers, and checked for the ones in-plan for Mom's insurance within a 20-mile radius of her home and with a rating of excellent or very good on trusted rating services. It then began trying to find a match between available appointment times (supplied by the agents of individual providers through their Web sites) and Pete's and Lucy's busy schedules.
The Semantic Web Vision (2) In a few minutes the agent presented them with a plan. Pete didn't like it. University Hospital was all the way across town from Mom's place, and he'd be driving back in the middle of rush hour. He set his own agent to redo the search with stricter preferences about location and time. Lucy's agent, having complete trust in Pete's agent in the context of the present task, automatically assisted by supplying access certificates and shortcuts to the data it had already sorted through. Almost instantly the new plan was presented: a much closer clinic and earlier times, but there were two warning notes. First, Pete would have to reschedule a couple of his less important appointments. He checked what they were, not a problem. The other was something about the insurance company's list failing to include this provider under physical therapists: "Service type and insurance plan status securely verified by other means," the agent reassured him. "(Details?)" Lucy registered her assent at about the same moment Pete was muttering, "Spare me the details," and it was all set. (Of course, Pete couldn't resist the details and later that night had his agent explain how it had found that provider even though it wasn't on the proper list.)
Difficulties for the SemWeb • How is information represented in the actual Web? • As documents written in natural language • As graphs, pictures, tables, videos, and other multimedia • Humans are good at: • deduce facts from some (incomplete) information • create associations between facts • aggregate information from several sources • But, machines: • cannot use partial (or incomplete) information • have difficulties aggregating several sources of information • can read but cannot “understand” information
Semantic Web (1998 – 2008) Layers in 2001 Layers in 2008
Semantic Web Layers URI/IRI Universal Resource Identifier Internationalized Resource Identifier XML eXtendted Markup Language RDF Resource Description Framework RDFS RDF Schema RIF Rule Interchange Format SPARQL Simple Protocol and RDF Query Language OWL Web Ontology Language
What is needed for the SemWeb? • The technologies shown in the previous picture. • That the existing data (which are meaningful only to people) are represented in a form understandable for machines. This means, annotate data with metadata. • Ontologies: documents that define relations among terms. • Software agents that can process the data on behalf of humans, and automated web services that provide data. Metadata are data about data.
XML (eXtended Markup Language) • XML is a flexible text format that is widely used to structure, store, and transport data. • XML is different from HTML because it is not about displaying data. • In XML (differently from HTML) you create your own tags to annotate data. • XML is used to create other languages such as: XHTML, RSS, RDF, OWL, etc. • To learn XML go to: http://www.w3schools.com/xml/
An XML Example <bookstore> <book category="COOKING"> <title lang="en">Everyday Italian</title> <author>Giada De Laurentiis</author> <year>2005</year> <price>30.00</price> </book> <book category="CHILDREN"> <title lang="en">Harry Potter</title> <author>J K. Rowling</author> <year>2005</year> <price>29.99</price> </book> </bookstore>
RDF (Resource Description Framework) • RDF: a standard for describing resources on the Web • The meaning of data is encoded in sets of triples. • Triples are “subject, predicate, object” statements. • Each element of a triple is identified by a URI. • URIs represent both resources and relations. • RDF is written in XML • RDF is to Semantic Web what HTML was to the Web. Harry Potter has as author J. K. Rowling.
An RDF Example http://en.wikipedia.org/wiki/J._K._Rowling dc:creator http://en.wikipedia.org/wiki/ Harry_Potter <rdf:RDF xmlns:rdf=http://www.w3.org/1999/02/22-rdf-syntax-ns# xmlns:dc=http://purl.org/dc/elements/1.1/> <rdf:Description rdf:about=“http://en.wikipedia.org/wiki/Harry_Potter”> <dc:creator=“http://en.wikipedia.org/wiki/J._K._Rowling”> </rdf:Description> </rdf:RDF>
Other RDF related technologies • RDFS supports expression of structured vocabulary. • It can be used to represent minimal ontologies. • RDF triples are stored in special repositories. For an example, refer to openRDF.org • GRDDL - Gleaning Resource Descriptions from Dialects of Languages (a means to extract RDF from XML or XHTML documents) • SPARQL – a query language for RDF data
Ontologies and OWL • An ontology is an explicit description of things and their relations. • OWL serves to write ontologies for the Web. • OWL is written in XML and built on top of RDF. • You can think of OWL as an object-oriented language that defines classes, hierarchy of classes, attributes, relations, etc. • OWL is designed to support inference (subsumption and classification) • OWL is more expressive than RDF.
An Ontology Example • Visit http://protege.stanford.edu/ to learn about creating ontologies. Source: http://www.sei.cmu.edu/isis/guide/gifs/fruit-ontology.gif
Friend of a Friend (FOAF) RDF class Person <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <foaf:Person> <foaf:name>Peter Parker</foaf:name> <foaf:gender>Male</foaf:gender> <foaf:title>Mr</foaf:title> <foaf:givenname>Peter</foaf:givenname> <foaf:family_name>Parker</foaf:family_name> <foaf:mbox_sha1sum>cf2f4bd069302febd8d7c26d803f63fa7f20bd82 </foaf:mbox_sha1sum> <foaf:homepage rdf:resource="http://www.peterparker.com"/> </foaf:Person> </rdf:RDF>
Repositories of SW data a community effort to extract structured information from Wikipedia the universal protein resource, a central repository of protein data Semantic web atlas of postgenomic knowledge GeoNames Geographical database
Semantic Web Search A search engine for semantic web documents represented in RDF, that provides services to software agents.
Semantic Applications Source: http://www.readwriteweb.com/archives/10_semantic_apps_to_watch.php
Summary • Semantic Web is an ambitious vision with uncertain future. • Not all technologies needed are yet in place, but progress is steady. • The biggest challenge is to convince people to make their data available in an annotated form (e.g., RDF). • There are big research opportunities in the SemWeb: • automatically annotating data • creating, aligning ontologies • approximate and probabilistic reasoning • defining and implementing trust
Where to learn more • W3C Semantic Web Activity: http://www.w3.org/2001/sw/ • Prof. James Hendler: http://www.cs.rpi.edu/~hendler/ • Prof. Steffen Staab: http://www.uni-koblenz.de/~staab/ • Resources: • Jena – A semantic web framework for Java. It provides a programmatic environment for RDF, RDFS and OWL, SPARQL and includes a rule-based inference engine. http://jena.sourceforge.net/ • Yahoo! SearchMonkey: http://developer.yahoo.com/searchmonkey/ • LinkingOpenData: http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData