450 likes | 766 Views
The Semantic Web . Stefan Decker Information Sciences Institute University of Southern California. Outline. Semantic Web Overview Vision, Challenges, Rationals Semantic Web in SCEC. Semantic Web. coined by Tim Berners-Lee (1997)
E N D
The Semantic Web Stefan Decker Information Sciences Institute University of Southern California
Outline • Semantic Web Overview • Vision, Challenges, Rationals • Semantic Web in SCEC
Semantic Web • coined by Tim Berners-Lee (1997) "The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation.” • T. Berners-Lee, J. Hendler, O. Lassila,“The Semantic Web”, Scientific American, May 2001
Insurance Co. Rating Provider sites Physician’s Agent Mom required treatment in-plan? close-by? Specialist? Schedule appointment Driving schedule Lucy’s Agent Pete’ Agent Doctor’s appointment“The Semantic Web”, Scientific American, May 2001
Means to Achieve the Vision • Explicit Ontologies • Needed to understand each others data(e.g., joint notion about what a schedule is) • Web Services • Required to actively interconnect systems(automatically make an appointment)
Technical challenges • Interoperability • Inaccurate, incomplete, heterogeneous data • Unreliable, ill-defined, evolving services • Natural language processing, data mining • make information explicit • Human-computer interaction • querying interfaces, visualization • Scalability • Subsecond performance
Social challenges • Standardization is hard • DublinCore • Bogus or inaccurate metadata • Physician rating, profile • Competition and commoditization • Economical incentive • Chicken and egg • Complexity: developers and users
Jump Starters • Machine Readable Data: • .org (human-edited directory) • .org (Music encyclopedia) • RSS (RDF Site Summary) • (embedded metadata) • CC/PP (Composite Capability/Preference Profiles) • P3P (Platform for Privacy Preferences)
Jump Starters • B2B Vocabulary Projects • PapiNet.org: Vocabulary for Paper Industry • BPMI.org: Vocabulary for exchanging Business Process Models • XML-HR: Vocabularies for human resources (HR) • DMTF (Distributed Management Task Force) (Vocabularies for managing enterprises • … • Research Vocabulary Projects • Gen Ontology Working Group • Earth Sciences • MathNet • …
How do we get there? Research communities DL, AI, DB, … Standards bodies W3C, OMG, … Non-profit US, EC, Japan Industry IBM, Nokia, HP, Microsoft(?),... Business.semanticweb.org
Non-profit • DARPA • “DARPA Agent Markup Language” • since Aug 2000 • NSF • Co-sponsored events (e.g., SWWS) • Further support in the loop • European Council • “Semantic Web Technologies”, FrameWork 6 • Japan • Interoperability Technology Association for Information Processing, Japan (INTAP) www.daml.org www.semanticweb.org/SWWS www.ontoweb.org www.net.intap.or.jp/INTAP/
AI: “Add logic to the Web” • Assertions, rules • Agents • Interoperability • First-order logics • Ontologies, description logics • Logic programming, datalog • Problem-solving methods • … Distributed knowledge base
DB: “Everything is syntax” • Semistructured data • Web services • Interoperability • Data integration • Mediation, query rewriting • Model management • Conceptual modeling Conglomerate of distributed heterogeneous (semistructured) databases
Heterogenous Data • To many data formats/languages
1. Step • Define uniform, underlying syntax • Lowest common denominator: labeled graphs(semi-structured Data) -> RDF Relational Database Structured Text (e.g., Vcard) Person begin: vcardfn: Stefann: Decker;Stefanend: vcard Person row row vcard1 fn n L-name L-name ID ID F-name F-name Stefan Decker;Stefan 1 Decker Stefan Decker 2 Birgit
XML • Containment, hierarchy • Adjacency (A followed by B) • Attributes (atomic values) • Opaque reference (IDREF) Good for serialization, poor for modeling relational semantics
<Creator> <uri>http://www.w3.org/Home/Lassila</uri> <name>Ora Lassila</name> </Creator> <Document uri=“http://www.w3.org/Home/Lassila” <Creator>Ora Lassila</Creator> </Document> Ora Lassila <Document uri=“http://www.w3.org/Home/Lassila” Creator=“Ora Lassila”/> Encoding of Information “The Creator of the Resource “http://www.w3.org/Home/Lassila” is Ora Lassila http://www.w3.org/Home/Lassila Creator Endless encoding possibilities in XML:
Introduction to RDF • RDF (Resource Description Framework) • Beyond Machine readable to Machine understandable • RDF unites a wide variety of stakeholders: • Digital librarians, content-raters, privacy advocates, B2B industries, AI... • Significant (but less than XML) industrial momentum, lead by W3C • RDF consists of two parts • RDF Model (a set of triples) • RDF Syntax (different XML serialization syntaxes) • RDF Schema for definition of Vocabularies (simple Ontologies) for RDF (and in RDF)
Ora Lassila A Simple Example • Describing Resources • URIs: global OIDs, literals • Binary relationships between objects • Arcs (relationships) are first-class objects • Blank (anonymous) nodes • “Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila” • Structure • Resource (subject) http://www.w3.org/Home/Lassila • Property (predicate) http://www.schema.org/#Creator • Value (object) "Ora Lassila” s:Creator http://www.w3.org/Home/Lassila
RDF • Graph-based universal syntax (Agent-) Applications RDF-Layer (Single dataformat, Query and storage System) Scheduling Service Insurance Ratings Calendar Semantics in a global, open environment?
Step2: Ontologies • What is an Ontology? „An ontology is a specification of a conceptualization.“ Tom Gruber, 1993 • Ontologies are social contracts • Agreed, explicit semantics • Understandable to outsiders • (Often) derived in a community process • Ontologies require Knowledge Representation • Is_a hierarchy, part of, attributes, axioms
RDF and Ontologies • Idea: Define an Ontology Language by defining predefined nodes and arcs • The Ontology Language itself is just an Ontology • Ontologies are used to tag data from sources
From an Ontology LivingThing subClassOf Person row row L-name L-name ID ID F-name F-name 1 Decker Stefan Decker 2 Birgit Step 2: Layers on Top of RDF Tim Berners-Lee: “Axioms, Architecture and Aspirations” W3C all-working group plenary Meeting 28 February 2001
W3C Semantic Web Activity • Annotation (Annotea) • Access control • Calendaring • Collaboration • Logic • Rules • Workflows Working Groups Advanced development RDF Core Web Ontology
RDF Core Working Group • Resource Description Framework (RDF) • Goals • Improve RDF abstract model and XML syntax according to implementors feedback • Define precise semantics for RDF and RDF Schema • Clarify ties with XML family
Web Ontology Working Group • Standard definition language for ontologies (conceptual models) • Derived from Description Logics • But partial mapping to Datbase and Datalog possible -> (see Horrocks, Volz, Decker, Grossof: WWW2003) • Extension of RDF Schema and DAML+OIL • Class Expressions (Intersection, Union, Complement) • XML Schema Datatypes • Enumerations • Property Restrictions • Cardinality Constrains • Value Restrictions
The Layer Cake Tim Berners-Lee: “Axioms, Architecture and Aspirations” W3C all-working group plenary Meeting 28 February 2001 Research Phase Standardization Phase Recommendation Phase
Tasks within SCEC - CME • Towards an Earth Sciences Ontology: • Cataloging and Unification of Existing Databases • E.g., Fissures and Fault Activity Database • Building a Mediation Environment • Organizing a Community Process • Enriching of Web Services and Grid Infrastructure with Semantics • Service Discovery and Match Making
Fault Activity Database • Hand-Maintained within SCEC (Sue Perry) • Re-engineering of the Database Schemata <rdfs:Class rdf:about="&FAD_v1;AVG_RECURRENCE_INTERVAL" rdfs:label="AVG_RECURRENCE_INTERVAL"> <a:_slot_constraints rdf:resource="&FAD_v1;SCFADsep_02_00106"/> <rdfs:subClassOf rdf:resource="&rdfs;Resource"/> </rdfs:Class> <rdfs:Class rdf:about="&FAD_v1;AVG_SLIP_PER_EVENT" rdfs:label="AVG_SLIP_PER_EVENT"> <rdfs:subClassOf rdf:resource="&rdfs;Resource"/> </rdfs:Class> <rdfs:Class rdf:about="&FAD_v1;AVG_SLIP_PER_EVENT_METHOD" rdfs:label="AVG_SLIP_PER_EVENT_METHOD"> <rdfs:subClassOf rdf:resource="&rdfs;Resource"/> </rdfs:Class> <rdf:Property rdf:about="&FAD_v1;CFM-A_coord_file_URL" a:maxCardinality="1" rdfs:label="CFM-A_coord_file_URL"> <rdfs:domain rdf:resource="&FAD_v1;FAULT"/> <rdfs:range rdf:resource="&rdfs;Literal"/> </rdf:Property>
Planned: Mediation Environment with RDF-based Rule Language Applications Mediation with RDF-based Rule Language Fault Activity Database Fissures Grid Services
Motivation: Why Rule Languages for the Web • Plethora of data available • Data needs to be adapted and combined • “Time to Market”: Faster to write rules than code • Data Transformation and Integration • Logic specification, not programming • Tabled evaluation/bottom-up evaluation • Semi-structured data • Multiple semantics (Relational Data, UML, ER, TopicMaps, DAML+OIL, XML-Schema, special purpose data models) • Distributed, heterogeneous sources
What’s Wrong With Existing Approaches? • Built-in semantics (e.g. SiLRI, RQL, DQL) • but: many RDF-based languages with different semantics (DAML+OIL, RDF Schema, UML/RDF, TopicMaps/RDF, DMTF, …) • For each language a specialized query language ????
TRIPLE:Language Overview • Native support • for Resources & namespaces, • Abbreviations • Models (sets of RDF statements) • Reification • Rules with expressive bodies (full FOL syntax) • Inspired by F-Logic: • subject[predicateobject] (“molecule”)
Language Description I • Namespace and resource abbreviations: • rdf := “http://www.w3.org/1999/02/22-rdf-syntax-ns#”. • isa := rdf:subClassOf. • Statements, triples, molecules: • subject[predicateobject] • subject[p1o1;p2 o2; ...] • s1[p1 s2[p2o] ] • Models, model expressions, parameterized models: • s[po]@m “triple <s,p,o> in model m” • s[po]@(m1 m2) model intersection, union, diff. • s[po]@sf(m1, X, Y) Skolem function
Language Description II • Reification: • stefan[believes <Ora[isAuthorOfhomepage]> ] • Logical formulae: • usual logical connectives and quantifiers: • all variables introduced via (or ) • Clauses: • facts: s[p1o1; p2 o2; ...]. • rules: X s1[p1X] s2[p2X] ... . • Model blocks: • @model { clauses } • Mdl @model(Mdl) { clauses }
TRIPLE Stefan Decker dc:title dc:creator db:d_01_01 dc:subject dc:subject ... RDF triples rule N p(N)[ rdf:type xyz:Person; xyz:name N ] D D[dc:creator N]. Person Stefan Decker name rdf:type query:“find all names” N P P[rdf:type xyz:Person; xyz:name N]@db:documents. N = “Stefan Decker” Example: Dublin Core dc := “http://purl.org/dc/elements/1.0/”. db := “http://www-db.stanford.edu/”. ···· @db:documents { db:d_01_01 [ dc:title TRIPLE; dc:creator “Stefan Decker”; dc:subject RDF; dc:subject triples; ... ]. } namespace abbreviations model block fact
Example: Specification of RDF Schema Semantics rdf := 'http://www.w3.org/...rdf-syntax-ns#'. rdfs := 'http://www.w3.org/.../PR-rdf-schema-...#'. type := rdf:type. subPropertyOf := rdfs:subPropertyOf. subClassOf := rdfs:subClassOf. FORALL Mdl @rdfschema(Mdl) { FORALL O,P,V O[P->V] <- O[P->V]@Mdl. FORALL O,V O[subClassOf->V] <- EXISTS W (O[subClassOf->W] AND W[subClassOf->V]). … } namespace abbreviations resource abbreviations model block “copy” triples from Mdl Transitivity of subClassOf
Example: Cars Ontology with RDF Schema Semantics @cars { xyz:MotorVehicle[rdfs:subClassOf -> rdfs:Resource]. xyz:PassengerVehicle[rdfs:subClassOf -> xyz:MotorVehicle]. xyz:Truck[rdfs:subClassOf -> xyz:MotorVehicle]. xyz:Van[rdfs:subClassOf -> xyz:MotorVehicle]. xyz:MiniVan[ rdfs:subClassOf -> xyz:Van; rdfs:subClassOf -> xyz:PassengerVehicle]. } xyz:MotorVehicle xyz:Truck xyz:Van xyz:PassengerVehicle xyz:MiniVan X = xyz:Van X = xyz:Truck X = xyz:PassengerVehicle FORALL X <- X[rdfs:subClassOf -> xyz:MotorVehicle]@cars. X = xyz:Van X = xyz:Truck X = xyz:PassengerVehicle X = xyz:MiniVan FORALL X <- X[rdfs:subClassOf -> xyz:MotorVehicle]@rdfschema(cars).
Grid Computing and Web Services (ongoing) • Matchmaking between Jobs and Resources • Hard-Coded in Globus Toolkit • Reeingineering using a Ontology and Rule-based solution • RDF and DMTF Vocabulary (www.dmtf.org) <rdfs:Class rdf:ID="CIM_ComputerSystem"> <rdfs:subClassOf rdf:resource="#CIM_System"/> <version><![CDATA["2.6.0"]]></version><rdfs:comment parseType="Literal"><![CDATA["A class derived from System that is a special collection of ManagedSystemElements. This collection provides compute capabilities and serves as aggregation point to associate one or more of the following elements: FileSystem, OperatingSystem, Processor and Memory (Volatile and/or NonVolatile Storage)."]]></rdfs:comment> <rdfs:subClassOf> <daml:Restriction> <daml:toClass rdf:resource="#string"/> <daml:onProperty> <daml:DatatypeProperty rdf:ID="NameFormat"> <daml:toClass rdf:resource="http://www.w3.org/2001/XMLSchema#string"/> </daml:DatatypeProperty> </daml:onProperty> </rdfs:Class>
Semantic Web and Earth Sciences • Semantic Web field provides technologies for explicity vocabulary and mediate data • Standards-based, many resources available • Editors, Rule Engines, APIs • Effort feeds back for other domain