780 likes | 888 Views
Design and Implementation of An RDF Data Store. Ching-Long Yeh 葉慶隆 Department of Computer Science and Engineering Tatung University Email: chingyeh@cse.ttu.edu.tw URL: httP://www.cse.ttu.edu.tw/chingyeh. Outline. Introduction to the Semantic Web Semantic Web Layered Architecture:
E N D
Design and Implementation of An RDF Data Store Ching-Long Yeh 葉慶隆 Department of Computer Science and Engineering Tatung University Email: chingyeh@cse.ttu.edu.tw URL: httP://www.cse.ttu.edu.tw/chingyeh
Outline • Introduction to the Semantic Web • Semantic Web Layered Architecture: • XML, RDF(S), DAML(-S) • The Big Picture of the Semantic Web • An Architecture of RDF Triple Data Store • An RDF Parser in Prolog • The Repository storage • APIs and User Interface • Semantic Community Portal • Future Work and Conclusions An RDF Data Store
Sources • Part of the slides are selected from the following sources: • Knowledge Markup and Resource Semantics, By Harold Boley, Stefan Decker, and Michael Sintek, IJCAI-01 Tutorial, http://www.ijcai-01.org/ • Anupriya Ankolenkar, et al., “DAML-S: Semantic Markup For Web Services,”, Proceedings of SWWS’ 01, the First Semantic Web Working Symposium, California, USA, July 30 - August 1, 2001. An RDF Data Store
Introduction to Semantic Web • Facilities to put machine-understandable data on the Web are becoming a high priority for many communities. • The Web can reach its full potential only if it becomes a place where data can be shared and processed by automated tools as well as by people. • For the Web to scale, tomorrow's programs must be able to share and process data even when these programs have been designed totally independently. An RDF Data Store
Introduction to Semantic Web • The Semantic Web is a vision: the idea of having data on the web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications. • See “W3C Semantic Web Activity,” by Marja-Riitta Koivunen, for more descriptions. An RDF Data Store
The Semantic Web Layered Architecture Tim Berners-Lee: “Axioms, Architecture and Aspirations” W3C all-working group plenary Meeting 28 February 2001 (http://www.w3.org/2001/Talks/0228-tbl/slide5-0.html) An RDF Data Store
Increasing demand for formalized knowledge on the Web: AI’s chance! XML- & RDF-based markup languages provide a 'universal' storage/interchange format for such Web-distributed knowledge representation. AI’s Chance Namespaces CSS DTDs XSLT DAML Stylesheets Agents Transformations Ontobroker XQL XML HornML Rules Queries XQuery RuleML XML-QL SHOE RDF[S] Frames Acquisition TopicMaps Protégé An RDF Data Store
XML Fundamentals Source: http://www.ibiblio.org/xml/slides/sd2001east/fundamentals/XML_Fundamentals.html
What is XML? • Extensible Markup Language • A syntax for documents • A Meta-Markup Language • A Structural and Semantic language, not a formatting language • Not just for Web pages An RDF Data Store
Extensible Markup Language • Language • It has a grammar • It has a vocabulary (sort of) • It can be parsed by machines • Markup Language • It says what things are; not what they do • It is not a programming language • It is not compiled • Extensible • You can add words to the language An RDF Data Store
XML is a Meta Markup Language • Not like HTML, troff, LaTeX • Make up the tags you need as you need them • The tags you create can be documented in a Document Type Definition (DTD) • A meta syntax for domain-specific markup languages like MusicML, MathML, and XHTML An RDF Data Store
XML Applications • A specific markup language that uses the XML meta-syntax is called an XML application • Different XML applications have their own more constricted syntaxes and vocabularies within the broader XML syntax • Further syntax can be layered on top of this; e.g. data typing through schemas An RDF Data Store
XML describes structure and semantics, not formatting • XML documents form a tree • Document Object Model (DOM) • Element and attribute names reflect the kind of the element • DTD, Schema • Formatting can be added with a style sheet • Cascading Style Sheets (CSS) • Extensible Stylesheet language (XSL) An RDF Data Store
XML Hypertext • A Uniform Resource Identifier (URI) names or locates a resource • An XLink defines connections between two or more documents identified by URIs • XPath identifies particular nodes within a document • An XPointer adds an XPath to a URI • XBase defines the URI against which relative URIs are resolved • XInclude embeds a document identified by a URI inside an XML document. An RDF Data Store
A Song Description in HTML <dt>Hot Cop <dd> by Jacques Morali, Henri Belolo, and Victor Willis <ul> <li>Producer: Jacques Morali <li>Publisher: PolyGram Records <li>Length: 6:20 <li>Written: 1978 <li>Artist: Village People </ul> An RDF Data Store
A Song Description in XML <SONG> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>6:20</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG> An RDF Data Store
Style Sheets Provide Formatting(CSS) SONG {display: block; font-family: New York, Times New Roman, serif} TITLE {display: block; font-size: 24pt; font-weight: bold; font-family: Helvetica, sans} COMPOSER {display: block} PRODUCER {display: block} YEAR {display: block} PUBLISHER {display: block} LENGTH {display: block} ARTIST {display: block; font-style: italic} An RDF Data Store
Attaching Style Sheets to Documents <?xml-stylesheet type="text/css" href="song.css"?> <SONG> <TITLE>Hot Cop</TITLE> <COMPOSER>Jacques Morali</COMPOSER> <COMPOSER>Henri Belolo</COMPOSER> <COMPOSER>Victor Willis</COMPOSER> <PRODUCER>Jacques Morali</PRODUCER> <PUBLISHER>PolyGram Records</PUBLISHER> <LENGTH>6:20</LENGTH> <YEAR>1978</YEAR> <ARTIST>Village People</ARTIST> </SONG> An RDF Data Store
An XSLT Stylesheet (Part 1) <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:template match="/"> <html> <head><title>Song</title></head> <body> <xsl:apply-templates select="SONG"/> </body> </html> </xsl:template> An RDF Data Store
An XSLT Stylesheet (Part 2) <xsl:template match="SONG"> <h1> <xsl:value-of select="TITLE"/> by the <xsl:value-of select="ARTIST"/> </h1> <ul> <li>Length: <xsl:value-of select="LENGTH"/></li> <li>Producer: <xsl:value-of select="PRODUCER"/></li> <li>Publisher: <xsl:value-of select="PUBLISHER"/></li> <li>Year: <xsl:value-of select="YEAR"/></li> <xsl:apply-templates select="COMPOSER"/> </ul> </xsl:template> <xsl:template match="COMPOSER"> <li>Composer: <xsl:value-of select="."/></li> </xsl:template> </xsl:stylesheet> An RDF Data Store
Transforming the Document <html> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> <title>Song</title> </head> <body> <h1>Hot Cop by the Village People </h1> <ul> <li>Length: 6:20</li> <li>Producer: Jacques Morali</li> <li>Publisher: PolyGram Records</li> <li>Year: 1978</li> <li>Composer: Jacques Morali</li> <li>Composer: Henri Belolo</li> <li>Composer: Victor Willis</li> </ul> </body> </html> XSL document (template rules) XML document XSLT Processor (IE 5) Output An RDF Data Store
A DTD for Songs <!ELEMENT SONG (TITLE, COMPOSER+, PRODUCER*, PUBLISHER*, LENGTH?, YEAR?, ARTIST+)> <!ELEMENT TITLE (#PCDATA)> <!ELEMENT COMPOSER (#PCDATA)> <!ELEMENT PRODUCER (#PCDATA)> <!ELEMENT PUBLISHER (#PCDATA)> <!ELEMENT LENGTH (#PCDATA)> <!-- This should be a four digit year like "1999", not a two-digit year like "99" --> <!ELEMENT YEAR (#PCDATA)> <!ELEMENT ARTIST (#PCDATA)> An RDF Data Store
Well-formedness • Rules: • Open and close all tags • Empty tags end with /> • There is a unique root element • Elements may not overlap • Attribute values are quoted • < and & are only used to start tags and entities • Only the five predefined entity references are used • Plus more... An RDF Data Store
Validity • To be valid an XML document must be • Well-formed • Must have a Document Type Definition (DTD) • Must comply with the constraints specified in the DTD An RDF Data Store
What Is XML Used for? • Domain-Specific Markup Languages • XML in industrial applications: http://www.xml.org/xml/industry_industrysectors.jsp • Self-Describing Data • Much data is lost due to format problems. • Interchange of Data Among Applications • Electronic business: RosettaNet, ebXML An RDF Data Store
XML Namespaces • XML namespaces are akin to namespaces, packages, and modules in programming languages • Disambiguation of tag–and attribute–names from different XML applications (“spaces”) through different prefixes • A prefix is separated from the local name by a “:”, obtaining prefix:name tags • Namespaces constitute a layer on top of XML 1.0, since prefix:name is again a valid tag name and namespace bindings are ignored by some tools An RDF Data Store
Namespace Bindings • Prefixes are bound to namespace URIs by attaching an xmlns:prefix attribute to the prefixed element or one of its ancestors, prefix:name1 ,...,prefix:namen • The value of the xmlns:prefix attribute is a URI, which may or (unlike for DTDs!) may not point to a description of the namespace’s syntax • An element can use bindings for multiple name-spaces via attributes xmlns:prefix1 ,...,xmlns:prefixm An RDF Data Store
Two-Namespace Example:Snail-Mail and Telecoms Address Parts <mail:address xmlns:mail="http://www.deutschepost.de/" xmlns:tele="http://www.telekom.de/"> <mail:name>Xaver M. Linde</mail:name> <mail:street>Wikingerufer 7</mail:street> <mail:town>10555 Berlin</mail:town> <mail:bill>12.50</mail:bill> <tele:phone>030/1234567</tele:phone> <tele:phone>030/1234568</tele:phone> <tele:fax>030/1234569</tele:fax> <tele:bill>76.20</tele:bill> </ mail:address> bill disambiguation through mail and tele prefixes An RDF Data Store
Introduction to RDF • RDF (Resource Description Framework) • Beyond Machine readable to Machine understandable • RDF unites a wide variety of stakeholders: • Digital librarians, content-raters, privacy advocates, B2B industries, AI... • Significant (but less than XML) industrial momentum, lead by W3C • RDF consists of two parts • RDF Model (a set of triples) • RDF Syntax (different XML serialization syntaxes) • RDF Schema for definition of Vocabularies (simple Ontologies) for RDF (and in RDF) Knowledge Markup and Resource Semantics, By Harold Boley, Stefan Decker, and Michael Sintek, IJCAI-01 Tutorial, http://www.ijcai-01.org/
RDF Data Model • Resources • A resource is a thing you talk about (can reference) • Resources have URI’s • RDF definitions are themselves Resources (linkage, see requirement 1) • Properties • slots, define relationships to other resources or atomic values • Statements • “Resource has Property with Value” • (Values can be resources or atomic XML data) • Similar to Frame Systems An RDF Data Store
Ora Lassila A Simple Example • Statement • “Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila” • Structure • Resource (subject) http://www.w3.org/Home/Lassila • Property (predicate) http://www.schema.org/#Creator • Value (object) "Ora Lassila” • Directed graph s:Creator http://www.w3.org/Home/Lassila An RDF Data Store
Another Example • To add properties to Creator, point through an intermediate Resource. http://www.w3.org/Home/Lassila s:Creator Person://fi/654645635 Name Email Ora Lassila lassila@w3.org An RDF Data Store
Collection Containers • Multiple occurrences of the same PropertyType don’t establish a relation between the values • The Millers own a boat, a bike, and a TV set • The Millers need (a car or a truck) • (Sarah and Bob) bought a new car • RDF defines three special Resources: • Bag unordered valuesrdf:Bag • Sequence ordered values rdf:Seq • Alternative single valuerdf:Alt • Core RDF does not enforce ‘set’ semantics amongst values An RDF Data Store
/courses/6.001 Example: Bag • The students incourse 6.001 are Amy, Tim,John, Mary,and Sue Rdf:Bag rdf:type /Students/Amy students rdf:_1 rdf:_2 /Students/Tim bagid1 rdf:_3 /Students/John rdf:_4 /Students/Mary rdf:_5 /Students/Sue An RDF Data Store
Example: Alternative • The source code for X11 may be found at ftp.x.org, ftp.cs.purdue.edu, or ftp.eu.net http://x.org/package/X11 rdf:Alt rdf:type source altid rdf:_1 ftp.x.org rdf:_2 ftp.cs.purdue.edu rdf:_3 ftp.eu.net An RDF Data Store
Statements About Statements • Making statements about statements requires a process for transforming them into Resources • subject the original resource • predicate the original property • object the original value • type rdf:Statement An RDF Data Store
Statements About Statements Ralph Swick says that Ora Lassila is the creator of the resource http://www.w3.org/Home/Lassila. <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:a="http://description.org/schema/"> <rdf:Description> <rdf:subject resource="http://www.w3.org/Home/Lassila" /> <rdf:predicate resource="http://description.org/schema/Creator" /> <rdf:object>Ora Lassila</rdf:object> <rdf:type resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Statement" /> <a:attributedTo>Ralph Swick</a:attributedTo> </rdf:Description> </rdf:RDF> An RDF Data Store
Representation of a reified statement Ora Lasilla http://www.w3.org/Home/Lassila rdf:object rdf:subject rdf:predicate rdf:type rdf:Statement s:Creator a:attributedTo Ralph Swick An RDF Data Store
Statements About Statements <rdf:RDF> <rdf:Description about="http://www.w3.org/Home/Lassila" bagID="D_001"> <s:Creator>Ora Lassila</s:Creator> <s:Title>Ora's Home Page</s:Title> </rdf:Description> <rdf:Description aboutEach="#D_001"> <a:attributedTo>Ralph Swick</a:attributedTo> </rdf:Description> </rdf:RDF> An RDF Data Store
RDF Syntax I • Data model does not enforce particular syntax • Specification suggests many different syntaxes based on XML • General form: Subject (OID) Starts an RDF-Description <rdf:RDF> <rdf:Description about="http://www.w3.org/Home/Lassila"> <s:Creator>Ora Lassila</s:Creator> <s:createdWith rdf:resource=“http://www.w3c.org/amaya”/> </rdf:Description> </rdf:RDF> Literal Resource (possibly another RDF-description) Properties An RDF Data Store
Resulting Graph http://www.w3.org/Home/Lassila s:createdWith s:Creator http://www.w3c.org/amaya Ora Lassila <rdf:RDF> <rdf:Description about="http://www.w3.org/Home/Lassila"> <s:Creator>Ora Lassila</s:Creator> <s:createdWith rdf:resource=“http://www.w3c.org/amaya”/> </rdf:Description> </rdf:RDF> An RDF Data Store
RDF Syntax II: Syntactic Varieties Typing Information Subject (OID) In-Element Property <s:Homepage rdf:about="http://www.w3.org/Home/Lassila” s:Creator=“Ora Lassila”/> <s:createdWith> <s:HTMLEditor rdf:about=“http://www.w3c.org/amaya”/> </s:createdWith> </s:Homepage> rdf:type s:Homepage http://www.w3.org/Home/Lassila Property s:createdWith s:Creator rdf:type HTMLEditor http://www.w3c.org/amaya Ora Lassila
RDF Schema (RDFS) • RDF just defines the data model • Need for definition of vocabularies for the data model - an Ontology Language! • The RDF Schema mechanism provides a basic type system for use in RDF models. • The RDF schema specification language is less expressive, but much simpler to implement, than full predicate calculus languages such as CycL and KIF. An RDF Data Store
Most Important Modeling Primitives • Core Classes • Root-Class rdfs:Resource • MetaClass rdfs:Class • Literals rdfs:Literal • rdfs:subclassOf-property • Inherited from RDF: properties (slots) • rdfs:domain & rdfs:range • rdfs:label, rdfs:comment, etc. • Inherited from RDF: InstanceOf (rdf:type) An RDF Data Store
Classes and Properties Resources Property • rdf:type • rdfs:subClassOf • rdfs:subPropertyOf • rdfs:comment • rdfs:label • rdfs:seeAlso • rdfs:isDefinedBy Classes • rdfs:Resource • rdfs:Class • rdf:Property • rdfs:ConstraintProperty • rdfs:Literal ConstraintProperty • rdfs:domain • rdfs:range Classes and Resources as Sets and Elements
DARPA Agent Markup Language Program • DARPA funded Research Program (also funded the Development of the ARPANNET -> Internet) • Focusing on building the foundation for the Semantic Web: http://www.daml.org • Ontology Language DAML+OIL: Result of a Joint (European + US-American) Committee • Rule Language in preparation An RDF Data Store
DAML+OIL • Extension of RDF Schema • Ontology Language DAML+OIL: Result of a Joint (European + US-American) Committee • Extension of RDF Schema • Class Expressions (Intersection, Union, Complement) • XML Schema Datatypes • Enumerations • Property Restrictions • Cardinality Constraints • Value Restrictions An RDF Data Store
Example: Intersection & Synonyms <daml:Class rdf:ID="TallMan"> <daml:intersectionOf rdf:parseType="daml:collection"> <daml:Class rdf:about="#TallThing"/> <daml:Class rdf:about="#Man"/> </daml:intersectionOf> </daml:Class> <daml:Class rdf:ID="HumanBeing"> <daml:sameClassAs rdf:resource="#Person"/> </daml:Class> An RDF Data Store