E N D
Facilities to put machine-understandable data on the Web are becoming a high priority for many communities. The Web can reach its full potential only if it becomes a place where data can be shared and processed by automated tools as well as by people. For the Web to scale, tomorrow's programs must be able to share and process data even when these programs have been designed totally independently. The Semantic Web is a vision: the idea of having data on the web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications. An Introduction to RDF
History An Introduction to RDF
What is the Web, Really ? • Millions upon millions of computers all using the same communications protocol HTML HTTP TCP/IP An Introduction to RDF
HTML <B><I> <FONT FACE="Tahoma" SIZE=2> <P ALIGN="CENTER">DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE</P> <P ALIGN="CENTER">SAN JOSE STATE UNIVERSITY</P> </I><P ALIGN="CENTER">SPRING 2000 COLLOQUIUM SERIES, PART II</P> </FONT> <I><FONT SIZE=2> <P ALIGN="CENTER">Each talk with be on a Thursday at 3:00 p.m. in MacQuarrie Hall 523 </P> <P ALIGN="CENTER">Please join us for refreshments beforehand, at 2:30 p.m., in MacQuarrie Hall 210</P> </I><P ALIGN="CENTER">Parking available in the Seventh Street Garage at South Seventh and San Salvador Streets, San Jose, CA</P> </FONT> <I><FONT FACE="Tahoma" SIZE=2> </I></FONT><FONT SIZE=2> <P>April 6		Zvezdelina Stankova-Frenkel, Mathematics, Mills College</P> <I><P>From Desargues to Modern Algebraic Geometry</P> </B></I></FONT><FONT FACE="Arial" SIZE=2> <P>We will look at some classical plane geometry . . . mathematics. </P> An Introduction to RDF
The Evolution of Web Technology • HTML 1.0 became 2.0 became ... 4.0 • Cascading style sheets and other formatting and layout standards defined by W3C • Proprietary technologies such as Shockwave and PDF invented An Introduction to RDF
The Implicit Assumptions • Point to point (direct) communication • The primary task of a web server is to deliver information to a human who is asking for that information • Key points: to a human, already asking for information An Introduction to RDF
The First Business Opportunity • “The Web is like mail-order” • Put Catalogs on the web • Easy to update • Easy to link in auxiliary information • “People who bought that also bought …” • Availability information • In many cases, simply putting an “HTML front end” on existing systems An Introduction to RDF
Leads to Another Opportunity • Catalogs prime the pump • Easy to understand application that is compelling • Side-effect: lots of information is now available on the internet • How do we take advantage of it ? • Automate existing processes • Enable new applications An Introduction to RDF
HTML is a Problem • It’s a markup language based on document structure • Most tags are visual, about presentation • HTML solves document-level navigation problems, for humans • Lots of information encoded in images • Fundamentally, the wrong idea. An Introduction to RDF
eXtensible Markup Language (XML) • Basically, a language for defining markup languages • Key idea: separate data from presentation information • Replace HTML with two things • A domain specific markup language (defined in XML) • A map from that markup language to HTML (defined using XSL) An Introduction to RDF
Split Data <SEASON> <YEAR>1998</YEAR> <LEAGUE> <LEAGUE_NAME>National League</LEAGUE_NAME> <DIVISION> <DIVISION_NAME>East</DIVISION_NAME> <TEAM> <TEAM_CITY>Atlanta</TEAM_CITY> <TEAM_NAME>Braves</TEAM_NAME> <PLAYER> <SURNAME>Malloy</SURNAME> <GIVEN_NAME>Marty</GIVEN_NAME> <POSITION>Second Base</POSITION> <GAMES>11</GAMES> <GAMES_STARTED>8</GAMES_STARTED> <AT_BATS>28</AT_BATS> <RUNS>3</RUNS> <HITS>5</HITS> <DOUBLES>1</DOUBLES> ..... Meaning! An Introduction to RDF From: The XML Bible by Harold
From Presentation <HTML xmlns:xsl="http://www.w3.org/TR/WD-xsl"> <HEAD> <TITLE> <xsl:for-each select="SEASON"> <xsl:value-of select="YEAR"/> </xsl:for-each> Major League Baseball Statistics </TITLE> </HEAD> <BODY> <xsl:for-each select="SEASON"> <H1 ALIGN="CENTER"> <xsl:value-of select="YEAR"/> Major League Baseball Statistics </H1> Formatting! An Introduction to RDF
What is the Web, XML Version • HTML is a tag language, defined using XML • One of many tag languages (and the likely target for XSL transformations) XHTML Special Purpose Tag Languages XML HTTP TCP/IP An Introduction to RDF
XML Has Lots of Problems • Everything bottoms out in strings • DTD’s provide simple structure at the level of “documents” • Very simple inter-document structure • No provisions for intra-document structure • No support for versioning An Introduction to RDF
The VISA DTD <!ELEMENT Invoice (InvoiceHeader, InvoiceDetails+, InvoiceSummary)> <!ATTLIST Invoice sectorUsageVersion CDATA #IMPLIED > <!ELEMENT InvoiceHeader (InvoiceType, InvoiceStatus, TaxTreatment, DiscountTreatment?, InvoiceTreatment, InvoiceNumber, InvoiceDate, TaxPointDate?, Currency, Party, Party, Party*, Payment?, PONum?, DeliveryNoteNum?, Ref*, Date*, GenText*)> <!ELEMENT InvoiceType EMPTY> <!ATTLIST InvoiceType stdValue (380|381) "380" stdName (UNTDID:1001) "UNTDID:1001"> <!-- 380 = Invoice 381 = Credit Note --> <!ELEMENT InvoiceStatus EMPTY> <!ATTLIST InvoiceStatus stdValue (9|10|53) "9" stdName (UNTDID:1225) "UNTDID:1225"> <!-- 9 = Original, 10 = Copy, 53 = Test --> <!ELEMENT TaxTreatment EMPTY> <!ATTLIST TaxTreatment stdValue (NIL|GIL|NLL|GLL|NON) "NLL" stdName (VISA:TAXT) "VISA:TAXT"> <!-- NIL = Line item net amounts, invoice level tax GIL = Line item gross amounts, invoice level tax NLL = Line item net amounts, line level tax GLL = Line item gross amounts, line level tax NON = Tax does not apply to this invoice --> <!ELEMENT DiscountTreatment EMPTY> <!ATTLIST DiscountTreatment stdValue (UN|UG|TN) "UG" stdName (VISA:DSCT) "VISA:DSCT"> <!-- UN = Line item unit price, net of discount UG = Line item unit price, gross of discount TN = Line item sub-total, net of discount TG = Line item sub-total, gross of discount. --> <!ELEMENT InvoiceTreatment EMPTY> <!ATTLIST InvoiceTreatment stdValue (P|EP|E) "P" stdName (VISA:INVT) "VISA:INVT"> <!-- P = Invoice printed and given to purchaser, and then used for tax reclaim S = Printed, but printed invoice treated as supplemental invoice since electronic copy used for tax reclaim E = Printed invoice suppressed since electronic master version used for tax reclaim --> An Introduction to RDF
It Gets Worse <!ATTLIST InvoiceTreatment stdValue (P|EP|E) "P" stdName (VISA:INVT) "VISA:INVT"> <!-- P = Invoice printed and given to purchaser, and then used for tax reclaim S = Printed, but printed invoice treated as supplemental invoice since electronic copy used for tax reclaim E = Printed invoice suppressed since electronic master version used for tax reclaim --> <!ELEMENT InvoiceNumber (#PCDATA)> <!-- String, 1..35 characters --> <!ELEMENT InvoiceDate (#PCDATA)> <!-- String, 1..19 Character DateTime (CCYY-MM-DDTHH:MM:SS) --> <!ELEMENT TaxPointDate (#PCDATA)> <!-- String, 1..19 Character DateTime (CCYY-MM-DDTHH:MM:SS) --> An Introduction to RDF
The Accompanying Prose • The DTD is 4 pages • The manual is 182 pages The aim of this Guide is to provide sufficient information about the XML Invoice Document to enable its implementation. It documents the file structure, the business usage of the elements, and all the elements and attributes in detail. An Introduction to RDF
RDF An Introduction to RDF
Goal: The Semantic Web • Different sites each maintain small amounts of information • Sites need to refer to each other’s information with full semantic integrity • Information is maintained by owners and referred to by other sites • Information should be accessible, and coherent, in very small chunks An Introduction to RDF
Needed: Precision • Need the ability to specify things like dates, times, and monetary amounts • Compile in those VISA comments • The more of this we can do, the less programmer-hours are needed • Ultimately, most web-based computation will not involve a browser An Introduction to RDF
Needed: Granularity • Saying things at the “page” level is too coarse grained • Small chunks of data necessary • And ability to aggregate into larger chunks important An Introduction to RDF
Use Classes and Instances • Objects are a natural way to represent information • A web page can contain hundreds of instances, each with its own URI • Hard part is figuring out how to do this in a way that works on the web An Introduction to RDF
Start with Resources • A resource is a thing you talk about (can reference) • Everything is a resource • Resources have URI’s An Introduction to RDF
How to say things in RDF • Small set of canonical tags • Use XML syntax to define vocabularies • Information asserted via triples • Assertions require three things: • Subject: What the assertion is about (always a resource) • Property: A property whose value is being asserted (always a resource) • Object: The value of the property (either a resource or a primitive value) An Introduction to RDF
Important Tags • rdf:Description • rdfs:Class • rdfs:Property • rdf:type • rdfs:subClassOf • rdfs:domain • rdfs:range An Introduction to RDF
Defining a Class <?xml version='1.0' encoding='ISO-8859-1'?> <!-- Version Tue Feb 01 18:29:46 PST 2000 --> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#" xmlns:rdfutil="http://www.w3.org/rdfutil#" xmlns:bill="http://www.grosso.org/rdfexample#"> <rdf:Description rdf:ID="MotorVehicle"> <rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Class"/> <rdfs:subClassOf rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Resource"/> </rdf:Description> An Introduction to RDF
An Instance of Motor Vehicle <?xml version='1.0' encoding='ISO-8859-1'?> <!-- Version Tue Feb 01 18:29:46 PST 2000 --> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#" xmlns:rdfutil="http://www.w3.org/rdfutil#" xmlns:bill="http://www.grosso.org/rdfexample#"> <rdf:Description rdf:ID="MyChevy"> <rdf:type resource= bill:MotorVehicle /> </rdf:Description> An Introduction to RDF
Resources Define Tags <rdfs:Class ID="MotorVehicle"> <rdfs:subClassOf rdf:resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Resource"/> </rdfs:Class> <bill:MotorVehicle ID=MyChevy/> An Introduction to RDF
Classes • Object-oriented notion • There are classes, arranged in a taxonomy (with subclass relationships) • Instances can be instances of more than one class An Introduction to RDF
Adding a Property <rdf:Description rdf:ID="rearSeatLegRoom"> <rdf:type resource="http://www.w3.org/TR/1999/PR-rdf-schema-19990303#Property"/> <rdfs:domain rdf:resource="#MotorVehicle"/> <rdfs:range rdf:resource="http://www.w3.org/TR/xmlschema-2/#integer"/> </rdf:Description> <rdfs:Property ID=”rearSeatLegRoom"> <rdfs:domain rdf:resource="#MotorVehicle"/> <rdfs:range rdf:resource="http://www.w3.org/TR/xmlschema-2/#integer"/> </rdfs:Property> An Introduction to RDF
Setting Property Values <rdf:Description rdf:ID=MyChevy> <bill:rearSeatLegRoom> 47 </bill:rearSeatLegRoom> </rdf:Description> An Introduction to RDF
Properties • Similar to fields (data members, attributes...) • Big difference: they’re first class objects • Defined independently of classes • Asserted independently of classes • Classes don’t come with a set of data members • Other people (other pages) can assert properties about your classes and instances without your knowledge or permission An Introduction to RDF
The Web of Knowledge Corning Fiberglass has a product catalog Home Appliances Defines things like “Blender” Corning Fiberglass has a product catalog Corning Fiberglass has a product catalog Corning Fiberglass has a product catalog Sears has an on-line store that uses (and extends) both of these as standard vocabularies An Introduction to RDF
Corning Fiberglass has a product catalog Home Appliances. Defines things like “Blender” Corning Fiberglass has a product catalog Corning Fiberglass has a product catalog Corning Fiberglass has a product catalog Sears has an on-line store that uses (and extends) both of these as standard vocabularies The Web of Knowledge Public Opinion And Ratings Terminology Consumer Reports uses the product catalogs and attaches more information to them An Introduction to RDF
What is the Web, RDF Version • Usually called “The Semantic Web” Instances Instances Schema Schema Schema RDF and RDF-Schema XML HTTP TCP/IP An Introduction to RDF
Further Information • http://www.w3.org/RDF/ • http://www.w3.org/2001/sw/ • http://www.semanticweb.org/ • http://www.mozilla.org/rdf/doc/ • http://www.xml.com/pub/a/2001/01/24/rdf.html • http://xml.coverpages.org/rdf.html An Introduction to RDF
Programmatic Resources • Protege (http://www smi.stanford.edu/projects/protege) • RDF DB (http://web1.guha.com/rdfdb/) • Redland (http://www.redland.opensource.ac.uk/) • Java API (http://www-db.stanford.edu/~melnik/rdf/api.html) • Squish (http://swordfish.rdfweb.org/rdfquery/) An Introduction to RDF
High Profile Uses • Electric Power Industry (http://www.langdale.com.au/XMLCIM.html) • DMOZ (http://www.dmoz.org/) • Epinions (http://www.epinions.com) • DAML (www.daml.org) An Introduction to RDF