1.23k likes | 1.25k Views
Semantic Web. Ivana Vujovic, ile@eunet.yu Prof. Dr. Erich Neuhold, neuhold@ipsi.fhg.de Dr. Peter Fankhauser, fankhaus@ipsi.fhg.de Claudia Niederée, niederee@ipsi.fhg.de Prof. Dr. Veljko Milutinovic, vm@etf.bg.ac.yu http://galeb.etf.bg.ac.yu/~vm. Tutorial Structure (Overview).
E N D
Semantic Web Ivana Vujovic, ile@eunet.yu Prof. Dr. Erich Neuhold, neuhold@ipsi.fhg.de Dr. Peter Fankhauser, fankhaus@ipsi.fhg.de Claudia Niederée, niederee@ipsi.fhg.de Prof. Dr. Veljko Milutinovic, vm@etf.bg.ac.yu http://galeb.etf.bg.ac.yu/~vm
Tutorial Structure (Overview) • Introduction to the Semantic Web • XML Technologies for the Semantic Web • Defining vocabularies with RDF • Ontologies and ontology languages • Challenges for the Semantic Web • References
World Wide Web - Today • How do we find the information on the Web? • By giving keyword(s) to the browser, that returns pointers to web pages, containing the keyword you entered • If you want more?
World Wide Web - Today Information consumer preferences Information request preferences Search Engines (eg. Google), Information Portals Indexing, refences, collections Information and Service Providers
S+ S+ Request/Task Interpretation Interpretation Agents Communication, Negotiation, Planning, Decisions, Proofs Interpretation Ratings, Signatures, Certificates S+ Interpretation S+ S+ S+ Semanticly enriched information S+ „Trust“-Services Semantic Web - Vision User Preferences … Calendar … Calendar Preferences Information and Service Provider
A Definition of the Semantic Web “Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation” Tim Berners-Lee, James Hendler, Ora Lassila, The Semantic Web, Scientific American, May 2001
The WEB - Reality Check • „Extension of the current Web“ • Over 500 million users • More than 3 billion pages • „Well-defined meaning ?“ • Format Crisis (XML 1.0) ? • Schema Crisis (XML Schema) ? • Semantic Crisis (RDF) ? • „Computers and people work cooperatively“ • Effective? • Scalable? • Trust?
Why? • To use the large amount of information on the Web more effectively • To enable more advanced automated processing on the Web - machines can “understand” the content • Intelligent browsers to help you find what you are looking for • To derive new information from existing information (reasoning) - Virtual global database • Advanced applications and services become possible, eg. in - e-business - e-government - e-learning
Examples • Context-awareness --linking based on the meaning of the information elements • Filtering -- you could rate the pages you visit, and this is later used for automatic general recommendations • Annotations -- you could add comments to the information on the Web, and these comments can be shown to other visitors • Privatization -- you can create your own database of information from the Web
Trusted Web Resources DAML+OIL Shared Terminology machine machine 2010 OWL XML Self Describing RDF Documents2000 HTTP Foundation of Web today1990 Human Machine HTML SGML Document Exchange Format1985 Hy Time
Building Blocks Semantic Web Metadata URI Data about data – labeling and structuring information in a document Universal Resource Identifier – an universal and unique name for any resource http://www.something.com/one
Design Goal: Evolution Support • Enable combining of independently designed services, standards, vocabularies, etc • Building the new techniques on top of old ones, without altering them • Use such description techniques that can develop with the evolution of human understanding • Partial understanding and transformability
Minimalist Design • Making it as simple as possible • Simplicity helps future evolution of Semantic Web
Inference • Deriving new data from the existing ones • Merging data repositories gives new information • Allows the creation of more powerful applications (intelligent agents) • Unfortunately, inference can be achieved completely only when the semantics is defined formally in a language (eg. "First Order Predicate Logic“ languages)
Tutorial Structure • Introduction to the Semantic Web • XML Technologies for the Semantic Web • Defining vocabularies with RDF • Ontologies and ontology languages • Challenges for the Semantic Web • References
XML Technologies for the Semantic Web • Overview • XML Instances • XML Document Type Definition • XML Linking • XML Schema • XML Query Language
What is an XML-Document ? <?xml version="1.0"?> <a> <bid="x1"> <c>David</c> <c>Marie</c> </b> <d/> <bid="x2"> <c>John</c> </b> </a> a a id=x1 id=x2 * b d b b d id c c c * c David Marie John Schema (Document Type Definition, DTD) File Format (Instance) Tree Structure Instance
XML (eXtensible Markup Language) • XML is a text-based metalanguage format for data exchange • Provides a pathway to transfer data easily between dissimilar applications and servers • Markup – identifies structures in the document (<p></p>) • DTD – specifies the structure of XML files • XML Schema – adds types to XML • XML Query – a typed query language for XML documents
The XML Stack Specific Applications Standardized Applications XHTML, SVG, SMIL, P3P, MathML Layout - XSL - CSS Hyperlinks - XLink - XPointer Metadata - RDF, RDFS API - DOM - SAX Schemas - XSD - Namespaces Queries - XPath - XQuery XML 1.0 Locators (URI) Unicode DTDs
XML Document Comments, Processing Instructions, Character Reference Prologuewith Documenttype-Declaration Documenttype-Definition (DTD) Element, Attribute, Entity, Parameter entity, Parameter entityreference External DTD Subset Conditional Sections Document Elementwith namespace declaration Document Starttag, Endtag, empty element tag, PCDATA, CDATA Sections, Entity References Epilogue Overall Structure of an XML Document
Example of songs.xml • Example of describing a song in songs.xml using music.dtd • <song> <title>Gipsy song</title> <artist>Vlatko Stefanovski</artist> <type class=”ETHNO” /><download class=“YES”/><comments/></song> parent element defined in music.dtd child elements defined in music.dtd
XML Documents - Instances • XML code is written between start tag and end tag • In our example it would be <MUSIC>…</MUSIC> • XML documents must be well formed: • Every document must have a root element • Every start tag must have a corresponding end tag • There may not exist interleaved tags: <a><b></a></b> • Other elements in an XML document consist of start tag, content of that element, and end tag, as described in the related DTD • If there is nothing between start tag and end tag, or there is only an empty tag with closing slash (<E/>), this is empty element
Document type declaration and document element External DTD Declaration <?xml version="1.0" encoding="ISO-8859-1"?> <!DOCTYPE test PUBLIC "-//Test GmbH//DTD test V1.0//EN" SYSTEM "http://www.test.de/test.dtd"><test> "test" is the document element </test> Internal DTD Declaration <!DOCTYPE test [ <!ELEMENT test EMPTY> ]><test/> Mixed Usage <!DOCTYPE test SYSTEM "http://www.test.de/test.dtd" [<!ENTITY hello "hello world">]><test>&hello;</test>
DTD Element Declarations (1) <!ELEMENT elementname (contentmodel)> • Element Content • <!ELEMENT example ( a )> • Text and Mixed content • <!ELEMENT example (#PCDATA)> • <!ELEMENT example (#PCDATA | a)*> • Empty Element • <!ELEMENT exampleEMPTY> • Element with arbitrary content • <!ELEMENT exampleANY>
DTD Element Declarations (2) • Sequence • <!ELEMENT example ( a, b )> • Alternative • <!ELEMENT example ( a | b )> • Optional (Zero or Once) • <!ELEMENT example ( a )?> • Optional and repeatable • <!ELEMENT example ( a )*> • Required and repeatable • <!ELEMENT example ( a )+> • Parentheses can be used for grouping content models
DTD Attribute Declaration <!ATTLIST ElementnameAttributenameTypeRestrictionAttributenameTypeRestriction ...> Possible Restrictions: • Required Attribute: #REQUIRED • Optional Attribute: #IMPLIED • Fixed Attribute: #FIXED "value" • Value for enumeration types: "value"
DTD Attributes: Types • CDATA • Character String • <!ATTLIST example HREFCDATA#REQUIRED> • Enumeration type • <!ATTLIST example selection( yes | no | maybe )"yes"> • ID, IDREF • ID serves as a unique key within a document • IDREF refers to a key • Referential integrity is checked by parser • <!ATTLIST exampleindentityID #IMPLIEDreferenceIDREF #IMPLIED>
Music.dtd Parent element • <!ELEMENT song (title, artist, album?, type, format?, download, comments?)> • <!ELEMENT title (#PCDATA)> • <!ELEMENT artist (#PCDATA)> • <!ELEMENT type EMPTY> • <!ATTLIST type • class (CLASSICAL | ROCK | POP | RAP | • JAZZ | TECHNO | ETHNO) #REQUIRED> • <!ELEMENT download EMPTY> • <!ATTLIST download • class (YES | NO) "YES" • > • <!ELEMENT comments (#PCDATA)> Child elements Attributes describe content List of values for download
XML Linking Simple Link Extended Link XPointer Link Group
XPath • A language that enables us to address parts of an XML document (elements, attributes, …) • Select the title elements of the song elements of the catalog element and all the artist elements in the document /catalog/song/title | //artist • Selects all the song elements of the catalog element that have a download element with a value of yes: /catalog/song[download=yes]/title selects any element in the document selects the child element selects several paths
Also… • Use * to select unknown XML elements /catalog/*/artist • Use @attribute_name to specify an attribute //song[@type=‘classical'] • XPath expressions – logical, arithmetical /catalog/song[duration<5] • XPath functions - count(), id(), last(), name(), concat(), string(), trenslate(), sum(), round(), false(), not(),… /catalog/song[last()] • To select nodes from the XML document (IE) xmlDoc.selectNodes("/catalog/song/title/text()") the path
XPointer • Locates portions of other XML documents (elements, attributes…), without the need to place anchors inside those documents (as in HTML) • More robust to the changes in the target document • URL + XPath • http://www.music.org/first.xml/#xpointer(//song/title[1]) URL of the document we point into XPointer expression (XPath language)
product locator locator locator XLink: Example for an extended Link <?xmlversion="1.0"?> <productxml:link="extended" inline= "false" title= "Gipsy Songs" id= "id12345"> <locatorhref= "desc.xml" role= "Description"/> <locatorhref= "img.gif" role= "Image"/> <locatorhref= "urn:rdbm: select price from products where id="id12345" role= "Price"/> </product> XLinks are first class objects
XML Schema • XML Schema defines a class of XML documents • Defines (explains) the datatypes, elements, and attributes • Defines and catalogues vocabularies for classes of XML documents • The document described by an XML schema can be called an instance (parallel to OOP) • The schema language, considerably extends the capabilities of XML 1.0 document type definitions (DTDs), most importantly with datatypes
Practically no reuseof contentmodels Syntax: Not XML Limitations of DTDs • <!ELEMENT song (title, artist, album?, type, format?, download, comments?)> • <!ELEMENT title (#PCDATA)> • <!ELEMENT artist (#PCDATA)> • <!ELEMENT type EMPTY> • <!ATTLIST type • class (CLASSICAL | ROCK | POP | RAP | JAZZ | TECHNO | ETHNO) #REQUIRED> • <!ELEMENT download EMPTY> • <!ATTLIST download • class (YES | NO) "YES"> • <!ELEMENT comments (#PCDATA)> Constructors: Elementset withContent Model Datentypes: Essentially only "String"
XML Schema Components • An XML Schema is comprised of a set of schema components • There are three groups of components • Primary components - Simple type definitions, Complex type definitions, Attribute declarations, Element declarations • Secondary components - Attribute group definitions, Identity-constraint definitions, Model group definitions, Notation declarations • “Helper” components – Annotations, Model groups, Particles, Wildcards, Attribute Uses
Example – song Type definition • <xsd:complexType name=“song" > <xsd:sequence> <xsd:element name=“title" type="xsd:string"/> <xsd:element name=“artist" type="xsd:string"/> </xsd:sequence> • <xsd:attribute name=“length" type="xsd:duration"/> </xsd:complexType> • xsd – used to denote XML Schema namespace Complex type <xsd:choice> Type declarations Simple type </xsd:choice>
Simple types • Simple types cannot have element content and cannot carry attributes • Defined with simpletype element • string, int, unsignedInt, long, byte, token, decimal, float double, time, duration, gMonth, name language, ID, ENTITY, NOTATION, NMTOKEN are built-in types • New simple elements can be created by restrictionfrom the built ins (a range of values) with the facets (like pattern, enumeration) list types by derivation from existing atomic types(they can have facets length, minlength, enumeration…) union types
Complex types • Complex types allow elements in their content and may carry attributes • Defined using the complextype element • Usually contains a set of element declarations, element references, and attribute declarations
annotation Example.xsd • <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:annotation> <xsd:documentation xml:lang="en"> Music album schema </xsd:documentation> </xsd:annotation> • <xsd:element name=“Musicalbum" type=“MusicalbumType"/> <xsd:element name="comment" type="xsd:string"/> • <xsd:complexType name=“MusicalbumType"> <xsd:sequence> <xsd:element name=“title" type=“xsd:string"/> <xsd:element name=“contains" type=“song"/> <xsd:element ref="comment" minOccurs="0"/> <xsd:element name=“publisher" type=“xsd:string"/> </xsd:sequence> <xsd:attribute name=“PublishDate" type="xsd:date"/> </xsd:complexType> • (example-song) • </xsd:schema>
Global elements and attributes Global element or attribute is defined as a child of schema element • They can be referenced later in the schemas by using the “ref” attribute • <xsd:element ref="comment" minOccurs="0"/> Occurrence constraint - comment can appear min 0 times
Local elements and attributes • Local elements and attributes are defined within the context of some (complex) type definition • Elements (and attributes) with the same name may have different type in different contexts. • DTDs only allow for global elements and local attributes • Example:<xsd:complexType name=“personNameType”> <xsd:element name=“first” type=“xsd:string”/> <xsd:element name=“last” type=“xsd:string”/></xsd:complexType><xsd:complexType name=“competitionResultType”> <xsd:element name=“first” type=“personNameType”/> <xsd:element name=“runnerUp” type=“personNameType/><xsd:complexType> • The element <first> has two different types (and meanings) depending on its context.
Reusability of schemas • xs:include – to include a schema from another document (copy-paste) <xs:include schemaLocation=“collection.xsd"/> • xs:redefine – same, plus it lets you redefine schema • xs:import - reusing definitions from other namespaces (a system of libraries) <xs:import namespace="http://www.w3.org/XML/1998/namespace" schemaLocation="myxml.xsd"/> Now we can reference an external element from the imported namespace in our schema
XML query language • XQuery is a language for querying XML data • It is a fully compositional, functional, strongly typed language consisting of expressions • Expressions return node sequences, sequences of simple values, and even sequences of nodes and simple values. for $x in (1 to 10) return ($x*2) • evaluates 2,4,6,8,10,12,14,16,18,20 • Expressive power: • relationally complete (of course) • turing complete (unfortunately) • Four main constituents: • FLWR (compare to SQL) • XPath expressions • Element construction • System defined and user defined functions
Comparison of XQuery and SQL • XQuery • for $c in book, $o in magazine where $c/author_id=$o/author_id return $c.name • SQL • select book.name from book, magazine where book.author_id=magazine.author_id
F(or)L(et)W(here)R(eturn) expressions • Central control structure of XQuery. • FOR iterates over (several) sequencesproducing tuples of iteration variable bindings • LET binds interim results to variables • WHERE filters some of the tuples, which can be used for joins • and RETURN realizes the body of the iteration by FOR This is typically used for constructing result elements
XQuery element constructors • Element construction is just like XML. • Start tag, end tag, and optional content in between • <ele persid=“{$id}”> • {$name} • {$age} • </ele> • Construct an element <ele>, with attribute @persid filled with value returned by variable $id, and content filled with the nodes returned by variables $name and $age. • XQuery uses XPath for querying nested structures: $album//son[@class=“something"]
Tutorial Structure • Introduction to the Semantic Web • XML Technologies for the Semantic Web • Defining vocabularies with RDF • Ontologies and ontology languages • Challenges for the Semantic Web • References