1 / 66

Semantic Web

Semantic Web. Semantic Web Architecture Dieter Fensel Federico Facca Jacek Kopecky. Where Are We?. Agenda. Introduction and motivation Technical solutions Semantic Web architecture Uniform Resource Identifier Extensible Markup Language Namespaces Extensions

Download Presentation

Semantic Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Semantic Web Semantic Web Architecture Dieter Fensel Federico Facca Jacek Kopecky

  2. Where Are We?

  3. Agenda • Introduction and motivation • Technical solutions • Semantic Web architecture • Uniform Resource Identifier • Extensible Markup Language • Namespaces • Extensions • Illustration by a large example • Summary • References

  4. Introduction AND MOTIVATION

  5. Web Architecture • The Semantic Web extends the Web • Web architecture principles • Simplicity • Orthogonality of specifications • Identification vs. interaction vs. representation • Better manageability, independent evolution • Extensibility • Extending and subsetting • Web of documents • Wealth of data • Mostly in HTML, XML – data islands

  6. Semantic Web • Semantic data • Powerful queries, inference, reasoning • Linked data • Data integration and reuse • Combining data from differen sources • Semantic automation • Mediation of different vocabularies • Data mining • Automated use of Web services

  7. Semantic Web Architecture • Formalized components and their relationships • What technologies make up Semantic Web • What are the dependencies between components • Roadmap for steps of developing the Semantic Web

  8. The Semantic Web architecture and its foundations TECHNICAL SOLUTION

  9. SemWeb Architecture: Requirements • Extensibility • Each layer should extend the previous one(s) • Support for data interchange • Using data from one source in other applications • Support for ontology description with different complexity • Including rules • Support for data query • Support for data provenance and trust evaluation see the Semantic Web Roadmap: http://www.w3.org/DesignIssues/Semantic.html

  10. Semantic Web Stack Adapted from http://en.wikipedia.org/wiki/Semantic_Web_Stack

  11. UNICODE, URI and XML • UNICODE is the standard international character set • Uniform Resource Identifiers (URIs) identify things and concepts • eXtensible Markup Language (XML) is a markup language used for data exchange

  12. RDF, RDFS and OWL • Resource Description Framework (RDF) is the HTML of the Semantic Web • Simple way to describe resources on the Web • Based on triples <subject, predicate, object> • Various serializations, including one based on XML • A simple ontology language (RDFS) • More in Lecture 3 • Web Ontology Language (OWL) isa more complex ontology language than RDFS • Layered language based on DL • Overcomes some RDF(S) limitations • More in Lecture 7

  13. SPARQL and Rule Languages • SPARQL • Query language for RDF triples • A protocol for querying RDF data over the Web • More in lecture 6 • Rule languages (esp. Rule Interchange Format RIF) • Extend ontology languages with proprietary axioms • Based on different types of logics • Description Logic • Logic Programming • More in lecture 8

  14. Logics, Proof and Trust • Unifying logic • Bring together the various ontology and rule languages • Common inferences, meaning of data • Proof • Explanation of inference results, data provenance • Trust • Trust that the system performs correctly • Trust that the system can explain what it is doing • Network of trust for data sources and services • Technology and user interface • Many open problems, topics for future research

  15. Lecture II: Foundations

  16. More than a-z, A-Z UNICODE

  17. Character Sets ASCII – 7 bit, 128 characters (a-z, A-Z, 0-9, punctuation) Extension code pages – 128 chars (ß, Ä, ñ, ø, Š, etc.) Different systems, many different code pages ISO Latin 1, CP1252 – Western languages (197 = Å) ISO Latin 2, CP1250 – East Europe (197 = Ĺ) Code page is an interpretation, not a property of text UNICODE $ Å Ĺ Æ ήك ⅝ ♥ Жญ U+0024 U+00C5 U+0139 U+00C6 U+03AE U+0643 U+215D U+2665 U+0416 U+0E0D

  18. UNICODE ISO standard About 100'000 characters, space for 1'000'000 Unique code points from U-0000 through U-FFFF to U-10FFFF Well-defined process for adding characters When dealing with any text, simply use UNICODE Character code charts: http://www.unicode.org/charts/ See also: http://www.tbray.org/talks/rubyconf2006.pdf http://tbray.org/ongoing/When/200x/2003/04/06/Unicode

  19. UNICODE Byte Encodings 21 bits per character  various encodings UTF-8: 1-4 bytes per character Optimized towards western alphabets ASCII-safe UTF-16: 2-4 bytes per character UTF-32: 4 bytes per character, simplest to work with Code Char UTF-8 UTF-16 UTF-32 U+0041 A 41 0041 00000041 U+00DF ß C39F 00DF 000000DF U+0E0D ญ E0B88D 0E0D 00000E0D U+10380 F0908E80 D800DF80 00010380

  20. URI: UNIFORM RESOURCE IDENTIFIERS How to identify things on the Web

  21. Identifier, Resource, Representation Taken from http://www.w3.org/TR/webarch/

  22. URIs Are Everywhere! http://www.flickr.com/photos/psd/2450160/

  23. URI Syntax • Examples • http://www.ietf.org/rfc/rfc3986.txt • mailto:John.Doe@example.com • news:comp.infosystems.www.servers.unix • telnet://melvyl.ucop.edu/ • URI Syntax scheme: [//authority] [/path] [?query] [#fragid] • The scheme distinguishes different kinds of URIs • Authority normally identifies a server • Path normally identifies a directory and a file • Query adds extra parameters • Fragment ID identifies a secondary resource

  24. URI Syntax cont’d • Reserved characters (like /:?#@$&+* ) • Many allowed characters • Rest percent-encoded from UTF-8 • http://google.com/search?q=technikerstra%C3%9Fe • IRI – Internationalized Resource Identifier • Allows whole UNICODE • Specifies transformation into URI – mostly UTF-8 encoding

  25. URI Schemes • Schemes partition the URI space into subspaces • Schemes can add or clarify properties of resources • Ownership (how authorities are formed) • Persistence (how stable the URIs should be) • Protocol (default access protocol) From http://www.iana.org/assignments/uri-schemes.html

  26. URLs and URNs • Locators (addresses) vs. Names • Pure URNs not easily dereferencable • URNs can be made dereferencable by infrastructure • URLs perceived as less persistent • URLs and URNs drifting towards middle ground • http://www.w3.org/DesignIssues/NameMyth.html • No point in making the distinction any more • Simply always use "URI" • If you didn't know about URLs and URNs, ignore this slide

  27. Fragment IDs in URIs • Fragment ID identifies asecondary resource • No direct path to representation • Interpretation of fragment IDs depends on media type • For example: http://example.com/file#foo • In HTML <a name="foo"> • In XML <element xml:id="foo"/> • No meaning in JPEG

  28. URIs: Main Points • Cool URIs don’t change • URIs can be (and are) scribbled on napkins • URIs don’t necessarily point to documents • A URI can identify anything, even you • Dereferencable URIs also good as names • URL – URN distinction obsolete

  29. How to exchange structured data on the Web XML: EXTENSIBLE MARKUP LANGUAGE

  30. eXtensible Markup Language Language for creating languages “Meta-language” XHTML is a language: HTML expressed in XML W3C Recommendation (standard) XML is, for the information industry, what the container is for international shipping For structured and semistructured data Main plus: wide support, interoperability Platform-independent Applying new tools to old data

  31. Structure of XML Documents Elements, attributes, content One root element in document Characters, child elements in content

  32. XML Element • Syntax <name>contents</name> • <name> is called the opening tag • </name>is called the closing tag • Examples <gender>Female</gender> <story>Once upon a time there was…. </story> • Element names case-sensitive

  33. Attributes to XML Elements • Name/value pairs, part of element contents • Syntax <name attribute_name="attribute_value">contents</name> • Values surrounded by single or double quotes • Example <temperature unit="F">64</temperature> <swearword language='fr'>con</swearword>

  34. Empty Elements Empty element: <name></name> This can be shortened: <name/> Empty elements may have attributes Example <grade value='A'/>

  35. Comments May occur anywhere in element contents or outside the root element Start with <!-- End with --> May not contain a double hyphen Comments cannot be nested Example:<element>content <!-- a comment, will be ignored in processing --></element><!-- comment outside the root element -->

  36. Nesting Elements Elements may contain other (child) elements The containing element is the parent element Elements must be properly nested Example with improper nesting: <b>bold <i>bold-italic</b> italic?</i> The above is not XML (not well-formed)

  37. Special Characters in XML < and > are obviously reserved in content Written as &lt; and &gt; Same for ' and " in attribute values Written as &apos; and &quot; Now & is also reserved Written as &amp; Any character: &#223; or &#xdf;  ß Decimal or hexa-decimal unicode code point Elements and attributes whose name starts with “xml”are also special

  38. Well-formed XML Syntactically correct XML <jsdfiew bnvyu="099430'693lkjds3!" /> Not well-formed (every line contains syntax errors) <a b='c"> <%$@$#%#!! /> <d e f=g/> &nbsp </h>

  39. Uses of XML Document mark-up – XHTML HTML is a language, so it can be expressed in XML Exchanged data Scalable vector graphics – SVG E-commerce – ebXML Messaging in general – SOAP And many more standards Internal data Databases Configuration files Etc.

  40. Why XML? For semistructured data: Loose but constrained structure Unspecified content length For structured data: Table(s) or similar rows Well-defined structure, data types Good interoperability But: requirements for quick access, processing

  41. XML Parsers Document Object Model (DOM) builder Creates an object model of XML document In-memory representation, random access DOM complex, simpler JDOM etc. Simple API for XML parsing (SAX) Views XML as stream of events el_start("date"), attribute("day", "10"), el_end("date") Calls your handler DOM builder can use SAX Pull parsers You call source of XML events

  42. What Parser To Use? Random access  DOM (JDOM) Well-formedness requirement DOM Or very good error handling, transactions Great speed  pull, SAX Streaming  pull, SAX Passing the stream around in the program  pull (not SAX) Easy prototyping  DOM

  43. XML LANGUAGES Using XML for concrete data, mixing different types of data

  44. XML Languages XML is a meta-language HTML is a language Or scalable vector picture format Or purchase order format Language should be formally described In textual documentation Grammar can be machine-processible DTD, XML Schema (out of scope of this class) A document that fits the language grammar is valid Naturally, it must be well-formed first

  45. No Registry of XML Languages <sign> Traffic sign? Positive vs. negative number sign? Sign-language element? Signature?

  46. Namespaces Avoiding conflicts, misunderstanding Giving a URI to an XML language http://traffic.example/inventory/ http://math.example/functions/ http://deaf.example/language/ http://security.example/digsig/ Combined as a string! <sign xmlns="http://example.com/math/functions/"> + </sign> xmlns attribute defines the namespace of an element So-called default namespace

  47. Combining XML Languages Potentially lot of URI repetition <container> <sign xmlns="http://example.com/math/functions/"> + </sign> <sign xmlns="http://example.com/security/digsig/"> fad87b6dfb875bfda4ce67… </sign> </container>

  48. Combining XML Languages 2 Named namespaces with prefixes Prefix part of name – QName (qualified name) But prefix is insignificant "a:sign" can be different signs in different places "a:sign" can be same as "b:sign" (and even just "sign") <container xmlns:math="http://example.com/math/functions/" xmlns:sec="http://example.com/security/digsig/" > <math:sign> + </math:sign> <sec:sign>fad87b6dfb875bfda4ce67…</sec:sign> </container>

  49. Example Document I • Namespace binding <?xml version='1.0' encoding='UTF-8'?> <order> <item code='BK123'> <name>Care and Feeding of Wombats</name> <desc xmlns:html='http://www.w3.org/1999/xhtml'> The <html:b>best</html:b> book ever written! </desc> </item> </order>

  50. Example Document II • Scope of namespace binding <?xml version='1.0' encoding='UTF-8'?> <order> <item code='BK123'> <name>Care and Feeding of Wombats</name> <desc xmlns:html='http://www.w3.org/1999/xhtml'> The <html:b>best</html:b> book ever written! </desc> </item> </order>

More Related