510 likes | 768 Views
Semantic Web. Andrejs Lesovskis. Agenda. Syntax and semantics Introduction to Semantic Web Semantic Web layers Projects that use Semantic Web technologies. Syntax and semantics (1). A term for the study of the rules governing the way words are combined to form sentences in a language.
E N D
Semantic Web Andrejs Lesovskis
Agenda Syntax and semantics Introduction to Semantic Web Semantic Web layers Projects that use Semantic Web technologies
Syntax and semantics (1) • A term for the study of the rules governing the way words are combined to form sentences in a language. • In computer science it refers to the ways symbols may be combined to create well-formed programs in the language. • It defines the formal relations between the constituents of a language.
Syntax and semantics (2) • Semantics is the study of the meaning of linguistic expressions. The language can be a natural language, such as English or Navajo, or an artificial language, like a computer programming language. • Natural-language semantics is important in trying to make computers better able to deal directly with human languages.
What is Semantic Web? "The Semantic Web is not a separate Web but an extension of the current one (World Wide Web – WWW), in which information is given well-defined meaning, better enabling computers and people to work in cooperation. ... a web of data that can be processed directly and indirectly by machines." TimBerners-Lee,JamesHendler, andOraLassila.
Semantic Web and World Wide Web applications agents
Resource Integration Semantic annotations Shared ontology Web resources, services, databases 8 8
Resource integration Industrial and business processes External resources Web resources, services, databases Web users Shared ontology Multimedia resources Web agents/applications Mobile devices Machines and devices 9
Semantic Web inventor Semantic web inventor SirTimothyBerners-Leebest known as the inventor of the World Wide Web. Berners-Lee is the director of the World Wide Web Consortium (W3C), which oversees the Web's continued development.
Semantic Web layers (2) • URI and Unicode • XML (eXtensible Markup Language) • RDF (Resource Derscription Framework) • Ontology • Logic • Proof • Trust • User interface and applications
Project OpenCalais (Thomson Reuters) • Thomson Reuters launched project Calais in January 2008. • Calais Web Service processes unstructured text (like news articles, blog postings, scientific papers, etc.) and it returns semantic metadata in RDF format. • Uses natural language processing and machine learning techniques to examine the text and locate the entities, facts, and events.
Project DBPedia.org (1) • DBpedia is a project aimed to extract structured content from the information created as part of the Wikipedia project ("infobox" tables). This structured information is then made available on the World Wide Web. • The DBpedia knowledge base allows users to query relationships and properties associated with the Wikipedia resources, including links to other related datasets. • Used technologies: Scala, Java, VirtuosoUniversal Server.
Project DBPedia.org (4) DBPedia project results: • Data extraction from 97 languages, • English version of the DBpedia knowledge base currently describes 3.77 million things, including 764,000 persons, 573,000 places, 333,000 creative works, 192,000 organizations, 202,000 species and 5,500 diseases., • Contains more than 672 million RDF triples, • Tests show 87% precision, • Developed a large multi-domain ontology.
RDF Site Summary (RSS) RSS (Really Simple Syndication)is a family of web feed formats used to publish frequently updated works — such as blog entries, news headlines, audio, and video — in a standardized format.
Really Simple Syndication (RSS) <?xml version="1.0" encoding="UTF-8" ?> <rss version="2.0"> <channel> <title>RSS Title</title> <description>This is an example of an RSS feed</description> <link>http://www.someexamplerssdomain.com/main.html</link> <lastBuildDate>Mon, 06 Sep 2010 00:01:00 +0000 </lastBuildDate> <pubDate>Mon, 06 Sep 2009 16:45:00 +0000 </pubDate> <ttl>1800</ttl> <item> <title>Example entry</title> <description>Here is some text containing an interesting description.</description> <link>http://www.wikipedia.org/</link> <guid>unique string per item</guid> <pubDate>Mon, 06 Sep 2009 16:45:00 +0000 </pubDate> </item> </channel> </rss>
URI un Unicode • Unicode- is a computing industry standard for the consistent encoding, representation and handling of text expressed in most of the world's writing systems. • URI (UniformResourceIdentifier) • URL(UniformResourceLocator) • http://www.google.com • mailto:email@example.com • URN(UniformResourceName) • URN of "Spider-Man"movie: urn:isan:0000-0000-9E59-0000-O-0000-0000-2 • URN of "ScienceofComputerProgramming“ magazine:urn:issn:0167-6423
XML (1) XML (eXtensible Markup Language) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable Uses tags for markup : <tag_name>data</tag_name> Some of the XML-based languages: • Extensible Hypertext Markup Language (XHTML), • Really Simple Syndication (RSS), • Mathematical Markup Language (MathML), • GraphML, • Scalable Vector Graphics (SVG).
XML (2) Example: <CATALOG> <CD> <TITLE></TITLE> <ARTIST></ARTIST> <COUNTRY></COUNTRY> <COMPANY></COMPANY> <YEAR></YEAR> </CD> </CATALOG> … <breakfast_menu> <food> <name>Belgian Waffles</name> <price>$5.95</price> <description> two of our famous Belgian Waffles with plenty of real maple syrup </description> <calories>650</calories> </food> </breakfast_menu>
Simple Object Access Protocol (SOAP) • SOAP Version 1.2 (SOAP) is a lightweight protocol intended for exchanging structured information in a decentralized, distributed environment. It uses XML technologies to define an extensible messaging framework providing a message construct that can be exchanged over a variety of underlying protocols. • SOAP 1.2 became a W3C recommendation in 2007.
SOAP example POST /InStock HTTP/1.1 Host: www.example.org Content-Type: application/soap+xml; charset=utf-8 Content-Length: 299 SOAPAction: "http://www.w3.org/2003/05/soap-envelope" <?xml version="1.0"?> <soap:Envelope xmlns:soap="http://www.w3.org/2003/05/soap-envelope"> <soap:Header> </soap:Header> <soap:Body> <m:GetStockPrice xmlns:m="http://www.example.org/stock"> <m:StockName>IBM</m:StockName> </m:GetStockPrice> </soap:Body> </soap:Envelope>
Web Services Description Language (WSDL) • Web Services Description Language is an XML-based interface description language that is used for describing the functionality offered by a web service. • A WSDL description of a web service (also referred to as a WSDL file) provides a machine-readable description of how the service can be called, what parameters it expects, and what data structures it returns. • WSDL 2.0 became a W3C recommendation on June 2007.
Simple Semantic Web Architecture and Protocol (2) The SSWAP architecture is based on the following five basic concepts: • Provider– corresponds to the organizations that own and publish resources; • Resource –arbitrary resources (for example, web pages, ontologies, or datasets), but they are primarily used to describe web services; • Graph– concept that describes transformations performed by the service; • Subject– input data that is given to the service; • Object– service execution result.
Document Type Definition (DTD) Document Type Definition (DTD) is a set of markup declarations that define a document type for an SGML-family markup language (SGML, XML, HTML). DTD is a part of XML 1.0 specification. Example: DTD <?xml version="1.0"?> <!DOCTYPE bookstore [ <!ELEMENT bookstore (name,topic+)> <!ELEMENT topic (name,book*)> <!ELEMENT name (#PCDATA)> <!ELEMENT book (title,author)> <!ELEMENT title (#PCDATA)> <!ELEMENT author (#PCDATA)> <!ELEMENT isbn (#PCDATA)> <!ATTLIST book isbn CDATA "0“> ]> XML <bookstore> <name>Mike's Store</name> <topic> <name>XML</name> <book isbn=“111-111-111"> <title>XML in Nutshell</title> <author>John Smith</author> </book> </topic> </bookstore>
DTD elements • External DTD declaration: <!DOCTYPE doc_elem SYSTEM/PUBLIC dtd_addr> • <!DOCTYPE chapter SYSTEM "../dtds/chapter.dtd"> • <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML//EN" "../dtds/chapter.dtd"> • Element type declaration <!ELEMENT namecontent_model> • Any content: <!ELEMENT description ANY> • Children elements: <!ELEMENT name (first_name, last_name) > • Parsedcharacterdata: <!ELEMENT first_name (#PCDATA) > • Empty (has no content): <!ELEMENT pays_on_timeEMPTY >
DTD quantifiers <?xmlversion = "1.0" standalone ="yes" ?> <!DOCTYPE document [ <!ELEMENT document (product | customer)+ > <!ELEMENT product (company, info*)+> <!ELEMENT company (#PCDATA) > <!ELEMENT info (#PCDATA) > <!ELEMENT customer (first_name, last_name) > <!ELEMENT first_name (#PCDATA) > <!ELEMENT last_name (#PCDATA) > ]> <document> <product> <company>Microsoft</company> </product> </document> Use of quantifiers • a+ • a* • a? • a, b • a | b
DTDattributes • Attribute declaration template: <ATTLIST element_name attribute_nametypedefault_value ... attribute_nametypedefault_value> • Example: <ATTLIST customer type (good | bad) "good" language CDATA #FIXED "EN“> ... <customertype="good">
XML Schema XML Schema 1.0 was approved as a W3C Recommendation in 2001 and it was the first separate schema language for XML to receive this status. Schema is an abstract collection of metadata, that includes the following components: element and attribute declarations and complex and simple type definitions. Schema definition example: <?xml version="1.0" encoding="utf-8"?> <xs:schema elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="ElementName"> ... </xs:element> </xs:schema> Reference to an XL Schema: <?xml version="1.0" encoding="utf-8"?> <ElementName xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="SimpleAddress.xsd“> ... </ElementName>
XML Schemaexample <?xml version="1.0“?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="employee" type="fullpersoninfo"/> <xsd:complexType name="personinfo"> <xsd:sequence> <xsd:element name="firstname" type="xsd:string"/> <xsd:element name="lastname" type="xsd:string"/> </xsd:sequence> </xsd:complexType> <xsd:complexType name="fullpersoninfo"> <xsd:complexContent> <xsd:extension base="personinfo"> <xsd:sequence> <xsd:element name="address" type="xsd:string"/> <xsd:element name="city" type="xsd:string"/> <xsd:element name="country" type="xsd:string"/> </xsd:sequence> </xsd:extension> </xsd:complexContent> </xsd:complexType>
XML Schema elements • Simple elements don’t contain child elements or attributes: <xsd:element name="note“ type=“xsd:string> corresponds to <note>some text</note>. • Complex elements can contain child elements and/or attributes: <xsd:complexType name="fullpersoninfo"> <xsd:complexContent> <xsd:extension base="personinfo"> <xsd:sequence> <xsd:element name="address" type="xsd:string"/> …
Element types • Derived types • normalizedString, • token, • language, • NMTOKEN, NMTOKENS, • Name, NCName, • ID, IDREF, IDREFS, • ENTITY, ENTITIES, • integer, • nonPositiveInteger, • negativeInteger, • long, int, short, byte, • unsignedLong, • unsignedInt, • unsignedShort, • unsignedByte. • Primitive types • string, • boolean, • decimal, • float, • double, • duration, • dateTime, time, date, • gYearMonth, gYear, • gMonthDay, gDay, • gMonth, • hexBinary, base64Binary, • anyURI, • Qname, • NOTATION.
Element occurrence indicators The minOccurs indicator specifies the minimum number of times an element can occur. If minOccurs is equal to 0, then element is optional. <xs:element name="elem_name" type="xs:string“ minOccurs="0"/> The <maxOccurs> indicator specifies the maximum number of times an element can occur. If maxOccurs equals "unbounded", then element is allowed to appear an unlimited number of times. <xs:element name="elem_name" type="xs:string" maxOccurs="10" minOccurs="0"/>
XML Schema attributes • Attribute declaration template: <xsd:attribute name="att_name" type="xsd:att_type" use=""> • Example: <xsd:attribute name="phone" type="xsd:string"> ... <customerphone="111-1111111">
DTD vs XML Schema (1) DTD pros • It's been around longer than XML Schema; • Is a part of XML 1.0 specifications. DTD cons • Uses different from XML syntax; • Doesn’t support namespaces; • Limited number of types; • DTD describes whole document.
DTD un XML Schema (2) XML Schema pros • Uses XML syntax (schemas themselves are XML documents); • Supports more data types and allows to define your own types; • Schema can define portions of the document. XML Schema cons • Pretty much none these days.