1 / 82

Linked Data Tutorial

February 16, 201 2. Linked Data Tutorial. Tomáš Knap, Jindřich Mynarz , Martin Nečaský, Jakub Stárka . (Partially based on slides of Chris Bizer [9]). Motivation. Motivational Scenario. Basic data. Public contracts. Employees. Departments. Budget. Expenses.

prentice
Download Presentation

Linked Data Tutorial

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. February16, 2012 Linked Data Tutorial Tomáš Knap, Jindřich Mynarz, Martin Nečaský, Jakub Stárka (Partially based on slides of Chris Bizer [9])

  2. Motivation

  3. Motivational Scenario Basic data Public contracts Employees Departments Budget Expenses WWW page of the institution Business Register ÚFIS Buyer‘s Profile ISVZUS gov.cz • Data Consumer: Show me suppliers of the public contracts for the Ministry of Finance (MF) in the region Liberec. Show me the data on the Google maps in iPhone. For every public contract, I am also looking for the aggregation of all the payments made by MF, link to their budget and responsible person. • Where can I get the data about public contracts, responsible persons, expenses, and budget of MF? • How should I aggregate and link the data? • How can I observe the data on the map?

  4. Current Common Practise Basic data Public contracts Employees Departments Budget Expenses WWW page of the institution Business Register ÚFIS Buyer‘s Profile ISVZUS gov.cz 3 - Expenses ? 2 – MF public contracts + employees 1 – MF public contracts ? Consumer did not discovered ? Information Integration very time consuming, boring, and ineffective!

  5. Linked Data - Basics

  6. Linked Data • Set of best practices for publishing structured data on the Web in accordance with the general architecture of the Web • using Semantic Web technologies and standards • Semantic Web is the goal, Linked Data provides the means to reach the goal

  7. Linked Data Principles • Use URIs as names for things • Use HTTP URIs so that people can look up those names. • When someone looks up a URI, provide useful RDFinformation • Include RDF statements that link to other URIs so that theycan discover relatedthings. [Tim Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html, 2006]

  8. Architecture of the Classic Web • Single global information space • Small set of simple standards: • HTTP URI • globally unique ID • retrieval mechanism • HTML as document format • Hyperlinks to connect everything • Applications work on top of the complete information space

  9. Web 2.0 APIs and Mashups • No single global dataspace • Shortcomings: • API have proprietary interfaces • No hyperlinks between data items within different APIs • Mashups are based on a fixed set of data sources Web APIs slice the Web into Walled Gardens!

  10. Linked Data • Extend the Web with a single global dataspace • By using RDF to publish structured data on the Web • By setting links between data items within differentdata sources. • Physically distributed, behaves like single dataspace

  11. RDF Data Model • Flexible graph-based data model [2] • HTTP URIs take the role of global primary keys. • pd:cygri = http://richard.cyganiak.de/foaf.rdf#cygri • dbpedia:Berlin = http://dbpedia.org/resource/Berlin

  12. Resolving URIs over the Web • The HTTP protocol brings together identification andretrieval

  13. Following Links deeper into the Web

  14. Pubby – Linked Data Browser http://dbpedia.org/page/Český_Krumlov

  15. Propertiesofthe Web ofLinked Data • Global, distributed dataspace build on a simple set of standards • RDF, URIs, HTTP • Entities are connected by links • creating a global data graph that spans data sources • enables the discovery of new data sources • Data-coexistence • Everyone can publish data to the Web of Linked Data • Everyone can express their personal view on things

  16. Linked Data Deployment on the Web..Is it real?

  17. W3C Linking Open Data Project • Grassroots community effort to • Publish existing open license datasets as Linked Data on the Web • Interlink things between different data sources

  18. Linked Data Cloud 2007

  19. Linked Data Cloud2009

  20. Linked Data Cloud2011 http://richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.pdf http://thedatahub.org/

  21. More Statistics http://stats.lod2.eu/stats

  22. Uptake in Governmental Domain • The EU is publishing LinkedData • EuroStat • http://estatwrap.ontologycentral.com/ • National efforts • The Government is releasing public data • http://data.gov.uk/ • Lots of initiatives in Great Britain • Budget in Germany • http://bund.offenerhaushalt.de/ • Open Data in Catalonia • http://opendata.gencat.cat/en/dades-obertes.html

  23. Data.gov.uk http://data.gov.uk/organogram/cabinet-office

  24. Linked Data Applications Linked Data Browsers ? ? ? ?

  25. Search Engines - Sig.ma http://sig.ma

  26. Mashups – Public Contracts On the Map http://gd.projekty.ms.mff.cuni.cz:2021/new/map.html

  27. Mashups – Crime, Transport, Education http://apps.seme4.com/see-uk/

  28. Other Applications • Browsers: • Disco Hyperdata Browser • http://www4.wiwiss.fu-berlin.de/rdf_browser/ • OpenLink RDF Browser • http://ode.openlinksw.com/ • Search Engines • Falcons • http://ws.nju.edu.cn/falcons/ • Watson • http://watson.kmi.open.ac.uk/WatsonWUI/ • Mashups

  29. Linked Data Applications - Summary Linked Data Mashups Search Engines Linked Data Browsers

  30. Publishing Linked Data

  31. Publishing Tasks – Bizer 38 • 1. Make data available as RDF via HTTP • Requires ways to serialize RDF data model • 2. Set RDF links pointing at other data sources • 3. Make your data self-descriptive

  32. RDF/XML • W3C Recommendation, 2004 [2]

  33. Turtle Syntax @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dataModel: <http://www.w3.org/2000/10/swap/pim/contact#> . @prefix myContact: <http://www.w3.org/People/EM/contact#> . myContact:me rdf:type dataModel:Person ; dataModel:fullName "Eric Miller". dataModel:mailbox <mailto:em@w3.org>. dataModel:personalTitle "Dr.". • W3C Team Submission, 2011, [4]

  34. RDFa • A way to directly add RDF to XHTML pages • Provides new attributes to handle additional markup • W3C Recommendation, 2008 [5] • HTML is not extendable • most RDFa parsers will recognize RDFa attributes in any version of HTML

  35. RDFa • Provides new attributes to handle additional markup, reuses existing • About, resource, … • Href, src, … • Used with any supported element, prefered: • Span, div (in the body) • a (linking element) • Meta, link (in the header)

  36. RDFa Example • XHTML page http://example.com/alice/posts/42 • Original XHTML code All content on this site is licensed under <a href="http://cc.org/licenses/by/3.0/"> a Creative Commons License </a>. • XHTML + RDFa All content on this site is licensed under <a rel=“cc:license" href="http://cc.org/licenses/by/3.0/"> a Creative Commons License </a>. • RDF triples destilled from XHTML+RDFa <http://example.com/alice/posts/42> cc:license<http://cc.org/licenses/by/3.0/>.

  37. RDF store + Linked Data Interface • Virtuoso + pubby

  38. D2R server • A way how to publish data in relational databases as Linked Data • Requests from the Web are rewritten into SQL queries via the mapping. • on-the-fly translation • eliminates the need for replicating the data into a dedicated RDF triple store.

  39. Publishing Tasks 1. Make data available as RDF via HTTP 2. Set RDF links pointing at other data sources 3. Make your data self-descriptive

  40. 2. Set RDF links <http://dbpedia.org/resource/Berlin> owl:sameAs <http://sws.geonames.org/2950159> . • There are tools to help you generate links • Silk [6]

  41. Publishing Tasks 1. Make data available as RDF via HTTP 2. Set RDF links pointing at other data sources 3. Make your data self-descriptive

  42. 3. Make your data self-descriptive • Increase the usefulness of your data and ease data integration • Aspects of self-descriptiveness • 1. Reuse terms from common vocabularies • 2. Enable clients to retrieve the schema • 3. Publish schema mappings for proprietary terms • 4. Metadata • Provide provenance metadata • Provide licensing metadata • Provide data-set-level metadata using voiD

  43. About Vocabularies • We have to be able to define the meaning of the subject, properties • Vocabularies, e.g. Public contracts ontology

  44. Public Contracts Ontology http://purl.org/procurement/public-contracts#

  45. RDFS • RDFS = RDF Schema • W3C recommendation • http://www.w3.org/TR/rdf-schema/ • Vocabulary for RDF • Definition of classes • is:Studentrdf:typerdfs:Class • Definition of properties • is:namerdf:typerdfs:Property • Domains and ranges of properties • is:namerdfs:domainis:Student • is:namerdfs:rangexsd:string

  46. OWL • OWL = Web Ontology Language • W3C recommendation • http://www.w3.org/TR/owl2-overview/ • Ontologies • More complex constructs • Class or property equivalences • Cardinality restrictions • …

  47. 3. Make your data self-descriptive • Increase the usefulness of your data and ease data integration • Aspects of self-descriptiveness • 1. Reuse terms from common vocabularies • 2. Enable clients to retrieve the schema • 3. Publish schema mappings for proprietary terms • 4. Metadata • Provide provenance metadata • Provide licensing metadata • Provide data-set-level metadata using voiD

  48. 3.1 Reuse Terms from Common vocabularies • Common Vocabularies • Friend-of-a-Friendfor describing people and their social network • SIOCfor describing forums and blogs • SKOSfor representing topic taxonomies • Organization Ontology for describing the structure of organizations • GoodRelations provides terms for describing products and business entities • Music Ontology for describing artists, albums, and performances • Review Vocabulary provides terms for representing reviews • Common sources of identifiers (URIs) for real world objects • LinkedGeoData and Geonames locations • GeneID and UniProt life science identifiers • DBpedia wide range of things

  49. 3.2 Enable Clients to retrieve the Schema • Clients can resolve the URIs that identify vocabularyterms in order to get their RDFS or OWL definitions. • If we discover in data URI: <http://opendata.cz/data/p6/contract/ocz_art_5161> http://purl.org/procurement/public-contracts#awardDate "2011-11-11"^^<http://www.w3.org/2001/XMLSchema#date> ; • We resolve the URI and get the definition: RDFS or OWL definition

  50. 3.3 Publish Schema Mappings pc:Tender a owl:Class; rdfs:subClassOfgr:Offering. pc:AwardCriterion a owl:Class; owl:equivalentClassloted:AwardCriteria. • Simple Mappings: • rdfs:subClassOf, rdfs:subPropertyOf • owl:equivalentClass, owl:equivalentProperty • Complex mappings – R2R [7]

More Related