1 / 72

An Introduction to Linked Data, Its Applications and Challanges

An Introduction to Linked Data, Its Applications and Challanges. Samad Paydar samad.paydar@stu-mail.um.ac.ir WTLab Research Group Ferdowsi University of Mashhad. 2 nd October 2009. Outline. The Web of Documents vs. the Web of Data Linked Data Linking Open Data Project

bonita
Download Presentation

An Introduction to Linked Data, Its Applications and Challanges

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Introduction to Linked Data,Its Applications and Challanges Samad Paydar samad.paydar@stu-mail.um.ac.ir WTLab Research Group Ferdowsi University of Mashhad 2nd October 2009

  2. Outline • The Web of Documents vs. the Web of Data • Linked Data • Linking Open Data Project • Linked Data Technology Stack • Linking Data Applications • Outlook • Similar Developments • Challenges

  3. The Web of Documents vs. the Web of Data

  4. The Web of Documents • Traditional Web, Hypertext Web • Analogy • A global filesystem • Designed for • Human consumption • Primary objects • Documents • Links • Untyped • Between documents (or parts of documents) • Degree of structure in object • Fairy low • Semantics of content and links • implicit

  5. The Web of Documents

  6. The Web of Documents : Challenges • The Web has radically altered the way people share knowledge • By lowering the barrier to publishing and accessing documents • But it is not so about applications and data • Traditionally, data on the Web is published as formats like HTML tables, CSV or XML files, … • Much of the structure and semantic of data is sacrificed.

  7. The Web of Documents : Challenges • Data integration • “Show me all the publications from Semantic Web-related conferences in 2007” • Querying across data sources • “Which WWW2008 papers have been written by people from companies of less than 100 people?” • Note that all the data required to answer the above questions might be available on the Web.

  8. The Web of Data • Analogy • A global data space • Designed for • Machines first, humans later • Primary objects • Things (description of things) • Links • Typed • Between things • Degree of structure in objects • High • Semantic of content and links • Explicit

  9. The Web of Linked Data

  10. Linked Data

  11. Linked Data • Is about using the Web to create typed links between data from different sources • Refers to data published on the Web in such a way that • It is machine-readable • Its meaning is explicitly defined • It is linked to other datasets • It can be linked to from external datasets

  12. Linked Data and Web of Data • The term Linked Data refers to a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the last three years, leading to the creation of a global data space containing billions of assertions - the Web of Data.

  13. Properties of the Web of Data • It is generic • Can contain any type of data • Data about anything • Anyone can publish data • No constraints on choice of vocabularies entities are connected by RDF links

  14. A Taste of Linked Data

  15. A Taste of Linked Data

  16. A Taste of Linked Data

  17. Linked Data

  18. Linking Open Data Project

  19. LOD Project • Linking Open Data Project • A community project • Founded in January 2007 • Supported by W3C Semantic Web Education and Outreach Group • Goal: to bootstrap the Web of Data by identifying existing datasets that are available under open licenses, converting them to RDF (according to Linked Data principles), interlink them with other datasets, and publishing then on the Web

  20. LOD Cloud : May 2007

  21. LOD Cloud • The image shows only datasets that are published based on Linked Data Principles and are interlinked with at least one other dataset in the cloud • Each circle represents a dataset • Size of the circle corresponds to the number of triples • Arrows represent the links between datasets • Thickness of arrows indicates number of links between datasets • Some datasets act as hub • E.g. DBpedia, Geonames, …

  22. DBpedia • Extract structured information from Wikipedia and making it available on the Web under an open license

  23. Geonames • Contains over eight million geographical names • 6.5 million unique features • 2.2 million populated places and 1.8 million alternate names • features categorized into one out of nine feature classes • further subcategorized into one out of 645 feature codes

  24. Geonames

  25. LOD Cloud : July 2007

  26. LOD Cloud : August 2007

  27. LOD Cloud : November 2007

  28. LOD Cloud : February 2008

  29. LOD Cloud : March 2009

  30. LOD Cloud : July 2009

  31. LOD Cloud • Content of the cloud is diverse • Data about geographic locations, people, companies, books, scientific publications, companies, books, films, music, TV programs, genes, proteins, … • Some statistics • The Web of Data currently consists of 4.7 billion RDF triples, interlinked around 142 million RDF links (May 2009)

  32. A Programmer’s Point of View • Semantic technologies like Linked Data, decouple applications from data through the use of a simple, abstract data model • Any application that understands the model, can consume any data source published based on the model

  33. Don’t Miss books • To really feel it, I recommend to study

  34. Linked Data Technology Stack

  35. Linked Data Technology Stack

  36. Linked Data Principles • Berners-Lee, 2006 • Use URIs as names for things • Use HTTP URIs so that people can lookup those names • When someone looks up a URI, provide useful information • Include links to other URIs, so that they can discover more things

  37. URI: Uniform Resource Identifier • “URI provides a simple and extensible means for identifying a resource” RFC 3986 • URL: for documents and other entities that can be located on the Web • URI is a more generic means to identify any entity existing in the world

  38. HTTP • Provides URI dereferencing: A simple mechanism for retrieving • resources that can be serialized as a stream of bytes • E.g. picture of a dog • Descriptions of entities that cannot themselves be sent across network • E.g. the dog itself

  39. RDF • HTML provides a means to structure and link documents • RDF provides a generic, graph-based data model to structure and link data that describes things • A triple [subject, predicate, object] • Subject: a URI • Object: a URI or a string literal • Predicate: a URI

  40. RDF Link • RDF Link: take the form of RDF triples, where the subject of the triple is a URI reference in the namespace of one data set, while the object of the triple is a URI reference in the other • S: http://data.linkedmdb.org/resource/film/77 • P: http://www.w3.org/2002/07/owl#sameAs • O: http://dbpedia.org/resource/Pulp_Fiction_%28film%29 • Allow client applications to navigate between data sources to discover additional data

  41. RDFS / OWL • Provide a basis for creating vocabularies that can be used to describe entities in the world and how they are related

  42. Linked Data • Linked Data employs • HTTP URIs to identify resources • HTTP Protocol to retrieve resources • RDF data model to represent resources • Therefore, it is built on the general architecture of the Web

  43. Linking Data Applications

  44. Current Applications • Numerous efforts are underway to research and build applications that exploit this Web of data. At present, these efforts can be broadly classified into three categories: • Linked Data browsers • Linked Data search engines and indexes • Domain-specific Linked Data applications

  45. Linked Data Applications • Linked Data Browsers • Browse things, not just documents • Browse and navigate between data • E.g. Disco, Tabulator, Marbles

  46. Data about Berlin on DBpedia is linked to data about Berlin on Geonames

  47. Linked Data Search Engines and Indexes • Crawl Linked Data from the Web and provide query capabilities over aggregated data • Human-oriented • E.g. Falcon, SWSE • Application-oriented • E.g. Swoogle, Watson,

  48. Domain-Specific Applications • Revyu • Dbpedia Mobile • Talis Aspire • BBC Programmes and BBC Music

  49. Revyu

  50. DBpedia Mobile Uses Dbpedia, Revyu, and Flickr

More Related