190 likes | 333 Views
A Gentle Introduction to Linked Data. Ralph LeVan Sr. Research Scientist OCLC Research. What is Linked Data?. Simple answer: “The term Linked Data is used to describe a method of exposing, sharing, and connecting data via dereferenceable URIs on the Web.” – Wikipedia (emphasis added).
E N D
A Gentle Introduction to Linked Data Ralph LeVan Sr. Research Scientist OCLC Research
What is Linked Data? • Simple answer: • “The term Linked Data is used to describe a method of exposing, sharing, and connecting data via dereferenceable URIs on the Web.” – Wikipedia (emphasis added)
What is Linked Data? • Opaque answer: • Linked Data is a protocol component of the Semantic Web, using URIs, Real World Objects, Content Negotiation and RDF documents to share information on the web. • My goal is to try to make that statement a little less opaque.
Why Linked Data? (Aren’t we sharing our data already?) • The simplest answer is that Linked Data is a way to share data that we’ve always been eager to share, but in a non-library-centered exchange format. • Up until now, our choices for sharing our data have been MARC (not very popular with the Web community) or Dublin Core (not very semantically rich). Linked Data, using RDF as a description syntax, provides a framework for sharing semantically rich data in a Web-friendly way. (Remember, this is the Semantic Web we’re talking about.)
The Semantic Web Dream • “The semantic web is a vision of information that is understandable by computers, so computers can perform more of the tedious work involved in finding, combining, and acting upon information on the web.” – Wikipedia • “I have a dream for the Web [in which computers] become capable of analyzing all the data on the Web – the content, links, and transactions between people and computers.” • – Tim Berners-Lee
The Semantic Web Challenge • Finding a web-acceptible forum to develop the markup syntax that lets us expose the semantic richness of MARC-21 and the structural richness of AACR2 • The forum is the W3C Library Linked Data Incubator Group • The Syntax is the Resource Description Framework (RDF)
RDF • RDF is a general framework for describing things • A statement in RDF has three parts: subject, predicate, object • The subject and predicate must be URIs • The object can be a value or a URI • <contact:Person rdf:about="http://www.w3.org/People/EM/contact#me"> <contact:fullName>Eric Miller</contact:fullName> </contact:Person>
Why URIs? • In the old days, we’d assign a control number to something important, like an LCCN and associate our data with that control number. • URI’s are the modern equivalent of a control number with several bonuses • The domain name tells you who created the URI • Since the URI is usually an HTTP URL, there’s often something to click through to
URLs: Pages or Concepts? • When is a URL a reference to a concept and when is it just a pointer to a page? • <rdf:Description rdf:about=“http://example.org/ralph”> • <rrl:likes rdf:resource=http://www.oclc.org/identities /> • Does that statement mean I like that web page or I like that service? • How do we tell the difference?
Real World Objects • A rule has been agreed upon by the W3C on how to tell concept URLs from page URLs • Concept URLs, called “Real World Objects, will return an HTTP status of 303 (See Other) with the URL of a page in the Location header • That 303 redirect is the clue to Semantic Web applications that they had a pointer to a concept, not just a page
Content Negotiation • How does a Semantic Web application find the RDF we’re making? It asks for RDF automatically, whenever it follows a link. • HTTP applications, (e.g., browsers) can tell servers what form of a page they would prefer to get back: HTML, PDF, JPEG… They do that by sending an Accept header along with their request for the page. • When servers decide what to send back based on that Accept header, we call that Content Negotiation
Content Negotiation • A browser going to http://viaf.org/viaf/12345679/ would send the Accept header “*.*” (meaning that it would take anything) and typically get HTML back. • A Semantic Web application going to that exact same URL would send the Accept header “application/rdf+xml” and would get RDF back.
What is Linked Data? • Linked Data is a component of the Semantic Web, using URIs, Real World Objects, Content Negotiation and RDF documents to share information on the web. • Hopefully, that’s a little less opaque now. Maybe some examples would help.
Real World Objects in VIAF • http://viaf.org/viaf/12345679 is a Real World Object. How do we know this? Because clicking on that URL gets us a 303 redirect to http://viaf.org/viaf/12345679/ • curl --include http://viaf.org/viaf/12345679 • HTTP/1.1 303 See Other • Server: Apache-Coyote/1.1 • Location: /viaf/12345679/
Content Negotiation in VIAF • curl --include --header "Accept: *.*" http://viaf.org/viaf/12345679/ • curl --include --header "Accept: text/xml" http://viaf.org/viaf/12345679/ • curl --include --header "Accept: application/rdf+xml" http://viaf.org/viaf/12345679/
RDF • <rdf:RDF> • <viaf:NameAuthorityCluster rdf:about="viaf/12345679> … • <viaf:EstablishedHeading • rdf:about="viaf/12345679/#Mozziconacci,+Jean-François“> • <viaf:NameAuthority • rdf:about="viaf/12345679/#LC%7Cn++93057547“>… • <foaf:Person rdf:about="viaf/12345679/#foaf:Person“>… • <skos:Concept rdf:about="viaf/12345679/#skos:Concept“>… • </rdf:RDF>
What is Linked Data? • Linked Data is a component of the Semantic Web, using URIs, Real World Objects, Content Negotiation and RDF documents to share information on the web.
Links & Questions • Semantic Web: http://en.wikipedia.org/wiki/Semantic_Web • RDF: http://www.jenitennison.com/blog/node/135 • Linked Data: http://www.w3.org/TR/cooluris
Thank You • Ralph LeVan Senior Research Scientist OCLC Research, Dublin OHlevan@oclc.org • OCLC Research TAI CHI Webinar Serieshttp://www.oclc.org/research/events/taichi.htm