290 likes | 443 Views
Towards a semantic web. Philip Hider. This talk. The Semantic Web vision Scenarios Standards Semantic Web & RDA. Web 1.0, 2.0, 3.0. Internet to WWW (Web 1.0) Web 1.0 allows people to navigate the Internet easily, through hyperlinks Web 2.0 allows people to collaborate more on the Web
E N D
Towards a semantic web Philip Hider
This talk • The Semantic Web vision • Scenarios • Standards • Semantic Web & RDA
Web 1.0, 2.0, 3.0 • Internet to WWW (Web 1.0) • Web 1.0 allows people to navigate the Internet easily, through hyperlinks • Web 2.0 allows people to collaborate more on the Web • Web 3.0 allows computers to find and use the datacontained in Web documents • Web 3.0 = the Semantic Web vision
The Semantic Web vision • It will allow computers to make sense of the content of Web documents, so that they can find and use this data independently • Basis of SW already developed, with standards such as XML and RDF • Like Web 1.0, it represents a bottom-up, distributed approach
How would it work? • Computers would be able to identify and ‘understand’ particular data in a Web document according to the metadata associated with that data • metadata could be inside our outside the document • Computers (agents) would then be able to relate that data to other data in other documents (or the same document) according to specified schemas, ontologies and rules • They could then independently integrate data and process information according to tasks set by their human users
A Semantic Web scenario • User asks ‘Trip Agent’ to purchase the ‘best’ deal for a trip to New Zealand with date range x, family members y, time of day z, etc. etc. • ‘Trip agent’ searches the Web for flights and accommodation, and is able to look up databases and specify conditions according to what it ‘knows’ about user’s preferences
Semantic Web scenario • Agent is able to ‘understand’ the deals available on different websites by integrating data from different sources, e.g. looking up geographic information systems (how far from the sea, shops, etc.), weather forecasts, family members’ calendars, etc. an ultimately suggesting the optimal combination of flight, hotel, tours, etc.
Another scenario • User asks if the latest Stephen King book is available in a nearby library, can’t remember what it’s called • ‘Library Agent’ searches the Web for nearby libraries with books by ‘Stephen King’, finds a few different Stephen Kings, confirms with user which Stephen King, then identifies the latest novel via the official Stephen King website, but chooses the second-nearest library (by car) which holds it because of availability/format/library opening hours, etc.
What do SW agents need? • Information about the data, i.e. metadata, in a machine-readable format • Including a shared understanding of the structure of that metadata and its relationship to other knowledge structures (ontologies) • Some clever programming
Standards for the Semantic Web • Resource Description Framework • Universal Resource Identifiers • XML • Unicode • Schemas (such as XML schemas) • Ontologies written in e.g. OWL • Rules written in RIF, etc. • SPARQL
Resource Description Framework • W3C standard • A model used to structure resource descriptions • Can be used to structure data about any kind of resource • could be a book, or a car, or a flight ticket, or an experiment, etc. • Based on ‘triples’, i.e. Resource – Property – Value (Subject – Predicate – Object)
Universal Resource Identifiers • For example, URLs • And ISBNs • People don’t have them yet • OCLC working on ‘work identifiers’ • Properties and some values are referenced as part of particular schemas, ontologies, etc.
eXtensible Markup Language (XML) • Another W3C standard • More flexible than HTML, XHTML • Can be used to encode any data • Data can be in the same Web document or another document • Can be used to express RDF, i.e. RDF/XML • RDF/XML basis for metadata structures such as schemas and ontologies
Schemas • Standardised structures of resource description that define property elements in a taxonomic way • Mostly based on a particular domain, e.g. pertaining to bibliographic data, or geospatial data, or flight booking data, or used car data, etc.
Schemas • Two main groups of schemas – XML schemas and RDFS (RDF schemas) • Superseding Document Type Definitions (DTDs) • Specific well-known schemas include • Dublin Core • ONIX • RSS
Some metadata encoded in RDF/XML <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn"> <dc:title>Tony Benn</dc:title> <dc:publisher>Wikipedia</dc:publisher> <foaf:primaryTopic> <foaf:Person> <foaf:name>Tony Benn</foaf:name> </foaf:Person> </foaf:primaryTopic> </rdf:Description> </rdf:RDF>
Some metadata encoded in RDF/XML <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://en.wikipedia.org/wiki/Tony_Benn"> <dc:title>Tony Benn</dc:title> <dc:publisher>Wikipedia</dc:publisher> <foaf:primaryTopic> <foaf:Person> <foaf:name>Tony Benn</foaf:name> </foaf:Person> </foaf:primaryTopic> </rdf:Description> </rdf:RDF>
Ontologies • More sophisticated than schemas, formalising more complex relationships between elements • Also usually domain-specific • Use extra languages, such as OWL, on top of RDF/XML etc. • Ontologies give more scope for agents to be ‘clever’ • Dublin Core can be expressed as an ontology or a schema
What about MARC? • MARC files are rather flat and do not readily define relationships between elements • But can be expressed as an XML schema, i.e. MARCXML • MODS is a lite version of MARCXML • Mappings between MARCXML and other schemas (e.g. DC)
Mappings • Lots of them! • Between different schemas, ontologies, languages, etc. • AKA crosswalks • By UKOLN, LC, OCLC, etc. etc. • The more standards and adaptations, the more crosswalks
Value sets • Resource – Property – Value • Schemas and ontologies may point to particular value sets, e.g.Book A hasaSubjectcalled DCterms:LCSH Appleswhere Apples is a value in the set of values known as LCSH • In other words, they may point to controlled vocabularies
SKOS • Simple Knowledge Organization Systems • SW standard for expressing controlled vocabularies such as subject thesauri • http://www.w3.org/2004/02/skos • Might promote use of LCSH, etc.
Semantic Web & cataloguing • More sophisticated use of library catalogues if they can be understood by Semantic Web agents • Library resources more likely to be used in conjunction with non-library web resources • SW about agents using cataloguing, not replacing cataloguing
Semantic Web & RDA • RDA is therefore aligning itself with DC and RDF • RDA elements mapped to DC, ONIX, etc. • DCMI/RDA Task Group • RDA-DC application profile • http://dublincore.org/dcmirdataskgroup
Prospects for SW • Examples of Semantic Web developments: http://www.w3.org/2001/sw/sweo/public/UseCases • A lot of standards now in place, technology not so much of an issue • With RDA, bibliographic domain ripe for SW take-up
Thank you. phider@csu.edu.au