1 / 29

Bluffers Guide to The Semantic Web

Bluffers Guide to The Semantic Web. Data wants to be free. Frank van Harmelen CS Department Vrije Universiteit Amsterdam. Semantics as your saviour?. Outline. The general idea: a Web of Data What must be done to realise this How far away is this Nex steps, do’s, don’ts.

hadar
Download Presentation

Bluffers Guide to The Semantic Web

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bluffers Guide toThe Semantic Web Data wants to be free Frank van Harmelen CS Department Vrije Universiteit Amsterdam

  2. Semantics as your saviour?

  3. Outline • The general idea: a Web of Data • What must be done to realise this • How far away is this • Nex steps, do’s, don’ts

  4. The Scientist’s Problem Everybody’s Too much unintegrated data: • from a variety of incompatible sources • no standard naming convention • each with a custom browsing and querying mechanism (no common interface) • and poor interaction with other data sources

  5. What are the Data Sources? • Flat Files • URLs • Proprietary Databases • Public Databases • Spreadsheets • Emails • … Data wants to be free Maps

  6. In which disciplines? a new database each month One dataset per site • Archeology • Chemistry • Genomics, proteomics, ... (bio/life-sciences) • Communication science • Social history • Linguistics • Bio-diversity • Environmental sciences (climate studies) • .... • libraries (KB), archives (sound&vision) Geo? historical data laymen data international data (for their first time)

  7. Outline • The general idea: a Web of Data • What must be done to realise this • How far away is this • Nex steps, do’s, don’ts

  8. The Future Web of Data The Current Web of text and pictures and another web page about Frank This page is about the Vrije Uniersitei a web page in English about Frank And this page is about LarKC And this page is about Stefano Data wants to be free ? ? ? linked web-pages, written by people, written for people, used only by people... ? ? linked data, usable by computers! useful for people! Many of these pages already come from data, that is usable by computers! But we can’t link the data....

  9. Which Semantic Web? • Version 1:“Enrichment of the current Web” • recipe:Annotate and classify web-content • enable better search & browse,..

  10. Which Semantic Web? • Version 2:"Semantic Web as Web of Data" (TBL) • recipe:expose databases on the web, use RDF, integrate • meta-data from: • expressing DB schema semantics in machine interpretable ways • enable integration and unexpected re-use

  11. Outline • The general idea: a Web of Data • What must be done to realise this • How far away is this • Nex steps, do’s, don’ts

  12. alleviates <treatment> <name> <symptoms> <drug> IS-A <disease> <drugadministration> machine accessible meaning(What it’s like to be a machine) META-DATA

  13. name symptoms disease drug administration What is meta-data? • it's just data • it's data describing other data • its' meant for machine consumption

  14. Required are: • a standard syntax • so meta-data can be recognised as such • one or more shared vocabularies • so data producers and data consumers all speak the same language • lots of resources with meta-data attached • mechanisms for attribution and trust

  15. 1. A standard syntax Semantic Web data model: RDF things & relations between things

  16. RDF Triples in Life Sciences

  17. RDF Triples in Geo <rdf:RDF> <geo:Point> <geo:lat>55.701</geo:lat> <geo:long>12.552</geo:long> </geo:Point> </rdf:RDF> Remember: RDF = simple model for data 55.701 geo:lat geo:point:_ 12.552 geo:long

  18. RDF Schema: vocabulary for data types • Classes + subclass hierarchy • rivers are waterways • Properties + subproperty hierarchy • father-of implies parent-of • Domain of properties • X capital-of YX has-type city • Range of properties • X capital-of YY has-type country Simple standardised inferences

  19. OWL OWL: richer vocabulary for data types • Things RDF Schema cannot express: • Description Logic SHOIN(D) • equality, disjunction, negation, • min/max number restrictions • inverse, symmetric, transitive properties • and much more… Complex standardised inferences Example: Every country has precisely one capital: Inference TheHague ≠ A’dam & A’dam = capital  TheHague ≠ capital Integrity checks after data-merging

  20. different owners & locations Web of Data: anybody can say anything about anything • All identifiers are URL's (= on the Web) • Allows total decoupling of • data • vocabulary • meta-data Data wants to be free [<x> IsOfType <T>] x T <prince>

  21. 2. Shared vocabularies BioMed • Mesh • Medical Subject Headings, National Library of Medicine • 22.000 descriptions • EMTREE • Commercial Elsevier, Drugs and diseases • 45.000 terms, 190.000 synonyms • UMLS • Integrates 100 different vocabularies • SNOMED • 200.000 concepts, College of American Pathologists • Gene Ontology • 15.000 terms in molecular biology • NCBI Cancer Ontology: • 17,000 classes (about 1M definitions) Geo?

  22. Outline • The general idea: a Web of Data • What must be done to realise this • How far away is this • Nex steps, do’s, don’ts

  23. How far away is this ? • Stable data formats & standardised inferences • Lots of shared vocabularies (+ ways to convert them) • Lots of data sources(+ ways to convert them) • Lots of tools • convert, construct, edit (data, vocabularies) • store, search, query, reason • interlink • visualise • ...

  24. How far away is this ? Not very far away! every book sold by Amazon rapidly growing Linked Open Data cloud. already many billions of facts & rules any CD ever recorded (almost) life-science databases basic facts on every country on the planet hierarchical dictionaries (UK, FR, NL) common sense rules & facts (100.000’s) scientific bibliographies names of artists & art works (10.000’s) Geographic names (millions) Encyclopedia It gets bigger every month

  25. Example use-case: bbc.co.uk/music/artists • Content is BBC + LOD • Use an ontology as basis for the site • Serve data back out as RDF • “The Web is becoming our content management platform”

  26. Outline • The general idea: a Web of Data • What must be done to realise this • How far away is this • Nex steps, do’s, don’ts

  27. learn / get access to some basic technology Next steps Can you get famous by sharing data? • hunt for shared vocabularies • try to avoid building them • wrap legacy data sources • your own • from others • link wrapped sources • publish linked data on the web • make noise • reconstruct some old results • produce new results • get famous A little semantics goes a long way in-use systems in communication science, KB, Beeld & Geluid, Europeana papers in oncology, in communication science, dedicated conferences in chemistry, earth-sciences, life-sciences, humanities funding opportunities in humanities, social sciences, life sciences

  28. Frank.van.Harmelen@cs.vu.nl http://www.cs.vu.nl/~frankh/popularising.html Questions & discussion

More Related