190 likes | 310 Views
Linked Data: Survey of Adoption. Aidan Hogan. Day 1 Session 1: …. Linked Data… …so, what’s out there? …on the Web… …now… …. The Web of Data !. August 2007. November 2007. February 2008. March 2008. September 2008. March 2009. July 2009. September 2010.
E N D
Linked Data: Survey of Adoption Aidan Hogan Day 1 Session 1: …
Linked Data… …so, what’s out there? …on the Web… …now… …
The Web of Data! August 2007 November 2007 February 2008 March 2008 September 2008 March 2009 July 2009 September 2010 Images from:http://richard.cyganiak.de/2007/10/lod/; Cyganiak, Jentzsch 3
Publications Media User-generated Government Cross-Domain Geographic Life sciences
Anatomy of the LOD cloud: cross-domain • Freebase • ~300 million triples • General knowledge • User contributed • Acquired by Google • OpenCalais • ~4.5 million triples • Thomson Reuters export • OpenCyc • ~2 million triples • Upper ontology concepts • DBpedia • ~1 billion triples • Exports from Wikipedia • Central hub • Yago • ~19 million triples • Smaller/more precise data from Wikipedia • WordNet • ~4.5 million triples • Synonyms, etc.
User-generated Cross-Domain
Anatomy of the LOD cloud: user-generated • semanticweb.org • ~50 thousand triples • SemWeb related topics • Semantic Media Wiki! • Revyu • ~20 thousand triples • User contributed reviews • FlickrWrappr • ~56 million triples • Exports from photo site • DogFood • ~200 thousand triples • SemWeb confs. and papers • RDF ohloh • ~700 thousand triples • Exports from open-source development site
Publications User-generated
Anatomy of the LOD cloud: publications Library Exports • DBLP • ~28 million triples • Com. Sci. publications • DBLP • ~28 million triples • Com. Sci. publications • DBLP • ~28 million triples • Com. Sci. publications • ePrints • ~8.4 million triples • ePrints exporter Academic Publications
User-generated Life sciences
Anatomy of the LOD cloud: life-sciences • Drug Bank • ~800 thousand triples • Detailed pharmacology for FDA-approved drugs • Sider • ~200 thousand triples • Drug side-effects • DailyMed • ~200 thousand triples • Detaileddrug info from NLM • DiseaseSome • ~91 thousand triples • Disorders and disease • LinkedCT • ~7 million triples • Clinical trials info • UniProt • 100’s millions triples • Info on proteins and sequences • PubMed • 800 million triples • HCLS publications
Geographic Life sciences
Anatomy of the LOD cloud: geographical • GeoNames • ~100 million triples • 10 million places with lat, long, population, subdivisions, post-codes, etc. • 2000 U.S. Census • ~1 billion triples • Population statistics per geographical location • Linked Sensor Data • ~1 billion triples • Sensor observations from 20 thousand weather observatories • Linked GeoData • ~3 billion triples • OpenStreetMap geolocations
Government Geographic
Anatomy of the LOD cloud: governmental • UK Legislation • ~2 billion triples • UK primary and secondary legislation info • NASA • ~100 thousand triples • Spacecraft, star catalogues, etc. • EuroStat • ~5 million triples • Various statistics for EU countries • UK Postcodes • ~27 million triples • Every UK postcode • GovTrack • ~13 million triples • US Congress bills, sponsorship, voting records
Media Government
Anatomy of the LOD cloud: media • Music (Various) • 100’s millions triples • MySpace • AudioScrobbler • MusicBrainz • discogs • LastFM • Music (Various) • 100’s millions triples • MySpace • AudioScrobbler • MusicBrainz • discogs • LastFM • Music (Various) • 100’s millions triples • MySpace • AudioScrobbler • MusicBrainz • discogs • LastFM • Music (Various) • 100’s millions triples • MySpace • AudioScrobbler • MusicBrainz • discogs • LastFM • Music (Various) • 100’s millions triples • MySpace • AudioScrobbler • MusicBrainz • discogs • LastFM • Poképédia • ~115 thousand triples • Everything you ever wanted to know about Pokémon (but were afraid to ask) • BBC Programmes • ~60 million triples • Extensive info on BBC TV and radio programmes • New York Times • ~400 thousand triples • Extensive news vocabulary and cat. schemes • Linked Movie Database • ~6 million triples • Movie database • Open (smaller) version of IMDb
Publications Media User-generated Government Cross-Domain Geographic Life sciences
…TODO General statistics Licensing SPARQL endpoints Problems…