720 likes | 982 Views
Georgi Kobilarov , Chris Bizer, Sören Auer, Jens Lehmann Freie Universität Berlin, Universität Leipzig. Querying Wikipedia like a Database. Domain specific Data Images Infoboxes. Title Description Languages Web Links Categorization. Infobox Extraction.
E N D
Georgi Kobilarov, Chris Bizer, Sören Auer, Jens Lehmann Freie Universität Berlin, Universität Leipzig
QueryingWikipedia like a Database
Domain specific Data Images Infoboxes Title Description Languages Web Links Categorization
Infobox Extraction dbpedia:Albert_Einstein p:name „Albert Einstein“ dbpedia:Albert_Einstein p:birth_place dbpedia:Ulm dbpedia:Albert_Einstein p:birth_date „ 1956-07-09“
StructuringWikipedia‘sKnowledge • Structuringactualdata, not modelingtheworld • BoundtoWikipedia Templates, parsers handle templatevaluesbased on rules (propertysplitting, merging, transformation)
DBpediaOntology • DBpediaOntologybuildfromscratch • 170 classes, 900 properties
Class Hierarchy „Select all TV Episodes …“
Template Mapping Class TV Episode (Work) Wikipedia Templates: Television Episode UK Office Episode Simpsons Episode DoctorWhoBox
Template Mapping Infobox Cricketer Infobox HistoricCricketer Infobox RecentCricketer Infobox Old Cricketer Infobox CricketerBiography => Class Cricketer(Athlete)
People Actors Athlete Journalist MusicalArtist Politician Scientist Writer
Places Airport City Country Island Mountain River
Organisations Band Company Educational Institution Radio Station Sports Team
Event Convention Military Conflict Music Event Sport Event
Work Book Broadcast Film Software Television
More structureddata • Categories in SKOS • Intra-wiki links • Disambiguation • Redirects • Links to Images (andFlickr) • Links toexternalwebpages
Multilingual Abstracts • English: 2,613,000 • German: 391,000 • French: 383,000 • Dutch: 284,000 • Polish: 256,000 • Italian: 286,000 • Spanish: 226,000 • Japanese: 199,000 • Portuguese: 246,000 • Swedish: 144,000 • Chinese: 101,000
DBpediaas Linked Data Hub
Semantic Web “My document can point at your document on the Web, but my database can't point at something in your database without writing special purpose code. The Semantic Web aims at fixing that.” Prof. James Hendler
Web ofDocuments Search Engines Web Browsers HTTP HTML HTML HTML HTML hyperlinks hyperlinks hyperlinks C D A B
Web of Data Linked DataMashups Linked Data Browsers Search Engines HTTP HTTP Thing Thing Thing Thing Thing Thing Thing Thing Thing Thing datalink datalink datalink datalink A E C D B
Linked Data • Use URIs as names for things • Use HTTP URIs so that people can look up those names. • When someone looks up a URI, provide useful information. • Include links to other URIs. so that they can discover more things. WikipediaArticle URI:http://en.wikipedia.org/wiki/Madrid DBpediaResource URIhttp://dbpedia.org/resource/Madrid
Music Online Activities Publications Geographic Cross-Domain Life Sciences
4.5 billion triples 180 million data links
UseCases • Data Source for Web-Applications • QueryingWikipedialike a database • Tag Web contentwithconceptsinsteadoffree-text tags • Vocabularyandsemanticbackboneforenterpriselinkeddataintegration
DBpedia as data source • EmbedstructuredinformationfromWikipediaintoyour web applications • Build (mobile) mapsapplicationsusingDBpediadataaboutplaces • Display multilingual titles &descriptions in 15 languages
SparqlEndpoint http://dbpedia.org/sparql
Annotating Documents • UseDBpediaconceptstoannotatedocumentsinsteadoffree-text tags • NamedEntityExtraction Systems alreadyuseDBpedia URIs(OpenCalais, Muddy Boots) • SocialBookmarkingwithDBpedia URIs as tags www.faviki.com
„Apple“ http://dbpedia.org/resource/Apple_Inc. http://dbpedia.org/resource/Apple_(fruit) http://dbpedia.org/resource/Apple_Records
AnnotatingDocuments • BBC editors tag newsarticleswithDBpediaconcepts • DBpedia Lookup Servicehttp://lookup.dbpedia.org