1 / 72

Georgi Kobilarov , Chris Bizer, Sören Auer, Jens Lehmann

Georgi Kobilarov , Chris Bizer, Sören Auer, Jens Lehmann Freie Universität Berlin, Universität Leipzig. Querying Wikipedia like a Database. Domain specific Data Images Infoboxes. Title Description Languages Web Links Categorization. Infobox Extraction.

Download Presentation

Georgi Kobilarov , Chris Bizer, Sören Auer, Jens Lehmann

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Georgi Kobilarov, Chris Bizer, Sören Auer, Jens Lehmann Freie Universität Berlin, Universität Leipzig

  2. QueryingWikipedia like a Database

  3. Domain specific Data Images Infoboxes Title Description Languages Web Links Categorization

  4. Infobox Extraction dbpedia:Albert_Einstein p:name „Albert Einstein“ dbpedia:Albert_Einstein p:birth_place dbpedia:Ulm dbpedia:Albert_Einstein p:birth_date „ 1956-07-09“

  5. Property Synonyms

  6. StructuringWikipedia‘sKnowledge • Structuringactualdata, not modelingtheworld • BoundtoWikipedia Templates, parsers handle templatevaluesbased on rules (propertysplitting, merging, transformation)

  7. DBpediaOntology • DBpediaOntologybuildfromscratch • 170 classes, 900 properties

  8. Nolivingthings

  9. Class Hierarchy „Select all TV Episodes …“

  10. Template Mapping Class TV Episode (Work) Wikipedia Templates: Television Episode UK Office Episode Simpsons Episode DoctorWhoBox

  11. Template Mapping Infobox Cricketer Infobox HistoricCricketer Infobox RecentCricketer Infobox Old Cricketer Infobox CricketerBiography => Class Cricketer(Athlete)

  12. People Actors Athlete Journalist MusicalArtist Politician Scientist Writer

  13. Places Airport City Country Island Mountain River

  14. Organisations Band Company Educational Institution Radio Station Sports Team

  15. Event Convention Military Conflict Music Event Sport Event

  16. Work Book Broadcast Film Software Television

  17. More structureddata • Categories in SKOS • Intra-wiki links • Disambiguation • Redirects • Links to Images (andFlickr) • Links toexternalwebpages

  18. Data about 2.6 million “things”

  19. 274 million pieces of information (RDF triples)

  20. Multilingual Abstracts • English: 2,613,000 • German: 391,000 • French: 383,000 • Dutch: 284,000 • Polish: 256,000 • Italian: 286,000 • Spanish: 226,000 • Japanese: 199,000 • Portuguese: 246,000 • Swedish: 144,000 • Chinese: 101,000

  21. DBpediaas Linked Data Hub

  22. Semantic Web “My document can point at your document on the Web, but my database can't point at something in your database without writing special purpose code. The Semantic Web aims at fixing that.” Prof. James Hendler

  23. Web ofDocuments Search Engines Web Browsers HTTP HTML HTML HTML HTML hyperlinks hyperlinks hyperlinks C D A B

  24. Web of Data Linked DataMashups Linked Data Browsers Search Engines HTTP HTTP Thing Thing Thing Thing Thing Thing Thing Thing Thing Thing datalink datalink datalink datalink A E C D B

  25. Linked Data • Use URIs as names for things • Use HTTP URIs so that people can look up those names. • When someone looks up a URI, provide useful information. • Include links to other URIs. so that they can discover more things. WikipediaArticle URI:http://en.wikipedia.org/wiki/Madrid DBpediaResource URIhttp://dbpedia.org/resource/Madrid

  26. HTTP URIs

  27. Music Online Activities Publications Geographic Cross-Domain Life Sciences

  28. 4.5 billion triples 180 million data links

  29. UseCases

  30. UseCases • Data Source for Web-Applications • QueryingWikipedialike a database • Tag Web contentwithconceptsinsteadoffree-text tags • Vocabularyandsemanticbackboneforenterpriselinkeddataintegration

  31. DBpedia as data source • EmbedstructuredinformationfromWikipediaintoyour web applications • Build (mobile) mapsapplicationsusingDBpediadataaboutplaces • Display multilingual titles &descriptions in 15 languages

  32. DBpedia Mobile

  33. SparqlEndpoint http://dbpedia.org/sparql

  34. Wikipedia Query

  35. Annotating Documents • UseDBpediaconceptstoannotatedocumentsinsteadoffree-text tags • NamedEntityExtraction Systems alreadyuseDBpedia URIs(OpenCalais, Muddy Boots) • SocialBookmarkingwithDBpedia URIs as tags www.faviki.com

  36. „Apple“ http://dbpedia.org/resource/Apple_Inc. http://dbpedia.org/resource/Apple_(fruit) http://dbpedia.org/resource/Apple_Records

  37. AnnotatingDocuments • BBC editors tag newsarticleswithDBpediaconcepts • DBpedia Lookup Servicehttp://lookup.dbpedia.org

More Related