1 / 18

Christian Bizer Richard Cyganiak Freie Universität Berlin

W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA, USA D2RQ Lessons Learned. Christian Bizer Richard Cyganiak Freie Universität Berlin. The D2RQ Plattform. 2002: D2R MAP dump relational databases as RDF based on and expressive declarative mapping language

strom
Download Presentation

Christian Bizer Richard Cyganiak Freie Universität Berlin

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. W3C Workshop on RDF Access to Relational Databases25-26 October, 2007 — Boston, MA, USAD2RQLessons Learned Christian BizerRichard Cyganiak Freie Universität Berlin

  2. The D2RQ Plattform • 2002: D2R MAP • dump relational databases as RDF • based on and expressive declarative mapping language • 2004: D2RQ • RDQL/SPARQL to SQL query rewriting • Jena and Sesame API • 2006: D2R Server • SPARQL, Linked Data access over the Web • Tested with Oracle, MySQL, and PostgreSQL • Should work with any SQL-92 compatible database • GNU GPL license, 4600 downloads (150 per month)

  3. Outline • D2RQ Mapping Language • D2RQ Architecture and Interfaces • Areas for Future Community Work • RDF Access to Relational Databases • The Web Perspective

  4. The D2RQ Mapping Language • Declarative language to express mappings between a given RDF schemata and a given relational database schemata.

  5. Class Map Author map:Author_ClassMap a d2rq:ClassMap; d2rq:class foaf:Person; d2rq:uriPattern "/people/@@Author.ID@@". http://www4.wiwiss.fu-berlin/d2rServer/people/12 rdf:type foaf:Person .

  6. Property Bridge Author map:email_PropertyBridge a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Author_ClassMap; d2rq:property foaf:name; d2rq:pattern "@@Author.first@@ @@Author.last@@". http://www4.wiwiss.fu-berlin/d2rServer/people/12 foaf:name “Chris Bizer” .

  7. Joins Papers Author Rel_Authors_Papers map:author_PropertyBridge a d2rq:PropertyBridge; d2rq:belongsToClassMap :PeopleClassMap; d2rq:property dc:creator; d2rq:refersToClassMap :PapersClassMap; d2rq:join “Author.ID=Rel_Authors_Papers.AuthorID"; d2rq:join "Rel_Authors_Papers.PaperID=Papers.ID“. http://www4.wiwiss.fu-berlin/d2rServer/docs/312 dc:creator http://www4.wiwiss.fu-berlin/d2rServer/people/12 .

  8. Other Features of the Mapping Language • Conditional mappings • Value translation tables • Extensible with arbitrary value translation functions • Performance hints

  9. D2RQ Architecture and Interfaces

  10. Performance and Limitations • Performance is fine with databases containing a few million records. • Dumps, Linked Data und HTML interface usually no problem. • Simple SPARQL queries usually fine. • Complex SPARQL queries (OPTINAL, FILTER, LIMIT) sometimes slow. • Due to limitations of the implementation. Will improve with future  releases. • Limitations • No support for Named Graphs • Read only. No support for CREATE/DELETE/UPDATE • No support for inference

  11. Areas for Future Community Work

  12. RDF Access to Relational Databases • With Virtuoso, DartGrid, SPASQL, SquirrelRDF, Relational.OWL, D2RQ, and … there are various suitable solutions around. • Compare the Expressivity of Mapping Languages • People need weird mappings and fixups for database design anti-patterns. • We need an accepted mapping benchmark which reflects this. • First approach: THALIA testbed. • Compare the Performance of the different Implementations • We need an accepted performance benchmark.

  13. Future Community Work seen from the Web Perspective • Mapping relational databases to RDF is a local problem and its technical realization matters little from the Web perspective. • What people really want are • expressive and fast queries • over an integrated view • on an unbounded number of data sources (the Web) • expressed via simple user interfaces. • We should aim at providing answers to the well-known, but hard data integration questions arising from this scenario.

  14. Testbed: The Linking Open Data Cloud

  15. Federation versus Replication DBpedia geonames • Virtual Integration via SPARQL Query Federation • DARQ (HU Berlin) • Complicated and slow. • Materialized Integration via Crawling • Zitgist (Zitgist), SWSE (DERI), Swoogle (UMBC), Watson (Open University) • Fast, but requires huge RDF repositories. • Worked for HTML, worked for RSS, so why not for RDF? • Materialization On-the-Fly • Crawl only data that is needed while answering the query. • Semantic Web Client Library (FU Berlin), SWIC (University of London) • Works, but is really slow. RDF Link SIOC FOAF RDF Link RDF Link RDF Link

  16. Data Source Discovery and Description • Registry-based Discovery • Registries collect links or data source descriptions. • Example: Ping the Semantic Web • Work on data source descriptions • DARQ, SADDLE • Link-based Discovery • Discovering RDF data by following RDF Links. • Worked fine on the classic HTML Web, so why not for the Semantic Web?

  17. Schema Mapping • Still no clear answers to: • How to express mappings between different RDF vocabularies? • How to publish and search for such mappings on the Web? • RDF Schema and OWL are insufficient in practice to express mappings. • Maybe upcoming Rules Interchange Format (RIF) could provide a solution?

  18. Conclusion • We should have a look which parts of the Semantic Web puzzle are missing to make RDF-based data integration work on WEB- scale! This talk is online athttp://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/Bizer-Cyganiak-D2RQ-slides.pdf

More Related