180 likes | 293 Views
W3C Workshop on RDF Access to Relational Databases 25-26 October, 2007 — Boston, MA, USA D2RQ Lessons Learned. Christian Bizer Richard Cyganiak Freie Universität Berlin. The D2RQ Plattform. 2002: D2R MAP dump relational databases as RDF based on and expressive declarative mapping language
E N D
W3C Workshop on RDF Access to Relational Databases25-26 October, 2007 — Boston, MA, USAD2RQLessons Learned Christian BizerRichard Cyganiak Freie Universität Berlin
The D2RQ Plattform • 2002: D2R MAP • dump relational databases as RDF • based on and expressive declarative mapping language • 2004: D2RQ • RDQL/SPARQL to SQL query rewriting • Jena and Sesame API • 2006: D2R Server • SPARQL, Linked Data access over the Web • Tested with Oracle, MySQL, and PostgreSQL • Should work with any SQL-92 compatible database • GNU GPL license, 4600 downloads (150 per month)
Outline • D2RQ Mapping Language • D2RQ Architecture and Interfaces • Areas for Future Community Work • RDF Access to Relational Databases • The Web Perspective
The D2RQ Mapping Language • Declarative language to express mappings between a given RDF schemata and a given relational database schemata.
Class Map Author map:Author_ClassMap a d2rq:ClassMap; d2rq:class foaf:Person; d2rq:uriPattern "/people/@@Author.ID@@". http://www4.wiwiss.fu-berlin/d2rServer/people/12 rdf:type foaf:Person .
Property Bridge Author map:email_PropertyBridge a d2rq:PropertyBridge; d2rq:belongsToClassMap map:Author_ClassMap; d2rq:property foaf:name; d2rq:pattern "@@Author.first@@ @@Author.last@@". http://www4.wiwiss.fu-berlin/d2rServer/people/12 foaf:name “Chris Bizer” .
Joins Papers Author Rel_Authors_Papers map:author_PropertyBridge a d2rq:PropertyBridge; d2rq:belongsToClassMap :PeopleClassMap; d2rq:property dc:creator; d2rq:refersToClassMap :PapersClassMap; d2rq:join “Author.ID=Rel_Authors_Papers.AuthorID"; d2rq:join "Rel_Authors_Papers.PaperID=Papers.ID“. http://www4.wiwiss.fu-berlin/d2rServer/docs/312 dc:creator http://www4.wiwiss.fu-berlin/d2rServer/people/12 .
Other Features of the Mapping Language • Conditional mappings • Value translation tables • Extensible with arbitrary value translation functions • Performance hints
Performance and Limitations • Performance is fine with databases containing a few million records. • Dumps, Linked Data und HTML interface usually no problem. • Simple SPARQL queries usually fine. • Complex SPARQL queries (OPTINAL, FILTER, LIMIT) sometimes slow. • Due to limitations of the implementation. Will improve with future releases. • Limitations • No support for Named Graphs • Read only. No support for CREATE/DELETE/UPDATE • No support for inference
RDF Access to Relational Databases • With Virtuoso, DartGrid, SPASQL, SquirrelRDF, Relational.OWL, D2RQ, and … there are various suitable solutions around. • Compare the Expressivity of Mapping Languages • People need weird mappings and fixups for database design anti-patterns. • We need an accepted mapping benchmark which reflects this. • First approach: THALIA testbed. • Compare the Performance of the different Implementations • We need an accepted performance benchmark.
Future Community Work seen from the Web Perspective • Mapping relational databases to RDF is a local problem and its technical realization matters little from the Web perspective. • What people really want are • expressive and fast queries • over an integrated view • on an unbounded number of data sources (the Web) • expressed via simple user interfaces. • We should aim at providing answers to the well-known, but hard data integration questions arising from this scenario.
Federation versus Replication DBpedia geonames • Virtual Integration via SPARQL Query Federation • DARQ (HU Berlin) • Complicated and slow. • Materialized Integration via Crawling • Zitgist (Zitgist), SWSE (DERI), Swoogle (UMBC), Watson (Open University) • Fast, but requires huge RDF repositories. • Worked for HTML, worked for RSS, so why not for RDF? • Materialization On-the-Fly • Crawl only data that is needed while answering the query. • Semantic Web Client Library (FU Berlin), SWIC (University of London) • Works, but is really slow. RDF Link SIOC FOAF RDF Link RDF Link RDF Link
Data Source Discovery and Description • Registry-based Discovery • Registries collect links or data source descriptions. • Example: Ping the Semantic Web • Work on data source descriptions • DARQ, SADDLE • Link-based Discovery • Discovering RDF data by following RDF Links. • Worked fine on the classic HTML Web, so why not for the Semantic Web?
Schema Mapping • Still no clear answers to: • How to express mappings between different RDF vocabularies? • How to publish and search for such mappings on the Web? • RDF Schema and OWL are insufficient in practice to express mappings. • Maybe upcoming Rules Interchange Format (RIF) could provide a solution?
Conclusion • We should have a look which parts of the Semantic Web puzzle are missing to make RDF-based data integration work on WEB- scale! This talk is online athttp://sites.wiwiss.fu-berlin.de/suhl/bizer/pub/Bizer-Cyganiak-D2RQ-slides.pdf