330 likes | 473 Views
A Little Bit History. SKOS first version: 2008. DAML started: 2000. Data Engineering. RDFS: 2004. 2009 OWL 2. RDF: 1999. OWL: 2004. N3 first version: 2008. SPARQL 1.0: 2008. Programming. 2012 GeoSPARQL. SPARQL 1.1: 2013. JSON-LD: 2014. Programming the Semantic Web. Linyun Fu
E N D
A Little Bit History SKOS first version: 2008 DAML started: 2000 Data Engineering RDFS: 2004 2009 OWL 2 RDF: 1999 OWL: 2004 N3 first version: 2008 SPARQL 1.0: 2008 Programming 2012 GeoSPARQL SPARQL 1.1: 2013 JSON-LD: 2014
Programming the Semantic Web Linyun Fu June 3, 2014 Adapted from Steffen Staab’s keynote at ESWC 2014
This presentationcontainsprogramcode, engineering, speculationandworse. Itmaybeviewedas offensive bysomeviewers.
Semantic Web Programming RDF file Linked Data Hypothesis: Semantic Web dataiswonderful, but programmingwithSemantic Web haschangedtoolittlesince 2000 and still is a mess. Ontology Wepromiseflexibility, but code still hardtomaintain. SPARQL endpoint
These is a mismatch between data engineering and programming approaches RDF file Linked Data Eclipse Ontology „Inside“ Data Mgmt „Outside“ Data Mgmt .... Visual Studio SPARQL endpoint
searching, finding, reusing all the complex data strcutres that we have „Outside“ • Understanding bythedeveloper • „Search + Code“ • „Browse + Code“ • Triple-objectmapping • Code generation • Processoffederation • LD vsendpoint • Models offederation Eclipse „Inside“ Data Mgmt „Outside“ Data Mgmt .... Visual Studio Abstractionlayersthatfacilitate a developer‘slife
Programmingwith Data: Whatdoesitcost? Both „Inside“ „Outside“ Ctotal = t*Ctool + d*t*Clearn + s*Cdeu + s*Cmap + n*Ccode Ctool: Costsforbuildingtmanytools, shared; almostfree Clearn: Costsforlearninghowtousetechnology per developerd Cdeu: Costsfordataengineering/understandingssources Cmap: Costsformappingdatastructureforssourcestoobjects Ccode: Actualcostsforaccessing/manipulatingdatantimes
SWOT ofSemantic Web Programming „Outside“ Ctotal = t*Ctool + d*t*Clearn + s*Cdeu+ s*Cmap + n*Ccode weakness threat strength/weakness asgoodasRelDBis not goodenough! opportunity opportunity Strong in flexibility Somewhatweak in performance Not a strength, yet!
Intermediate conclusion Minimizecostsforsetup: Clearn: Costsforlearninghowtousetechnology per developerd Cdeu: Costs for data engineering/understanding s sources Cmap: Costsformappingdatastructureforssourcestoobjects Minimizecostsforcoreprogramming: Ccode: Actualcostsforaccessing/manipulatingdatantimes • Costsforlearningandunderstandingconstitute a threat! • Need tobeovercome!
We need flexible code to match flexible data structures • i.e., Domain-specific languages for Semantic Web Programming • XML programming example • Why Jena is not good enough
XML programming example: LINQ to XML XElement contacts = newXElement("Contacts", newXElement("Contact", newXElement("Name", "Patrick Hines"), newXElement("Phone", "206-555-0144"), newXElement("Address", newXElement("Street1", "123 Main St"), newXElement("City", "Mercer Island"), newXElement("State", "WA"), newXElement("Postal", "68042") ) ) );
The Jena Approach The Jamendo ontology Task: List all records for each music artist
From artists to songs Observations • SPARQL queries are strings • Results are strings • Requires good understanding of the data source RDF Typingis lost
Related Work on RDF Access Static Typing • Errors detected before execution • Misspelling discovered by compiler! • Anectode: 2nd place because of misspelt var. • Static types are form of documentation • Less knowledge about data source required • Better IDE integration / autocompletion Code generation • Sommer • Winter • OntoMDE Dynamic Typing • E.g. ActiveRDF(Oren et al 2007)) • “convention over configuration” • dynamic metaprogramming allows for slick code Criticism
Node Path Query Language UsingAutocompletion Exploration ofclasses Exploration ofrelations Queryingforinstances
Node Path Query Language UsingAutocompletion Exploration ofclasses Exploration ofrelations
Node Path Query Language: Query Formulation Exploration ofclasses Exploration ofrelations Queryingforinstances Typesetofmo:MusicArtist Nodefinitionordeclarationneeded
Node Path Query Language for Code Development Onelanguageto bind them all Exploration ofclasses Exploration ofrelations Queryingforinstances Developingcodewithqueries • All translatedinto SPARQL queriesat • Development time • Type inferenceatcompile time (but also aspartof IDE) • Queryingagainatrun time
Node Path Query Language for Code Development Exploration ofclasses Exploration ofrelations Queryingforinstances Developingcodewithqueries Developingcodewithnewclasses • All translatedinto SPARQL queriesat • Development time • Run time update • Persistence!
NPQL NPQL (Node Path Query Language) • IntensionalQueries Describing RDF classesandpropertiesforreuse in IDE and in hostlanguagemetaprogramming • ExtensionalQueries Class instancesandpropertyinstances • Compilationto SPARQL forreuseofexistingendpoints Ongoingdiscussionaboutdetailsof NPQL
LITEQ NPQL (Node Path Query Language) • IntensionalQueries • ExtensionalQueries • Compilationto SPARQL LITEQ (Language Integrated Types, ExtensionsandQueries) • Implementationof NPQL as F# Type Provider in Visual Studio • Autocompletionusing NPQL queries • Automatictypingofextensionalqueryresultsbyintensionalqueries
Costsavings Ctotal = t*Ctool + d*t*Clearn + s*Cdeu + s*Cmap + n*Ccode Ctool: open source Clearn: not free– thoughautocompletionreducescognitiveload Cdeu: not free – understandingthe RDF schemafromyour IDE Cmap: 0 Ccode: a lotlessthanfordotNet RDF (Apache Jena?!!) littlebitmorethanfor a fictitiousperfectobjectmodel
Halsteadmetricsfor different tasks: ConventionalSemantic Web programmingapproacheswasteupto 50% ofyourefforts!
Speculation 1 Speculation 1: Usingontologiesand RDF schemata, wecandevelopmoreefficientlyusingtherighttools! Ctotal RelDB/XML codingefforts SemWebcodingefforts Appliestosmalln Diff. costsforlearning n (as in n*Ccode )
Halsteadmetricsfor different tasks: Ifsomeonegivesme a perfect RDF-to-OO mappingforfree then I will not careaboutwhetheritis RDF orRelDB underneath!
Speculation 2 Speculation 2: For large programmes, ourtoolsneedtoofferbettersupporttoreducesetupcosts! Ctotal RelDB/XML codingefforts SemWebcodingefforts „Perfect“ objectmodelshieldsdeveloperfromdatabaseideosyncracies Diff. costsforsetup n (as in n*Ccode )
Semantic Web: Make Developers More Productive RDF file Linked Data Eclipse Ontology „Inside“ Data Mgmt „Outside“ Data Mgmt .... Visual Studio SPARQL endpoint
What I Think • New programming languages vs. code generation + existing PLs • Mismatches between RDF and OO: Oren et al 2007, Saathoff et al 2009 • Only some ontologies can be perfectly mapped to OO classes • A killer PL to come • “Data programmability” • Model • Format • …?
References • C. Saathoff, S. Scheglmann, S. Schenk. Winter: Mapping RDF to POJOs revisited. • E. Oren, R. Delbru, S. Gerke, A. Haller, S. Decker: ActiveRDF: object-oriented semantic web programming. WWW 2007: 817-824 • S. Scheglmann, A. Scherp, S. Staab. Declarative Representation of Programming Access to Ontologies. In: 9th Extended Semantic Web Conference (ESWC2012), Heraklion, Greece, May 27-31, 2012. • W. Cook, A. Ibrahim. Integrating Programming Languages & Databases: What’s the Problem?? http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.66.7169&rep=rep1&type=pdf