410 likes | 581 Views
SPARQL for the Pragmatic Ontologist. The Washington Semantic Web Meetup 2009-02-12. Daniel Yacob Mekonnen Semantic Solutions Architect, TopQuadrant. SPARQL by Application. Q: What is the SQL of RDF?. SPARQL!. Q: What is the XSLT of RDF?. SPARQL!. Q: What is the JavaDoc of RDF?. SPARQL!.
E N D
SPARQL for thePragmatic Ontologist The Washington Semantic Web Meetup 2009-02-12 Daniel YacobMekonnenSemantic Solutions Architect, TopQuadrant
SPARQL by Application Q: What is the SQL of RDF? SPARQL! Q: What is the XSLT of RDF? SPARQL! Q: What is the JavaDoc of RDF? SPARQL! Q: What is the Lint of RDF? SPARQL! Q: What is the Unit Test of RDF? SPARQL! “Its in the way that you use it”- Eric Clapton “The query was just the beginning”- Hey, I said that!
Thinking SPARQL • To think in SPARQL, think in patterns • graph patterns or shapes of data • think in graphs • a similar mindset to regular expressions thinking • The pattern language is N3 • “Notation 3” • A language for expressing RDF • Learn N3 as you learn SPARQL
SPARQL Vocabulary • ASK • BASE • BOUND • CONSTRUCT • DATATYPE • DESCRIBE • DISTINCT • false • FILTER • FROM • FROM NAMED • GRAPH • isIRI • isLITERAL • isURI • LANG • LANGMATCHES • LIMIT • OFFSET • OPTIONAL • ORDER BY • PREFIX • REDUCED • REGEX • sameTerm • SELECT • STR • true • UNION • WHERE Is SPARQLREAD-ONLY? Yes. SPARQL 1.0is READ-ONLY. Terms I’ve used at least once in 2 years. Terms I use regularly.
SPARQL Vocabulary (SPARQL 2 Candidates) Jena ARQ Extensions • CLEAR • COUNT • CREATE • DELETE • DELETE DATA • DROP • GROUP BY • HAVING • INSERT • INSERT DATA • INTO • MODIFY • SERVICE • SILENT These are read-write extension to SPARQL. Supported by TBC. Gruff is read-only Terms I’ve used at least once in the last year. Terms I use regularly.
Exercise Dataset • 7 ontologies altogether: • 1 ontology of Airport codes: IANA Code, City, State, Country. • 1 ontology of National Capitals: Nation, Capital City, with Latitude and Longitude. • 5 ontologies work as a group providing Country, City and US States relationships. • These 3 groupings are disjoint –but we can link the data with SPARQL! • Query are executed over ALL ontologies.
Simple Select “Find all country instances and their name properties” “Find the names of all countries” I’m a Class! “?” indicates a variable, something unknown. country:Country Objects Predicates rdf:type I’m a Property value! I’m an unknown property value. I’m an Instance! I’m an unknown instance. country:name ?country ?name “Italy” Triple I’m a Property! Triple Subject
Simple Select “Find the names of all countries” country:Country rdf:type ‘$’ is also legal (sigil to taste). ‘?’ is the sigil in SPARQL. ?country SELECT ?country ?name WHERE { ?country rdf:type country:Country . ?country country:name ?name } SELECT $country $name WHERE { $country rdf:type country:Country . $country country:name $name } SELECT ?country ?name WHERE { ?country rdf:type country:Country . ?country country:name ?name } country:name ?name
Simple Select “Find the names of all countries” SELECT ?country ?name WHERE { ?country rdf:type country:Country . ?country country:name ?name } SELECT ?country ?name WHERE { ?country a country:Country . ?country country:name ?name } SELECT ?country ?name { ?country country:name ?name } SELECT ?country ?name WHERE { ?country a country:Country ; country:name ?name } SELECT ?country ?name { ?country a country:Country ; country:name ?name }
Ordered Select SELECT ?state ?city WHERE { ?cityR city:name ?city ; uscity:state ?stateR . ?stateR state:name ?state } “Find the states and their cities”
Ordered Select SELECT ?state ?city WHERE { ?cityR city:name ?city ; uscity:state ?stateR . ?stateR state:name ?state } ORDER BY ?state ?city “Find the states and their cities-sorted by state, then city name”
Ordered Select “Find the states and their cities-sorted by ascending state name, then by descending city name” SELECT ?state ?city WHERE { ?cityR city:name ?city ; uscity:state ?stateR . ?stateR state:name ?state } ORDER BY ASC(?state) DESC(?city)
Ordered Select SELECT ?state ?city WHERE { ?cityR city:name ?city ; uscity:state [ state:name ?state ] } ORDER BY ASC(?state) DESC(?city) Compactified Expression Found in the wild, but best to avoid –more cryptic and error prone.
Select with OPTIONAL SELECT ?state ?capital ?airport WHERE { ?stateR state:name ?state ; state:capital ?capitalR . ?capitalR city:name ?capital OPTIONAL { ?airportR airports:city ?capital ; airports:airport ?airport } } bridge across ontologies
Select with UNION SELECT ?state WHERE { { ?stateR state:borderstate usstate:AL, usstate:TN . ?stateR state:name ?state . } UNION { ?stateR state:borderstate usstate:ID . ?stateR state:name ?state . } } Comma between objects means the subject and predicate are the same.
Select with FILTER “Find all states that do NOT border another state” SELECT ?state WHERE { ?stateR state:name ?state . OPTIONAL { ?stateR state:borderstate ?borderState. } FILTER ( !bound(?borderState) ) } FILTER means “keep” With !bound() we effectively look for the absence of a resource
Select with FILTER “Find all states, and cities, where the city name begins with the letter ‘Y’ ” SELECT ?state ?city WHERE { ?cityR uscity:state ?stateR . ?cityR city:name ?city . ?stateR state:name ?state FILTER( regex( xsd:string(?city), "^Y") ) } Cast untyped literals into datatype needed by function.
Performance Notes • Filter early, near the top of an expression –if a condition is not met the pattern match aborts immediately. • Use OPTIONAL sparingly, they will slow down a query, sometimes drastically. • Use ASK when you do not need the results of the match –ASK will terminate when the first match if found, returns a boolean • DISTINCT and ORDERED BY may also be slow over large result sets • USE LIMIT and OFFSET with large result sets.
Select with DISTINCT DISTINCT avoids duplicate result sets SELECT DISTINCT ?state ?city WHERE { ?person uscity:address ?address . ?address uscity:city ?city . ?city uscity:state ?state } Duplicates Explained: ?person is notin the SELECT list, but many, many people will live in a ?city. Without DISTINCT one result set of ?state ?city would appear per ?person! Without DISTINCT, one SELECT result is returned per graph pattern match.
CONSTRUCT Query • Has anyone thought it strange that we query a graph, with a graph (WHERE clause), but then process a tabular result set? • CONSTRUCT allows us to specify a graph to return. • CONSTRUCTed graphs are not inserted into the queried graph –but some tools allow the constructed graph to be asserted. • CONSTRUCT is extremely useful for transforming graphs.
Now the bad news… Not all SPARQL implementations are created equal
CONSTRUCT Query CONSTRUCT { ?countryR owl:sameAs ?nationR } WHERE { ?countryR a country:Country; country:name ?country . ?nationR a capitals:Nation; rdfs:label ?country; } Linkage, equating, of resources. ?country bridges two ontologies in the same graph.
CONSTRUCT Query CONSTRUCT { ?airportR wswm:country ?countryR ; wswm:city ?cityR ; wswm:state ?stateR } WHERE { ?airportR a airports:AirportCode ; airports:city ?city ; airports:country ?country . { # Link to International City Instances ?cityR a capitals:Capital; rdfs:label ?city . ?countryR a capitals:Nation ; rdfs:label ?country ; capitals:capital ?cityR . } UNION { # Link to US City Instances LET ( ?countryR := country:UNITED-STATES ) ?airportR airports:state ?state . ?stateR state:code ?state . ?cityR city:name ?city ; uscity:state ?stateR } } Construct resource linkages across the 3 ontology groups. The data is now linked and we can find geo coordinates for airports(in capital cities only). Start comments anywhere with “#” Use LET() to assign variables.
SPARQL 2 DELETE –use to remove unwanted triples and clean up an ontology DELETE { ?s country:cia ?o } WHERE { ?s country:cia ?o }
SPARQL 2 DELETE and INSERT –use together in sequence to move content: INSERT { ?state rdfs:label ?name } WHERE { ?state state:name ?name } DELETE { ?state state:name ?name } WHERE { ?state state:name ?name }
Advanced Topics • Named Graphs – useful when the same namespaces are used in multiple graphs: SELECT ?state ?births FROM NAMED <http://some.org/births.owl> WHERE { ?stateR a state:State ; state:name ?state ; stats:birthRate ?births } Graphs names are automatic with TBC. With Gruff the names must be specified at import time into the Allegograph triplestore.
Advanced Topics SELECT ?state ?births ?deaths WHERE { ?stateR a state:State ; state:name ?state ; GRAPH <http://some.org/births.owl> { stats:birthRate ?births } GRAPH <http://other.org/deaths.owl> { stats:deathRate ?deaths } }
Advanced Topics The graph can also be a variable SELECT ?state ?deaths ?graph WHERE { ?stateR a state:State ; state:name ?state ; GRAPH ?graph { stats:deathRate ?deaths } }
Advanced Topics Custom FILTER, LET and Property functions bound to Java code. SELECT ?town ?temperature WHERE { ?town a geo:Town; geo:zipcode ?zipcodePlus4 . ?zipcodemy:shortenZip( ?zipcodePlus4 ). LET ( ?temperatureC := my:weather( ?zipcode ) ) FILTER( my:toFahrenheit(?temperatureC) > 100 ) } A user defined function to shorted a 9 digit zip code to a 5 digit zip code. User function to get current temperature at zip code. User function to convert Celsius to Fahrenheit.
TopBraid Suite Advanced Product Training • TopBraid Suite offers extensive SPARQL support • In-depth SPARQL • full Jena ARQ and TBC extensions • using SPARQL in applications • Next-generation SPARQL • SPARQLMotion Scripts • SPIN (SPARQL Inference Notation)
SPARQLMotion SPARQLMotion is designed to tie the data lifecycle into an automated process Import (Spreadsheets, DBs, XML) SPARQLMotion Engine Processing (Editing, querying, transforming) Export (Converting, browsing, visualizing)
Scripts consist of modules Modules have a type (e.g. LoadXML) The output of one module is the input to its successors Branching and merging supported Drag & drop modules to create a script step, fill out required attributes, connect SPARQLMotion Language
SPARQLMotion Script • Script define data processing steps • properties define relationships between modules • ‘next’ means result triples from on module sent to next • Graph view • shows processing pipeline • Form view • modules are instances of (SPARQLMotion) classes
SPIN • SPIN - SPARQL Inferencing Notation • define constraints and inference rules on Semantic Web models • http://spinrdf.org • Specification for representing SPARQL with RDF • RDF syntax for SPARQL queries • Modeling vocabulary • constraints, constructors, rules • templates, functions • Standard Modules Library • small set of frequently needed SPARQL queries
SPIN Syntax • Problem: SPARQL is represented as a string • can add as a property value to an ontology • …ala that query library • but what if you wanted to associate a query with a specific resource? # must be at least 18 years old ASK WHERE { ?this my:age ?age . FILTER (?age >= 18) . } [ a sp:Ask ; rdfs:comment "must be at least 18 years old"^^xsd:string ; sp:where ([ sp:objectsp:_age ; sp:predicatemy:age ; sp:subjectspin:_this ] [ a sp:Filter ; sp:expression [ sp:arg1 sp:_age ; sp:arg2 18 ; a sp:ge ] ]) ] Query represented as RDF nodes (not for human consumption)
Using SPIN and Composer-ME • Incremental inferencing (Composer) • compute values when editing ontology • can turn incremental inferencing on and off, run on command, etc. • Calculate the value of a property based on other properties • age of a person as a difference between today's date and person's birthday • computation is declared in SPIN property • Perform constraint checking with closed world semantics • e.g. raise inconsistency flags when currently available information does not fit specified integrity constraints • constraints specified in SPIN • Isolate a set of rules to be executed under certain conditions • e.g. class constructors • initialize certain values when a resource is first created, or to drive interactive applications
SPIN Modeling Vocabulary • Constraints • link classes with SPARQL ASK or CONSTRUCT queries • e.g. link Parents to query • when true, constraint is violated • (note ‘<‘ not ‘>=’) • CONSTRUCT will generate triples when constraint is violated • Rules • link SPARQL CONSTRUCT queries to instances of classes • …and all subclasses • apply the rule to all instances • Constructors • same as Rules • but applied only when instance is created ASK WHERE { ?this my:age ?age . FILTER (?age < 18) . }
SPIN Meta-Modeling Vocabulary • Reusable SPARQL queries • can use in other contexts • particularly Rules and Constraints • encapsulate SPARQL query templates • Templates • create a template class with SPARQL query • create instance of template • Functions • define new functions to be used in FILTER or LET clauses
SPIN Standard Modules Library • Set of frequently used SPARQL queries • Functions • spl:hasValue • spl:hasValueOfType • spl:instanceOf • spl:objectCount • Templates • spl:Argument • spl:Attribute • spl:ConstructDefaultValues spl:hasValueOfType(rdfs:Class, rdfs:label, xsd:string) # true spl:hasValueOfType(rdf:Class, rdfs:label, xsd:int) # false
Additional Information • SPARQL Guide http://www.dajobe.org/2005/04-sparql/ • SPARQL FAQ http://thefigtrees.net/lee/sw/sparql-faq • Jena (ARQ) Property & Filter Functions http://jena.sourceforge.net/ARQ/library-propfunc.htmlhttp://jena.sourceforge.net/ARQ/library-function.html • My Delicious Bookmarks – sparql and semantic tags http://del.icio.us/yacob • SPARQLPedia – a query repository http://sparqlpedia.org/ • SPARQLMotion – scripting of SPARQL http://sparqlmotion.org/ • SPIN – stored procedures for SPARQL http://spinrdf.org/