230 likes | 370 Views
European Environment Agency and Linked Environment Data and how we are implementing SEIS. Søren Roug. The current situation. Find dataset. The current situation. Find dataset Download it. The current situation. Find dataset Download it Import it. The current situation. Find dataset
E N D
European Environment AgencyandLinked Environment Dataandhow we are implementingSEIS Søren Roug
The current situation • Find dataset
The current situation • Find dataset • Download it
The current situation • Find dataset • Download it • Import it
The current situation • Find dataset • Download it • Import it • Clean it
The current situation • Find dataset • Download it • Import it • Clean it • Create chart
Vision statement If SEIS is only about making data public and not the rest, we wouldn’t get much benefit! We want to eliminate all steps but the last! ...And we’re going to use Linked Data technology to do it
Solution to the data format problem • In addition to the HTML for human eyes we’re asking for a new format called RDF that machines can understand • It is a modernisation of CSV, Excel and all the other data dump formats • This is all we ask a producer to provide... and some metadata
No more searching on foreign sites • The remote nodes provide lists of their datasets • Called manifests or semantic sitemaps • Also in RDF format • Controlled vocabulary URLs in metadata Use any of: GEMET / AgroVoc / DBPedia / EuroVoc / UMTHES – we have created equivalence links between them • The manifests are loaded into our Linked Data search engine
Downloading made easy! Click on the title to see if it is in the database
Downloading made easy Seconds later...
Status • EEA has deployed two search engines called Content Registry and Semantic Data Service that import all lists and all data • Content Registry is for Reportnet deliveries • Semantic Data Service is for published datasets • We have created RDF of several data sets: Reportnet, GEMET, EUNIS, ROD, ITIS, NUTS, NACE etc. • We can also load Eurostat SDMX data via the LATC project
Example of SPARQL query Future prospects for the European otter (From Reportnet) PREFIX art17: <http://rdfdata.eionet.europa.eu/art17/ontology/> PREFIX eea: <http://rdfdata.eionet.europa.eu/eea/ontology/> SELECT ?country ?region ?future WHERE { [] art17:forSpecies <http://eunis.eea.europa.eu/species/1435>; art17:hasRegionalReport ?report. ?report art17:conclusion_future ?future; art17:forCountry ?curl; art17:region ?bgregion. ?bgregion eea:name ?region. ?curl eea:name ?country } ORDER BY ?country ?region
Comparing data: Where do EUNIS and ITIS not agree on naming? PREFIX e: <http://eunis.eea.europa.eu/rdf/species-schema.rdf#> PREFIX itis: <http://eunis.eea.europa.eu/rdf/schema.rdf#> PREFIX dwc: <http://rs.tdwg.org/dwc/terms/> SELECT ?eunisname ?eunisauthor ?itisname ?itisauthor ?usage WHERE { ?eunisurl e:validName 1; e:sameSynonym ?itisurl; e:binomialName ?eunisname; dwc:scientificNameAuthorship ?eunisauthor. ?itisurl itis:nameUsage "invalid",?usage; itis:completename ?itisname; itis:hasAuthor ?auurl. ?auurl itis:shortAuthor ?itisauthor }
Water use per NUTS level 2 in 2007Top 20 Combination of two Eurostat SDMX datasets
PREFIX qb: <http://purl.org/linked-data/cube#> PREFIX e: <http://ontologycentral.com/2009/01/eurostat/ns#> PREFIX sdmx-measure: <http://purl.org/linked-data/sdmx/2009/measure#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX g: <http://eurostat.linked-statistics.org/ontologies/geographic.rdf#> PREFIX dataset: <http://eurostat.linked-statistics.org/data/> SELECT ?nuts2 SUM(xsd:decimal(?obsvalue)) AS ?population ?wateruse xsd:decimal(?wateruse)*1000000/SUM(xsd:decimal(?obsvalue)) AS ?percapita WHERE { ?observation qb:dataset dataset:demo_r_pjanaggr3 ; e:time <http://eurostat.linked-statistics.org/dic/time#2007>; e:age <http://eurostat.linked-statistics.org/dic/age#TOTAL>; e:sex <http://eurostat.linked-statistics.org/dic/sex#T>; e:geo ?ugeo; sdmx-measure:obsValue ?obsvalue. ?ugeo g:hasParentRegion ?parent. ?parent g:code ?nuts2. ?wuregion qb:dataset dataset:env_n2_wu ; e:geo ?parent; e:cons <http://eurostat.linked-statistics.org/dic/cons#W18_2_7_2>; e:time <http://eurostat.linked-statistics.org/dic/time#2007>; sdmx-measure:obsValue ?wateruse. } GROUP BY ?nuts2 ?wateruse ORDER BY DESC(?percapita) LIMIT 20
PREFIX qb: <http://purl.org/linked-data/cube#> PREFIX e: <http://ontologycentral.com/2009/01/eurostat/ns#> PREFIX sdmx-measure: <http://purl.org/linked-data/sdmx/2009/measure#> PREFIX skos: <http://www.w3.org/2004/02/skos/core#> PREFIX g: <http://eurostat.linked-statistics.org/ontologies/geographic.rdf#> PREFIX dataset: <http://eurostat.linked-statistics.org/data/> SELECT ?country ?year ?population ?ghgtotal xsd:decimal(?ghgtotal)*1000/(xsd:decimal(?population)) AS ?percapita FROM <http://eurostat.linked-statistics.org/data/demo_pjanbroad.rdf> FROM <http://eurostat.linked-statistics.org/data/env_air_gge.rdf> FROM <http://semantic.eea.europa.eu/home/roug/eurostatdictionaries.rdf> WHERE { ?popobs qb:dataset dataset:demo_pjanbroad ; e:time ?uyear; e:freq <http://eurostat.linked-statistics.org/dic/freq#A>; e:age <http://eurostat.linked-statistics.org/dic/age#TOTAL>; e:sex <http://eurostat.linked-statistics.org/dic/sex#T>; e:geo ?ucountry; sdmx-measure:obsValue ?population. ?ghgobs qb:dataset dataset:env_air_gge ; e:geo ?ucountry; e:time ?uyear; e:airsect <http://eurostat.linked-statistics.org/dic/airsect#TOT_X_5>; sdmx-measure:obsValue ?ghgtotal. ?ucountry skos:prefLabel ?country. ?uyear skos:prefLabel ?year } ORDER BY ?country ?year
The end Søren Roug European Environment Agency Soren.Roug@eea.europa.eu