170 likes | 197 Views
Explore the evolution and implementation of ACI's Linked Open Data initiative, including data classifications, project objectives, implemented solutions, and the live demo of the ACI Linked Open Data Portal. Learn about ACI's journey towards Level 5 Linked Open Data and the use of W3C standards.
ACI LinkedOpen Data Rome - March 6th, 2019
Table of Contents • Open Data classifications • Startingpoint in ACI • Project objectives • Implementedsolution • Evolution of Open Data • ACI Linked Open Data Portal • Publisheddata • Live demo • ACI LinkedOpen Data
Open Data classificationsaccording to Tim Berners Lee • ACI Linked Open Data • Data isavailable on the web (in any format) but to be classifiedas Open Data it must have an open license. • Data must be available in a structured formatthat can be interpreted by some software (eg. a Microsoft Excel file asopposed to a scanned image of a spreadsheet table). • Data is in a structured format and moreoverthisformat isnotproprietary(in the aboveexample, CSV is a better format than Microsoft Excel in so far asitisnotsubject to a license). • Aswellas the abovecriteria, data makes use of open standardsasdefined by W3C (suchasRDF and SPARQL) to identifyobjects, so thatothersmayreferenceyourresources. • Data complies with all the abovecriteriaaswellascontaininglinks to external datain order to providefurthercontext to yourown information.
Open Data classificationsaccording to Tim Berners Lee • ACI Linked Open Data
Startingpoint of Open Data in ACI • ACI Linked Open Data • Open Data published in the Autoritrattosection of ACI’s corporate website belongs to level3: the data isstructured with an open license, and the format isnotproprietary. • At times the data isnotpresented in a homogenousstructuresinceeachdatasetcarriesitsownspecificstructure. • The data is accompanied by a “Code of regulations for electronic access to the data and services contained in the institutional website of ACI” and by a “Catalog of data in Open format”, both of whichsuppliedas PDF documents. • Currently no structuredmetadata to accompany the publishedstatisticsisbeingreleased.
Project objectives • ACI Linked Open Data • Create a dedicated web portal and publish the Open Data for the benefit of anyexternalusers. • Increase the value of the published open data to reachlevel5 (Linked Open Data), by adopting the establishedstandardslaid out by W3C suchas RDF and, by way of links, connectining to relevantexternaldatasets. • Use advancedmechanisms to represent and navigate the data by publishing an ontology for the domains of PRA and Taxes and allrelateddocumentation for itsinterpretation. • Implementautomaticservices of semanticannotationfor interpreting the data with respect to the ontology.
Implementedsolution • ACI Linked Open Data • Level 1: Ontology: the ontologyimplementedcomprises8 modules, reflect in turn the divisions of the domains of interest for eachlogicallyinterconnected area. The modules are: Vehicles;Formalities;Entities;Levies; Possession& Ownership; Events & Conditions; Territories; Statistics. • Level 2: Mappingontologies and data sources:specifies the relationshipthatexistsbetween the data sources in an organization’s IT infrastructure and the elements of an ontology (classes and theirproperties). Currently the data is extracted directly from the relational databases, it is subsequently annotated semantically using the predicates of an ontology and finally it is codified in RDF format. • Level 3: Data sources:comprisesanyrepository in whichresidesall data of interest for the organization.
Evolution of Open data – fourstars • ACI Linked Open Data • RDF: by virtue of the semanticannotations, anyextracted RDF willmaintainallconnectionsbetween the datasetrecords and the ontologicalterminology. • URI: a datasetin RDF format whichutilizesURIs to identifyobjectswill do so univocally and makethemlinkable to otherdatasetswhetherlocally or on the web. • Publishedresources: an implementation mechanism that manages pages documenting all resources in a dataset, thus every associated URI of a resource has its own page.
Evolution of Open data – fivestars • ACI Linked Open Data • Anythingpublished with links to resourcesbelonging to otherdatasetsisknownas “5-star” open data and isconsidered to be richer and of higherquality. • The types of linksbetween open datasetsthat are significant are: • Relationallinks: these are connections to resourcesthatbelong to variousdatasets. • Identity links: these are special links that connect two URIs to indicate that they refer to the same resource (entity).
Evolution of Open data – fivestars • ACI Linked Open Data • Relationallinks: use of elements taken from the W3C standard ontological vocabularies linking to elements of ACI ontology domain. • Importing and using terms from standard vocabularies simplifies the understanding of an ontology, reduces the risk of redundancy in terms used, and promotes interoperability with other ontologies.The ontological modules which avail of these links are most notably the statistical and the territory modules where elements of the following ontologies have been imported: • GeoNames ( http://www.geonames.org ) for the description of geographical locations. • Data Cube( https://www.w3.org/TR/vocab-data-cube ) for the representation of multidimensional statistical data. • Dublin Core ( http://dublincore.org ) for the description of digital material accessible via the web.
Evolution of Open data – fivestars • ACI Linked Open Data • Identity links: create associations between data from ACI and external datasets allowing for enriched resources with a number of attributes not present in ACI data. • In the case of geographical locations, for example, it is possible to create links with GeoNamesURI to add useful information such as latitude and longitude. This allows a developer who used ACI data to easily add data georeferencing, which is of vital importance in many applications. • It is necessary to identify the vocabularies and datasets to which the entities present in the data are to be connected. Given the nature of the information assets belonging to ACI, the "natural" candidates were: • GeoNames( http://www.geonames.org ) for geographicallocations. • Ispra ( http://dati.isprambiente.it ) for information regarding the environment.
ACI Linked Open Data Portal – technical notes • ACI Linked Open Data • The ACI Linked Open Data Portal wasaccomplishedthrough a collaborationbetweenACI, the Sapienza Universy of Rome and OKKAM, a spin-off of the University of Trento. The portalemploystechnologicaladvancementsprovided by La Sapienza and OKKAM for the creation of the portal, for the definition of the ontology and specifications for mapping, and for the creation of publisheddatasets. • Specifically: • The portalwascreatedusing the Mastro Studio and Mastro systems of La Sapienza. • The ontologywascreatedusingEddy, La Sapienza’sdevelopmentenvironment for ontologies. • The identitylinks in the publisheddatasetsweredefinedthroughOKKAM’sEntityName System (ENS).
ACI Linked Open Data Portal – architecture Apache Web Server + PHP web client client requests WildFly Application Server SOAP Web Service invoked via Javascript SOAP Web Service invoked via PHP • ACI Linked Open Data
ACI Linked Open Data Portal –Mastro Studio and Mastro Mastro Studio avails of automaticreasoningservicesprovided by Mastro, througha web-service system, to facilitate the publication and navigation of the ontology and the resources in the datasets. • ACI Linked Open Data • Mastro Studiois a web applicationbased on DKAN, an Open Data CMS. Itprovides the followingfeaturesthrough a suite of custom modules: • inspectthe ontology in eitherOWL or Graphol(a visuallanguagesimilarto E-R diagrams for OWL ontologies) format. • browse the ontologydocumentationin wikiformat (replete with hypertextlinks). • navigate allresourcesin the publisheddatasets. • download anydataset in CSV or RDF format.
ACI Linked Open Data Portal – producing the datasets • ACI Linked Open Data • The datasetspublishedon the portalACI Linked Open Data for navigation and downloading are createdthrough the followingprocess: • directextraction of the statistical data from ACI'srelational data sources. • transformationof extracteddata intoRDF format, simultaneouslycreating the resourceidentifiers (URIs) for all the resources in the dataset. • semanticannotation of the URIsthroughlinks to the terms of the ontology, whichprovide a formaldefinition. • enrichmentof the datasetsthrough the addition of identitylinkswhichconnect the resources in the ACI datasets to thosepublished by otherorganizations, notablyGeoNamesand ISPRA
Published data – first registrationand deregistration • ACI Linked Open Data • The Automobile Club d'Italia (ACI) provides a representation of Italy’s entire fleet of vehicles in an open format, through an articulated synthesis of the data taken from the governing body’s archives. • Of all vehicle transactions, the most important are first registration and final deregistration, which together determine the trend of the circulating fleet. The most salient aspects of these are highlighted, that is fuel systems and Euro class, which are both indices of the environmental impact of the vehicles. • First registrations of new vehicles subdivided by territorial body • First registrations of new vehicles subdivided by territorial entity and fuel systems • Deregistrations by way of decommissioning subdivided by territorial body • Deregistrations by way of decommissioning subdivided by territorial body and by Euro class
Conclusion • ACI Linked Open Data • The end result is the publication of the ACI Linked Open Data Portal • http://lod.aci.it