1 / 14

Introduction to Open Data A generic approach

Introduction to Open Data A generic approach. Iraklis Varlamis Harokopio University of Athens varlamis@hua.gr. s upported by:. Open Data. Open Data flow. A growing trend among scholars, government bodies and organizations to share data outputs, codebooks and software.

Download Presentation

Introduction to Open Data A generic approach

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.


Presentation Transcript

  1. Introduction to Open Data A generic approach • IraklisVarlamis • Harokopio University of Athens • varlamis@hua.gr 2nd SemaGrow Hackathon (in conjunction with IRSS14) supported by:

  2. Open Data Open Data flow A growing trend among scholars, government bodies and organizations to share data outputs, codebooks and software. 2nd SemaGrowHackathon (in conjunction with IRSS14) Publish data in a machine readable format!

  3. Open data value Open Data lifecycle 2nd SemaGrow Hackathon (in conjunction with IRSS14) Publish Publish data and keep them updated!

  4. Increase open data value Organization C Agency A Collect & Aggregate Data repository Agency B 2nd SemaGrow Hackathon (in conjunction with IRSS14) Organization D Serve through a single endpoint Aggregate & combine data!

  5. Data aggregation issues Speak the same language! • Different sources use different notation • Data from multiple sources may be inconsistent • Each source may use different identifier for the same concept • Concept descriptions may differ or even contradict • We need a common way to describe data • We need common data description schemata • It is good to have an ontology in order to validate data 2nd SemaGrow Hackathon (in conjunction with IRSS14)

  6. Common way to describe data • Resource Description Framework (RDF) • A data model for metadata (similar to E-R or Relational model) • Each concept • is a resource (subject) • has several aspects (predicates) • and values for these aspects (objects) • Data expressed as graphs • Resources are identified (URI) • Values are either simple or URI 2nd SemaGrow Hackathon (in conjunction with IRSS14) Data aggregation  merge graphs on the common nodes (URIs)

  7. Common data description schemas • Let’s agree on the predicates • We need machine readable ontologies, taxonomies or vocabularies • FOAF (Friend of a Friend): Agent, Person, name, title, familyName, givenName, knows etc. • DC (Dublin Core Schema): Title, Creator, Subject, Description, Publisher, Contributor etc. • Socially Interconnected Online Communities (SIOC) 2nd SemaGrow Hackathon (in conjunction with IRSS14)

  8. Query Open Data Every endpoint is a database Query the databases and Aggregate query results (RDF tripplets – edges from the graphs) 2nd SemaGrow Hackathon (in conjunction with IRSS14) SPARQL query  SPARQL query  SPARQL query  Query endpoints and merge results

  9. In a real world • Most organizations “publish” data in their web sites • Unformated or semi-formated data (HTML, PDF) • Data scrapping is needed • Some of them publish data in machine readable format • xls, xml files • Only a few offer APIs 2nd SemaGrow Hackathon (in conjunction with IRSS14)

  10. In Greece Data.gov.gr – Public Data Catalog Openarchives.gr – Greek publications Statisitcs.gr - (Hellenic Statistical Authority) Geodata.gov.gr – Public geospatial data opengeodata.gr – Open geospatial data astynomia.gr/opendata/ - Accident related data Other Datasets: Wikipedia, Europeana, GeoNames, WikiTravel, LinkedGeoData, YAGO2s, Freebase, FactForgeetc datasets@eellak.gr: 1)https://docs.google.com/spreadsheets/d/1X9qFojnUbk1RkFWQ8653n2IxjjRewtCcEPScfNAyrqU/edit#gid=02) http://mycontent.ellak.gr/?s=datasets&x=0&y=0 2nd SemaGrow Hackathon (in conjunction with IRSS14)

  11. More APIs Open data cloud: www.opendatacloud.gr/ Data Extraction Tool: deixto.com/ Open data portal: open-data.okfn.gr/ PORTALS Registry for Research Data Repositories: www.re3data.org/ EU Open Data portal: https://open-data.europa.eu/en/data/ World Bank data: http://data.worldbank.org/ http://publicdata.eu/ http://oad.simmons.edu/oadwiki/Data_repositories 2nd SemaGrow Hackathon (in conjunction with IRSS14)

  12. Roadmap 2nd SemaGrowHackathon (in conjunction with IRSS14)

  13. Contribute at all levels! 2nd SemaGrowHackathon (in conjunction with IRSS14) Source: http://www.lorax.gr/ Source: https://www.peterkrantz.com/2012/publishing-open-data-api-design/

  14. Thank you!Questions? 2nd SemaGrow Hackathon (in conjunction with IRSS14)

More Related