1 / 17

PHDD – An RDF Vocabulary for the Physical Data Description Work in Progress

PHDD – An RDF Vocabulary for the Physical Data Description Work in Progress. Joachim Wackerow and Thomas Bosch (both GESIS – Leibniz Institute for the Social Sciences). What is it ?. Description of the physical properties of a data file . Focus on most common format types

urian
Download Presentation

PHDD – An RDF Vocabulary for the Physical Data Description Work in Progress

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PHDD – An RDF Vocabulary for the Physical Data DescriptionWork in Progress Joachim Wackerow and Thomas Bosch (both GESIS – Leibniz Institute for the Social Sciences)

  2. Whatisit? • Description ofthephysicalpropertiesof a datafile. • Focus on mostcommonformattypes • Rectangularformat • Character-separatedvalues(CSV) orfixed-recordlength

  3. Character-separated Values Header row

  4. Fixed Record-length Format

  5. Motivation • Data.gov andsimilarinitiatives providedata in CSV formatorsimilar • W3C Government Linked Data Working Group Charter • “The mission … is to provide standards and other information which help governments around the world publish their data as effective and usable Linked Data using Semantic Web technologies.” • Machine-actionability - intendedforprogramuse

  6. Example at data.gov

  7. ExistingApproaches • CSV on the Web Working Group Charter (W3C) • Linkage • Linked CSV (JeniTennison) • CSV linked data (Quoderat) • Description • Common Format and MIME Type for Comma-Separated Values (CSV) Files, RFC 4180 • csv: a vocabulary for describing CSV files (Rurik Thomas Greenall, Norwegian University at Trondheim) • Representations • URI design for RDF conversion of CSV-based data (Tim Lebo, Gregory Todd Williams) • CSV2RDF Application (Ivan Ermilov, Sören Auer, Claus Stadler) Research results: • Intendedfor different purposes • Description approaches not sufficient

  8. PHDD – First Ideas

  9. PHDD – UML Model

  10. PHDD - Overview General approachis not reallynew, just a completesetofpropertiesforthemostcommoncases. Structure • Table – therectangulardatafile [disco::DataFile, dcat::Distribution] • TableStructure - commonproperties plus specificonesfordelimitedandfixedcolumns • Column - commonproperties plus specificonesfordelimitedandfixedcolumns [disco::Variable]

  11. Table Structure

  12. Column Description

  13. Usage Scenarios Data DDI XML Data Data Provider PHDD Discovery DCAT Description Program User Transformation Analysis Search

  14. Relationshipto DDI XML • Mapping to DDI XML Specifications • DDI Codebook 2.* • approx. half ofthepropertiesof PHDD • DDI Lifecycle 3.* • almost all propertiesof PHDD

  15. Relationshipof DDI Specifications XML Schema OWL/RDF PHDD DDI Codebook 2.* Discovery DDI Lifecycle 3.* XKOS DDI 4 Model Future XML Schema Representation OWL/RDF Representation

  16. Acknowledgements • Contributionsby • Larry Hoyle (Institute for Policy & Social Research, University of Kansas) • Richard Cyganiak(DERI - Digital Enterprise Research Institute )

  17. Further Information • Development repositoryof PHDD • https://github.com/linked-statistics/physical-data-description • DDI Alliance RDF Vocabularies • http://www.ddialliance.org/Specification/RDF

More Related