110 likes | 240 Views
ENSURE Linked Data Registry. PRELIDA Workshop 2013. Robert Sharpe, Tessella. Agenda. Archives, libraries and representation information Previous “technical registries”: Potted History Issues ENSURE linked data technical registry: What’s different? Why we hope it should succeed?
E N D
ENSURE Linked Data Registry PRELIDA Workshop 2013 Robert Sharpe, Tessella
Agenda • Archives, libraries and representation information • Previous “technical registries”: • Potted History • Issues • ENSURE linked data technical registry: • What’s different? • Why we hope it should succeed? • Conclusions and feedback…
Archives, libraries & representation information • Hold descriptive / cataloguing information for centuries: • Helps determine context and makes things unambiguous: • E.g., censusrecords • Frequency, type of information • Professions • Parish boundaries • Includes references to other sources / archives • A “representation information network” of “linked data” • With advent of digital material: • Need information on formats, rendering software etc. • Look to add “Technical Registry”
Technical Registries: Potted History 1/2 • PRONOM: • Started in 2001 • On-line from 2005 • “File format registry” • In fact, holds more… • Planets Core Registry (2008) • Holds even more entities • Both: • Database–based • Web-based GUI • Issues: • Partially populated • Hard to add new entities • Hard to synchronise
Technical Registries: Potted History 2/2 • Move to linked data: • Linked Data PRONOM • UDFR • … • Issues: • Partially populated • Hard to add new entities • Partial projects: enough to be used? • Hard for people to query: SPARQLbut not via simple GUI • Complex provenance
What’s different? • ENSURE Linked Data Technical Registry: • Less entities: more population: • Expand later • Start with synchronise issue • Good querying and user interface: • Human Search / Browse • Human View / Edit • Simple view of provenance • Long term commitment: • Will integrate with SDB/Preservica • 20+ organisations will use it
Data Model • Keep it simple: • Things actually used • Things actually populated • Add more if and when needed • Format: • ID, Name, Version, Description • Release Date, Withdrawn Date • Internal Signature, External Signature • Relationships • Not: • Assessments, Risk scores • Documents, Reference files, Agents • Intellectual Property • Technical Environments • XCDL, XCEL • Types, Faceting • Complex provenance
Allow view / edit • Needs to be simple and user friendly • Not clear it can then expand with model w/o effort?
Provenance • Blocks of information: • Format, Software, Property, Pathway • Who made change to format, when and based on what info? • Need provenance of block not each item • Store every change: • Rollback • Diff • In fact makes synchronise easy: • Receive update and detect change
Conclusions • Simple, Usable • Synchronised (as needed) • Provenance held (simply) • Expandable (with limited but not zero effort) • Being built now • Should be complete by December • Will be integrated to working repository and thus used • Will need to iterate from there… • Comments and ideas welcome