70 likes | 172 Views
B2SAFE: Practical Policies. Willem Elbers (MPI-TLA) 2 nd EUDAT Conference. Date : 27 October 2013. Policy Collecting. Collected policy natural language descriptions from serveral projects and institutes (EUDAT core communities ), IN2P3, REPLIX, PanData, ILL Total of 53 policies
E N D
B2SAFE: Practical Policies Willem Elbers (MPI-TLA) 2nd EUDAT Conference Date: 27 October 2013
Policy Collecting • Collectedpolicynatural languagedescriptionsfromserveralprojects and institutes • (EUDAT corecommunities), IN2P3, REPLIX, PanData, ILL • Total of 53 policies • Classifiedaccordingtotype (hard/soft) and risk. • First selection round: • Integrity checking • Format conversion
Integrity Use the EUDAT Persistent Identifier Structure • Verify that: • At least n replicas of the digital object exist • The replica checksums match the checksum of the repository of record object • Requirements: • Original Digital Object as input • Reference to RoR • Checksum stored for each replica • We can’t rely on on-demand checksum computation • Checksum must have a timestamp associated
Integrity • Three policies needed • Policy to compute checksum • Running at the EUDAT data centers • At a set periodic interval • Policy to verify the number of replicas • Policy to verify the replica checksums • Visualization of the results for the community data managers • Integration into the Data Policy Manager
Check number of replicas Check replica integrity Integrity 1839/abc Fetch all replica locations Fetch all replica locations EUDAT 29db...279b4a 1.10.13 00:00 Count # replicas For each replica 456/abc 789/abc …/abc Compare count(min, actual) Compare checksum(PIDror, PIDcurrent) 29db...279b4a 29db...279b4a 29db...279b4a 1.10.13 02:00 5.10.13 05:00 6.10.13 10:00 Checksum recalculation on the physical file
Format Conversion • Domains: format verification versus format conversion • On ingest the RoR is responsible for format verification and conversion • Over time a format can become obsolete • A format obsolescence strategy must be in place • Where does the conversion take place? • The end user converts the data before or during the ingestion manually (out of EUDAT scope) • The conversion is part of an ingestion workflow (can be in EUDAT scope). • Some format after some time becomes obsolete and all replicas in that format shall be converted (in EUDAT scope).
Questions Thank you for your attention