1 / 7

B2SAFE: Practical Policies

B2SAFE: Practical Policies. Willem Elbers (MPI-TLA) 2 nd EUDAT Conference. Date : 27 October 2013. Policy Collecting. Collected policy natural language descriptions from serveral projects and institutes (EUDAT core communities ), IN2P3, REPLIX, PanData, ILL Total of 53 policies

sarah-todd
Download Presentation

B2SAFE: Practical Policies

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. B2SAFE: Practical Policies Willem Elbers (MPI-TLA) 2nd EUDAT Conference Date: 27 October 2013

  2. Policy Collecting • Collectedpolicynatural languagedescriptionsfromserveralprojects and institutes • (EUDAT corecommunities), IN2P3, REPLIX, PanData, ILL • Total of 53 policies • Classifiedaccordingtotype (hard/soft) and risk. • First selection round: • Integrity checking • Format conversion

  3. Integrity Use the EUDAT Persistent Identifier Structure • Verify that: • At least n replicas of the digital object exist • The replica checksums match the checksum of the repository of record object • Requirements: • Original Digital Object as input • Reference to RoR • Checksum stored for each replica • We can’t rely on on-demand checksum computation • Checksum must have a timestamp associated

  4. Integrity • Three policies needed • Policy to compute checksum • Running at the EUDAT data centers • At a set periodic interval • Policy to verify the number of replicas • Policy to verify the replica checksums • Visualization of the results for the community data managers • Integration into the Data Policy Manager

  5. Check number of replicas Check replica integrity Integrity 1839/abc Fetch all replica locations Fetch all replica locations EUDAT 29db...279b4a 1.10.13 00:00 Count # replicas For each replica 456/abc 789/abc …/abc Compare count(min, actual) Compare checksum(PIDror, PIDcurrent) 29db...279b4a 29db...279b4a 29db...279b4a 1.10.13 02:00 5.10.13 05:00 6.10.13 10:00 Checksum recalculation on the physical file

  6. Format Conversion • Domains: format verification versus format conversion • On ingest the RoR is responsible for format verification and conversion • Over time a format can become obsolete • A format obsolescence strategy must be in place • Where does the conversion take place? • The end user converts the data before or during the ingestion manually (out of EUDAT scope) • The conversion is part of an ingestion workflow (can be in EUDAT scope). • Some format after some time becomes obsolete and all replicas in that format shall be converted (in EUDAT scope).

  7. Questions Thank you for your attention

More Related