90 likes | 224 Views
Data Integrity An Alternate View. Dick Simpson PDSMC 29 November 2006. Level 3 Requirements Applicable to “Data Integrity”. No such Level 3 Requirement. Keys to Data Integrity. SPACE. TIME. Use Cases. UC-1 Data Delivery SPACE UC-2 Data Distribution SPACE
E N D
Data IntegrityAn Alternate View Dick Simpson PDSMC 29 November 2006
Level 3 RequirementsApplicable to “Data Integrity” No such Level 3 Requirement Keys to Data Integrity SPACE TIME
Use Cases • UC-1 Data Delivery SPACE • UC-2 Data Distribution SPACE • UC-3 Data Transfer to Deep Archive SPACE • UC-4 Data Node Termination SPACE • UC-5 Archive Integrity TIME • UC-6 Media Migration SPACE(and TIME?) • UC-7 Recover Data from Deep Archive SPACE • UC-8 Data Transfer Between PDS Nodes SPACE
Level 4 Requirements follows from 2.5.2 restates 3.2.3 variation on 4.1.2 restates 3.2.3 restates 3.2.3 restates 3.2.3 restates 3.2.3 follows from 3.2.3
Level 4 Requirements(continued) follows from 3.2.3 restates 4.1.2 variation on 3.2.3 restates 4.1.2 variation on 3.2.3 part of 3.2.3 variation on 3.2.3
“SPACE” Moving Data from A to B EXTERNAL MANIFEST REPOSITORY t0 send list PROVIDER or REQUESTOR t1 validate data (“checksum” and list) if delivery if request t2 transfer data t3 validate data (“checksum” and list) if request if delivery t4 acknowledge or repeat
“TIME” Checking Data at tA vs tB MANIFEST REPOSITORY tA original list and files same? same? tB current list and files inventory, including something like checksums files
Requirements • Define manifests (file and directory lists, or equivalents) • Master manifest(s) for total holdings • Ephemeral manifests for transfers • Foolproof procedures for protecting, updating, and checking manifest(s) • Adopt a checksum or equivalent system • Procedures for marking and validating files • Procedures for exchanging these identifiers • Method(s) for exchanging Manifests, Checksums, and Files with very low error rates
Related Issues • Type of control/management • central vs local control (or totally ad hoc?) • getting everyone synchronized • automation vs human participation • Schedule for internal integrity checks • Role of mirror site • Contingency planning: what if something actually does go wrong? • Optional information for manifest(s) • file size, creation date • log all transfers to/from NSSDC • statistics on transfers? • statistics on storage devices, comm links?