140 likes | 226 Views
5 topics. Fedora NGDA project activities Two study ideas MODIS Preservation as series-of-handoffs. Fedora— what. Repository system Features basic management functionality programmatic APIs object model & XML representation thereof storage subsystem abstraction
E N D
5 topics • Fedora • NGDA project activities • Two study ideas • MODIS • Preservation as series-of-handoffs
Fedora— what • Repository system • Features • basic management functionality • programmatic APIs • object model & XML representation thereof • storage subsystem abstraction • inter-object relationships, versions, … • Active user community
Fedora— why • Avoid re-inventing wheel • Value in describing our work as profiles of, additions to a base repository • intellectual value • practical value: contribute back to Fedora community • Generic Fedora-ADL connection
Fedora— to do • Define “long-term preservation” profile • Additions to Fedora • richer model of semantic definitions • support for geospatial data types • Archivas storage driver
RESEARCH DEVELOPMENT ADL federation UCSB Stanford other … prototype archives … multiple levels of information preservation: bits semantics viewability geospatial format/product registry NGDA project activities • Considerations for long-term preservation • Best practices • Collection development, prioritization, and scope • Architectural and economic models • Rights issues
2005 2105 object (data + metadata) object (data + metadata) computing platform semantics terminology provenance provider quality appropriate usage community object migrate environment capture environment Study #1 How much is necessary? Capturable? Archivable? Affordable?
Study #2 • Survey people who have (tried to) use old geospatial data • What information did they need? • What was missing? • What would have been useful?
MODIS • “Moderate Resolution Imaging Spectroradiometer” • http://modis.gsfc.nasa.gov/
MODIS challenges • Size • 2 petabytes; growing 1 TB/day • not backed up • HDF file format • large, complex, long history • controlled/managed by NCSA • source of funding • actual format is undocumented • accessed through NCSA-provided software libraries • reverse engineerable in principle, not in practice
MODIS challenges • Raw data format • documented, but not publicly releasable • includes satellite controls • Attitude/ephemeris data format • “hard to find” • Other format issues • packets, nested formats, ...
MODIS challenges • Calibration/processing algorithms • key to data interpretation • documentation: • initially described by “algorithm theoretical basis document” (ATBD) • rapidly outdated, never updated • journal articles • Fortran/C source code is definitive • certain lookup tables re-calibrated monthly • moving to on-demand computation of products
MODIS challenges • Other, related efforts • NASA committee(s) on long-term storage • NASA’s transition of operations to NOAA • CLASS
MODIS • Questions for you: • Is this within NGDA’s scope? • Is this within LoC’s scope?
Preservation as series of handoffs • Chris Rusbridge • no such thing as a 100-year guarantee* • “impossible perfection” • instead, a series of 10-year guarantees • Jim Frew • “store-and-forward” model • analogous to Internet • Greg’s conclusion • handoff/migration ability is key *except by LoC?