190 likes | 374 Views
Principles for Sustainable Data Curation;. Steven Worley Computational and Information Systems Laboratory NCAR. Can Research Library Repositories Benefit from the Federal Lab Experience?. Topics. My perspective – Research Data Archive @ NCAR Principles for Sustainable Data Curation
E N D
Principles for Sustainable Data Curation; Steven Worley Computational and Information Systems Laboratory NCAR
Can Research Library Repositories Benefit from the Federal Lab Experience?
Topics • My perspective – Research Data Archive @ NCAR • Principles for Sustainable Data Curation • Stable Funding • Knowledgeable Staff • Robust Digital Storage • Protection from Loss • Data and Metadata Format • Partnerships • Data Management Evolution ARL, Leadership Fellows
My perspective – Research Data Archive @ NCAR Core Data Categories • Operational and Reanalysis Model Outputs Meteorological and Oceanographic Observations Remote Sensing Observations • Topography, Bathymetry, Vegetation, and Land Use ARL, Leadership Fellows
My perspective – Research Data Archive @ NCAR • Purposes • Support climate & weather research at NCAR and UCAR Universities • Extend data service worldwide • Basic Metrics • Established in 1960s • 600+ datasets, +4M files • +70 datasets growing daily - monthly ARL, Leadership Fellows
My perspective – Research Data Archive @ NCAR ARL, Leadership Fellows
Archiving • Metadata • Data Integrity • Preservation • Management • Supervision • Guidance • Integrity • Access • Archiving • Metadata • Data Integrity • Preservation • Curation • Steward-ship • Requests • and • Needs • Users • Data • Assistance • Feedback • US • International ARL, Leadership Fellows
Sustainable Curation - Stable Funding Permits: • Flexibility • Evolution of data management to meet expectations • Holistic approach – not driven by narrowly defined projects • Take advantage of unplanned opportunities • Necessary to keep collection viable for long-term ARL, Leadership Fellows
Sustainable Curation - Knowledgeable Staff Data domain knowledge enables: • Understand data and do integrity checks • Choose data organization to fit science discipline • Design appropriate access systems and do consulting Consistent staffing levels nurtures: • Professionals dedicated to best practices • Human-based knowledge cannot be under estimated ARL, Leadership Fellows
Sustainable Curation – Robust Digital Storage Keep pace with digital media evolution: • Expect data migration every 2-5 years • Tape, disk capacity, etc. • Plan, test, and implement migration carefully • Mistakes are irrecoverable! • Use knowledgeable staff heavily Why evolve? • Users expect more data with faster access • Media will eventually fail ARL, Leadership Fellows
Sustainable Curation – Protection from Loss Create backup data and test disaster recovery Why? • Physical failures • Environmental: Power outage, Fire, Flood, ….. • Hardware: Disk system failure, Tape degradation • Poor curation practices • Metadata loss • Accidental data over-writes and deletions • Solutions • Store backup at separate physical location • Treat metadata and data as equals - couple together ARL, Leadership Fellows
Sustainable Curation – Protection from Loss ARL, Leadership Fellows
Sustainable Curation – Protection from Loss RDA : 40% ARL, Leadership Fellows
Sustainable Curation – Data and Metadata Format Formats are a serious consideration because: • Must maintain data access for long-term • How? • Insist that data and metadata are in standard formats • Avoid computer OS dependent formats • Worry about application driven formats • E.G.: .xls, .xlsx, .doc, .docx, .ppt, .pptx, etc. • Challenge; Scientist are reluctant to help • Curators nightmare; never ending data and metadata format diversity ARL, Leadership Fellows
Sustainable Curation – Partnerships Science productivity is enhanced by partnerships • Open sharing of data and metadata • Relies heavily on standards • No one archive or repository can do it all • BUT, users need/want it all • Cost saving by sharing ARL, Leadership Fellows
Data Management Evolution – Person-centric 1960s to 1990s ARL, Leadership Fellows
Data Management Evolution – Metadata-centric 1990s – 2010s ARL, Leadership Fellows
Summary: For Research Library Repositories Sustainable Data Curation Robust Digital Storage Knowledgeable Staff Stable Funding Protection from Loss Data/Metadata Format Partnerships ARL, Leadership Fellows
Research Data Archive @ NCARhttp://dss.ucar.edu/worley@ucar.edu ARL, Leadership Fellows