Measurement Data Archive – Integration Effort mda.doregistry / GEC11 July 2011

Measurement Data Archive – Integration Effort http://mda.doregistry.org/ GEC11July 2011 Giridhar ManepalliCorporation for National Research Initiativeshttp://www.cnri.reston.va.us/

Measurement Data Archive: Status • Deployed a prototype of measurement data archive that includes a temporary storage space, aka workspace • A hierarchical storage system that allows making collections of objects • Mints a persistent identifier that resolves to data • Indexes metadata to support queries and data discovery • Supports SFTP, SCP, SMB, REST, and Web-based Interface into the system • Early adopters in GENI: • OnTimeMeasure - Ohio State University • INSTOOLS - University of Kentucky

Success Criteria for an Archive • Archive cannot be just a store-and-retrieve service. An eco-system surrounding the archive is needed to motivate communities into using it. • Visualization, policy enforcement, dissemination, etc. are examples of services an archive could provide. • To build such an eco-system, a basic understanding of what we store is necessary: • #1: Data Model. How do you define a data object? (Not how it is serialized, e.g., databases, file-systems, etc.). Do we need a data agnostic archive? Do we manage relationships across data objects? • Too many storage systems failed because of the lack of a proper data model. • #2: Metadata. What constitutes a metadata record? How is it associated with a data object? • Lack of metadata results in a pile of bytes in an archive. Building an eco-system of services with a pile of bytes is impossible. • #3: API. How is data (and metadata) pushed into an archive? What are the end-point definitions and data structures? • #1 and #2 are more important.

Integration: Next Steps • Step #1: Define a data object. • Is data just a series of bytes? Or do we pack X, Y, & Z into it? • Are relationships across objects required or not? (Not nice-to-have, but are they required?) • Do we have data visibility criteria? Permissions, etc. • Step #2: Validate metadata recommendation. • Projects should generate a few metadata records with these goals: • To identify which elements are needed, which are optional, and which are not required. • To capture different profiles of data. Perhaps some elements are needed for one class of data, and other elements are needed for other class of data. • This may result in a few profiles. Although unlimited profiles are hard to manage, a limited number will result in less optional fields. • To validate the suggested controlled vocabulary for some of the elements, and to identify vocabulary where missing. Controlled vocabulary brings some order into metadata and discovery. • Step #3: Identify API. • What end-points and data structures are reasonable for a given project? REST+XML, XML-RPC, etc.

Measurement Data Archive – Integration Effort mda.doregistry / GEC11 July 2011

Measurement Data Archive – Integration Effort mda.doregistry / GEC11 July 2011

Presentation Transcript

Data / Voice / Video Integration

CSEC INFORMATION

Introduction to WinRiver II

Missing Data Measurement Error

Software Measurement

Archiving

Curriculum Based Measurement (CBM) Training

Informatica MDM - Multidomain

Chapter: Measurement

Software Testing 13. Integration

UNDIP TEAM SEMARANG Meeting, 9 – 13 July 2011

Semantic Provenance: Trusted Biomedical Data Integration

Semantic Provenance: Trusted Biomedical Data Integration

Data Integration: Oracle APEX and WebCenter Content Integration Solutions

Current Best Practices in Prevention July 20, 2011

Spring Integration - basics

12. July 2011

Data Mining: Concepts and Techniques — Chapter 2 —

Digital Archiving for Documentation of Endangered Languages

A Small Tutorial on Big Data Integration

Data Integration with OPC-UA, SNMP, and more

“ Scientific Measurement ”