190 likes | 374 Views
DMAC Data Integration. What is it really? Why does it seem frozen in place? How do we get it moving?. Steve Hankin (NOAA/PMEL) DMAC = Data Management and Communications subsystem of the US Integrated Ocean Observing System (IIOS). [. ].
E N D
DMAC Data Integration What is it really? Why does it seem frozen in place? How do we get it moving? Steve Hankin (NOAA/PMEL) DMAC = Data Management and Communicationssubsystem of the US Integrated Ocean Observing System (IIOS) [ ]
Part 1. A Short Digression(begging your indulgence …) • What’s new in theObserving System Monitoring Center (OSMC) OCO Annual Review
under the hood … • Metadata feeds from NOAAPort & GODAE • GODAE QC fields to be added next … • A feed from NCEP ? • Goal: • Compare QC strategies. • Compare GTS filters and feeds. OCO Annual Review
Part 2. DMAC Data Integration(DMAC = Data Management and Communications subsystem of IOOS) Just what is DMAC “data integration” ? (and what is it not ?) Start with a taxonomy thru examples … What is it really? Why does it seem frozen in place? How do we get it moving? OCO Annual Review
The concept of “integration” in DMAC An analogy: the electric power grid • Energy goes in. Energy comes out. • Providers do not target specific consumers. • They just adhere to standards (60Hz). • Consumers are not aware of specific providers. Analogy appears simplistic until you refine your concept of data. Data must always be tightly bound to its metadata. DMAC integration is a “data grid” Analogy is simplistic? OCO Annual Review
The DMAC Plan (2004) is built around a “data grid” concept(a.k.a. “data commons”) • Uniform services (standards) • to interconnect existing systems • “Do no Harm” • Existing standards are inadequate An implementation plan,not a specification How far have we progressed? 240 pages OCO Annual Review
How far has DMAC progressed since 2004? Honest answer: barely at all. Why? • Formulation choices in the DMAC Plan • Political chaos • Community social structure How do we overcome each of these obstacles? OCO Annual Review
Obstacle 1: Formulation choices in the plan DMAC Plan has detailed milestonesBut they are not sufficiently tangible– e.g. “publish a community standard for [xxx]”. Solution: Reformulate the Plan as a sequence of tasks that each provide tangible benefits. OCO Annual Review
Obstacle 2: Political chaos • Dumb, bad luck timing (post 9/11) & • Interagency coordination failureslead to • Negligible direct funding(just enough for “volunteer” meetings) (Note: millions have been made available that generated additional demand for DMAC guidance) Solution: Better marketing. Map out a Plan that can be marketed to Gov’t managers OCO Annual Review
Obstacle 3: Community social structure The diminutive nation of Science Data Management lies nestled among three neighbors: • IT Infrastructure • Computer Science • Science Research Each is larger and more powerful and imposes its viewpoint on our small nation. Science Research Computer Science DataMgmt IT Infrastructure OCO Annual Review
Obstacle 3: Community social structure 3. Science/Research viewpoint: “Reduce complexity by limiting the number of variables to be considered initially.” But data management challenges are largely independent of data content. Analogy: would it reduce complexity in designing an ocean glider if it only had to measure temperature? Data management simplifies by reducing the number of data structures(a.k.a. “data models”). OCO Annual Review
Proposal: Build the DMAC integration framework as a collection of Virtual Data Assembly Centers(“V-DACs”) by data structure. To be developed one-by-one: • Grids (models, satellites, climatologies) • Time series • Surface Tracks • Vertical Profiles and Sections • …, Scatters, Swaths, Radials, Polygons, … OCO Annual Review
NDBC Time series V-DAC U. Hawaii Sea Level Center Meta-data OceanSites NODC time series protocol TAO BATS Imagine the V-DAC for time series data • bricks-and-mortar time series “curator” (funded) • standard protocol(s) (“web services”) • one access point • multiple variables OCO Annual Review
also fund a metadata development activity: • Data discovery • Controlled vocabularies • Data lineage • Geo-referencing • Instrument characterizations • Quality control OCO Annual Review
Temperature V-DAC Meta-data Time series V-DAC Meta-data Profiles V-DAC Grids V-DAC Meta-data Meta-data A single place to access all ocean temperature data How do we build an ocean temperature V-DAC?
The virtues of this approach: • Reductionism: One protocol at a time • A concrete deliverable at every step • Unites communities of interest (integration) But can we market the idea to management? (Who has the ability to carry the message to management?) The science community has a strong voice. (Much stronger than DM.) OCO Annual Review
Discussion(Thank you) OCO Annual Review