400 likes | 552 Views
MBARI Data Management Initiatives. John Graybeal Information Applications Group Lead. M onterey B ay A quarium R esearch I nstitute. Established in 1987. David and Lucile Packard Foundation. MBARI Location. Santa Cruz. MBARI. Monterey. Monterey Canyon.
E N D
MBARI Data Management Initiatives John Graybeal Information Applications Group Lead
Monterey Bay Aquarium Research Institute Established in 1987 David and Lucile Packard Foundation
MBARI Location Santa Cruz MBARI Monterey Monterey Canyon
Monterey Ocean Observing System • Suitable for deep ocean or coastal studies • low power, long term moorings and benthic nodes • low bandwidth communication links to shore • Configurable, re-deployable instruments and platforms (using ships and ROVs) • Smart nodes on deployed platforms • some on-board data processing • facilitate autonomous event detection • perform on-board calculations/detections • handle responses from shore
MBARI Autonomous Underwater Vehicle (AUV) MOOS Concept of Operations Mooring Benthic Node
Data Management Challenge • Large number of data sources • Large variety of data sources • Dynamic systems • Data sources may appear and disappear • Devices & platforms reconfigured often • Interactions from shore and ship • No standard data format • Data can be instrument ‘native’ • New sources coming on-line all the time • Streams or files, automated or manual
Example: Video and Images • 14 years, up to 300 dives/year • 14000 video tapes, 10000 hours • 47000 frame grabs… => 900,000 annotations • How to manage this valuable repository? • Advanced annotation system • Detailed knowledge base of concepts • Easy-to-use querying tool
Notes About SSDS: TheShore Side Data System • A MOOS Development Project • Goals: low cost, flexible, expandable, reliable • Future systems beyond MOOS (e.g., MARS) • Now in 3rd year, deploying initial elements • Key Tenets of SSDS Development • Iterative development—improve it as we go • Test with real data—new and archival • Build for change—use modular interfaces
Shore Side Data System:Requirements Overview • Ingest data in any described format and save it • Capture, publish data descriptions (metadata) • Provide standards-based access to data • Raw data, and other common digital formats • APIs for common visualization and analysis tools • User-oriented web interfaces, quick-look plots • Merge data (different sources & time intervals) • Support data visualization & quality control • Provide data access security as needed
Shore Side Data System:User Requirements • Raw data via device ID pages? (sort of limited) • Standard plots like OASIS quality controlled ones? • Access data from applications via a DODS URLs? • Matlab, Ingrid, Live Access Server, Excel, IDV, Ferret • And hopefully, Ocean Data View • Access data via returned data files (e.g., ASCII CSV w/headers) opened within desktop applications? • Excel, ArcView, Ocean Data View • Delivery of data directly into an application? • Ability to subset data, for example by time window? • Ability to merge data from different data sets?
Data Management at MBARI:SSDS Efforts • Infrastructure/model development • Ontologies • Metadata schema • Metadata entry/correction/annotation • User interfaces • Data processing • Visualizations • Federated access to MBARI data/metadata
More MBARI SSDS Tasks • Legacy data migration • OASIS, expd etc., Samples, Waypoints, ? • New data sources • MTM II, AUV Sonar, CIMT, … • Outreach (integrating non-SSDS projects) • Documentation • NEPTUNE • Education • Operational support
Archiving 101110 110 234 999 110011 Data Presentation Data line 1 more data last data MOOS/SSDS Architecture(shows data flow) Applications/ Interfaces User Communications Portal Deployed Platform Data Tracking Applications Devices Shore Side Data System (User Tools) Ocean Side Shore Side
Metadata Approach(Credit: Dan Davis) • XML suitable for MOOS metadata • Enables use of many other tools/software • But, it looks a little bit user-unfriendly • Use XML-driven GUI technology to create forms to create and display metadata • Users don’t have to directly read XML • It’s there and easy to access if they want it • Bind XML metadata to each device through its puck
Metadata stored in puck interface • During pre-deployment instrument configuration, and test, sensor driver and associated metadata is stored in compact flash memory in puck to host computer Puck Sensor serial interface serial interface
Metadata User Form Design • User interface designer uses schema to build a form for creation, display, access, of metadata instances • There may be different forms for different users (e.g. scientific, system, and operational) to create, and display metadata of interest
Instrument Configuration • Metadata forms are used during device configuration to create metadata that is entered into device puck • Similarly metadata forms are used during configuration of other system elements, such as platforms, and even communication links. This metadata is maintained in system nodes.
SSDS—Metadata (Device) • The data source. • SSDS tracks: • Software or hardware source • Unique identifier • Manufacturer information • References to documentation
SSDS—Metadata (Deployment) • ‘Deployment’ information. • SSDS tracks: • Where the data was collected. • When it was collected. • What other data was used. • Relation to other deployments
SSDS—Metadata (DataContainer) • References to the data. • SSDS tracks: • The data storage location. • How to access this data. • The deployment that produced this data.
SSDS—Metadata (Records) • Format and contents of a DataContainer. • SSDS tracks: • The contents of a data set. • The data format (to allow parsing by software). • Descriptive info like units, scale, …
Metadata and Access: Catalogs and Repositories • View From the Shore • Many data registries and models • GDC, OBIS, EarthRef, NVODS, … • Many standards • Communications protocols: SOAP, OPeNDAP, OBIS, … • Metadata formats (MIF, XML, NGDC, NetCDF…) • Metadata ontologies and efforts • NGDC, MarineXML, ESRI, Metadata Wranglers • Conclusion: Watch, Learn, Try (Iterate)
SSDS Data Access • Desktop Application: HOOVES • Data File Service • Quick Look • Metadata Access (and Validation) • Metadata Editing • Networked API: Servlet / JSP Pages • Application API (NetCDF): OPeNDAP • Web Access (NetCDF): Live Access Server • Archived Files: Direct Access (?)
Prime Areas for Collaboration • Infrastructure/model development • Ontologies • Metadata schema • Metadata entry/correction/annotation • User interfaces • Data processing • Visualizations • Federated access to data/metadata • Documentation
IAG Team • Kevin Gomes • John Graybeal • Mike McCann • Brian Schlining • Rich Schramm • And, a Mystery Guest (To Be Determined) Science Representative to SSDS • John Ryan