1 / 26

DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION

This article explores the issues, metadata standards, and data formats in the Earthquake Engineering community. It discusses the uses of data and metadata, the benefits of standardization, and barriers to overcome. The article also proposes a metadata design and discusses philosophical issues related to data sharing.

hreed
Download Presentation

DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DATA SHARING ISSUES, METADATA, ARCHIVES, AND COMPREHENSION • Urgency • NEESgrid (www.neesgrid.org) schedule: • Characterize the Earthquake Engineering community use of data and metadata: January 2002. • Distribute preliminary metadata standards: May 2002. • Publish standards for data and metadata models and representations by September 2002 • (Prudhomme and Mish, 2001). • Consortium Developer of NEES www.nees.org • Working groups on data issues: looking for interested volunteers

  2. Identify/define uses of data and metadata • To help me remember what I did last time • To permit other researchers to duplicate test • Real time remote PI interaction • To allow numerical simulation • Interactive decision making during experiment • Years after the test • Automated control of the experiment • Visualization • Research and education, sponsors • Data search/query filter • Artificial Intelligence, inverse/system identification • Software sharing by common interface opensees

  3. Use of data • Data search/query filter • Artificial Intelligence, inverse/system identification • Software sharing by common interface opensees

  4. Experience of geotech community • CWRU database on element tests • VELACS USC • COSMOS and IRIS • PEER structures data bases UCSD, UW • UCD cgm.engr.ucdavis.edu

  5. Other community examples • Atmosphere/ocean research NCAR, NOAA, Navy • Example of flux vector interchanged between programs • User specific API to interface with “black box” • CORBA – Common Object Request Broker Architecture. A spec for an “object that may be accessed by many platforms – java, fortran, etc. • Fluid flow • Visualization code runs with solver • Open GL • Generic flux vector • Connection of mismatched meshes (regular and scattered.) • Meshing experimental data with numerical data.

  6. Data use and format • Think ahead for uses • Needs assessment • Format changes • Visualization of large data sets is demanding • What is data ? • Format • Access tools input and output • Don’t store twice because it is in different format (calibration?)

  7. Formats, coding • Oracle • Flat ASCII • XML

  8. What are benefits of standardization? • Knowledge of data format at one facility is transferable to others. • E.g., numerical simulation of tests at CWRU, UCD. • Training of experimenters may transferable. • User interfaces to databases may be sharable; so, maybe we will not have to each develop the interfaces independently. • Search, query, automated IO, visualization……..

  9. Barriers to standardization and how to overcome them • Need a “killer app” that assumes a standard • The gap between Civil Engineering and Information Technology.

  10. “Killer App” features • To help me remember what I did last time: automated metadata documentation • To permit other researchers to duplicate test • Real time remote PI interaction- teleparticipation • To allow numerical simulation • Interactive decision making during experiment • Years after the test • Automated control of the experiment

  11. “Killer App” features(2) • Visualization • Data search/query/access/filter • Web portal - for all of the above?

  12. Metadata Design • Determine the structure of metadata to optimize • Intuitive query language • Readable to computers and humans • Completeness without redundancy • Flexibility and Evolution • Curation by NEES SI and Consortium • Write code- XML document type definitions

  13. Strawman metadata structure • Project Identifiers • Catalog of Materials, Objects, Sensors and Apparatus • Sequence of Model Test Events and Measurements • Sensor Channel Gain Lists (1) • Image Data • Control Data Files

  14. Discussion Items • Philosophical issues related to culture of data sharing? • Data producer should get first shot at publication • How long should we allow a data generator to ponder before other people can have access? • How do we publish electronic data? • Give academic credit to data publishers,

  15. XML <ModelTest> <Catalog> <Sensors> <Sensor SN="PCB3245"> <Type>Piezoelectric Accelerometer</Type> <Manufacturer>PCB</Manufacturer> <Model>352</Model> <CalibrationDate>092899</CalibrationDate> <Sensitivity Unit="mV/g">100</Sensitivity> <Range>50g</Range> <SensorData> http://www.pcb.com/pcb3245 </SensorData> </Sensor> </Sensors> </Catalog>

  16. There must be nice interfaces to complex data structures. Automatic metadata generator should do most of the work. TEDS (Transducer Electronic Data Sheets), SCEDS, automated geometry definition will make the job do-able.

  17. Discussion Items • At what metadata level do we refer to other archives instead of re-archiving? Example: • Accelerometer amplifier gain for each test event archive • Accelerometer calibration in the test archive • Date and method of calibration in facility archive • Cross-axis sensitivity at manufacturers archive

  18. Strawman metadata hierarchy • Section 1 of the outline in Table 1 contains metadata associated with the research project. • Section 2 is a catalog of physical objects used to construct or test the model. This includes: apparatus used to test the model, passive materials and markers that are placed in the model, and sensors that are used in the model tests.

  19. Strawman metadata hierarchy • Section 5 describes image data. This could include photographs, video camera data, and/or engineering drawings of configuration. • Section 6 describes the data required to control the experiment. This could determine the location of a CPT sounding, the rate of penetration of a penetrometer, or command files to control a shaker.

  20. Strawman metadata hierarchy • Section 3 describes sequencing of events. A sequence can be the measurement of the location of an object, or an event involving activation of an actuator or a penetrometer sounding.  • Section 4 includes the sensor-channel-gain lists; this documents which sensors are plugged into which amplifier channels, and also includes the sequence in which the sensor data was recorded, and parameters that define gains and filters.

  21. CAD of geometry and instrument location numbers Printable version of report (pdf) describing experiment and automatically generated data time histories Excel spreadsheets of metadata ASCII data files of sensor readings during about 90 simulated earthquakes (about 1 MB each)

  22. Excel spread sheet describing calibration factors, amplifier channel numbers, gains, data file format, ...

  23. Event BV, page 3 of pdf document - semiautomatic plot generation using MathCAD program, central vertical array of accelerometer data

  24. NEES Collaboratory Othersite 1 System Integrator Othersite 2 Earthquake Researchers Site A NEESgrid Simulation/Experimental Facilities Educators & Students Site B NEES Consortium NEES Consortium Development Site C Other Practitioners Site Council Professional Engineers

  25. Imaging 3 D visualization machine_1 3 D visualization machine_2 NEES SGI 16 processor Parallel computer SGI image processor GSR_2 GSR_1 HDTV Camera Environmental Monitoring OXC OXC OXC 80 PC cluster UC Davis Research Network Prototype OLS router Prototype OLS router To Sacramento, Merced To Berkeley, SantaCruz

More Related