280 likes | 421 Views
Data Issues for (Engineering) CyberInfrastructure. (What I have learnt from the NEESgrid Data/MetaData Efforts in the past 6 months). Kincho H. Law Professor of Civil and Environmental Engineering (Structural Engineering and Engineering Informatics) Stanford University June 11, 2004
E N D
Data Issues for (Engineering) CyberInfrastructure (What I have learnt from the NEESgrid Data/MetaData Efforts in the past 6 months) Kincho H. Law Professor of Civil and Environmental Engineering (Structural Engineering and Engineering Informatics) Stanford University June 11, 2004 NSF ITR/ENG/CI Meeting
Outline of Presentation • Overview of NEES and NEESgrid • (Courtesy of Bill Spencer, Jr. of UIUC) • NEESgrid’s Data and Metadata Efforts • (Courtesy of Chuck Severance of U. Mich) • More on Data Issues
System Integration University of Illinois at Urbana - Champaign $10,000,000 George E. Brown Jr. Network for Earthquake Engineering Simulation (NEES) Fast Hybrid Testing Laboratory Multi - Axial Subassemblage Testing University of Colorado, Boulder Modular Simulation Laboratory System $1,983,553 University of Illinois at University of Minnesota, Twin Cities Urbana - Champaign $6,472,049 $2,958,011 Permanently Instrumented Field Sites Brigham Young University $1,944,423 Dual ( relocatable ) Shake Tables and High Performance Actuators State University of New York, Tsunami Wave Basin Oregon State University University at Buffalo $4,775,832 $6,160,785 $4,379,865 Three (two relocatable) Geotechnical Centrifuge Rensselaer Polytechnic Institute Shake Tables University of Nevada, Reno $2,380,579 $4,398,450 Lifeline Testing Facility Cornell University $2,072,716 Geotechnical Centrifuge University of California, Davis $4,614,294 Multi - directional Testing Facility Lehigh University $2,593,317 Consortium Development Consortium of Universities for Research in Earthquake Engineering Field Testing Equipment $1,999,907 University of Texas, Austin $2,937,036 Reconfigurable Reaction Wall University of California, Berkeley $4,268,323 Large Uniaxial Shake Table University of California, San Diego Field Testing Equipment $5,890,000 University of California, Los Angeles $2,652,761
NEES Resources Remote Users Instrumented Structures and Sites (Faculty, Students, Practitioners) Simulation Tools Repository • Enabling collaboration • (laboratory experiments, • field tests, simulations) • Enabling data sharing, • archiving and access by • researchers (and public) High-Performance Network(s) Laboratory Equipment Field Equipment Curated Data Repository Leading Edge Computation Global Connections (FY 2005 – FY 2014) Remote Users: (K-12 Faculty and Students) Laboratory Equipment
Vision of NEES (Collaboratory) NEESgrid
To connect and facilitate experimentation/simulation (virtual collaboratory) in earthquake engineering for US and the world NEESgrid!- CyberInfrastructure • Tele-Control Services and APIs • Tele-Observation and Data Visualization • E-Notebook • Streaming data services • DAQ and related services • Data and Metadata Services • Remote Collaboration and Visualization tools and services • Core Grid Services, deployment efforts, packaging • Simulation Component
Data/Metadata Working Group Member Objective: To define a common approach and tools to enhance the sharing, access and utilization of the NEESgrid data repository Jean-Pierre Bardet University of Southern California Jennifer Swift University of Southern California Andrei Reinhorn State University of New York, Buffalo (Data Sharing and Archiving Committee, DSAC) Ken Ferschweiler Northwest Alliance for Computational Science and Engineering, Oregon State Univ. Lelli Van Den Einde University of California at San Diego Gokhan Pekcan University of Nevada, Reno (Coordinator of the Task Group) Hank Ratzesberger University of California, Santa Barbara Chuck Severance University of Michigan Bill Spencer University of Illinois, Urbana-Champaign (PI of NEES’s System Integration Project) Jim Eng University of Michigan Jun Peng Stanford University Kincho H. Law Stanford University
NEESgrid Data – Core Elements Data Acquisition NEESpop Data/MD Ingest Tools API Local Repository Grid and Web Services Data Teamlets API Workstation NEESdata Data tools Central Repository Data Teamlets API Data viewers
Boxology Data Models E-Notebook Central Repository NEES Grid Data Approach Local Repository Experiment Management Data Acquisition Data Analysis Experiment Monitoring
Overall Data Modeling Efforts NEES Site Site A Site B Site C Specifications Database Equipment People Equipment People ProjectDescription Trials Experiments Experiments Trials Domain Tsnumai Shake Table Centrifuge Geotech Specific Specimen Specimen Specimen Specimen models Common Units Sensors Elements Descriptions Data / Data Data Data Observations
Multiple Models Site Site Model Project Model Proj Person Facility Exp Equipment Trial Specimen Notebook Sensor Element Element Chapter Entry
Prototype Data Model (Shake table test) SiteSpecific ExperimentalSetup ProjectRelated ExperimentalElement DataElement (Protégé 2.0)
Data models are described in RDF (Resource Description Framework) Local repository supports multiple data models with cross-model relationships defined Project Browser, Notebook Browser, Site Specification Database Browser are being developed Data Models
Repo Data Load RDF <owl:ObjectProperty rdf:ID="hasPublications"> <rdfs:domain> <owl:Class> <owl:unionOf rdf:parseType="Collection"> <owl:Class rdf:about="#Project"/> <owl:Class rdf:about="#Task"/> </owl:unionOf> </owl:Class> </rdfs:domain> <rdfs:range rdf:resource="#Publications"/> </owl:ObjectProperty> Configure Models RDF/ OWL Configure
SiteSpecific ExperimentalSetup DAQ ProjectRelated C D ExperimentalElement DataElement NEESgrid Experiment Data Flow Project Browser Data Ingestion Experiment Control Data Model NEESGrid Data Repository Data Turbine Stored Viewer Streaming Viewer DAQ Disk
Data Extraction For Analysis Time Step Channel xyz Start Time Step 1 End Time Step 9999 Data Extraction NEES Data Repository Pseudo-Dynamic Continuous Export Auto Export DT Main System
More on Data Issues….. Current Developments and Tools • Facilitate experimenters and researchers generating and ingesting data into repository • Allow browsing (and limited searching) data/information about a project or experiment • Support “No data generated by NEES would be lost” policy But ….. “A data repository is only as good as what can be done with the data.”
Data Life Cycle Data Consumption Data Production Data Management • For how long? • What format? (standardization, usability) • Heterogeneous data types (text, CAD, video, streams) • Raw, processed, derived, curated data • Non-proprietary data format and software (what to archive?) • What to preserve? Everything? Selectively? Need: data policies and short/long term plans for the cyberinfrastructure community (not just the cyberinfrastructure but the community it serves!)
Consumers for the NEESgrid Data Repository: • Experimenters – about generating reports, analysis of results • Practitioners – about specific results for specific experiment on specific component (beam column joint) • Researchers – about prior works for developing new ideas (new, novel active damping/sensing devices) • Colleaguesin other fields – data interoperability and sharing (IRIS’s seismological data sets) • Faculty – developing teaching materials • Students – discovering new knowledge, mining archived data • K-12 audience – about earthquakes and our civil infrastructure systems
? ? NEES Consortium OAIS (Open Archive Information System) Functional Model Current Focus and Deliverables http://ssdoo.gsfc.nasa.gov/nost/isoas/
Capture Data/MetadataThroughout Data Lifecycle Data Models Experiment Prep Experiment Management Data Monitoring Data Analysis Data Publishing Data Curation Data Discovery and Reuse
What have I learnt about NEESgrid (CI) data issues in the last 6 months? • The amount of data (in a wide variety of formats) to be generated will be overwhelming – standardization is hard but some standardization is necessary • Expectations of what data models and repository can support are very high (and may not be easy to deal with in short term) • Putting data in a computer (meaningfully) IS NOT an easy problem
What have I learnt about NEESgrid (CI) data issues in the last 6 months? • So much to learn among different disciplines – engineering, digital library, digital archive and preservation, library information science, information and communication technologies, …. • Many related ongoing works: Skyserver, DSPACE, ICPSR, IRIS, …..
What have I learnt about NEESgrid (CI) data issues in the last 6 months? • Strong leadership and management team (in addition to technology developers) • Dedication of community participants from the domain of concern • Policies and guidelines need to be well defined by the community and implemented
Rapidly Deployable CI and Data Problem Source: A Proposal on “Innovative Use of IT in Post-Disaster Investigation,” EERI, Feb 2004
Acknowledgments • NEES’s Data and Metadata Task Group • NEES’s Data Sharing and Archiving Committee • Joy Pauschke of NSF, Bill Spencer of UIUC, Chuck Severance of U. Michigan and Joe Futrelle of UIUC, Gokhan Pekcan of U. Nevada, Reno, Jun Peng, Stanford University, and many others ….. • Any opinions expressed in this presentation are those of the presenter and do not necessarily reflect the views of the NEESgrid, the National Science Foundation and his collaborators.
Thank You Comments and Suggestions