200 likes | 318 Views
An Integrated Environmental Observatory Cyberenvironment. Barbara Minsker Director, Environmental Engineering, Science, & Hydrology Group, National Center for Supercomputing Applications; Principal Investigator and co-Director, CLEANER Project Office;
E N D
An Integrated Environmental Observatory Cyberenvironment Barbara Minsker Director, Environmental Engineering, Science, & Hydrology Group, National Center for Supercomputing Applications; Principal Investigator and co-Director, CLEANER Project Office; Associate Professor, Dept of Civil & Environ. Engineering; University of Illinois, Urbana, IL, USA November 16, 2006 National Center for Supercomputing Applications
Environmental Cyberinfrastructure Demonstration (ECID) Project • NSF Office of Cyberinfrastructure is funding NCSA and SDSC to: • Work with leading edge communities to develop cyberinfrastructure to support science and engineering • Incorporate successful prototypes into a persistent cyberinfrastructure • NCSA’s primary focus: Cyberenvironments • As part of this effort, the ECID project, led by Jim Myers & Barbara Minsker, is working with the WATERS community and CUAHSI Hydrologic Information System (HIS) project to create a prototype cyberenvironment for environmental observatories • Driven by requirements gathering and close community collaborations National Center for Supercomputing Applications
Requirements Gathering • Interviews at conferences and meetings (Tom Finholt and staff, U. of Michigan) • Usability studies (CET, Wentling group) • Community survey (Finholt group) • AEESP and CUAHSI surveyed in 2006 as proxies for environmental engineering and hydrology communities • 313 responses out of 600 surveys mailed (52.2% response rate) • Key findings are driving ECID cyberenvironment development National Center for Supercomputing Applications
Nonstandard/ inconsistent units/formats • Metadata problems • Other obstacles What is the single most important obstacle to using data from different sources? Shows a need for an integrated cyberenvironment with provenance. • 55% concerned about insufficient credit for shared data • N=278 National Center for Supercomputing Applications
What three software packages do you use most frequently in your work? • *Other: • MS Word • MS PowerPoint • Statistics applications (e.g., Stata, R, S-Plus) • SigmaPlot • PHREEQC • MathCAD • FORTRAN compiler • Mathematica • GRASS GIS • Groundwater models • Modflow Majority are not using high-end computational tools. National Center for Supercomputing Applications
Factors influencing technology adoption Ease of use, good support, and new capabilities are essential. National Center for Supercomputing Applications
What are the three most compelling factors that would lead you to collaborate with another person in your field? Community seeks collaborations to gain different expertise. National Center for Supercomputing Applications
Environmental CI Architecture: Research Services Integrated CI ECID Project Focus: Cyberenvironments Supporting Technology Data Services Workflows & Model Services Knowledge Services Meta-Workflows Collaboration Services Digital Library HIS Project Focus Analyze Data &/or Assimilate into Model(s) Link &/or Run Analyses &/or Model(s) Create Hypo-thesis Obtain Data Discuss Results Publish Research Process National Center for Supercomputing Applications
Cyberenvironments • Couple traditional desktop computing environments coupled with the resources and capabilities of a national cyberinfrastructure • Provide unprecedented ability to access, integrate, automate, and manage complex, collaborative projects across disciplinary and geographical boundaries. • ECID is demonstrating how cyberenvironments can: • Support observatory sensor and event management, workflow and scientific analyses, and knowledge networking, including provenance information to track data from creation to publication. • Provide collaborative environments where scientists, educators, and practitioners can acquire, share, and discuss data and information. National Center for Supercomputing Applications
SSO ECID CyberEnvironment Components CyberCollaboratory: Collaborative Portal CI:KNOW: Network Browser/ Recommender CyberIntegrator: Exploratory Workflow Integration CUAHSI HIS Data Services Tupelo Metadata Services Single Sign-On Security (coming) Community Event Management/Processing National Center for Supercomputing Applications
CyberCollaboratory • The CyberCollaboratory is a web portal to allow sharing of information and ideas across the community. • Currently being used by CLEANER Project Office To check out the public view of CyberCollaboratory, create an account at http://cleaner.ncsa.uiuc.edu/cybercollab National Center for Supercomputing Applications
CyberIntegrator • Studying complex environmental systems requires: • Coupling analyses and models • Real-time, automated updating of analyses and modeling with diverse tools • CyberIntegrator is a prototype technology to support exploratory modeling and analysis of complex systems. Integrates the following tools to date: • Excel • IM2Learn image processing and mining tools, including ArcGIS image loading • D2K data mining • Java codes, including event management tools • Additional tools will be added, based on high priority needs of beta users. Some options: ArcGIS, OpenMI model integration, Fortran codes, Matlab, Kepler, generic tools for running executables and Web services National Center for Supercomputing Applications
CyberIntegrator Architecture Example of CyberIntegrator Use: Carrie Gibson created a fecal coliform prediction model in ArcGIS using Model Builder that predicts annual average concentrations. Ernest To rewrote the model as a macro in Excel to perform Monte Carlo simulation to predict median and 90th percentile values. CyberIntegrator’s goal: Reduce manual labor in linking these tools, visualizing the results, and updating in real time. National Center for Supercomputing Applications
Real-Time Simulation of Copano Bay TMDL with CyberIntegrator CyberIntegrator Excel Executor Im2Learn Executor 1 2 3 4 Streamflows to Distributions (Excel) Fecal Coliform Concentrations Model (Excel) Load Shapefiles (Im2Learn) Geo-reference and Visualize Results (Im2Learn) USGS Daily Streamflows (web services) Shapefiles For Copano Bay call data National Center for Supercomputing Applications
Sensor Anomaly Detection Scenario Listens for data events & creates event when anomaly discovered. User subscribes to anomaly detector workflows Alerts user to anomaly detection, along with other events (logged-in users, new documents, etc.) Dashboard Event Manager Anomalies Anomaly Detector 1 Anomalies Anomaly Detector 2 CCBay Sensor Map Sensor data Shares workflow to server Sensor Data CC Bay Sensor Monitor Page Sensor map shows nearby related sensors so user can check data. Anomaly detector is faulty. CI-KNOW recommends alternate anomaly detector from Chesapeake Bay observatory. CyberIntegrator loads recommended workflow. User adjusts parameters to CCBay Sensor. CI-KNOW Network CyberIntegrator National Center for Supercomputing Applications
Demonstrations… National Center for Supercomputing Applications
CyberDashboard Desktop Application Raw Data Anomaly Subscription JMS Broker (ActiveMQ 4.0.1) JMS JMS Data and Anomaly Subscriptions Anomaly Publication Data Subscriptions JMS JMS JMS Sensor Page Reference CyberCollaboratory URL Workflow Service CyberIntegrator Workflow Workflow Reference CyberIntegrator Workflow URL Recommender Network Web Service CyberIntegrator SOAP Workflow Publication/ Retrieval Web Services CI-KNOW SOAP ECID Managed Data/Metadata Tupelo RDBMS Provenance User Subscriptions Workflow Templates Semantic Content Event Topics Cyberenvironment Technologies Metadata Data Anomalies National Center for Supercomputing Applications
ECID & WATERS Testbeds • Two technologies in the Cyberenvironment are ready for beta testing: CyberCollaboratory and CyberIntegrator • We invite WATERS testbed projects to become beta testers: • Use the beta software starting January 1st. We will work with you to create CyberIntegrator executors for your tools (do it yourself with our open source code or we’ll do it) • Provide feedback to our developers on usability and functionality • Ongoing software modifications will be made in response to feedback • To date, 4 projects agreed to participate in beta testing • 3 WATERS testbeds: Corpus Christi Bay, Chesapeake Bay, Utah • UNESCO IHE researchers • Interested in joining the beta testing? • See Luigi Marini for more details, or e-mail him at lmarini@ncsa.uiuc.edu National Center for Supercomputing Applications
Conclusions • The ECID Cyberenvironment demonstrates the benefits of end-to-end integration of cyberinfrastructure and desktop tools, including: • HIS-type data services • Workflow • Event management • Provenance and knowledge management, and • Collaboration for supporting environmental researchers, educators, and outreach partners (e.g., policy makers) • This creates a powerful system for linking observatory operations with flexible, investigator-driven research in a community framework (i.e., the national network). • Workflow and knowledge management support testing hypotheses across observatories • Provenance supports QA/QC and rewards for community contributions in an automated fashion. National Center for Supercomputing Applications
Acknowledgments • Contributors: • NCSA ECID team (Peter Bajcsy, Noshir Contractor, Steve Downey, Joe Futrelle, Hank Green, Rob Kooper, Yong Liu, Luigi Marini, Jim Myers, Mary Pietrowicz, Tim Wentling, York Yao, Inna Zharnitsky) • Corpus Christi Bay Testbed team (PIs: Jim Bonner, Ben Hodges, David Maidment, Paul Montagna) • Funding sources: • NSF grants BES-0414259, BES-0533513, and SCI-0525308 • Office of Naval Research grant N00014-04-1-0437 National Center for Supercomputing Applications