180 likes | 498 Views
Cyberenvironments: Adaptive Middleware for Scientific Cyberinfrastructure ARM’07 Jim Myers, Bob McGrath jimmyers@ncsa.uiuc.edu National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign Outline What’s Changing in Science?
E N D
Cyberenvironments: Adaptive Middleware for Scientific Cyberinfrastructure ARM’07 Jim Myers, Bob McGrath jimmyers@ncsa.uiuc.edu National Center for Supercomputing Applications (NCSA), University of Illinois at Urbana-Champaign National Center for Supercomputing Applications
Outline • What’s Changing in Science? • What Role should Cyberinfrastructure (CI) play? • Requirements and Design for Cyberenvironments: Adaptive/Reflective Techniques • Some Examples • Conclusions National Center for Supercomputing Applications
How is Science Changing? • Quantitative Modeling and Simulation • Better Data (e.g. Higher Signal to Noise) • More Data (e.g. High Throughput) • Closer ties between research and application • Investigation of subtle, non-linear, multi-dimensional phenomena • Statistical analysis of complex systems National Center for Supercomputing Applications
Dq2 Valid Range Dq1 Supporting the Research Lifecycle… Standards / Best practice Algorithms/ Services Reference Data Apply Curate Engineering Views Gap Analysis Analyze Publish Provenance Annotation Experiment Design Project Execution National Center for Supercomputing Applications
‘Amdahl’s Law’ for Scientific Progress ! Data production Processing power Data transfer/storage Data discovery Translation Experiment setup Group coordination Tool integration Training Feature Extraction Data interpretation Acceptance of new models/tools Dissemination of best practices Interdisciplinary communication National Center for Supercomputing Applications
CI versus the Literature/Out of Band Processes? • Higher Fidelity, Multiple Levels of Description • Custom Views • Actionable, Faster, Automatable • But software is rigid relative to text… • CI must be built before the parts are done • It must be evolvable by independent parties • It must enable coordination without central control • It must allow science to evolve / progress (no fixed domain model) • Researchers/educators must be able to work in multiple communities/value chains (across CI projects) • It must convey knowledge as well as tools to end users • It must align the interests of CI funders, developers, providers, users, … National Center for Supercomputing Applications
Key Cyberenvironment Design Concepts • Explicit Separation of How from What: • Content (type, global IDs, …) and Conceptual Context (metadata…) • Process (workflow, provenance, …) • Virtual Organizations/Social Networks (policies, resources, semantics, translation) • GUI Integration (portals, rich clients, …) • … • Ability to pass information through components that don’t understand the details (everything is data)… …e-Science, Semantic Grid, Cyberenvironments, Web 2.0 … …intelligence at the edges… National Center for Supercomputing Applications
Mid-America Earthquake Center Examples: MAEViz(Consequence-Based Risk Management for Seismic Events) Decision Support Damage Prediction Fragility Models Inventory Selection • Engineering View of MAE Center Research • Portal-based Collaboration Environment • Distributed Data/metadata Sources • Multi-disciplinary Collaboration Hazard Definition National Center for Supercomputing Applications
Examples: CyberIntegrator • Exploratory workflow (macro-recording) • Simple integration with Matlab, Excel, Fortran, etc. • Provenance tracking • Distributed, shared data access (HIS, WebDAV, …) • Remote Execution • Workflow/model publication • Metadata and Annotation of data, modules, workflows National Center for Supercomputing Applications
Examples: CyberCollaboratory Portal • Group Spaces • Library, discussion, announcements, wiki, … • Simplified invitation • Email integration • Provenance tracking/social network analysis • … National Center for Supercomputing Applications
Content & VO Aware Secure Enterprise Data Desktop Public Reference Data Data/Metadata Check VO and personal preferences Translate Virtual Data (from Recipes) National Center for Supercomputing Applications
Process Aware Publish/ Discover Process Capture Execute Retrieve Data Retrieve Code National Center for Supercomputing Applications
Dynamic New Third-Party Analyses (Forms, Visualizations) Compare, Contrast, Validate Auto-update MAEviz GIS Workflow Data Eclipse RCP Plug-in Framework National Center for Supercomputing Applications
Social/Conceptual Context • Capture of Interactions in Portal and in the Literature • Capture of Annotations/Associations • Provide Browsing and Recommender Interfaces National Center for Supercomputing Applications
What do CyberEnvironments/CI for scientific discourse have to do with ARM? • Thesis: the principles of ARM are critical design patterns for viable CEs • Abstract services • NSF CI, Grid—resource management, authentication, etc. • Support for science process (e.g., virtual organizations) • RCP and other component frameworks for composing software • Expose metadata • Generic content management • Generic process management • Open metadata using RDF • Instrumentation • Universal capture of provenance, annotation National Center for Supercomputing Applications
A Reflective Model • VO manager separate from App and CI developers • Can move from local to grid/web solutions w/o app changes • Semantic middleware as scalable communication layer… • Open Provenance Model, FOAF, DC, … as common conventions What needs to be done Which component(s) can do the work? What does the component need to know? Where can the information be found? What can the component add to the story? National Center for Supercomputing Applications
Conclusions • Building Cyberenvironments/supporting Scientific Discourse is critical for scientific efficiency/competitiveness. • Abstract management of data, process/provenance, social, and conceptual contexts solves real socio-technical problems in science and engineering research. • Our experience in building Cyberenvironments on these principles is showing their potential in terms of supporting systems science and evolving research. • E-Science, semantic web/grid, content management, Web 2.0 are all driving in this direction, but their impact is not well stated in terms of value to science researchers. National Center for Supercomputing Applications
Acknowledgments The authors wish to acknowledge the contribution of many CI researchers to the concepts and systems discussed here with specific recognition of members of NCSA’s Cyberenvironments Directorate. The National Center for Supercomputing Applications is funded by the US National Science Foundation under Grant No. SCI-0438712. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. National Center for Supercomputing Applications