400 likes | 485 Views
Advanced Cyberinfrastructure: An Engine for Competitiveness. Steve Meacham National Science Foundation CASC Workshop September 6, 2006. Outline. NSF CI Vision High-End Computing Portfolio Data, COVO and LWD The TeraGrid. What is cyberinfrastructure?.
E N D
Advanced Cyberinfrastructure: An Engine for Competitiveness Steve Meacham National Science Foundation CASC Workshop September 6, 2006
Outline • NSF CI Vision • High-End Computing Portfolio • Data, COVO and LWD • The TeraGrid
What is cyberinfrastructure? • Cyberinfrastructure for Science and Engineering Research and Education • Is the integration of the components of information technology necessary to advance the frontiers of scientific and engineering knowledge • Is the use of information technology to integrate research and education • Makes possible new modes of experimentation, observation, modeling, analysis, and collaboration • Is built with contributions from experts in many fields:- e.g. computer science, engineering and social science • Examples of CI components: • Optical, wired electrical, and wireless networking; simulation tools; high-performance computing; data analysis tools; data curation; tele-operation and tele-presence; visualization hardware and software; semantic mediation and query tools; digital workflows; middleware and high-performance system software; portal technology; virtual organizations and gateways; …
Strategic Plan (FY 2006 – 2010) • Ch. 1: Call to Action Strategic Plans for: • Ch. 2: High Performance Computing • Ch. 3: Data, Data Analysis & Visualization • Ch. 4: Collaboratories, Observatories and Virtual Organizations • Ch. 5: Learning & Workforce Development http://www.nsf.gov/dir/index.jsp?org=OCI
DATA Principal components HPC Data, Data Analysis, and Visualization High-Performance Computing COVO LWD Collaboratories, Observatories and Virtual Organizations Learning and Workforce Development
Inside NSF • Cyberinfrastructure Council • Office of Cyberinfrastructure • Directorate Cyberinfrastructure Working Groups • Directorate Cyberinfrastructure Programs
Office of CyberInfrastructure Dan Atkins Office Director José Muñoz Dep. Office Dir. Judy Hayden Joann Alquisa Priscilla Bezdek Mary Daley Irene Lombardo Data POC: Chris Greer HPC POC: Steve Meacham COVO POC: Kevin Thompson LDW POC: Miriam Heller Program staff: Chris Greer, Miriam Heller, Fillia Makedon, Steve Meacham, Vittal Rao, Frank Scioli, Kevin Thompson
CI Budgets HPC hardware acquisitions, O&M, and user support as a fraction of NSF’s overall CI budget
Examples of FY07 Areas of Emphasis • Leadership-class HPC system acquisition • Data- and collaboration-intensive software services • Confidentiality protection and user-friendly access for major social and behavioral science data collections • National STEM Digital Library (NSDL) supporting learners at all levels • CI-TEAM, preparing undergraduates, graduate students, postdocs and faculty to use cyberinfrastructure in research and education • Support for the Protein Data Bank (PDB), the international repository for information about the structure of biological macromolecules, and the Arctic Systems Sciences (ARCSS) Data Coordination Center
Principal components HPC Data, Data Analysis, and Visualization High-Performance Computing Collaboratories, Observatories and Virtual Organizations Learning and Workforce Development
HEC-enabled science and engineering • Impacts in many research fields • E.g. model economies; analysis of multi-sensor astronomical data; linguistic analysis; QCD & HEP analysis; cosmology; role of dark matter; chemistry; materials science; engineering; geoscience; climate; biochemistry; systems biology; ecosystem dynamics; genomics; proteomics; epidemiology; agent-based models of societies to test policy impacts; optimization; multi-scale, multi-science models e.g. envt + soc sci, Earth system models, earthquake + structural engineering, … • Transforming industry • Aircraft manufacturing; pharmaceuticals; engineering (inc nano- & bio-); oil exploration; entertainment; automobile manufacturing, new industries based on information mining, … • Part of the American Competitiveness Initiative
Why invest in HEC? • High-performance computing as a tool of research is becoming ever more important in more areas of research • An inexorable trend over the past few decades • Shows no sign of stopping • Current examples, future examples • Understanding life • Understanding matter • Understanding the environment • Understanding society
Why invest in HEC? Understanding life Satellite tobacco mosaic virus, P. Freddolino et al. Aldehyde dehydrogenase, T. Wymore and S. Brown Imidazole glycerol phosphate synthase, R. Amaro et al.
Why invest in HEC? Understanding matter I. Shipsey
Why invest in HEC? Understanding the environment K. Droegemeier et al. CCSM
Why invest in HEC? Understanding society MoSeS: A dynamical simulation of the UK population. http://www.ncess.ac.uk/nodes/moses/BirkinMoses.pdf M. Birkin et al. John Q Public: A computational model that simulates how voters' political opinions fluctuate during a campaign. S.-Y. Kim, M. Lodge, C. Taber.
Fe0.5Pt0.5 random alloy L10-FePt nanoparticle Magnetic NanocompositesWang (PSC) • Direct quantum mechanical simulation on Cray XT3. • Goal: nano-structured material with potential applications in high density data storage: 1 particle/bit. • Need to understand influence of these nanoparticles on each other. • A petascale problem: realistic simulations for nanostructures of ~ 50nm (~ 5M atoms). • LSMS- locally self-consistent multiple scattering method is a linear scaling ab initio electronic structure method (Gordon Bell prize winner) • Achieves as high as 81% peak performance of CRAY-XT3 Wang (PSC), Stocks, Rusanu, Nicholson, Eisenbach (ORNL), Faulkner (FAU)
VORTONICSBoghosian (Tufts) • Physical challenges: Reconnection and Dynamos • Vortical reconnection governs establishment of steady-state in Navier-Stokes turbulence • Magnetic reconnection governs heating of solar corona • The astrophysical dynamo problem. Exact mechanism and space/time scales unknown and represent important theoretical challenges • Computational challenges: Enormous problem sizes, memory requirements, and long run times • requires relaxation on space-time lattice of 5-15 Terabytes. • uses geographically distributed domain decomposition (GD3): DTF, TCS, Lonestar • Real time visualization at UC/ANL • Insley (UC/ANL), O’Neal (PSC), Guiang (TACC) Homogeneous turbulence driven by force of Arnold-Beltrami-Childress (ABC) form
Largest and most detailed earthquake simulation of the southern San Andreas fault. Calculation of physics-based probabilistic hazard curves for Southern California using full waveform modeling. Computation and data analysis at multiple TeraGrid sites. Workflow tools automate the very large number of programs and files that must be managed. TeraGrid staff Cui (SDSC), Reddy (GIG/PSC) TeraShake / CyberShakeOlsen (SDSU), Okaya (USC) Major Earthquakes on the San Andreas Fault, 1680-present 1906 M 7.8 1857 M 7.8 1680 M 7.7 Simulation of a magnitude 7.7 seismic wave propagation on the San Andreas Fault. 47 TB data set.
Searching for New Crystal Structures Deem (Rice) • Searching for new 3-D zeolite crystal structures in crystallographic space • Requires 10,000s of serial jobs through TeraGrid. • Using MyCluster/GridShell to aggregate the computational capacity of the TeraGrid for accelerating search. • TG staff Walker (TACC) and Cheeseman (Purdue)
HEC Program Elements • Acquisitions • Track 1 - Petascale • Track 2 - Mid-range supercomputers • Operations • HEC System Software Development • Compilers, fault-tolerant OS, fault-survivability tools, system status monitoring, file-systems, PSEs, … • HEC Petascale Application Development • Scalable math libraries, scalable algorithms, data exploration tools, performance profiling and prediction, large application development • Coordinated with other agencies
Acquisition Strategy Science and engineering capability (logrithmic scale) Track 1 system(s) Track 2 systems Track 3: Typical university HPC systems FY06 FY07 FY08 FY09 FY10
Track 2 Acquisitions • Individual systems - provide capabilities beyond those obtainable with university or state funds • Collectively, as part of TeraGrid - provide a diverse HPC portfolio to meet the HPC needs of the academic research community • Annual competition: roughly $30M/year for acquisition costs • O&M costs via a TeraGrid RP award • Primary selection criterion: impact on science and engineering research
Track 1 Acquisition (FY07-10) • A system that will permit revolutionary science and engineering research • Capable of delivering large numbers of cycles and large amounts of memory to individual problems • Capable of sustaining at least 1015 arithmetic ops/second on a range of interesting problems • Have a very large amount of memory and a very capable I/O system • An architecture that facilitates scaling of codes • Robust system software with fault tolerance and fault prediction features • Robust program development tools that simplify code development • A single physical system in a single location
Track 1 Acquisition (FY07-10) Examples of research problems: • The origin and nature of intermittency in turbulence • The interaction of radiative, dynamic and nuclear physics in stars • The dynamics of the Earth’s coupled carbon, nitrogen and hydrologic cycles • Heterogeneous catalysis on semiconductor and metal surfaces • The properties and instabilities of burning plasmas and investigation of magnetic confinement techniques • The formation of planetary nebulae • The interaction of attosecond laser pulse trains with polyatomic molecules • The mechanisms of reactions involving large bio-molecules and bio-molecular assemblages • The structure of large viruses • The interactions between clouds, weather and the Earth’s climate
HPC Operations Track 1 & 2 • O&M for projected useful life awarded with acquisition funds • O&M approach assessed in review process HPCOPS • An opportunity for universities w/o Track 1 or 2 funding but who can leverage other funding to acquire large HPC systems • Will provide contribution to O&M in return for provision of HPC resources to national S&E community • These will be TeraGrid RP awards – aligned w/ TG time frame • Expect to be highly competitive • Funding opportunity this year (Nov 28, 2006); do not anticipate a similar competition next year Possible third model? • Provide contribution to acquisition costs if institution picks up O&M
DATA Principal components Data, Data Analysis, and Visualization High-Performance Computing COVO LWD Collaboratories, Observatories and Virtual Organizations Learning and Workforce Development
Strategic Plans for Data, COVO and LWD (FY 2006 – 2010) Data CI: - Investments will continue to be prioritized by science and engineering research and education needs - S&E data generated with NSF funds will be accessible & usable - Data CI includes tools to manage, locate, access, manipulate, and analyze data, mechanisms to maintain confidentiality, and tools to facilitate creation and management of metadata - Data CI will involve strong, international, inter-agency and public-private partnerships Challenges include: -Managing and analyzing very large datasets - Managing, analyzing, and using streaming data - Developing tools to permit research using confidential data COVO and LWD: To appear (August)
The growth of observatories and virtual organizations Observatories - Based on ability to federate data-sets and data streams, some include instrument control, event detection and response, and some degree of virtualization - Examples: NVO, OOI, EarthScope, NEON, GEOSS Virtual organizations - A geographically dispersed community with common interests that uses cyberinfrastructure to integrate a variety of digital resources into a common working environment Supporting technologies - Portals, workflows, data analysis, models, streaming data, event detection, instrument/observatory control, networking, authentication/authorization, digital libraries, …
IRNC International Research Network Connections • Components • TransPAC2 (U.S. – Japan and beyond) • GLORIAD, (U.S. – China – Russia – Korea) • Translight/PacificWave (U.S. – Australia) • TransLight/StarLight, (U.S. – Europe) • WHREN (U.S. – Latin America)
CI-TEAM • A Foundation-wide effort to foster CI training and workforce devel’t • Started FY05 ($2.5M) - focused on demonstration projects • Anticipated funding in FY06: $10M - small and large activities • FY05: - 70 projects (101 proposals) received -11 projects funded • Broadening participation in CI • Alvarez (FIU) – CyberBridges • Crasta (VA Tech) – Project-Centric Bioinformatics • Fortson (Adler) – CI-Enabled 21st C. Astronomy Training for HS Science Teachers • Fox (IU) – Bringing MSI Faculty into CI & e-Science Communities • Gordon (OhSU) – Leveraging CI to Scale-up a Computational Science u/g Curriculum • Panoff (Shodor) – Pathways to Cyberinfrastructure: CI through Computational Science • Takai (SUNY Stonybrook) – High School Distributed Search for Cosmic Ray • Developing & implementing resources for CI workforce development • DiGiano (SRI) – Cybercollaboration between Scientists and Software Developers • Figueiredo (U FL) – In-VIGO/Condor-G Middleware for Coastal and Estuarine CI Training • Regli (Drexel) – CI for Creation and Use of Multi-disciplinary Engineering Models • Simpson (PSU) – CI-based Engineering Repositories for Undergraduates (CIBER-U)
TeraGrid Offers: • Common user environments • Pooled community support expertise • Targeted consulting services (ASTA) • Science gateways • A portfolio of architectures • Exploring: • A security infrastructure that uses campus authentication systems • A lightweight, service-based approach to enable campus grids to federate with TeraGrid
TeraGrid: What is It? • Integration of services provided by grid technologies • Distributed, open architecture. • GIG responsible for integration: • Software integration (including the common software stack, CTSS) • Base infrastructure (security, networking, and operations) • User support • Community engagement (including the Science Gateways activities) • 9 Resource Providers (with separate awards): • PSC, TACC, NCSA, SDSC, ORNL, Indiana, Purdue, Chicago/ANL, NCAR • Several other institutions participate in TeraGrid as a sub-awardees of the GIG • New sites may join as Resource Partners • TeraGrid: • Provides a unified, user environment to support high-capability, production-quality cyberinfrastructure services for science and engineering research. • Provides new S&E opportunities – by making possible new ways of using distributed resources and services • Examples of services include: • HPC • Data collections • Visualization servers • Portals
Science Gateways • Specific examples of Virtual Organizations • Built to serve communities of practice by bring together a variety of resources in a customized portal • Examples include: • NanoHub • NEES • LEAD • SCEC Earthworks Project • NVO • http://www.teragrid.org/programs/sci_gateways/
Science Gateways Biology and Biomedicine Science Gateway Computational Chemistry Grid (GridChem) Computational Science and Engineering Online (CSE-Online) GEON (GEOsciences Network) GIScience Gateway (GISolve) Grid Analysis Environment (GAE) Linked Environments for Atmospheric Discovery (LEAD) National Virtual Observatory (NVO) Network for Computational Nanotechnology and nanoHUB Network for Earthquake Engineering Simulation (NEES) Neutron Science Instrument Gateway Open Life Sciences Gateway Open Science Grid (OSG) SCEC Earthworks Project Special PRiority and Urgent Computing Environment (SPRUCE) TeraGrid Visualization Gateway The Telescience Project
Goal: to create and maintain a powerful, stable, persistent, and widely accessible cyberinfrastructure to enable the work of science and engineering researchers and educators across the nation. NSF-Cyberinfrastructure