1 / 36

NSF’s Evolving Cyberinfrastructure Program

NSF’s Evolving Cyberinfrastructure Program. Guy Almes <galmes@nsf.gov> Office of Cyberinfrastructure Oklahoma Supercomputing Symposium 2005 Norman 5 October 2005. Overview. Cyberinfrastructure in Context Existing Elements Organizational Changes

mikaia
Download Presentation

NSF’s Evolving Cyberinfrastructure Program

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. NSF’s Evolving Cyberinfrastructure Program Guy Almes <galmes@nsf.gov> Office of Cyberinfrastructure Oklahoma Supercomputing Symposium 2005 Norman 5 October 2005

  2. Overview • Cyberinfrastructure in Context • Existing Elements • Organizational Changes • Vision and High-performance Computing planning • Closing thoughts

  3. Cyberinfrastructure in Context • Due to the research university’s mission: • each university wants a few people from each key research specialty • therefore, research colleagues are scattered across the nation / world • Enabling their collaborative work is key to NSF

  4. Traditionally, there were two approaches to doing science: • theoretical / analytical • experimental / observational • Now the use of aggressive computational resources has led to third approach • in silico simulation / modeling

  5. Cyberinfrastructure Vision A new age has dawned in scientific and engineering research, pushed by continuing progress in computing, information, and communication technology, and pulled by the expanding complexity, scope, and scale of today’s challenges. The capacity of this technology has crossed thresholds that now make possible a comprehensive “cyberinfrastructure” on which to build new types of scientific and engineering knowledge environments and organizations and to pursue research in new ways and with increased efficacy. [NSF Blue Ribbon Panel report, 2003]

  6. Historical Elements • Supercomputer Center program from 1980s • NCSA, SDSC, and PSC leading centers ever since • NSFnet program of 1985-95 • connect users to (and through) those centers • 56 kb/s to 1.5 Mb/s to 45 Mb/s within ten years • Sensors: telescopes, radars, environmental, but treated in an ad hoc fashion • Middleware: of growing importance, but underestimated in importance

  7. ‘00 ‘97 ITR Projects Terascale Computing Systems Supercomputer Centers Discipline- specific CI Projects ETF Management & Operations • PSC • NCSA • SDSC • JvNC • CTC Partnerships for Advanced Computational Infrastructure Core Support • Alliance (NCSA-led) • NPACI (SDSC-led) • NCSA • SDSC Hayes Report Branscomb Report Atkins Report PITAC Report ‘85 ‘93 ‘03 ‘99 ‘08 ‘95 FY‘05

  8. Explicit Elements • Advanced Computing • Variety of strengths, e.g., data-, compute- • Advanced Instruments • Sensor networks, weather radars, telescopes, etc. • Advanced Networks • Connecting researchers, instruments, and computers together in real time • Advanced Middleware • Enable the potential sharing and collaboration • Note the synergies!

  9. CRAFT: A normative example – Sensors + network + HEC Univ Oklahoma NCSA and PSC Internet2 UCAR Unidata Project National Weather Service

  10. Current Projects within OCI • Office of Cyberinfrastructure • HEC + X • Extensible Terascale Facility (ETF) • International Research Network Connections • NSF Middleware Initiative • Integrative Activities: Education, Outreach & Training • Social and Economic Frontiers in Cyberinfrastructure

  11. TeraGrid: One Component • A distributed system of unprecedented scale • 30+ TF, 1+ PB, 40 Gb/s net • Unified user environment across resources • User software environment User support resources • Integrated new partners to introduce new capabilities • Additional computing, visualization capabilities • New types of resources: data collections, instruments • Built a strong, extensible Team • Created an initial community of over 500 users, 80 PIs • Created User Portal in collaboration with NMI courtesy Charlie Catlett

  12. Key TeraGrid Resources • Computational • very tightly coupled clusters • LeMieux and Red Storm systems at PSC • tightly coupled clusters • Itanium2 and Xeon clusters at several sites • data-intensive systems • DataStar at SDSC • memory-intensive systems • Maverick at TACC and Cobalt at NCSA • experimental • MD-Grape system at Indiana and BlueGene/L at SDSC

  13. Online and Archival Storage • e.g., more than a PB online at SDSC • Data Collections • numerous • Instruments • Spallation Neutron Source at Oak Ridge • Purdue Terrestrial Observatory

  14. TeraGrid DEEP Examples Aquaporin Mechanism Animation pointed to by 2003 Nobel chemistry prize announcement. Klaus Schulten, UIUC Atmospheric Modeling Kelvin Droegemeier, OU Reservoir Modeling Joel Saltz, OSU Advanced Support for TeraGrid Applications: • TeraGrid staff are “embedded” with applications to create • Functionally distributed workflows • Remote data access, storage and visualization • Distributed data mining • Ensemble and parameter sweeprun and data management Lattice-Boltzman Simulations Groundwater/Flood Modeling Peter Coveney, UCLBruce Boghosian, Tufts David Maidment, Gordon Wells, UT courtesy Charlie Catlett

  15. CyberresourcesKey NCSA Systems • Distributed Memory Clusters • Dell (3.2 GHz Xeon): 16 Tflops • Dell (3.6 GHz EM64T): 7 Tflops • IBM (1.3/1.5 GHz Itanium2): 10 Tflops • Shared Memory Clusters • IBM p690 (1.3 GHz Power4): 2 Tflops • SGI Altix (1.5 GHz Itanium2): 6 Tflops • Archival Storage System • SGI/Unitree (3 petabytes) • Visualization System • SGI Prism (1.6 GHz Itanium2+ GPUs) courtesy NCSA

  16. CyberresourcesRecent Scientific Studies at NCSA Weather Forecasting Computational Biology Molecular Science Earth Science courtesy NCSA

  17. P P P M M M Computing: One Size Doesn’t Fit All Trade-off • Interconnect fabric • Processing power • Memory • I/O Interconnect courtesy SDSC

  18. Data Storage/Preservation Extreme I/O Can’t be done on Grid (I/O exceeds WAN) SDSC Data Science Env SCEC Visualization Climate SCEC Simulation • 3D + time simulation • Out-of-Core EOL ENZO simulation NVO ENZO Visualization CIPRes Data capability (Increasing I/O and storage) CFD Distributed I/O Capable Protein Folding Campus, Departmental and Desktop Computing CPMD QCD Traditional HEC Env Compute capability (increasing FLOPS) Computing: One Size Doesn’t Fit All courtesy SDSC

  19. SDSC Resources DATA ENVIRONMENT • 1 PByte SAN • 6 PB StorageTek tape library • DB2, Oracle, MySQL • Storage Resource Broker • HPSS • 72-CPU Sun Fire 15K • 96-CPU IBM p690s Support for community data collections and databases Data management, mining, analysis, and preservation COMPUTE SYSTEMS • DataStar • 2,396 Power4+ pes • IBM p655 and p690 • 4 TB total memory • Up to 2 GB/s I/O to disk • TeraGrid Cluster • 512 Itanium2 pes • 1 TB total memory • Intimidata • Early IBM BlueGene/L • 2,048 PowerPC pes • 128 I/O nodes SCIENCE and TECHNOLOGY STAFF, SOFTWARE, SERVICES • User Services • Application/Community Collaborations • Education and Training • SDSC Synthesis Center • Community SW, toolkits, portals, codes courtesy SDSC

  20. Pittsburgh Supercomputing Center “Big Ben” System • Cray Redstorm XT3 • based on Sandia system • Working with Cray, SNL, ORNL • Approximately 2000 compute nodes • 1 GB memory/node • 2 TB total memory • 3D toroidal-mesh • 10 Teraflops • MPI latency: < 2µs (neighbor) • < 3.5 µs (full system) • Bi-section BW: 2.0/2.9/2.7 TB/s (x,y,z) • Peak link BW: 3.84 GB/s • 400 sq. ft. floor space • < 400 KW power • Now operational • NSF award in Sept. 2004 • Oct. 2004 Cray announced • Commercial version of Redstorm, XT3 courtesy PSC

  21. I-Light, I-Light2, and the TeraGrid Network Resource courtesy IU and PU

  22. Purdue, Indiana Contributions to the TeraGrid • The Purdue Terrestrial Observatory portal to the TeraGrid will deliver GIS data from IU and real-time remote sensing data from the PTO to the national research community • Complementary large facilities, including large Linux clusters • Complementary special facilities, e.g., Purdue NanoHub and Indiana University MD-GRAPE systems • Indiana and Purdue Computer Scientists are developing new portal technology that makes use of the TeraGrid (GIG effort) courtesy IU and PU

  23. New Purdue RP resources • 11 teraflops Community Cluster • (being deployed) 1.3 PB tape robot • Non-dedicated resources (opportunistic), defining a model for sharing university resources with the nation courtesy IU and PU

  24. PTO, Distributed Datasets for Environmental Monitoring courtesy IU and PU

  25. TeraGrid as Integrative Technology • A likely key to ‘all’ foreseeable NSF HPC capability resources • Working with OSG and others, work even more broadly to encompass both capability and capacity resources • Anticipate requests for new RPs • Slogans: • Learn once, execute anywhere • Whole is more than sum of parts

  26. TeraGrid as a Set of Resources • TeraGrid gives each RP an opportunity to shine • Balance: • value of innovative/peculiar resourcesvs value of slogans • Opportunistic resources, SNS, Grapes as interesting examples • Note the stress on the allocation process

  27. 2005 IRNC Awards • Awards • TransPAC2 (U.S. – Japan and beyond) • GLORIAD (U.S. – China – Russia – Korea) • Translight/PacificWave (U.S. – Australia) • TransLight/StarLight (U.S. – Europe) • WHREN (U.S. – Latin America) • Example use: Open Science Grid involving partners in U.S. and Europe, mainly supporting high energy physics research based on LHC

  28. NSF Middleware Initiative (NMI) • Program began in 2001 • Purpose: To design, develop, deploy and support a set of reusable and expandable middleware functions that benefit many science and engineering applications in a networked environment • Program encourages open source development • Program funds mainly development, integration, deployment and support activities

  29. Example NMI-funded Activities • GridShib – integrating Shibboleth campus attribute services with Grid security infrastructure mechanisms • UWisc Build and Test facility – community resource and framework for multi-platform build and test of grid software • Condor – mature distributed computing system installed on 1000’s of CPU “pools” and 10’s of 1000’s of CPUs.

  30. Organizational Changes • Office of Cyberinfrastructure • formed on 22 July 2005 • had been a division within CISE • Cyberinfrastructure Council • chair is NSF Director; members are ADs • Vision Document started • HPC Strategy chapter drafted • Advisory Committee for Cyberinfrastructure

  31. Education & Training Cyberinfrastructure Components Collaboration & Communication Tools & Services Data Tools & Services High Performance Computing Tools & Services

  32. Vision Document Outline • Call to Action • Strategic Plans for … • High Performance Computing • Data • Collaboration and Communication • Education and Workforce Development • Complete document by 31 March 2006

  33. Strategic Plan for High Performance Computing • Covers 2006-2010 period • Enable petascale science and engineering by creating a world-class HPC environment • Science-driven HPC Systems Architectures • Portable Scalable Applications Software • Supporting Software • Inter-agency synergies will be sought

  34. Coming HPC Solicitation • There will be a solicitation issued this month • One or more HPC systems • One or more RPs • Rôle of TeraGrid • Process driven by Science User needs • Confusion about capacity/capability • Workshops • Arlington -- 9 September • Lisle -- 20-21 September

  35. TCS LeMieux 6TF Marvel 0.3 TF Red Storm 10 TF Purdue Cluster 1.7TF Cray-Dell Xeon Cluster 6.4 TF IBM Cluster 0.2 TF IBM DataStar 10.4TF Dell Xeon Cluster 16.4 TF I/O Intensive Platforms Commodity Platforms SGI SMP system 6.6 TF Condor Pool 0.6 TF IBM Itanium Cluster 3.1 TF IBM Itanium Cluster 8TF HPC Platforms (2000-2005) Tightly Coupled Platforms ETF Integrating Framework

  36. Cyberinfrastructure Vision NSF will lead the development and support of a comprehensive cyberinfrastructure essential to 21st century advances in science and engineering.

More Related