1 / 51

Review of NCAR Al Kellie SCD Director November 01, 2001

Review of NCAR Al Kellie SCD Director November 01, 2001. Introduction to UCAR NCAR SCD Overview of divisional activities Research data sets (Worley) Mass Storage System (Harano) Extracting model performance (Hammond) Visualization & Earth System GRiD (Middleton) Computing RFP (ARCS).

livia
Download Presentation

Review of NCAR Al Kellie SCD Director November 01, 2001

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Review of NCAR Al Kellie SCD Director November 01, 2001

  2. Introduction to UCAR NCAR SCD Overview of divisional activities Research data sets (Worley) Mass Storage System (Harano) Extracting model performance (Hammond) Visualization & Earth System GRiD (Middleton) Computing RFP (ARCS) Outline of Presentation

  3. INTRODUCTION Overview of three divisional aspects Computing RFP (ARCS) Outline of Presentation

  4. University Corporation for Atmospheric Research Member Institutions Board of Trustees President Richard Anthes Corporate Affairs Jack Fellows, VP Finance & Administration Katy Schmoll, VP NCAR Tim Killeen, Director UCAR Programs Jack Fellows, Director Information Infrastructure Technology & Applications (IITA) Richard Chinman Cooperative Program for Optional Meteorology Education and Training (COMET) Constellation Observing System for Meteorology Ionosphere Climate (COSMIC) Digital Library for Earth System Science (DLESE) Atmospheric Chemistry Division (ACD) Atmospheric Technology Division (ATD) Advanced Study Program (ASP) Climate & Global Dynamics Diviion (CGD) Daniel McKenna David Carlson Al Cooper Maurice Blackmon Timothy Spangler Bill Kuo Mary Marlino Environmental & Societal Impacts Group (ESIG) High Altitude Observatory (HAO) Mesoscale & Microscale Meteorological Division (MMM) Scientific Computing Division (SCD) GPS Science and Technology Program (GST) Unidata Visiting Scientists Programs (VSP) Research Applications Programs (RAP) Joint Office for Science Support (JOSS) Robert Harriss Randolph Ware David Fulker Meg Austin Karyn Sawyer Robert Gall Brant Foote Michael Knölker Al Kellie Denotes President’s Office 12/07/98

  5. Atmospheric Chemistry Dan McKenna Climate & Global Dynamics Maurice Blackmon Mesoscale & Microscale Meteorology Bob Gall High Altitude Observatory Michael Knolker Scientific Computing Al Kellie Atmospheric Technology Dave Carlson Research Applications Brant Foote NCAR Tim Killeen ESIG Bob Harriss ASP Al Cooper Associate Director Steve Dickson UCAR Rick Anthes ISS K. Kelly B&P R.Brasher UCAR Board of Trustees NCAR Organization

  6. NCAR at a Glance • 41 years; 850 Staff – 135 Scientists • $128M budget for FY2001 • 9 divisions and programs • Research tools, facilities, and visitor programs for the NSF and university communities

  7. Total FY2001 funding: $128M

  8. NCAR Peer-Reviewed Publications

  9. NCAR Visitors

  10. 1959 “Blue Book” Link “There are four compelling reasons for establishing a National Institute for Atmospheric Research” 2. The requirement for facilities and technological assistance beyond those that can properly be made available at individual universities Where did SCD come from?

  11. SCD Mission Enable the best atmospheric & related research, no matter where the investigator is located through the provision of high performance computing technologies and related services

  12. SCIENTIFIC COMPUTING DIVISION DIRECTOR’S OFFICE Al Kellie, Director (12) Data Support Roy Jenne (9) Computational Science Steve Hammond (8) High Performance Systems Gene Harano (13) Data Archives Data Catalogs User Assistance Supercomputer Systems Mass Storage Systems Algorithmic Software Development Model performance Research Science Collaboration Frameworks Standards & Benchmarking User Support Section Operations and Infrastructure Support Aaron Andersen (18) Ginger Caldwell (21) Networking Engineering & Telecommunications Marla Meehl (25) Training/Outreach/Consulting Digital Information Distributed Servers & Workstations Allocations & Account Management Operations Room Facility Management & Reporting Database Applications Site Licenses LAN MAN WAN Dial-up Access Network Infrastructure Visualization & Enabling Technologies Don Middleton (12) Data Access Data Analysis Visualization Base $24,874 Ucar $4,027 Outside $2,020 Overhead $1,063

  13. Operates two distinct computational facilities. Climate simulations University community Governance of these SCD resources in the hands of the users - two external allocation committees. Computing leverages a common infrastructure for access, networking, data storage & analysis, research data sets, and support services including software development, and consulting. Computing Services for Research

  14. Climate Simulation Laboratory (CSL) The CSL is a national, multi-agency, special-use, computing facility for climate system modeling for the U.S. Global Change Research Program (USGCRP). Priority projects that require very large amounts of computer time. CSL resources are available to U.S. individual researchers with a preference for research teams regardless of sponsorship. An inter-agency panel selects the projects that use the CSL.

  15. Community Facility The Community Facility is used primarily by university-based NSF grantees and NCAR Scientists. Community resources are allocated evenly between NCAR and the university community. NCAR resources are allocated by the NCAR Director to the various NCAR divisions. University resources are allocated by the SCD Advisory Panel. Open to areas of atmospheric and related sciences.

  16. Distribution of Compute Resources

  17. History of Supercomputing at NCAR IBM SP/1308 IBM SP/604 IBM SP/296 IBM SP/64 Compaq ES40/36 Cluster IBM SP/32 Beowulf/16 SGI Origin2000/128 Cray J90se/24 HP SPP-2000/64 Cray T3D/128 Cray J90se/24 Cray Y-MP/8I Cray J90/20 Cray J90/16 Cray T3D/64 Cray C90/16 CCC Cray 3/4 IBM SP1/8 TMC CM5/32 IBM RS/6000 Cluster Cray Y-MP/8 Cray X-MP/4 TMC CM2/8192 Cray 1-A S/N 14 Cray Y-MP/2 Production Machines Cray 1-A S/N 3 Non-Production Machines CDC 7600 Currently in Production CDC 6600 CDC 3600 1960 1970 1980 1990 1995 1999 2000 2001

  18. STK 9940 #5 #4 2001

  19. OC3 (155Mbps) to the Front Range GigaPop - OC12 (622Mbps) on 1/1/2002 OC3 to AT&T Commodity Internet OC3 to C&W Commodity Internet OC3 to Abilene (OC12 on 1/1/2002) OC3 to the vBNS+ OC12 (622Mbps) to University of Colorado at Boulder intra-site research and back-up link to FRGP OC12 to NOAA/NIST in Boulder Intra-site research and UUNET Commodity Internet Dark fiber metropolitan area network at GigE (1000Mbps) to other NCAR campus sites NCAR Wide Area Connectivity

  20. TeraGrid Wide Area Network StarLight International Optical Peering Point (see www.startap.net) Abilene Chicago DTF Backbone Indianapolis Urbana * DENVER Los Angeles Starlight / NW Univ UIC San Diego I-WIRE Multiple Carrier Hubs Ill Inst of Tech ANL OC-48 (2.5 Gb/s, Abilene) Univ of Chicago Indianapolis (Abilene NOC) Multiple 10 GbE (Qwest) Multiple 10 GbE (I-WIRE Dark Fiber) NCSA/UIUC • Solid lines in place and/or available by October 2001 • Dashed I-WIRE lines planned for summer 2002

  21. ARCS Synopsis Credit: Tom Engel

  22. BEST VALUE PROCUREMENT Technical evaluation Delivery schedule Production disruption Allocation ready state Infrastructure Maintenance Cost impact – i.e. existing equipment Past performance of bidders Business proposal review Other considerations - invitation to partner ARCS RFP Overview

  23. Production-level Availability, robust batch capacity, operational sustainability and support Integrated software engineering and development environment High performance execution of existing applications Additionally – environment conducive to development of next-generation models ARCS Procurement

  24. Jobs using > 32 nodes 0.4 % of workload Average 44 nodes or 176 pes Jobs using < 32 nodes 99.6 % of workload Average 6 nodes or 24 pes Workload profile context

  25. A production-level, high-performance computing system providing for both capability and capacity computing A stable and upwardly compatible system architecture, user environment, and software engineering & development environments ARCS – The Goal • Initial equipment:At least double current capacity at NCAR • Long Term:Achieve 1 TFLOPs sustained by 2005

  26. SCD began technical requirements draft Feb 2000 RFP process (including scientific reps from NCAR divisions, UCAR Contracts, & external review panel) formally began Mar 2000; RFP released Nov 2000 Offeror proposal reviews, BAFOs, & Supplemental proposals Jan-May 2001 Technical Evaluations, Performance projections, Risk Assessment, etc. Feb-Jun 2001 SCD Recommendation for Negotiations 21 Jun; NCAR/ UCAR acceptance of recommendation 25 Jun Negotiations 24-26 Jul; tech. Ts&Cs completed 14 Aug Contract submitted to the NSF 01 Oct NSF Approval 5 Oct … Joint Press Releaseweek SC01 ARCS – The Process

  27. Hardware (processors, nodes, memory, disk, interconnect, network, HIPPI) Software (OS, user environment, filesystems, batch subsystem) System admin., resource mgmt., user limits, accounting, network/HIPPI, security Documentation & training System maintenance & support services Facilities (power, cooling, space) ARCS RFP Technical Attributes

  28. Critical Resource ratios: Disk 6 Bytes/peak-FLOP: 64+ MB/sec single-stream & 2+ GB/sec bandwidth - sustainable Memory 0.4 Bytes/peak-FLOP “Full-featured” product set (cluster-aware compilers, debuggers, performance tools, administrative tools, monitoring) Hardware & Software stability Hardware & Software vendor support & responsiveness (on-site, call center, development organization, escalation procedures) Resource allocation (processor(s), node(s), memory, disk; user limits & disk quotas) Batch Subsystem and NCAR job scheduler (BPS) Major Requirements

  29. Kernels (Hammond Harkness, Loft) Single Processor (COPY, IA, XPOSE, SHAL, RADABS, ELEFUNT, STREAMC) Multi-processor shared memory (PSTREAM) Message-Passing Performance (XPAIR, BISECT, XGLOB, COMMS[1,2,3], STRIDED[1,2], SYNCH, ALLGATHER) Parallel Shared Memory Applications CCM3.10.16 (T42 30-days & T170 1-day) – CGD, Rosinski WRF Prototype (b_wave 5-days) - MMM, Michalakes ARCS – Benchmarks (1) more >

  30. Parallel (MPI & Hybrid) models CCM3.10.16 (T42 30-day & T170 1-day – CGD, Rosinski MM5 3.3 (t3a 6-hr & “large” 1-hr) – MMM, Michalakes POP 1.0 (medium & large) – CGD, Craig MHD3D (medium & large) – HAO, Fox MOZART2 (medium & large) – ACD, Walters PCM 1.2 (T42) – CGD, Craig WRF Prototype (b_wave 5-day) – MMM, Michalakes System Tests HIPPI – SCD, Merrill I/O-tester – SCD, Anderson Network – SCD, Mitchell Batch Workloadincludes: 2 I/O-tester, 4 Hybrid MM5 3.3 large, 2 Hybrid MM5 3.3 t3a, 2 POP 1.0 medium & large, ccm3.10.16 T170, MOZART2 medium, PCM 1.2 T42, 2 MHD3D medium & large, WRF Prototype – SCD, Engel ARCS – Benchmarks (2) < return

  31. Vendor ability to meet commitments Hardware (processor architecture, clock speed boosts, memory architecture) Software (OS, filesystems, processor-aware compilers/libraries, tools [3rd party]) Service, Support, Responsiveness Vendor stability (product set, financial) Vendor promises vs. reality Risks

  32. Hardware & Software SCD/NCAR experience Other customers’ experience “Missed Promises” Vendor X ~ 2 yr slip, product line changes Vendor Y ~ on target Vendor Z ~ 1.5 yr slip, product line changes Past Performance

  33. “Blue Light” project invitation to develop of models for an exploratory supercomputer Invitation to a partnership development. Offer for an industrial partnership 256 Tflops peak, 8TB mem, 200TB disk on 64k nodes. True MPP with Torus interconnect. Node-64 Gflops, 128 MB mem, 32 kB L1 cache, 4MB L2 cache Columbia, LLNL, SDSC, Oak Ridge Other Considerations

  34. IBM was chosen to supply the NCAR Advanced Research Computing System (ARCS) … … will exceed the articulated purpose and goals A world-class system to provide reliable production supercomputing to the NCAR Community and Climate Simulation Laboratory A phased introduction of new, state-of-the-art computational, storage and communications technologies through the life of the contract (3-5 years) First equipment delivered Friday, 5 October ARCS Award

  35. ARCS Timetable

  36. ARCS Capacities Minimum + Negotiated capability commitments may require installation of additional capacity.

  37. Minimum Model Capability Commitments blackforest upgrade 1.0x (defines ‘x’) bluesky 3.1x bluesky upgrade 4.6x Failure to meet these commitments will result in IBM installing additional computational capacity Improved user environment functionality, support and problem resolution response Early access to new hardware & software technologies NCAR’s participation in IBM’s “Blue Light” exploratory supercomputer project (PFLOPs) ARCS Commitments

  38. Proposed Equipment - IBM †Federation switch (2400 MB/s, 4 usec) option 2H03

  39. ARCS Roadmap

  40. Thank you all for attending CAS 2001

More Related