310 likes | 423 Views
The Swiss Initiative for High-Performance Computing and Networking. Neil Stringfellow, Associate Director CSCS. Centro Svizzero die Calcolo Scientifico (CSCS) Swiss National Supercomputing Center . Established in 1991 by the Swiss Government as autonomous unit of ETH Zurich
E N D
The Swiss Initiative for High-Performance Computing and Networking • Neil Stringfellow, Associate Director CSCS
Centro Svizzero die Calcolo Scientifico (CSCS)Swiss National Supercomputing Center • Established in 1991 by the Swiss Government as autonomous unit of ETH Zurich • Located in Manno, near Lugano • Highly qualified, internationallyrecognized staff (41 FTE) • Develops, promotes, and provides leading-edgehigh-performance computingservices to the Swiss researchcommunity • 400 users working on 50 projects (status 2009) • Hosting and operating on behalf of Meteo Swiss the supercomputer foroperational weather forecasts (8 simulations per day, first country to run high resolution weather forecast in Europe)
The national HPCN strategy Issues HPC is a key requirement for leadership science as well as for a knowledge based society and industry The international competition HPC is accelerating (USA, Japan, D, F, UK, E, China and India) Economy of scale is a basis of HPC Answers Installation of a Petaflop/s computer by 2011/2012 Construction of a new CSCS building Creation of a Swiss competence network to connect existing application areas and reach out to new ones 5
Stimulus Package 3.5 Million for Building Planning 10 Million for New Machine 3 Million for HPC Education 2% of all Stimulus Money went to CSCS ! Swiss Stimulus Package 700 Million CHF
Cray XT5 – Monte Rosa 7 • 14,752 processors • 1844 eight-way nodes • 2 AMD 2.4 GHz “Shanghai” Opterons per node • Upgrade underway to 2.4 GHz “Istanbul” • Peak performance 141 Tflop/s • Linpack 117 Tflop/s • Peak will be 212 Tflop/s after upgrade • 29 Terabytes of memory • 16 Gigabytes per node • 2 Gbytes per processor core • 287 Terabytes of scratch file system • ~ capable of 12 GB/s sustained write bandwidth • 23rd on Top500 list in June 2009 • 4th most powerful system in Europe • Already at 90% Utilisation • ~ 30% of jobs require > 50% of machine
Theory (since antiquity) and simulation(since Metropolis, Teller, von Neuman, Fremi, ... 1940s) combined with experiment (since Galilei & Newton) Pillars of 21. century scientific method Excellence in Science requires leadership in all three areas: theory, experiment, and simulations
Invest in algorithms or computer hardware? (source: David Landau, UGA)
Simulations are necessary for scientific investigations to cope effectively with complex systems • Science is about discovery and understanding - those who come first get the credit • Simulations that use high-performance computing (HPC) have the competitive edge • Leadership in science requires leadership in simulation and leadership in HPC in particular
Role of science in Switzerland: why we are well positioned to make leading contributions to HPC • Switzerland puts a high value on scientific research and education and on maintaining international leadership in science and engineering • The density of internationally recognized computational scientists in Switzerland is very high, even when compared to the USA • Stable funding and flat hierarchies in Switzerland and particularly at ETH allow for a pragmatic, solution-oriented, and nimble response to new challenges and opportunities
Computational Science in Switzerland The density of internationally recognized computational scientists in Switzerland is very high Top 200 in Shanghai List CSCS User Community University of Zurich University of Basel University of Bern University of Geneva ETH Zurich EPF Lausanne EMPA Paul Scherrer Institute 12
INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCE ETH ZURICH, Switzerland absolute streamer occurrence on 330 K during winter (DJF) 1.9 x 1.9 (T63L31) 2.8 x 2.8 (T42L19) 1.1 x 1.1(T106L31) reference data(ERA40, 1 x 1) CSCS Swiss National Supercomputing Center Predicting the frequency of severe weather events in a changing climate: high-resolution simulations are crucial Potential Vorticity streamers are intrusions of stratospheric air into the troposphere. They affect various atmospheric processes, like heavy precipitation over the Alps. ECHAM-HAM high-resolution simulations reliably capture the frequency at which potential vorticity streamers occur. Low resolution simulations underestimate their occurrence. (master thesis A. Béguin, ETH Zurich, 2009)
INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCE ETH ZURICH, Switzerland land / sea distribution and terrain height 1.1 x 1.1(T106) 2000m 1500m 1000m 500m 0m sea 1.9 x 1.9 (T63) 2.8 x 2.8 (T42) CSCS Swiss National Supercomputing Center Europe in ECHAM-HAM high-resolution required to: 1) provide boundary conditions for nested regional model 2) compare model with regional scale observational data for example: Italy, the Alps, or Denmark are missing at low resolution, 2.8 x 2.8
Why resolution is such an issue for Switzerland 1X 70 km 35 km 8.8 km 100X 2.2 km 10,000X 0.55 km 1,000,000X
INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCE ETH ZURICH, Switzerland CSCS Swiss National Supercomputing Center High-resolution cloud-resolving regional climate simulations: Towards improved simulations of the water cycle in a changing climate The Alpine area is very vulnerable tochanges in the water cycle such as droughts, heat waves, and floods. Current projections of future changes in summer precipitation are highly uncertain. Advantages of cloud-resolving climate models: (1) Better representation of the land surface, (2) Explicit representation of heav precipitation (e.g. thunderstorms). Better representation of the daily cycle of precipitation in summer periods (Hohenegger et al. 2008, MZ). Cloud resolving @ 2.2km State-of-the-art @ 25km
INSTITUTE FOR ATMOSPHERIC AND CLIMATE SCIENCE ETH ZURICH, Switzerland CSCS Swiss National Supercomputing Center Terrain height in the regional climate model at different resolutions
Importance of HPC for modelling other Natural Hazards in Switzerland Climate and Weather Avalanches Earthquakes Engineering Energy • In 1356 Basel was destroyed by an Earthquake. • We now know that large earthquakes are more frequent than previously thought • Earthquake modelling is important for planning nuclear power plant safety Astrophysics 18
Selected application areas for simulation based science and engineering in Switzerland Climate and Weather Biomedical Engineering Energy Materials science and many others Chemistry/Pharmaceutical Astrophysics 19 19
Simulations require a high-performance computing ecosystem 3. Leadership runs 2006-2007: Production runs on leadership Cray XT3/4 system(~5000 processors) Capability computing at regional/national centers Leadership Local/institutional capacity computing 2. Scale-out 2005: Algorithm and implementation adapted for leadership systems 4. Large simulations since 2008: Continue large simulations on capability systems 1. Prior to 2004: VASP code developedon workstations and clusters and runson about 100 proc.
Strategic goals • In order to sustain a leading position in science, Switzerland has to develop leadership in HPC to support simulations, one of the three pillars of modern science • Sustainable implementation of the HPC ecosystem in Switzerland, which includes the national supercomputing center, institutional computing facilities, as well effective mapping of models and methods onto modern HPC hardware • Establish strong relationships with leadership computing facilities around the world • Develop key components of HPC in Switzerland • Method and algorithm development • Programming models, languages, and architectures for HPC • Sustained operations of national and institutional HPC systems
2011 (planned) 2009 (today) 5x 5x Jaguar @ ORNL/LCF 200 XT5 cabinets Sequoia (BG/Q) @ LLNL LCF3: Argonne or Oak Ridge 20 PFlop/s Will require new building infrastructure at CSCS 1.5 PFlop/s 800 TFlop/s Think about this now! 4 PFlop/s 5x Rosa @ CSCS only 20 XT5 cabinets => 210 TFlop/s (infrastructure limited) 300 TFlop/s 5x EPFL: ~60 TFlop/s UZH: ~60 TFlop/s ETHZ: ~70 TFlop/s 60 TFlop/s The ecosystem in numbers (peak performance) Capability computing at regional/national centers Leadership Local/institutional capacity computing
Elements of the Swiss HPCN Initiative • Swiss Platform for HP2C (2009-12): • Simulation capabilities that make effective use of next generation supercomputers • Establish HPC in CSE programs at Swiss universities • Hardware Phase I (2009-11): • Upgrade Cray XT system at CSCS to maximum possible within current infrastructure • Develop new building infrastructure by 2012: • State of the art infrastructure that support a machine footprint that is about a factor 10 larger than today • Hardware Phase II (2012-15): • Goal for CSCS is to host systems with performance of 20-25% compared to largest leadership system in the world
Experiences with upgrade in 2009 • Implemented in record time! • March: financing, decision & placement of order • February through April: site preparations • May - Installation • June: early users & acceptance • July: part of CSCS user program • CSCS at maximum of current building capacity • Current power usage 1.9 MW (99% of capacity) • Running at maximum cooling capacity (frequent system shut-down in summer) • Abandon memory upgrade in fall 2009 • No room to further grow computer systems in the future
New building planned in Lugano • Area (1500 m^2) • Power & cooling ~ 10 MW • Proximity to academic institution • Facilitate seamlessly changes in computer hardware • Extensible • Textmasterformate durch Klicken bearbeiten
Systems research CS Dept.& vendors Comp. mathematics Computer Science Hardware vendor Physics (chemistry, ...) Application software vendors Computer Center Users CSCS’s (HPC Centers) traditional role CSE at universities Applied research CSCS & USI Learning from the Oak Ridge experience: Covering all aspect of the simulation system Distributing the tasks in Switzerland: Example based on ORNL’s early science teams that run on the first petaflop/s systems Simulations Models, Methods, & Implementation Map to Hardware System operation System design
The Swiss platform for High-Performance and High-Productivity Computing ( ) • Develop simulation capabilities that will make effective use of supercomputing platforms in 2012-14 • Implement the “networking” part of the HPCN strategy • Core program in computational mathematics and problem oriented computer science (jointly between CSCS & University of Lugano) • About 10-15 domain science sub-projects at Swiss universities with ~3 “embedded” HPC developers per project • Explore future hardware architectures with industry (Cray, IBM, other) and lading laboratories (ORNL, NERSC, others) • Develop HPC components of computational science and engineering curricula at Swiss universities • Already established: CSE at ETH, U. Basel • Currently under development: CSE @ USI, EPFL, UZH • Reach out to other universities
Projects have to face “brutal facts of HPC” • Massive concurrency: applications will have to put up with millions (billions) of threads • Less and (relatively) slower memory per thread: memory consideration should be integral part of complexity analysis • Only slow improvements in inter-processor and inter thread communications - remember that speed of light is constant! • Stagnant I/O subsystems: you don’t want to limit progress in simulation capabilities with rate of progress in long-term storage technologies • Resilience and fault tolerance: resilience towards failure of individual components; (energy) cost to error detection and correction is non-negligible
Expected research priorities of projects • Significant problems that require orders of magnitude more computer power than what is available today • Significant re-engineering of algorithms and refactoring of codes - scientific progress cannot be limited by legacy software • Consider emerging parallel programming models - multiple levels of parallelism, PGAS, DARPA HPCS languages, heterogeneous nodes (consider CPU + accelerator) • Revisit workflows, in particular to avoid I/O Letters of intent were due August 15, 2009 Project proposals were due September 30, 2009 Review and decision making process in October/November 2009 Tier 1 projects start in Dec./Jan. 2009 Tier 2 projects start ca. spring/summer 2010
CSCS service portfolio • Business Services • Administration • Human resources • Finance • Building Infrastructure • IT Infrastructure • National Supercomputing Service • HPC Systems • System programming • Resource allocation • User support • User education & training • Short- to medium-term application support • Scientific Computing • Long-term application development support • Data analysis & visualisation • Experimental HPC systems • Research Computing Collocation Service • MeteoSwiss • CHIPP • Other hosting mandates Internal support services Core business:Academic HPC service and HPC research Technology transfer CONFIDENTIAL
Dual core upgradeCray XT3 3’328 cores UpgradeCray XT3 1’664 proc. New procurementCray XT3 1’100 processors “Final” upgradeCray XT5 Procurementnext generationsupercomputerHPCN initiative UpgradeCray XT5 14’752 cores Hex-core upgrade 22’128 cores Begin constructionof new building New building High-risk & high-impact projects of the (www.hp2c.ch) 2013 2012 2011 2010 2009 2008 2007 2006 2005