210 likes | 416 Views
Application Scalability and High Productivity Computing. Nicholas J Wright John Shalf Harvey Wasserman Advanced Technologies Group NERSC/LBNL. NERSC- National Energy Research Scientific Computing Center.
E N D
Application Scalability and High Productivity Computing Nicholas J Wright John Shalf Harvey Wasserman Advanced Technologies Group NERSC/LBNL
NERSC- National Energy Research Scientific Computing Center • Mission: Accelerate the pace of scientific discovery by providing high performance computing, information, data, and communications services for all DOE Office of Science (SC) research. • The production computing facility for DOE SC. • Berkeley Lab Computing Sciences Directorate • Computational Research Division (CRD), ESnet • NERSC
NERSC is the Primary Computing Center for DOE Office of Science • NERSC serves a large population Over 3000 users, 400 projects, 500 codes • NERSC Serves DOE SC Mission • Allocated by DOE program managers • Not limited to largest scale jobs • Not open to non-DOE applications • Strategy: Science First • Requirements workshops by office • Procurements based on science codes • Partnerships with vendors to meet science requirements
NERSC Systems for Science • Large-Scale Computing Systems • Franklin (NERSC-5): Cray XT4 • 9,532 compute nodes; 38,128 cores • ~25 Tflop/s on applications; 356 Tflop/s peak • Hopper (NERSC-6): Cray XE6 • Phase 1: Cray XT5, 668 nodes, 5344 cores • Phase 2: 1.25 Pflop/s peak (late 2010 delivery) • Clusters • 140 Tflops total • Carver • IBM iDataplex cluster • PDSF (HEP/NP) • ~1K core throughput cluster • Magellan Cloud testbed • IBM iDataplexcluster • GenePool (JGI) • ~5K core throughput cluster • NERSC Global • Filesystem (NGF) • Uses IBM’s GPFS • 1.5 PB capacity • 5.5 GB/s of bandwidth Analytics Euclid (512 GB shared memory) Dirac GPU testbed (48 nodes) • HPSS Archival Storage • 40 PB capacity • 4 Tape libraries • 150 TB disk cache
NERSC-9 1 EF Peak NERSC-8 100 PF Peak NERSC-7 10 PF Peak Hopper (N6) >1 PF Peak Peak Teraflop/s Franklin (N5) +QC 36 TF Sustained 352 TF Peak Franklin (N5) 19 TF Sustained 101 TF Peak NERSC Roadmap How do we ensure that Users Performance follows this trend and their Productivity is unaffected ? Top500 Users expect 10x improvement in capability every 3-4 years
Hardware Trends: The Multicore era • Moore’s Law continues unabated • Power constraints means cores will double every 18 months not clock speed • Memory capacity is not doubling at the same rate –GB/core will decrease Power is the Leading Design Constraint Figure courtesy of Kunle Olukotun, Lance Hammond, Herb Sutter, and Burton Smith
… and the power costs will still be staggering From Peter Kogge, DARPA Exascale Study $1M per megawatt per year! (with CHEAP power)
Changing Notion of “System Balance” • If you pay 5% more to double the FPUs and get 10% improvement, it’s a win (despite lowering your % of peak performance) • If you pay 2x more on memory BW (power or cost) and get 35% more performance, then it’s a net loss (even though % peak looks better) • Real example: we can give up ALL of the flops to improve memory bandwidth by 20% on the 2018 system • We have a fixed budget • Sustained to peak FLOP rate is wrong metric if FLOPs are cheap • Balance involves balancing your checkbook & balancing your power budget • Requires a application co-design make the right trade-offs
Summary: Technology Trends: • Number Cores • Flops will be “free” • Memory Capacity per core • Memory Bandwidth per core • Network Bandwidth per core • I/O Bandwidth
NERSC-9 1 EF Peak NERSC-8 100 PF Peak NERSC-7 10 PF Peak Hopper (N6) >1 PF Peak Peak Teraflop/s Franklin (N5) +QC 36 TF Sustained 352 TF Peak Franklin (N5) 19 TF Sustained 101 TF Peak Navigating Technology Phase Transitions Exascale + ??? GPU CUDA/OpenCL Or Manycore BG/Q, R Top500 COTS/MPP + MPI (+ OpenMP) COTS/MPP + MPI
Application Scalability How can a user continue to be productive in the face of these disruptive technology trends?
Source of Workload Information • Documents • 2005 DOE Greenbook • 2006-2010 NERSC Plan • LCF Studies and Reports • Workshop Reports • 2008 NERSC assessment • Allocations analysis • User discussion
New Model for Collecting Requirements • Joint DOE Program Office / NERSC Workshops • Modeled after ESnet method • Two workshops per year • Describe science-based needs over 3-5 years • Case study narratives • First workshop is BER, May 7, 8
Numerical Methods at NERSC(Caveat: survey data from ERCAP requests)
Application Trends Performance • Weak Scaling • Time to solution is often a non-linear function of problem size • Strong Scaling • Latency or Serial fraction will get you in the end. • Add features to models – “New” Weak Scaling “Processors” Performance “Processors”
Develop Best Practices in Multicore Programming NERSC/Cray Programming Models “Center of Excellence” combines: • LBNL strength in languages, tuning, performance analysis • Cray strength in languages, compilers, benchmarking Goals: • Immediate goal is training material for Hopper users: hybrid OpenMP/MPI • Long term input into exascale programming model = OpenMP thread parallelism
Develop Best Practices in Multicore Programming Conclusions so far: • Mixed OpenMP/MPI saves significant memory • Running time impact varies with application • 1 MPI process per socket is often good Run on Hopper next: • 12 vs 6 cores per socket • Gemini vs. Seastar = OpenMP thread parallelism
Co-Design Eating our own dogfood
Inserting Scientific Apps into the Hardware Development Process • Research Accelerator for Multi-Processors (RAMP) • Simulate hardware before it is built! • Break slow feedback loop for system designs • Enables tightly coupled hardware/software/science • co-design (not possible using conventional approach)
Summary • Disruptive technology changes are coming • By exploring • new programming models (and revisiting old ones) • Hardware software co-design • We hope to ensure that scientists productivity remains high !