NERSC. Today’s mission: Accelerate scientific discovery at the DOE Office of Science through high performance computing and extreme data analysis. National Energy Research Scientific Computing Center Established 1974, first unclassified supercomputer center
NERSC: Production Computing for the DOE Office of Science • Diverse workload: • 4,500 users, 600 projects • 700 codes; 100s of users daily • Allocations controlled primarily by DOE • 80% DOE Annual Production awards (ERCAP): • From 10K hour to ~10M hour • Proposal-based; DOE chooses • 10% DOE ASCR Leadership Computing Challenge • 10% NERSC reserve (“NISE”)
DOE View of Workload NERSC 2013 Allocations By DOE Office
Science View of Workload NERSC 2013 Allocations By Science Area
NERSC High End Computing and Storage Capabilities • Large-Scale Computing Systems • Hopper (NERSC-6): Early Cray Gemini System • 6,384 compute nodes, 153,216 cores • 144 Tflop/s on applications; 1.3 Pflop/s peak • Edison (NERSC-7): Early Cray Aries System (2013) • Over 200 Tflop/s on applications, 2 Pflop/s peak • 333 TB of memory, 6.4 PB of disk • Midrange • 275 Tflops peak • Carver • IBM iDataplexcluster • 10740 cores; 132TF • PDSF (HEP/NP) • ~2300 core cluster; 30TF • GenePool(JGI) • ~8200 core cluster; 113TF • 2.1 PB Isilon File System Analytics & Testbeds IBM x3850 1TB, 2TB nodes Dirac 50 NvidiaGPU nodes JesupIBM iDataPlex Data Analytics; HTC • NERSC Global • Filesystem (NGF) • Uses IBM’s GPFS • 8.5 PB capacity • 15GB/s of bandwidth • HPSS Archival Storage • 240 PB capacity • 5Tape libraries • 200 TB disk cache
JGI Historical Usage Repo: m342 PI: Eddy Rubin Repo: m1045 PI: Victor Markowitz *Note: % charged may differ from usage/allocation due to refunds
Why are we giving hours back? 2012 JGI’s Available Hours: 31,536,000 (GP1) + 4,905,600 (highmem) + 20,000,000 (NERSC) = 56,441,600 System Instability JGI clusters consolidated into Genepool; system went through period of instability (June-Sept 2012), users relied more heavily on other systems and checkpointing Some analysis jobs couldn’t run on Genepool Hadoop-style jobs, MPP work (RaxML, Ray,) needed to be run on Carver/Hopper 2013 JGI’s Available Hours: 31,536,000 (GP1) + 30,835,200 (GP2) + 6,937,920 (highmem) + 20,000,000 (NERSC) = 89,309,120 System Stability Improvements Major improvements to the scheduler, file systems and user workload has improved Genepool stability/availability; fewer jobs being rerun System Expansion / Configuration Genepool doubled in compute power end of 2012; nodes configured for both MPP and traditional JGI workloads (e.g. Hadoop jobs now run here)
Not everything can run on Hopper/Edison/Carver This analysis is critical to JGI’s mission and CAN NOT currently run on Hopper or Carver This analysis CAN and has been done on Hopper or Carver • Phylogenetic Tree Reconstruction • USEARCH • HMMER • BLAST (inefficient) • Metagenome Assembly Research (MPI-based assemblers like Ray) • Hadoop • Illumina pipeline • SMRTPortal • Fungal annotation pipeline • RQC pipeline • Jigsaw • Large memory assemblies • IMG production runs Single large analysis runs, codes that can run at scale, jobs that tolerate long queue waits High-throughput, automated, slot-scheduled jobs/pipelines; require large memory nodes, local disk, external database access Goal: Determine which workflows could migrate to Hopper/Edison (e.g. Jigsaw) Goal: Improve efficiency (particularly I/O) in existing workflows
Genepool is sufficient today, but what about next year? Goal: Accurately predict that givenXcompute nodes, we can analyzeYsamples over the course of Zmonths. • Step 1: Collect data • On Genepool (jobs run by program, type of analysis, time to complete, queue wait time) - procmon • On file systems (amount of data created per job, access patterns of the job) –NGF scripts • Step 2: Analyze the data • Predict compute time needed per sample sequenced; define acceptable queue wait and turn around times - MATH • Predict space needed per project - MATH • Step 3: Sanity check predictions • Add columns to LIMS system giving PMs the ability to enter predictions for compute and disk space needs IN PROGRESS 2013-14 2014
High-Impact Results on Day One NERSC’s users started running production codes immediately on Edison. 408 M MPP hours delivered in 2013 through Oct. 16. Top projects: carbon sequestration, artificial photosynthesis, complex novel materials, cosmic background radiation analysis Edison is very similar to Hopper, but with 2-5 times the performance per core on most codes. NERSC 8 Benchmark Performance
Edison is the premier production computing platform for DOE Office of Science 5200 compute nodes 124.5K processing cores 333 Terabytes memory 2.4 petaflops peak 530 TB/s memory bandwidth 11TB/s global bandwidth 1.3MW per PF 6.4PB storage @ 140TB/s • New Cray XC30 with Intel Ivy Bridge processors and Aries interconnect • Designed to support HPC and data-intensive work • Performs 2-4 x Hopper per node on real applications • Outstanding scalability for massively parallel apps • Easy adoption for users – runs current apps unmodified • Ambient cooled for extreme energy efficiency
