1 / 20

Real Science at the Petascale

Radhika S. Saksena 1 , Bruce Boghosian 2 , Luis Fazendeiro 1 , Owain A. Kenway, Steven Manos 1 , Marco Mazzeo 1 , S. Kashif Sadik 1 , James L. Suter 1 , David Wright 1 and Peter V. Coveney 1 1. Centre for Computational Science, UCL, UK 2. Tufts University, Boston, USA.

selia
Download Presentation

Real Science at the Petascale

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Radhika S. Saksena1, Bruce Boghosian2, Luis Fazendeiro1, Owain A. Kenway, Steven Manos1, Marco Mazzeo1, S. Kashif Sadik1, James L. Suter1, David Wright1 and Peter V. Coveney1 1. Centre for Computational Science, UCL, UK 2. Tufts University, Boston, USA Real Science at the Petascale

  2. Contents • New era of petascale resources • Scientific applications at petascale: • Unstable periodic orbits in turbulence • Liquid crystalline rheology • Clay-polymer nanocomposites • HIV drug resistance • Patient specific haemodynamics • Conclusions

  3. New era of petascale machines • Ranger (TACC) - NSF funded SUN Cluster • 0.58 petaflops (theoretical) peak: ~ 10 times HECToR (59 Tflops) “bigger” than all other TeraGrid resources combined • Linpack speed 0.31 petaflops, 123TB memory • Architecture: 82 racks; 1 rack = 4 chassis; 1 chassis = 12 nodes • 1 node = Sun blade x6420 (four 16 bit AMD Opteron Quad-Core processors); • 3,936 nodes = 62,976 cores • Intrepid (ALCF) - DOE funded BlueGene/P • 0.56 petaflops (theoretical) peak • 163,840 cores; 80TB memory • Linpack speed 0.45 petaflops • “Fastest” machine available for open science and third in general1 • 1. http://www.top500.org/lists/2008/06

  4. New era of petascale machines • US firmly committed to path to petascale (and beyond) • NSF: Ranger (5 years, $59 million award) • University of Tennessee, to build system with just under 1PF peak performance ($65 million, 5-year project)1 • “Blue Waters” will come online in 2011 at NCSA ($208 grant), using IBM technology – to deliver peak 10 Pflops performance (~ 200K cores, 10PB of disk) • 1. http://www.nsf.gov/news/news_summ.jsp?cntn_id=109850

  5. New era of petascale machines • We wish to do new science at this scale – not just incremental • advances • Applications that scale linearly up to tens of thousands of cores • (large system sizes, many time steps) – capability computing at • petascale • High throughput for “intermediate scale” applications • (in the 128 – 512 core range)‏

  6. HPCx Leeds Manchester Oxford RAL Intercontinental HPC grid environment UK NGS US TeraGrid HECToR NCSA AHE SDSC PSC TACC (Ranger)‏ ANL (Intrepid)‏ DEISA Lightpaths • Massive data transfers • Advanced reservation/ co-scheduling • Emergency/pre-emptive access

  7. Lightpaths - Dedicated 1 Gb UK/US network • JANET Lightpath is a centrally managed service which supports large research projects on the JANET network by providing end-to-end connectivity, from 100’s of Mb up to whole fibre wavelengths (10 Gb). • Typical usage • Dedicated 1Gb network to connect to • national and international HPC infrastructure • Shifting TB datasets between the UK/US • Real-time visualisation • Interactive computational steering • Cross-site MPI runs (e.g. between • NGS2 Manchester and NGS2 Oxford)‏

  8. Advanced reservations Plan in advance to have access to the resources ­ Process of reserving multiple resources for use by a single application - HARC1 - Highly Available Resource Co-Allocator - GUR2 - Grid Universal Remote Can reserve the resources: For the same time: Distributed MPIg/MPICH-G2 jobs Distributed visualization Booking equipment (e.g. visualization facilities)‏ Or some coordinated set of times Computational workflows Urgent computing and pre-emptive access (SPRUCE)1.http://www.realitygrid.org/middleware.shtml#HARC 2. http://www.ncsa.uiuc.edu/UserInfo/Resources/Hardware/TGIA64LinuxCluster/Doc/coschedule.html

  9. Advanced reservations • Also available via the HARC API - can be easily built into Java applications. • Deployed on a number of systems - LONI (ducky, bluedawg, zeke, neptune IBM p5 clusters) - TeraGrid (NCSA, SDSC IA64 clusters, Lonestar, Ranger(?)) - HPCx - North West Grid (UK) - National Grid Service - UK NGS - Manchester, Oxford, Leeds

  10. Middleware which simplifies access to distributed resources; manage workflows Wrestling with middleware can't be a limiting step for scientists - Hiding complexities of the ‘grid’ from the end user Applications are stateful Web services Application can consist of a coupled model, parameter sweep, steerable application, or a single executable Application Hosting Environment

  11. HYPO4D1 (Hydrodynamic periodic orbits in 4D)‏ • Scientific goal: to identify and characterize periodic orbits in turbulent fluid flow (from which exact time averages can be computed exactly) • Uses lattice-Boltzmann method: highly scalable (linear scaling up to at least 33K cores on Intrepid and close to linear up to 65K) a) Ranger b) Intrepid + Surveyor (Blue Gene/P)‏ 1. L. Fazendeiro et al. “A novel computational approach to turbulence”, AHM08

  12. HYPO4D1‏ (Hydrodynamic periodic orbits in 4D)‏ • Novel approach to turbulence studies: efficiently parallelizes time and space • Algorithm is extremely memory-intensive: full spacetime trajectories are numerically relaxed to nearby minimum (unstable periodic orbit)‏ • Ranger is ideal resource for this work (123 TB of RAM)‏ • During early-user period millions • of time steps for different • systems simulated and • then compared for similarities • ~ 9TB of data 1. L. Fazendeiro et al. “A novel computational approach to turbulence”, AHM08

  13. LB3D1 • LB3D -- three-dimensional lattice-Boltzmann solver for multi-component fluid dynamics, in particular amphiphilic systems • Mature code - 9 years in development. It has been extensively used on • the US TeraGrid, UK NGS, HECToR and HPCx machines • Largest model simulated to date is 20483 (needs Ranger) R. S. Saksena et al. “Petascale lattice-Boltzmann simulations of amphiphilic liquid crystals”, AHM08

  14. Cubic Phase Rheology Results1 • Recent results include the tracking of large time-scale defect dynamics on 10243 lattice-sites systems; only possible on Ranger, due to sustained core count and disk storage requirements • Regions of high stress • magnitude are localized in the • vicinity of defects 2563 lattice-sites gyroidal system with multiple domains 1. R. S. Saksena et al. “Petascale lattice-Boltzmann simulations of amphiphilic liquid crystals”, AHM08

  15. LAMMPS1 • Fully-atomistic simulations of clay-polymer nanocomposites • on Ranger • More than 85 million atoms • simulated • Clay mineral studies, with • ~ 3 million atoms, 2-3 orders • of magnitude greater than any • previous study • Prospects: to include the edges of the clay (not periodic boundary) and do realistic-sized models – at least 100 million atoms (~2 weeks wall clock, using 4096 cores)‏ 1. J Suter et al. Grid-Enabled Large-Scale Molecular Dynamics of Clay Nano-materials, AHM08

  16. HIV-1 drug resistance1 • Goal: to study the effect of anti- retroviral inhibitors (targetting proteins in the HIV lifecycle, such as viral protease and reverse- transcriptase enzymes) • High end computational power to confer clinical decision support • On Ranger, up to 100 replicas (configurations) simulated, for the first time, in some cases going to 100 ns • 3.5TB of trajectory and free energy analysis Energy differences of binding compared with experimental results for wildtype and MDR proteases with inhibitors LPV and RTV using 10ns trajectory. • 6 microseconds in four weeks • AHE orchestrated workflows 1. K. Sadiq et al., “Rapid, Accurate and Automated Binding Free Energy Calculations of Ligand-Bound HIV Enzymes for Clinical Decision Support using HPC and Grid Resources”, AHM08

  17. GENIUS project1 • Grid Enabled Neurosurgical Imaging Using Simulation (GENIUS)‏ • Scientific goal: to perform real time patient specific medical simulation • Combines blood flow simulation with clinical data • Fitting the computational time scale • to the clinical time scale: • Capture the clinical workflow • Get results which will influence clinical decisions: 1 day? 1 week? • GENIUS - 15 to 30 minutes 1. S. Manos et al., “Surgical Treatment for Neurovascular Pathologies Using Patient-specific Whole Cerebral Blood Flow Simulation”, AHM08

  18. GENIUS project1 • Blood flow is simulated using lattice-Boltzmann method (HemeLB)‏ • Parallel ray tracer doing real time in situ visualization • Sub-frames rendered on each MPI processor/rank and composited before • being sent over the network to a (lightweight) viewing client • Addition of volume rendering cuts down scalability of fluid solver due to • required global communications • Even so, datasets rendered at more than 30 frames per second (10242 • pixel resolution) 1. S. Manos et al., “Surgical Treatment for Neurovascular Pathologies Using Patient-specific Whole Cerebral Blood Flow Simulation”, AHM08

  19. CONCLUSIONS • A wide range of scientific research activities were presented that make • effective use of the new range of petascale resources available in the USA • These demonstrate the emergence of new science not possible without • access to this scale of resources • Some existing techniques still hold however, such as MPI, as some of • these applications have shown, scaling linearly up to at least tens of • thousands of cores • Future prospects: we are well placed to move onto next machines coming • online in the US and Japan

  20. Acknowledgements JANET/David Salmon NGS staff TeraGrid Staff Simon Clifford (CCS)‏ Jay Bousseau (TACC) Lucas Wilson (TACC)‏ Pete Beckmann (ANL)‏ Ramesh Balakrishnan (ANL)‏ Brian Toonen (ANL)‏ Prof. Nicholas Karonis (ANL)‏

More Related