1 / 19

Hiroshi Nakashima Academic Center for Computing and Media Studies Kyoto University

Combining the Power of Computer and Computational Sciences to Fly to Peta-Scale — a Case Study —. Hiroshi Nakashima Academic Center for Computing and Media Studies Kyoto University. special thanks to: Y. Omura & H. Usui (RISH, Kyoto U.). Contents. Introduction: Combining CS 2 Power

pepin
Download Presentation

Hiroshi Nakashima Academic Center for Computing and Media Studies Kyoto University

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Combining the Power of Computer and Computational Sciences to Fly to Peta-Scale— a Case Study — Hiroshi Nakashima Academic Center for Computing and Media Studies Kyoto University special thanks to: Y. Omura & H. Usui (RISH, Kyoto U.)

  2. Contents • Introduction: Combining CS2 Power • Why Need to Fly to Peta-Scale? • What Kind of Power to Be Combined? • Case Study: Plasma Simulation on DM Systems • Why Plasma Simulation? • Why for DM Systems ? • How for DM Systems ? • How Efficient ? • Fly from Case Study • Took off Successfully? • How Can We Fly Higher? • Conclusions

  3. Contents • Introduction: Combining CS2 Power • Why Need to Fly to Peta-Scale? • What Kind of Power to Be Combined? • Case Study: Plasma Simulation on DM Systems • Why Plasma Simulation? • Why for DM Systems ? • How for DM Systems ? • How Efficient ? • Fly from Case Study • Took off Successfully? • How Can We Fly Higher? • Conclusions

  4.   Why Need to Combine CS2 Power ?Fly to Peta: How High? (1/2) T2K Open Supercomputer in Kyoto Rpeak/Rmax=61.2/50.5TFLOPS (#34) system: node-group x 70 + 288p-sw x 6 +  +  + ... node group: node x 6 + 24p-sw x 2 node: (socket + mem.bank) x 4 + IB x 4 • already large enough (16 x 416 nodes = 6656 cores) • already layered deeply & complicatedly enough (coresocketnode node-groupsystem) socket: core x 4 + L3 core: (mul+add) x 2 + (L1+L2)

  5. T2K Open Supercomputer in Kyoto Rpeak/Rmax=61.2/50.5TFLOPS (#3?) system: node-group x 70 + 288p-sw x 6 +  +  + ... node group: node x 6 + 24p-sw x 2 node: (socket + mem.bank) x 4 + IB x 4 • already large enough (16 x 416 nodes = 6656 cores) • already layered deeply & complicatedly enough (coresocketnode node-groupsystem) socket: core x 4 + L3   core: (mul+add) x 2 + (L1+L2) Why Need to Combine CS2 Power ?Fly to Peta: How High? (2/2) Peta-scale system should be; • much larger (1,000,000 cores  6656 x 150) • much more deeply/complicatedly layered (corecore-groupsocketsocket-groupnode node-groupnode-supergroupsystem)

  6. Why Need to Combine CS2 Power ?Fly to Peta: How High? (2/2) Peta-scale system should be; • much larger (1,000,000 cores  6656 x 150) • much more deeply/complicatedly layered (corecore-groupsocketsocket-groupnode node-groupnode-supergroupsystem) BTW, how large is Peta? • 1 Peta meter > 100 light-year • 1 Peta second > 30 million year • 1 Peta kg > 1/2 x Deimos • 1 Peta Hz > violet

  7. Why Need to Combine CS2 Power ?What Are Combined to Fly? Computational scientists have deep knowledge of; • physics, chemistry, biology, ... • their own problems, algorithms, programs, ... • (sometimes) their own supercomputers Computational scientists have deep knowledge of; • physics, chemistry, biology, ... • their own problems, algorithms, programs, ... • (sometimes) their own supercomputers and (often?) have Nature/Science papers more Nature/Science papers and chance to win Nobel Prize much more efficient way to fully exploit peta-scale computing power chance to co-author a Nature/ Science paper and to attend Nobel Prize Ceremony Computer scientists have deep knowledge of; • a wide variety of computers, software, tools, ... • a wide variety of algorithms, techniques, tricks, ... • (sometimes) a few of scientific problems Computer scientists have deep knowledge of; • a wide variety of computers, software, tools, ... • a wide variety of algorithms, techniques, tricks, ... • (sometimes) a few of scientific problems but never dream to author a Nature/Science paper

  8. Contents • Introduction: Combining CS2 Power • Why Need to Fly to Peta-Scale? • What Kind of Power to Be Combined? • Case Study: Plasma Simulation on DM Systems • Why Plasma Simulation? • Why for DM Systems ? • How for DM Systems ? • How Efficient ? • Fly from Case Study • Took off Successfully? • How Can We Fly Higher? • Conclusions

  9. Case Study: Plasma Simulation on DMWhy Plasma Simulation ? A big user group of plasma simulation insisted that our new system should include this power/money hungry subsystem for their memory hungry SM-parallel application. power/money hungry large scale (128cores, 1TB, 1.28TFlops) shared memory nodes I failed to persuade them to build Open-Supercomputer-only system. So I swore revenge on them by coding a much more efficient DM-parallel program to run on Open Supercomputer.

  10. simulate particle movement by Case Study: Plasma Simulation on DMWhy for DM Systems ? (1/2) a large number of (e.g. > 1 billion) charged particles a large scale (e.g. 2000x2000x2000 grid) electromagnetic field (e.g. magnetosphere)

  11. Case Study: Plasma Simulation on DMWhy for DM Systems ? (2/2) • particle parallelization (only) very simple esp. on SM  #particle memory short in SM #grid-point memory short even in DM

  12. Case Study: Plasma Simulation on DMHow for DM Systems ? (1/3) 03 13 23 33 03 13 23 33 02 12 22 32 02 00 10 12 11 20 31 22 02 30 33 32 03 01 11 21 31 01 11 21 31 21 01 23 13 32 OhHelp: One-handedHelp 00 10 20 30 00 10 20 30 22 primary subspaces secondary subspaces • uniform block decomposition • well-balanced: #particle-in-subspace  #p / #nodes  (1 + )  simulate primary particles  neighboring comm. only • each node helps another node having dense subspace • balanced #particles • balanced subspace size • simple boundary comp/comm • well-balancedstable ss ass.

  13. Case Study: Plasma Simulation on DMHow for DM Systems ? (2/3) • Secondary Space Assignment give p even if becoming less than average get from somebody afterward move p from heaviest to lightest so that lightest has av. #p av. #p 33 00 32 01 30 10 13 03 23 20 31 02 11 21 12 22

  14. 33 00 32 01 30 10 13 03 23 20 31 02 11 21 12 22 Case Study: Plasma Simulation on DMHow for DM Systems ? (3/3) • Well-Balancing Check with Primary/Secondary Tree • check recursively from leaves to root • OK if no overflow detected • must have all primaries not covered by children • cover secondaries up to well-balancing limit • must have all primaries • cover secondaries up to well-balancing limit

  15. Case Study: Plasma Simulation on DMHow Efficient ? • performance @ 16-128 proc on HPC2500 x11.71 balanced T2K Open Supercomputer 4 nodes (64 cores) x4.02 x8.76 unbalanced x10.7 x1.66 original x3.20

  16. Contents • Introduction: Combining CS2 Power • Why Need to Fly to Peta-Scale? • What Kind of Power to Be Combined? • Case Study: Plasma Simulation on DM Systems • Why Plasma Simulation? • Why for DM Systems ? • How for DM Systems ? • How Efficient ? • Fly from Case Study • Took off Successfully? • How Can We Fly Higher? • Conclusions

  17. Fly from Case StudyTook off Successfully ? Plasma simulation group now; • appreciates OhHelp and Open Supercomputer (but not published Nature/Science papers yet ) • is planning to port codes to Open Supercomputer. Plasma simulation group now; • appreciates OhHelp and Open Supercomputer (but not published Nature/Science papers yet ) • is planning to port codes to Open Supercomputer. • hopes our help in recoding a variety of simulators. We supercomputer guys now; • are happy with accomplishing the revenge. • are generously pursuing cooperative research with them (hoping at least to have a SC paper ) We supercomputer guys now; • are happy with accomplishing the revenge. • are generously pursuing cooperative research with them (hoping at least to have a SC paper ) • but cannot find time to do everything they want.

  18. Fly from Case StudyHow Fly Higher ? • Plasma guys have a large variety of simulators. • Plasma guys have a wide variety of simulators. • Other guys have other varieties of other simulators. Parallelization Method Library generated from • method skeleton • AP specific stub and linked to simulators. • We supercomputer guys have OhHelp which needs to be adapted to each simulator by modifying not only itself but also the simulator. • We supercomputer guys have OhHelp which needs to be adapted to each simulator by modifying not only itself but also the simulator. • Expectedly we will find other computer-scientific tricks for other types of simulators.

  19. Conclusions • Flying to Peta-scale needs CS2 collaboration • offering various (non-numerical) tricks from computer guys. • taking opportunity to play in larger and real-world application field from computational guys. • Took off from OhHelp • simple but efficient load balancing for plasma simulations. • (non-numerical) computer-scientific tricks can greatly improve numerical simulations. • fly higher by parallelization method libraries. • Other ways to elevate • adaptation of linear equation solvers to applications w.r.t. memory layout. • parallel script programming language for large parameter space exploration.

More Related