1 / 1

* Work supported by .US Department of Energy under contract DE-AC02-76SF00515.

Cho Ng, Arno Candel, Lixin Ge, Kwok Ko, Lie-Quan Lee, Zenghai Li, Vineet Rawat, Greg Schussman, Liling Xiao, SLAC , Esmond Ng, Ichitaro Yamazaki, LBNL , Quikai Lu, Mark Shephard, RPI.

bebe
Download Presentation

* Work supported by .US Department of Energy under contract DE-AC02-76SF00515.

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Cho Ng, Arno Candel, Lixin Ge, Kwok Ko, Lie-Quan Lee, Zenghai Li, Vineet Rawat, Greg Schussman, Liling Xiao, SLAC, Esmond Ng, Ichitaro Yamazaki, LBNL, Quikai Lu, Mark Shephard, RPI ABSTRACT: The past decade has seen tremendous advances in electromagnetic modeling for accelerator applications with the use of high performance computing on state-of-the-art supercomputers. Under the support of the DOE SciDAC computing initiative, a comprehensive set of parallel electromagnetic codes based on the finite-element method, ACE3P, has been developed aimed at tackling the most computationally challenging problems in accelerator R&D. Complemented by collaborative efforts in computational science, these powerful tools have enabled large-scale simulations of complex systems to be modeled with unprecedented details and accuracy. This paper will summarize the efforts in scalable eigen- and linear solvers, in parallel adaptive meshing algorithms, as well as in visualization of large datasets to meet the challenges in electromagnetic modeling at the extreme scale for advancing the design of next generation accelerators. Challenges in Electromagnetic Modeling Scalable Solvers • Development of hybrid linear solver • Goal:balance between computational and memory requirements • - Exploits techniques from sparse direct methods in computing incomplete factorizations, which are then used as preconditioners for an iterative method • Numerically stable hybrid solver based on domain decomposition: apply direct methods to interior domains and preconditioned iterative method to interfaces • Supported by DOE’s HPC initiatives Grand Challenge, SciDAC-1 (2001-2007), and SciDAC-2 (2007-2012), SLAC has developed ACE3P, a comprehensive set of parallel electromagnetic codes based on the high-order finite-element method • Advances in computational science are essential to tackle computationally challenging problems of large, complex systems in accelerator applications • Collaborations with SciDAC CETs and Institutes on • Linear solvers and eigensolvers (TOPS) • Parallel adaptive mesh refinement and parallel meshing (ITAPS) • Partitioning and load balancing (CSCAPES & ITAPS) • Visualization (IUSV) • Goal is virtual prototyping of accelerator structures Direct Meeting Challenges in Extreme-Scale Electromagnetic Modeling of Next Generation Accelerators using ACE3P* franklin Speedup Number of cores Strong scalability Schematic of matrix Refinements & Parallel Meshing • Any number of CPUs can be assigned to each interior domain • larger domains lead to larger aggregated memory and smaller interfaces • Smaller interfaces lead to faster convergence • Moving window for wakefield computation of short bunches (Talk by L.-Q. Lee) high-order p or finer mesh d A dipole mode in ILC cryomodule consisting of 8 superconducting RF cavities p=0 p=0 b f Visualization using ParaView PEP-X undulator taper ILC SRF cavity coupler • Field distribution in complex structure Pseudo Green’s function using s = 0.5mm s = 3mm 8 hours w/ 12000 cores on jaguar 15 hours w/ 6000 cores on jaguar • 45 hours w/ 4096 cores on jaguar • 15 TByte data • Parallel Meshing CLIC two-beam accelerator structure • Field distribution in moving window • 5 hours w/ 18000 cores on jaguar • 16 TByte data • A multi-file NetCDF format is designed to remove synchronized parallel writing bottleneck • Preliminary testing has shown the success and efficacy of using the format • 160M elements • 10 minutes using 64 processors (Presentation at Vis Nite) Cornell ERL vacuum chamber transition * Work supported by .US Department of Energy under contract DE-AC02-76SF00515.

More Related