120 likes | 248 Views
Performance Engineering Research Institute (PERI). Patrick H. Worley Computing and Computational Science Directorate Computer Science and Mathematics. Performance engineering: Enabling petascale science. Petascale computing is about delivering performance to scientists.
E N D
Performance EngineeringResearch Institute (PERI) Patrick H. Worley Computing and Computational Science DirectorateComputer Science and Mathematics
Performance engineering:Enabling petascale science Petascale computing is about delivering performance to scientists Maximizing performanceis getting harder PERI addresses this challengein three ways • Systems are more complicated • O(100K) processors • Multi-core with SIMD extensions • Scientific softwareis more complicated • Multi-disciplinary and multi-scale • Model and predictapplicationperformance • Assist SciDACscientific codeprojects withperformanceanalysis and tuning • Investigate novel strategies for automatic performance tuning BeamBeam3D accelerator modeling POP modelof El Nino IBM BlueGeneat LLNL Cray XT3at ORNL
SciDAC-1Performance EvaluationResearch Center (PERC): 2001–2006 Initial goals: Second phase: • Develop performance-related tools and methodologies for • Benchmarking • Analysis • Modeling • Optimization • In the last two years, added emphasis on optimizing performance of SciDAC applications, including • Community ClimateSystem Model • Plasma Microturbulence Project (GYRO, GS2) • Omega3Paccelerator model
Some lessons learned • Performance portability is critical • Codes outlive computing systems • Scientists can’t publish that they portedand optimized code • Most computational scientistsare not interested in performance tools • They want performance experts to work with them • Such experts are not “scalable,” i.e., they area limited resource and introduce yet another bottleneck in optimizing code
SciDAC-2Performance EngineeringResearch Institute (PERI) Evaluating architecturesand algorithms inpreparation for moveto petascale Providing guidancein automatic tuning Providingnear-termimpacton the performance optimizationof SciDAC applications Long-term research goal to improve performance portability Relieving the performance optimization burdenfrom scientific programmers
Engaging SciDAC software developers Application Engagement Application Liaisons TigerTeams • Work directly withDOE computational scientists • Ensure successful performance portingof scientific software • Focus PERI researchon real problems • Build long-term personal relationships between PERI researchers and scientific code teams • Focus on DOE’s highest priorities • SciDAC-2 • INCITE Community Atmosphere Model Performance Evolution 45 IBM p690 cluster, T42L26 benchmark Load bal., MPI/OpenMP, improved dyn., land, and physics Load bal., MPI/OpenMP, improved dyn., and land V2.0, load bal., MPI/OpenMP V2.0, load balanced, MPI-only V2.0, original (2001) settings Optimizing arithmetic kernels 30 Simulation years per day Maximizing scientific throughput 15 0 1 1 4 16 64 256 Processors
Modeling efforts contribute to procurementsand other activities beyond PERI automatic tuning Performance modeling Obvious needed improvements: Modeling is critical for automation of tuning: • Guidance to the developer • New algorithms, systems, etc. • Need to know whereto focus effort • Where are the bottlenecks? • Need to knowwhen we are done • How fast shouldwe expect to go? • Predictions for new systems • Greater accuracyand bounds on error • Reduced cost (human/system)
Automatic performance tuning of scientific code Long-term goals for PERI • Automate the process of tuning software to maximize its performance • Reduce the performance portability challenge facing computational scientists. • Address the problem that performance expertsare in short supply • Build upon forty years of human experience and recent success with linear algebra libraries
Source code Triage Analysis Transformations Code generation Code selection Application assembly Automatic tuning flowchart • Guidance • Measurements • Models • Hardware information • Sample input • Annotations • Assertions Domain-specificcode generation External software Runtime performance data Trainingruns Production execution Runtime adaptation Persistent database
PERI portalwww.peri-scidac.org • System integration issues: • Automatic tuning is common goal of multipleresearch activities internal to PERI • No hope of actually integrating them(e.g., Open64, SUIF, and ROSE compilers)into one system in the near future • Instead, PERI will bring up a Web portal that will be our interface to application developers: • There will often be “a performance engineer behind the curtain” • Goal is research demonstration of capability
Argonne National Lab LawrenceBerkeley National Lab Lawrence Livermore National Lab Oak RidgeNational Lab Paul Hovland Dinesh Kaushik Boyana Norris David Bailey Daniel Gunter Katherine Yelick Bronis de Supinski Daniel Quinlan Jeffrey Vetter Patrick Worley Rice University University of California -San Diego Universityof Maryland University ofNorth Carolina University of Southern California University of Tennessee JohnMellor-Crummey Laura Carrington Allan Snavely Jeffrey Hollingsworth Rob Fowler Daniel Reed Ying Zhang Jack Dongarra Shirley Moore Jacqueline Chame Mary Hall Bob Lucas (P.I.) The Team
Contact Patrick H. Worley Oak Ridge National Laboratory (865) 574-3128 worleyph@ornl.gov Fred JohnsonDOE Program Manager Office of Advanced Scientific Computing ResearchDOE Office of Science 12 Presenter_date