380 likes | 393 Views
Explore the evolution of Petaflop computing architectures, software challenges, and emerging applications in classic MPP systems. Delve into problem architectures, data assimilation, data grids, and complex system simulations. Discover the potential for innovative solutions in this dynamic field.
E N D
Application RequirementsPetaflop Computing Geoffrey Fox Computer Science, Informatics, Physics Indiana University, Bloomington IN 47404 gcf@indiana.edu
Petaflop Studies • Recent Livermore Meeting on Processor in Memory Systems • http://www.epcc.ed.ac.uk/direct/newsletter4/petaflops.html 1999 • http://www.cacr.caltech.edu/pflops2/ 1999 • Several earlier special sessions and workshops • Feb. `94: Pasadena Workshop on Enabling Technologies for Petaflops Computing Systems • March `95: Petaflops Workshop at Frontiers'95 • Aug. `95: Bodega Bay Workshop on Applications • PETA online: http://cesdis.gsfc.nasa.gov/petaflops/peta.html • Jan. `96: NSF Call for 100 TF "Point Designs" • April `96: Oxnard Petaflops Architecture Workshop (PAWS) on Architectures • June `96: Bodega Bay Petaflops Workshop on System Software
Crude Classification • Classic Petaflop MPP • Latency 1 to 10 microseconds • Single (petaflop) machine • Tightly coupled problems • Classic Petaflop Grid • Network Latency 10-100 or greater milliseconds • Computer Latency <1 millisecond (routing time) • e.g. 100 networked 10 teraflop machines • Only works for loosely coupled modules • Bandwidth is not usually the “hard problem” • Current studies largely science and engineering
Styles in “Problem Architectures” I • Classic Engineering and Scientific Simulation: FEM, Particle Dynamics, Moments, Monte Carlo • CFD Cosmology QCD Chemistry ….. • Work Memory4/3 • Needs classic low latency MPP • Classic Loosely Coupled Grid: Ocean-Atmosphere, Wing-Engine-Fuselage-Electromagnetics-Acoustics • Few-way functional parallelism • ASCI • Generate Data – Analyse Data – Visualize is a “3-way” Grid • Classic MPP or few-way distributed not so big MPP
Classic MPP Software Issues • Large scale parallel successes mainly using MPI • MPI Low level and initial effort “hard” but • Portable • Package as libraries like PETSc • Scalable to very large machines • Good to have higher level interfaces • DoE Common Component Architecture CCA “packaging modules” will work at coarse grain size • Can build HPF/Fortran90 parallel arrays (I extend this with HPJava) but hard to support general complex data structures • We should restart parallel computing research • Note Grid is set up (tomorrow) as set of Web services – this is a totally message based (as is CCA) • Run time compilation to inline a SOAP message to a MPI message to a Java method call • Opposite model – Message Passing is high level
Styles in “Problem Architectures” II • Data Assimilation: Combination of sophisticated (parallel) algorithm and real-time fit to data • Environment: Climate, Weather, Ocean • Target-tracking • Growing number of applications (in earth science) • Classic low latency MPP with good I/O • Of growing importance due to “Moore’s law applied to sensors” and large investment in new instruments by NASA, NSF …… • Need improved algorithms to avoid being overwhelmed by data
Styles in “Problem Architectures” III • Data Deluge Grid: Massive distributed data analyzed in “embarrassingly parallel” fashion • Virtual Observatory • Medical Image Data bases (e.g. Mammography) • Genomics (distributed gene analysis) • Particle Physics Accelerator (100 PB 2010) • Classic Distributed Data Grid • Corresponds to fields X-Informatics (X=Bio, Laboratory, Chemistry …) • See http://www.grid2002.org • Underlies e-Science initiative in UK • Industrial applications include health, equipment monitoring (Rolls Royce generates gigabytes data per engine flight), transactional databases • DoD sees this in places like Aberdeen Proving Ground (Test and Evaluation)
Styles in “Problem Architectures” IV • Complex Systems: Simulations of sets of often “non-fundamental” entities with phenomenological or idealized interactions. Often multi-scale and “systems of systems” and can be “real Grids”; data-intensive simulation • Critical or National Infrastructure Simulations (power grid) • Biocomplexity (molecules, proteins, cells, organisms) • Geocomplexity (grains, faults, fault systems, plates) • Semantic Web (simulated) and Neural Networks • Exhibit phase transitions, emergent network structure (small worlds) • Data used in equations of motion as well as “initial conditions” (data assimilation) • Several fields (e.g. biocomplexity) are immature and not currently using major MPP time • Could set a computational Grid to catch a real Grid but many cases will need a real petaflop MPP
Styles in “Problem Architectures” V • Although problems are hierarchical and multi-scale, not obvious that can use a Grid (putting each subsystem on a different Grid node) as ratio of Grid latency to MPP latency is typically 104 or more and most algorithms can’t accommodate this • X-Informatics is data (information) aspect of field X; This is X-complexity integrates mathematics, simulation and data • Military simulations (using HLA/RTI from DMSO) are of this style • Entities in complex system could be vehicles, forces • Or packets in a network simulation • T&E DoD Integrated Modeling and Testing (IMT) is also of this data intensive simulation style
Societal Scale Applications • Environment: Climate, Weather, Earthquakes • Heath: Epidemics • Critical Infrastructure: • Electrical Power • Water, Gas, Internet (all the real Grids) • Wild Fire (weather + fire) • Transportation –Transims from Los Alamos • All parallelize well due to geometric structure • Military: HLA/RTI (DMSO) • HLA/RTI usually uses event driven simulations but future could be “classic time-stepped simulations” as these appear to work in many cases IF you define at fine enough grain size • “Homeland Security?”
Data Intensive Requirements • Grid like: accelerator, satellite, sensor from distributed resources • Particle Physics – all parts of process essentially independent – 1012 events giving 1016 bytes of data per year • Happy with tens of thousands of PC’s at ALL stages of analyze • Size reduction as one proceeds through different stages • Need to select “interesting data” at each stage • Data Assimilation: start with Grid like gathering of data (similar in size to particle physics) and reduce size by a factor of 1000 • Note particle physics doesn’t reduce data size but maintains embarrassingly parallel structure • Size reduction probably determined by computer realism as much as by algorithms • Tightly coupled analysis combining data and PDE/coupled ODE
Particle Physics Web Services • A Service is just a “computer process” running on a (geographically distributed) machine with a “message-based” I/O model • It has input and output ports – data is from users, raw data sources or other services • Big services built hierarchically from “basic” services • Each service invokes a “CPU Farm” Accelerator Data as a Webservice (WS) Physics Model WS DetectorModel WS ML Fit WS Calibration WS Data Analysis WS PWA WS Monte Carlo WS ExperimentManagementWS Visualization WS
104 wide 104 wide Particle Physics TeraflopAnalysis Portal Earth Science PetaflopMPP for Data assimilation
Components USArrayUS Seismic Array a continental scale seismic array to provide a coherent 3-D image of the lithosphere and deeper Earth SAFODSan Andreas Fault Observatory at Depth a borehole observatory across the San Andreas Fault to directly measure the physical conditions under which earthquakes occur PBOPlate Boundary Observatory a fixed array of strainmeters and GPS receivers to measurereal-time deformation on a plate boundary scale InSAR: Interferometric Synthetic Aperture Radar images of tectonically active regions providing spatially continuous strain measurements over wide geographic areas.
EarthScope Integration • Structural Representation • • Structural Geology Field Investigations • • Seismic Imaging (USArray) • • Gravity and Electromagnetic Surveying • Kinematic (Deformational) Representation • • Geologic Structures • • Geochronology • • Geodesy (PBO and InSAR) • • Earthquake Seismology (ANSS) • • Behavioral (Material Properties) Representation • • Subsurface Sampling(SAFOD) • • Seismic Wave Propagation • Structures + Deformation + Material properties • = Process (community models) Prediction
Data for Science and Education Funding and Management NSF Major Research Equipment Account Internal NSF process Interagency collaboration Cooperative Agreement funding Community-based management MRE - $172 M / 5 years Product - Data Science-appropriate Community-driven Hazards and resources emphasis Cutting edge technology Free and open access Facilities in support of Science a Facility an NSF Science Program Fundamental Advances in Geoscience Funding and Management Science driven & research based Peer reviewed Individual investigator Collaborative / Multi-institutional Operations - $71 M / 10 years Science - $13 M / year Product - Scientific Results Multi-disciplinary trend Cross-directorate encouragement Fundamental research and applications Education and human resources
S an A ndreas F ault O bservatory at D epth
PBO – A Two-Tiered Deployment of Geodetic Instrumentation • A backbone of ~100 sparsely distributed continuous GPS receivers to provide a synoptic view of the entire North American plate boundary deformation zone. • Clusters of GPS receivers and strainmeters to be deployed in areas requiring greater spatial and temporal resolution, such as fault systems and magmatic centers (775 GPS units & 200 strainmeters).
a Site-specific Irregular Scalar Measurements a Constellations for Plate Boundary-Scale Vector Measurements Ice Sheets a Volcanoes PBO Greenland Long Valley, CA Topography 1 km Stress Change Northridge, CA Earthquakes Hector Mine, CA
Unified Structural Representation Faults Fault zone structure Velocity structure Ground Motions FSM RDM AWM SRM Paleoseismicity Historical seismicity Regional strain Earthquake Forecast Model Intensity Measures Earthquake Forecast Model Computational Pathway for Seismic Hazard Analysis Full fault system dynamics simulation FSM = Fault System Model RDM = Rupture Dynamics Model AWM = Anelastic Wave Model SRM = Site Response Model
Seismic Hazard Map
Can it all be done with Grid?For particle physics – yesFor Data Intensive Simulations -- no
Societal Scale Applications Issues • Need to overlay with Decision Support as problems are often optimization problems supporting tactical or strategic decision • Verification and Validation as dynamics often not fundamental • Related to ASCI Dream – physics based stewardship • Some of new areas like Biocomplexity, Geocomplexity are quite primitive and not even moved to today’s parallel machines • Crisis Management links infrastructure simulations to Collaborative peer-to-peer Grid
Interesting Optimization Applications • Military Logistics Problems such as Manpower Planning for Distributed Repair/Maintenance Systems • Multi-Tiered, Multi-Modal Transportation Systems • Gasoline Supply Chain Model • Multi-level Distribution Systems • Supply Chain Manufacturing Coordination Problems • Retail Assortment Planning Problems • Integrated Supply Chain and Retail Promotion Planning • Large-scale Production Scheduling Problems • Airline Planning Problems • Portfolio Optimization Problems
Process Model Process Model Decision Analysis Object Space data structures distributed application scripting Multi-Purpose Tools Parameter Estimation Output Analysis Mathematical Prgrming Models Generic Routines Simulated Annealing Genetic Algorithms Other Algorithms LP IP NLP Grid Infrastructure HPC Resources Decision Application Object Framework • Support Policy Optimization and Simulation of Complex Systems • Whose Time Evolution Can Be Modeled Through a Set of Agents Independently Engaging in Evolution and Planning Phases, Each of Which Are Efficiently Parallelizable, • In Mathematically Sound Ways • That Also Support Computational Scaling
Intrinsic Computational Difficulties • Large-scale Simulations of Complex Systems • Typically Modeled in Terms of Networks of Interacting Agents With Incoherent , Asynchronous Interactions • Lack the Global Time Synchronization That Provides the Natural Parallelism Exploited As Data Parallel Applications Such As Fluid Dynamics or Structural Mechanics. • Currently, the Interactions Between Agents are Modeled by Event-driven Methods that cannot be Parallelized Effectively • But increased performance (using machines like the Teragrid) needs massive parallelism • Need new approaches for large system simulations
Los Alamos SDS Approach • Networks of particles and (partial differential equation) grid points interact “instantaneously” and simulations reduce to iterating calculate/communicate phases:“calculate at given time or iteration number next positions/values” (massively parallel) and then update • Complex systems are made of agents evolving with irregular time steps (cars stopping at traffic lights; crashing; sitting in garage while driver sleeps ..) This lack of global time synchronization stops natural parallelism in old approachesSDS combines iterative local planning with massively parallel updatesMethod seems general ……