600 likes | 715 Views
Particle Advection Defined & Usage. Particle advection is a foundational visualization algorithm. Advecting particles creates integral curves. Particle advection is a foundational visualization algorithm. Courtesy C. Garth, Kaiserslautern.
E N D
Particle advection is a foundational visualization algorithm • Advecting particles creates integral curves
Particle advection is a foundational visualization algorithm Courtesy C. Garth, Kaiserslautern
Particle advection is a foundational visualization algorithm Streamlines in the “fish tank”
Particle advection is a foundational visualization algorithm Poincaré analysis of a tokamak Image credit: A. Sanderson, Univ. Utah
Particle advection is a foundational visualization algorithm Stream surface around a tokamak
Lagrangian Methods • Visualize manifolds of maximal stretching in a flow, as indicated by dense particles • Finite-Time Lyapunov Exponent (FTLE) Courtesy Garth
Lagrangian Methods • Visualize manifolds of maximal stretching in a flow, as indicated by dense particles • Forward in time: FTLE+ indicates divergence • Backward in time: FTLE-indicates convergence www.vacet.org Courtesy Garth
Parallel Particle Advection: How ToLoad Balance? • N particles (P1, P2, … Pn), M Processing Elements/PEs (PE1, …, PEm) • (Processing Element = instance of your program) • Each particle takes a variable number of steps, S1, S2, … Sn • Total number of steps is ΣSi • We cannot do less work than this (ΣSi) • Goal: Distribute the ΣSi steps over M PEs such that problem finishes in minimal time
Parallel Particle Advection: How ToLoad Balance? • Goal: Distribute the ΣSi steps over M PEs such that problem finishes in minimal time • Sounds sort of like a bin-packing problem, but… • particles can move from PE to PE (i.e. work can move from bin to bin) • path of particle is data dependent and unknown a priori (i.e. we don’t know Si beforehand) • big data significantly complicates this picture…. • … data may not be readily available, introducing starvation
Advecting particles Decomposition of large data set into blocks on filesystem ? What is the right strategy for getting particle and data together?
Strategy: load blocks necessary for advection Decomposition of large data set into blocks on filesystem Go to filesystem and read block
Strategy: load blocks necessary for advection Decomposition of large data set into blocks on filesystem
“Parallelize over Particles” • Basic idea: particles are partitioned over PEs, blocks of data are loaded as needed. • Positives: • Indifferent to data size: a serial program can process data of any size • Trivial parallelization (partition particles over processors) • Negative: • Redundant I/O (both across PEs and within a PE) is a significant problem.
“Parallelize over data” strategy:parallelize over blocks and pass particles PE2 PE1 PE4 PE3
“Parallelize over Data” • Basic idea: data is partitioned over PEs, particles are communicated as needed. • Positives: • Only load data one time • Great for in situ processing • Negative: • Starvation!
Contracts are needed to enable these different processing techniques • Parallelize-over-seeds • One execution per block • Only limited data available at one time • Parallelize-over-data • One execution total • Entire data set available at one time
Both parallelization schemes have serious flaws. • Two approaches: Parallelize over particles Parallelize over data Hybrid algorithms
The master-slave algorithm is an example of a hybrid technique. • Uses “master-slave” model of communication • One process has unidirectional control over other processes • Algorithm adapts during runtime to avoid pitfalls of parallelize-over-data and parallelize-over-particles. • Nice property for production visualization tools. • (Implemented in VisIt) D. Pugmire, H. Childs, C. Garth, S. Ahern, G. Weber, “Scalable Computation of Streamlines on Very Large Datasets.” SC09, Portland, OR, November, 2009
Master P0 P1 P2 P3 Master Master Master P4 P5 P6 P7 P8 P9 P10 P11 P12 P13 P14 P15 Slave Slave Slave Slave Slave Slave Slave Slave Slave Slave Slave Slave Master-Slave Hybrid Algorithm • Divide processors into groups of N • Uniformly distribute seed points to each group • Master: • Monitor workload • Make decisions to optimize resource utilization • Slaves: • Respond to commands from Master • Report status when work complete
Master Process Pseudocode Master() { while ( ! done ) { if ( NewStatusFromAnySlave() ) { commands = DetermineMostEfficientCommand() for cmd in commands SendCommandToSlaves( cmd ) } } } What are the possible commands?
Commands that can be issued by master • Assign / Loaded Block • Assign / Unloaded Block • Handle OOB / Load • Handle OOB / Send • OOB = out of bounds Master Slave Slave is given a particle that is contained in a block that is already loaded
Commands that can be issued by master • Assign / Loaded Block • Assign / Unloaded Block • Handle OOB / Load • Handle OOB / Send • OOB = out of bounds Master Slave Slave is given a particle and loads the block
Load Commands that can be issued by master • Assign / Loaded Block • Assign / Unloaded Block • Handle OOB / Load • Handle OOB / Send • OOB = out of bounds Master Slave Slave is instructed to load a block. The particle advection in that block can then proceed.
Send to J Commands that can be issued by master • Assign / Loaded Block • Assign / Unloaded Block • Handle OOB / Load • Handle OOB / Send • OOB = out of bounds Master Slave Slave J Slave is instructed to send the particle to another slave that has loaded the block
Master Process Pseudocode Master() { while ( ! done ) { if ( NewStatusFromAnySlave() ) { commands = DetermineMostEfficientCommand() for cmd in commands SendCommandToSlaves( cmd ) } } }
Driving factors… • You don’t want PEs to sit idle • (The problem with parallelize-over-data) • You don’t want PEs to spend all their time doing I/O • (The problem with parallelize-over-seeds) • you want to do “least amount” of I/O that will prevent “excessive” idleness.
The heuristics behind the heuristics • Rules: • If no PE has the block, then load it! • If a PE does have the block… • If it is not busy, then give it the particle • If it is busy, then wait “a while”. • If you’ve waited a “long time”, then load it yourself.
Master-slave in action S0 0: Read S0 S1 S3 0: Read S1 1: Pass S2 S4 1: Pass 1: Read
Algorithm Test Cases • Core collapse supernova simulation • Magnetic confinement fusion simulation • Hydraulic flow simulation
Workload distribution in parallelize-over-particles Too much I/O
Workload distribution in supernova simulation Parallelization by: Data Particles Hybrid Colored by PE doing integration
Astrophysics Test Case: Total time to compute 20,000 Streamlines Uniform Seeding Non-uniform Seeding Seconds Seconds Number of procs Number of procs Particles Hybrid Data
Astrophysics Test Case: Number of blocks loaded Uniform Seeding Non-uniform Seeding Blocks loaded Blocks loaded Number of procs Number of procs Data Hybrid Particles
Summary: Master-Slave Algorithm • First ever attempt at a hybrid algorithm for particle advection • Algorithm adapts during runtime to avoid pitfalls of parallelize-over-data and parallelize-over-particles. • Nice property for production visualization tools. • Implemented inside VisIt visualization and analysis package.
Emerging I/O Configuration Common I/O Configuration Emerging I/O Configuration
Simulation Codes SimulationCode Checkpoint/Restart Node GlobalFile System SimulationCode Checkpoint/Restart SSD Node
Parallelize-over-Seeds Algorithm IntegrationI/O Streamlines GlobalFile System Memory Cache Node
Parallelize-over-Seeds Algorithm with the Extended Memory Hierarchy IntegrationI/O Streamlines GlobalFile System Memory Cache Local File System Node
Goal • Efficient code for a variety of particle advection based techniques • Cognizant of use cases with >>10K particles. • Need handling of every particle, every evaluation to be efficient. • Want to support diverse flow techniques: flexibility/extensibility is key. • Fit within data flow network design (i.e. a filter)
Motivating examples of system • FTLE • Stream surfaces • Streamline • Dynamical Systems (e.g. Poincaré Maps) • Statistics based analysis • + more