270 likes | 585 Views
Anton, a Special-Purpose Machine for Molecular Dynamics Simulation. By David E. Shaw et al Presented by Bob Koutsoyannis. The Anton Legacy. Anton van Leeuwenhoek “Father of Microscopy” First to see bacteria and other micro organisms
E N D
Anton, a Special-Purpose Machine for Molecular Dynamics Simulation By David E. Shaw et al Presented by Bob Koutsoyannis
The Anton Legacy • Anton van Leeuwenhoek “Father of Microscopy” • First to see bacteria and other micro organisms • Objective: Improve the tools available to scientists to further our understanding of organisms & diseases
Anton the Machine • Specialized Massively Parallel Machine being built to improve Molecular Dynamic Simulations. • In the works to be completed by 2009 • Biological processes spatially distributed among many nodes in a 3D torus. • MD specific hardware • Novel parallel algorithms
Molecular Dynamics Simulation • Models the motions and interactions of molecular systems • Proteins • Cell Membranes • DNA • (atomic level simulations)
Motivation • Life Saving… • Used to visualize biochemical phenomena that cannot be seen in lab experiments. • Protein Folding • Protein, Protein interactions • Protein, Drug interaction • Key for Developing Drugs
What makes one MD simulatorbetter than the Next? • Time Scale • Being able to simulate the interaction between molecules for more than a nanosecond. • Problem Size • Why is a millisecond of simulation out of the scope of our current technology? • Consider 200,000 molecules • 1012 time steps to simulate a millisecond • Each time step requires intense arithmetic computation on all 200,000 molecules
What makes one MD simulatorbetter than the Next? • Other Projects Addressing MD Sims • Folding@Home • Network of 200,000 PC’s • Large sample for independent molecular sims • But no millisecond simulations • FASTRUN, MDGRAPE, MD Engine • Good with larger molecular system sims • Have strong arithmetic units • Still limited by communication bottlenecks
MD Simulator Requirements • Force Calculation • (getting an idea of the level of computation needed) • Molecular mechanics force fields used to model the total PE of a system. • Input: X,Y,ZOutputs: Force Quantities M1 M2
MD Simulator Requirements • Force Calculation • (getting an idea of the level of computation needed) • For every time step, the force fields must be updated. • FFT, Convolution, Inverse FFT (Computationally expensive operations) • For 200,000 molecules/step… • 1) Need a huge number of arithmetic processing elements
MD Simulator Requirements • Integration • (getting an idea of the level of computation needed) • For every time step, updates of atomic positions and velocities must be made. • Global actions and Constraints must be enforced on the entire system (temperature, pressure, optimizations.)
MD Simulator Requirements • Parallelization • (getting an idea of the level of computation needed) • For every time step, every atom must communicate within its cutt-off radius with every other atom. • 2) A lot of inter-processor communication that can be scaled well is needed.
MD Simulator Requirements • Parallelization • (getting an idea of the level of computation needed) • Whole System is broken down into boxes (processing nodes) • Each node handles the bonded interactions within • NT method for non-bonded interactions (much more common). • NT method for Atom Migration
Why Specialized Hardware? • 1) Need a huge number of arithmetic processing elements • 2) A lot of inter-processor communication that can be scaled well is needed. • 3) Memory is not an issue • With 25,000 atoms (64bytes each) total=1.6MB over 512 nodes=3.2KB/node which is < most L1 Memory Communication Computation Needs
Memory Communication Computation Needs Why Specialized Hardware? • Consider Moore’s Law on 10X improvement in 5 years vs. Anton’s 1000X in 1 year. • Can great discoveries wait? • Can use custom pipelines with more precision, increased datapath logic speed, over less silicon area. • Have Tailored ISA’s for geometric calculations+ • Programmability for accommodating various force fields and integration algorithms • Dedicated memory for each particle to accumulate forces
Updating force field This node may update for them Communication Latency • Low-latency, high-bandwidthnetwork within and betweenASICs. • Push based communicationwith counters (reduce wait). • Set of Autonomous DirectMemory Access (DMA) Enginesallowing for greater overlap of communication and computation. • Admission Control Features
Subsystems of Anton • High-Throughput Interaction Subsystem (HTIS) • Flexible Subsystem • Communication Subsystem • Memory Subsystem
High-Throughput Interaction Subsystem • Executes Non-bonded MD interaction calculations (Charge Spreading & Force Interpolation) • Accumulates forces on each particle as data streams through. • ICB Controls flow of data through the HTIS, programmable ISA extensions, acts as a buffering, pre-fetching, synchronization, and write back controller
Flexible Subsystem • Initiates Force Computation Phase • Calculates bonded force terms • Force correction terms • All integration tasks Constraint Calculations (temp & pressure) Pos. Vel. Updates Atom Migration All Maintenance Activities (boot, diagnostic, self-test, loading sims, switching contexts, logging, check pointing, error reporting).
Flexible Subsystem • General Purpose Core w/ Caches • Remote Access Unit • Autonomous data transfers • Geometry Cores • MD calculations bonded • Correction Pipeline • Computes force correction terms • Racetrack • Local, internal connect for flex subsys components • Ring Interface Unit • Flex subsys to transfer packets to/from communication subsystem.
Communications Subsystem • Routing 48-bit address space • 16-bit node identifier 32-bit of address per node • Flow Control • Provided access to ASIC DRAM • Supports accumulation and synchronization Memory Subsystem
Simulation Evaluations • 500X NAMD 80-100X Desmond 100X Blue Matter
Accuracy Efficiency • Increase system simulation size leads to increase in efficiency. • Force Error measured in relative rms force error • Energy Drift