1.08k likes | 1.33k Views
Physics of Information Technology. Crystalline Computation. Norm Margolus. Crystalline Computation. Spatial Computers Architectures and algorithms for large-scale spatial computation Emulating Physics Finite-state, locality, invertibility, and conservation laws Physical Worlds
E N D
Physics of Information Technology Crystalline Computation Norm Margolus
Crystalline Computation Spatial Computers • Architectures and algorithms for large-scale spatial computation Emulating Physics • Finite-state, locality, invertibility, and conservation laws Physical Worlds • Incorporating comp-universality at small and large scales see: http://people.csail.mit.edu/nhm/cc.pdf
Spatial Computers • Finite-state computations on lattices are interesting • Crystalline computers will one day have highest performance • Today’s ordinary computers are terrible at these computations
Tom Toffoli and I wanted to see CA-TV Stimulated by new rev rules Rule = a few TTL chips! Later, rule = SRAM LUT LGA’s needed more “serious” machine! CAMs 1 to 6
Lattice example //* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step
Lattice example //* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step
Lattice example //* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step
Lattice example //* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step
Lattice example //* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step
Lattice example //* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step
Lattice example //* Four direction lattice gas *// 2D-square-lattice {512 512} create-bit-planes {N S E W} define-step shift-planes {(N,0,-1) (S,0,1) (E,1,0) (W,-1,0)} for-each-site if {N==S && E==W} {xchng(N,E); xchng(S,W)} endif end-site end-step
Lattice Gas Machines Better for modeling: • Flexible neighborhoods • Incorporate physics Better for hardware: • Easier communication • Flexible parallelism The lattice gas programming model: alternate steps of movement and interaction.
CAM-8 node • maps directly onto programming model (bit fields) • 16-bit lattice sites (use internal dimension to get more) • each node handles a chunk of a larger space
CAM-8: discrete hydrodynamics Six direction LGA flow past a half-cylinder. System is 2Kx1K.
Crystallization Crystallization using irreversible forces (Jeff Yepez, AFOSR)
CAM-8: materials simulation 512x512x64 spin system, 6up/sec w rendering 256x256x256 reaction-diffusion (Kapral)
Physical sim Logical sim Wavefront sim Microprocessor at right runs at 1KHz CAM-8: logic simulation A simple logic simulation in a gate-array-like lattice dynamics. Wavefront simulation (R. Agin)
CAM-8: porous flow • Used MRI data from a rock (Fontainebleau sandstone) • Simulation of flow of oil through wet sandstone agrees with experiment within 10% • 3D LGA’s used for chemical reactions, other complex fluids 64x64x64 simulation (.5mm/side)
Local dynamics can be used to find and mark features Bit maps can be “continuously” rotated using 3 shears CAM-8: image processing (Toffoli)
25 Million 16-bit site-updates/sec 8 node prototype arbitrary shifts of all bit fields at no extra cost limited by mem bandwidth still 100x PC! CAM-8 node (1988 technology) 6MB/sec x 16 DRAMs + 16 ASICs 25 MHz 64Kx16
100 x faster per DRAM chip No custom hardware Needed fast FPGA interface to DRAM FPGA node design (1997) RDRAM FPGA 600MB/sec 1 RDRAM + 1 FPGA 150 MHz 64Kx16 pipelined
Embedded DRAM (2000) 4 Mbit Block of DRAM: 20 Blocks of DRAM (80 Mbits total): 2K 40K 2K 2K 2K bits / 40 ns 40K bits / 40 ns = 1 Tbit/sec per chip! Each chip would be 1000x faster than 128-DRAM CAM-8!
SPACERAM: A general purpose crystalline lattice computer • Large scale prob with spatial regularity • 1 Tbit/sec ≈ 1010 32-bit mult-accum / sec (sustained) for finite difference • Many image processing and rendering algorithms have spatial regularity • Brute-force physical simulations (complex MD, fields, etc.) • Bit-mapped 3D games and virtual reality (simpler and more direct) • Logic simulation (physical, wavefront) • Pyramid, multigrid and multiresolution computations
The Task: Large-scale brute force computations with a crystalline-lattice structure The Opportunity: Exploit row-at-a-time access in DRAM to achieve speed and size. The Challenge: Dealing efficiently with memory granularity, commun, processing The Trick: Data movement by layout and addressing The Result: Essentially perfect use of memory and communications bandwidth, without buffering SPACERAM . 18µm CMOS, 10 Mbytes DRAM, 1 Tbit/sec to mem, 215 mm2, 20W http://people.csail.mit.edu/nhm/isca.pdf
Lattice computation model 1D example: the rows are bit-fieldsand columns sites A B Shift bit-fields periodically and uniformly (may be large) A B Process sites independently and identically (SIMD)
Mapping the lattice into memory A A B B A A B B
Mapping the lattice into memory CAM-8 approach: group adjacent into memory words A B Shifts can be long but re-aligning messy chaining A B Can process word-wide (CAM-8 did site-at-a-time)
Mapping the lattice into memory CAM-8: Adjacent bits of bit-field are stored together = + + +
Mapping the lattice into memory SPACERAM: bits stored as evenly spread skip samples. = + + +
… module #00 … module #01 … module #02 … … module #03 … module #19 PE0 PE63 The SPACERAM chip
Virtual Processor Space-efficient parallelism Static routing (systolic) Lattice Gas Machines Simpler & more general Ideal for physical modeling Virtual SIMD programming General Lattice Computer Perfect memory embedding Finer-grained virtual-SIMD Computing Crystals Architecture in software Easier to design/build/test Hardware Evolution
Crystalline Computation Spatial Computers • Architectures and algorithms for large-scale spatial computation Emulating Physics • Finite-state, locality, invertibility, and conservation laws Physical Worlds • Incorporating comp-universality at small and large scales see: http://people.csail.mit.edu/nhm/cc.pdf
Why emulate physics? • Comp must adapt to microscopic physics • Comp models may help us understand nature • Rich dynamics • Start with locality: Cellular Automata
Conway’s “Game of Life” In each 3x3 neighborhood, count the ones, not including the center: If total = 2:center unchanged If total = 3:center becomes 1 Else:center becomes 0 256x256 region of a larger grid. Glider gun inserted near middle.
Conway’s “Game of Life” • Captures physical locality and finite-state But, • Not reversible (doesn’t map well onto microscopic physics) • No conservation laws (nothing like momentum or energy) • No interesting large-scale behavior 256x256 region of a larger grid. About 1500 steps later.
Reversibility & other conservations • Reversibility is conservation of information • Why does exact conservation seem hard? The same information is visible at multiple positions For rev, one nth of the neighbor info must be left at the center
Adding conservations • With traditional CA’s, conservations are a non-local property of the dynamics. • Simplest solution: redefine CA’s so that conservation is a manifestly local property • CA regular computation in space & time • Regular in space: repeated structure • Regular in time: repeated sequence of steps
Diffusion rule Even steps:rotate cw or ccw a b c a b d or c d d b a c Odd steps:rotate cw or ccw a b c a b d or c d d b a c We “randomly” choose to rotate blocks 90-degrees cw or ccw (we actually use a fixed sequence of choices for each spot). Use 2x2 blockings. Use solid blocks on even time steps, use dotted blocks on odd steps.
Diffusion rule Even steps:rotate cw or ccw a b c a b d or c d d b a c Odd steps:rotate cw or ccw a b c a b d or c d d b a c We “randomly” choose to rotate blocks 90-degrees cw or ccw (we actually use a fixed sequence of choices for each spot).
Diffusion rule Even steps:rotate cw or ccw a b c a b d or c d d b a c Odd steps:rotate cw or ccw a b c a b d or c d d b a c We “randomly” choose to rotate blocks 90-degrees cw or ccw (we actually use a fixed sequence of choices for each spot).
TM Gas rule Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Use 2x2 blockings. Use solid blocks on even time steps, use dotted blocks on odd steps.
TM Gas rule Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Even step: update solid blocks.
TM Gas rule Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Odd step: update dotted blocks
TM Gas rule Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Even step: update solid blocks
TM Gas rule Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Odd step: update dotted blocks
TM Gas rule Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Even step: update solid blocks
TM Gas rule Even steps:rotate cw a b c a c d d b Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc Odd step: update dotted blocks
a b c a c d d b TM Gas rule Even steps:rotate cw Odd steps:rotate ccw a b b d c d a c Except:2 ones on diag, nc