Computing Media and Languages for Space-Oriented Computations

Computing Media and Languages for Space-Oriented Computations (opening remarks)

Big Idea: Matter Computes • Our physical world implements computations • Double implication • Computing landscape determined by laws of physical world • Understand our physical world in terms of the computation it performs • Control our physical world by programming the computation it performs.

Convergence of Concerns • Dealing with space as a physical issue when implementing modern computing components and systems • DSM IC, sublithographic effects • Realizing shapes/behaviors/properties using computations • Distributed robotics, programmable matter • Programming physical systems • Self-assembly, protein networks, …

Viewpoint • Traditional/mainstream • Abstractions, models, algorithms, languages • Have not adequately dealt with spatial issues • Either as optimization • Or as computational goal • Now have several communities approaching this from different perspectives

Week Outline

Monday 9:00am Opening Remarks 9:15am DeHon – Spatial Compute 11:15am Coore – Amorphous Computing 12:15pm LUNCH 1:30pm Goldstein – Programmable Matter 2:30pm Gruau – Blob Computing 3:30pm Coffee/Cake 4:00pm Giavitto –Data Structures as Space

Challenges and Opportunities for Spatial Computing André DeHon <andre@acm.org>

Message • Opportunity • Large and capable computing systems • Continued scaling  primarily in spatial capacity • Performance capabilities from parallelism • Dynamically (re)programmable/adaptive • Spatial Challenges • Distance=Delay • Communications take up space and energy • Demands • New models/abstractions/algorithms

Convergence of Concerns • Dealing with space as a physical issue when implementing modern computing components and systems • DSM IC, sublithographic effects • Realizing shapes/behaviors/properties using computations • Distributed robotics, programmable matter • Programming physical systems • Self-assembly, protein networks, …

Outline • Scaling • Spatial vs. Temporal Computation • Ground spatial examples: FPGAs, nanoPLA • Spatial Challenges: • Scaling, Interconnect Delay and Requirements • Defects, faults, lifetime effects • Opportunities • Capacity, parallelism, scaling, adaptation • Why not: C, VHDL… • Design Patterns • System Architectures

Capacity

Capacity Scaling from Intel

Area Perspective

Spatial Capacity Scaling Continues • Tl2 by end of Silicon roadmap • Another decade • Over Billion gates • Molecular-scale promises • Two orders of magnitude more than that • All 2D  still have third dimension to exploit • [paper at NanoNets2006 in 2 weeks]

Implication • Qualitatively not in the same world we were in 1945  1985 • Orders of magnitude shift in resources • Suggest dramatic changes in strategy • We have been on exponential curve • But…up to about 1990 • Shrinking same kind of computers down to one chip

Temporal vs. Spatial

Example • Compute: Y=Ax2+Bx+c

Temporal Implementation • Single Operator • Reuse in time • Store instructions • Store intermediates • Communication across time • One cycle per operation

Spatial Implementation • One operator for every operation • Instruction per operator • Communication in space • Computation in single cycle

Large instruction and data memory per active processing element • Economize on instruction memory with word-wide SIMD organization w op op op op Conventional Processors • Have temporal organization

Conventional Processors • Economize on Area • Pack Large computation • Into small area • By storing description of computation compactly • Reusing small number of processing elements in time • Trade time for area • Absolutely the right thing for 1985 Silicon • (and pre-integrated circuits)

Early Challenge • How do I make my large program fit on an economical computer? • Can compute with 10K vacuum tubes? • Fit in caches hold 100 instructions • 64K address space • Heavy sequentialization was a good engineering solution…for 19451990

Field-Programmable Gate Array (FPGA) K-LUT (typical k=4) Compute block w/ optional output Flip-Flop LUT = Look Up Table

Small instruction area per active operator • pack more computation on die Field-Programmable Gate Arrays • Have spatial computing organization • Bit-level control • use more of available ops

Field-Programmable Gate Arrays • Put more area into computing • Have more compute elements per die • Support more computation per cycle • Trade area for time • With more capacity • More applications fit spatially • More appropriate 2000+

Component Example XC4085XL-09 3,136 CLBs 4.6ns 682 Bit Ops/ns Alpha 1996 264b ALUs 2.3ns 55.7 Bit Ops/ns • Single die in 0.35mm

ALU bitops l2s EmpiricalRaw Density Comparison Computational Density Time

Spatial Computing • Enabled by high capacity • Has a density advantage • Now have sufficient capacity to hold large range of interesting problems • 100,000 bit-level operators on a single chip • More on the way • Can exploit the kinds of capacities now becoming available

Spatially Programmable FPGA

Ground Examples

100,000s of LUTs Embedded blocks Many small distributed memories Megabits of memory Data rates ~10Tb/s 10—100x over uP Operate 100s of MHz Easily scale up spatially Step and repeat Today’s FPGAs

Simple Nanowire-Based PLA NOR-NOR = AND-OR PLA Logic FPGA 2004

Tile into Arrays FPGA 2005

10mm x 5mm subarrays Millions on single-layer modest die 100 Product Terms per subarray Include memory blocks Stack in 3D nanoPLA Capacity

Interconnect

Interconnect Challenge • With 100,000 processing elements cooperating on a task • (can get today with FPGAs) • Must communicate • Interconnect becomes dominate • Area, delay, energy • Replaces memory for communications • Less heavily studied

Motivating Example: Memories(Memory from mux bits)

Large Memories • Build larger memory • Simple model: multiplex together more cells

Delay vs. Memory (1) • How does delay grow with memory size (N) ? • Tmem = Tdecode+Tcell+Tmux • Tmux = log(N)  Tmux4

Delay vs. Memory (2) • Tmem = O(log(N)) • Does this make sense for large N? • Speed of light? • Tmem = Tlogic + Twire • 2D memory: • Twire = O(N) • Tmem=C1log(N) + C2N

Chips >> Cycles • Chips growing • Gate delays shrinking • Wire delays aren’t scaling down • Will take many cycles to cross chip

Clock Cycle Radius • Radius of logic can reach in one cycle (45 nm) • Radius 10 • Few hundred PEs • Chip side 600-700 PE • 400-500 thousand PEs • 100s of cycles to cross

Communication Expensive • What if we just built a crossbar? • Interconnect area scales as N2 • Must exploit typical locality in design to reduce area • Rent’s Rule: IO=cNp • (0.5p0.75) typical • How well can we engineer low p? • Where show up in algorithm/computation design?

Optimizing • Must exploit physical locality (placement) • Reduce wire requirement (reduce p) • Reduce distance traveled over wires •  new meaning to spatial locality • Interconnect must show up in our design • Run-time management • Algorithms

Old New Probability Distribution Delay Clock Cycle Scaling has Ended • Up to ~2000 scaled down clock cycle • Architecture scaling: fewer gates/clock • Now down to ~10 gates/clock • Energy-limited computation • Could run a few devices faster…but not all of them • Variation at nanoscale diminishing clock frequencies  Future scaling is spatial

Atomic-Scale Physical Effects • As our devices approach the atomic scale, we must deal with statistical effects governing the placement and behavior of individual atoms and electrons.

Three Atomic-Scale “Problems” • Defects: Manufacturing imperfection • Occur before operation; persistent • Shorts, breaks, bad contact • Faults: • Occur during operation; transient • node X value flips: crosstalk, ionizing particles, bad timing, tunneling, thermal noise • Operational/lifetime defects: • Parts become bad during operational lifetime • Fatigue, electromigration, burnout…. • …slower • NBTI, Hot Carrier

Message • Opportunity • Large and capable computing systems • Continued scaling  primarily in spatial capacity • Performance capabilities from parallelism • Dynamically (re)programmable/adaptive • Spatial Challenges • Distance=Delay • Communications take up space and energy • Demands • New models/abstractions/algorithms

Questions so far?

Challenge

Computing Media and Languages for Space-Oriented Computations

Computing Media and Languages for Space-Oriented Computations

Presentation Transcript

Acceptability-Oriented Computing

Service Oriented Computing

Reconfigurable Computing for Space Applications

Object-Oriented Programming Languages

Recovery-Oriented Computing

Languages in Media

CBSE and Service Oriented Computing

Object-Oriented Database Languages

OBJECT ORIENTED QUERY LANGUAGES

Object Oriented Languages

Object Oriented Languages Comparison

Languages of Computing

Object-Oriented Languages - Design and Implementation

Recovery Oriented Computing

Object Oriented Programming Languages

Languages of Computing

5. Concepts and Technologies for Service-Oriented Computing

Recovery-Oriented Computing

Object Oriented Languages Comparison

Object-oriented Languages

Recovery-Oriented Computing