120 likes | 265 Views
Future of parallel computing: issues and directions. Laxmikant Kale CS433 Spring 2000. At the crossroads. Sequential computers are much faster than they were: Question: is that adequate for most applications? Bimodal distribution of applications
E N D
Future of parallel computing: issues and directions Laxmikant Kale CS433 Spring 2000
At the crossroads • Sequential computers are much faster than they were: • Question: is that adequate for most applications? • Bimodal distribution of applications • Are there new applications needing more power? • Are they economically important? • Likely answer: need parallel machines, but.. • Software crisis?
Hardware advances • CMOS: Moore’s law continues? • Power dissipation? • Current processors hotter than hot-plate • Extrapolation: Nuclear reactor! • Something has to give… • Device physics limitation • But new technologies are “around the corner” • Nanotubes • Biological computing (using organic biomolecules) • Quantum computing • ??
Performance challenges: • Latency tolerance • Scalability: • Can we utilize 10,000 processors on one problem? • 10,000 PE machines (almost) already here (9k). • 100,000? Million? • Increasing gap between processor and memory performance • Probably the most challenging issue today
Processors in Memory • PIM, or also, intelligent RAM (IRAM) • Idea: embed processors in memory, • SRAM? DRAM? • Read IRAM paper by Patterson et al. • IEEE Micro, April 1997. Will be on the course web page
Explicit management of caches • Explicit management: • Let the user program decide what stays in cache and when • Instructions for data movement • Problem: limited buffer • Solution: handle overflow in software? • Let compilers be smart enough to use this control • Notice: data-driven objects have an advantage here as well: • schedulers can prefetch relevant data in cache
New machines on the horizon • What novel architectures are expected in near future? • IBM’s Blue Gene • Japan’s Earth Simulator
Blue gene • Deep Blue: • chess playing machine • special purpose hardware • Blue gene: • by 2002-3, may have prototype • petaflop machine • 32 processors on a chip, sharing 16 MB only • 64 chips on a board • boards in a cabinet, multiple cabinets: • 1 million processors, running at gigaflop each • Multithreaded (8-way) • Cut-through routing
Slightly older machine: Tera • Basic motivation: • Shared memory programming without worrying about remote memory access latency • hardware support for threads • context switch on each instruction is possible • Large number of threads • Needs strong compiler support • Aside: • Petaflops workshop, PITAC report.. • Need 10 - million-way parallelism to attain petaflop
Earth simulator • Vector processor as a basic unit • Very high performance processors, for limited function • Huge building to house machine
HTMT • High Technology (?) MultiThreaded machine • Very fast processors: • Running at 4 degrees Kelvin! • 256 Gflops per processor? • Thousands of processors (to get to PetaFlops) • Multithreading and explicit control of memory hierarchies • New memory hierarchies • Remained in the design stage
Bigger and faster machines Radically Science and Engineering Operations Research Artificial Intelligence! And of course: Exciting times ahead! Impact Machines Better Games