150 likes | 162 Views
Stanford Streaming Supercomputer (SSS) Spring Quarter Wrapup Meeting. Bill Dally, Computer Systems Laboratory Stanford University June 11, 2002. Review – What is the SSS Project About?.
E N D
Stanford Streaming Supercomputer (SSS) Spring Quarter Wrapup Meeting Bill Dally, Computer Systems LaboratoryStanford University June 11, 2002
Review – What is the SSS Project About? • Exploit streams to give 100x improvement in performance/cost for scientific applications vs. ‘cluster’ supercomputers • From 100 GFLOPS PCs to TFLOPS single-board computers to PFLOPS supercomputers • Use layered programming system to simplify development and tuning of applications • Application specific frameworks and libraries • Stream languages • Streaming virtual machine • Demonstrate feasibility of above in year 1 • Run real applications on simulated hardware • Identify bottlenecks • Build a prototype and demonstrate CITS applications in years 2-6
The big picture • VLSI technology enables us to put TeraOPS on a chip • Conventional general-purpose architecture cannot exploit this • The problem is bandwidth • Streams expose locality and concurrency • Perform operations in record (not operation as with vector) order • Enables compiler optimization at a larger scale than scalar processing • A stream architectureachieves high arithmetic intensity • Intensity = arithmetic rate/bandwidth • Bandwidth hierarchy, compound stream operations • A Streaming Supercomputer is feasible • 100GFLOPS (64-b) on a chip, 1TFLOPS single-board computer, PFLOPS systems
Outreach • TST Meeting May 13-14 • SSS Project well received • Sierra visit April 30 • Outreach plans for summer • DOE Headquarters • Labs • Industrial partners • Intel, IBM, Sun, HP, Cray, Nvidia • Start visits this fall • Other application areas
EE482C – Streaming Architecture • 11 Class Projects • Irregular Streams (2) – caches and SRF indexing • Aspect ratio • Compiling Brook • Streams on legacy architectures • Mapping to multiple nodes • Communication Scheduling • Stencils • Vectors • Cellular Automata • Viterbi Decoding
Three Major Thrusts • Software • Brook language • Virtual machines • ‘Compilers’ to map Brook to VM to streaming and other hardware • OS/Run-time system • Hardware • Specification • Stream caching and support for multi-dim streams • ‘Aspect ratio’ thread vs data parallelism • Global mechanisms & Memory system • Simulation • SSS Simulator • Prototyping on Imagine • Applications • Fluids • StreamFLO • Model PDEs • Molecular dynamics • Microbenchmarks/Stress tests
Software goals for SQ02 • Accomplishments • Metacompiler parses Brook • Multidimensional features in Brook • Apps coded in Brook • Central source repository • Mapping analyses enumerated and mapped interaction kernel of StreamMD • Overall • End-to-end (brook->SVM->SSS) demonstration [all] • Put in place release process • Brook • Feature lock, all features needed for two apps [Ian] • Hints [Mattan] • Metacompilation • Compile Brook to SVM [Ben C.] • SVM • SVM specification, prototype C implementation, develop and run test suite [Francois] • Instrumented version of SVM [Francois] • Mapping • StreamMD running on Imagine [Mattan] • Enumerate known algorithms and research problems [Mattan] • Implement minimum mapping tool [Mattan]
Software Goals for Summer 02 • Fill in at meeting • SQ Accomplishments • Brook to StreamC (manual to KernelC) runs on Imagine (unoptimized, subset) • Version 2 SVM Specification • Brook features a lot closer • Metacompilation of Brook to BRT and StreamC • Compilation document first draft • Summer Goals • Brook • Bug fixes and changes to facilitate compilation [Ian, Mattan] • SVM • Specification, Simulator, and run-time [Francois] • Compilation • Identify framework permitting analysis [Mattan] • Translate to SVM [Mattan] • Compile kernels [Jayanth] • See Mattan’s e-mail • Run-time • Scalar processor multi-node support [Mattan] • Memory management etc… • Issues • Critical path SVM implementation • Build long-term compiler framework • Leverage Imagine compilation techniques • Run-time system
Hardware goals for SQ02 • Accomplishment • Completed strawman architecture • Initial bandwidth analysis of StreamMD • Architecture • Reconcile multidimensional language features with architecture [Tim] • Simulator • Define simulation results needed for October [all] • Single-node simulator [Ben S.] • Multi-node definition and simulator [Jung Ho] • Apps on Simulator • Map StreamFlow and StreamMD to SSS and analyze bandwidth [Mattan] • Point studies • Aspect ratio (TP vs ILP vs DP) [Ben], conditionals[Ujval], stream caching [Tim], global mechanisms [Mattan]…
Hardware Goals for Summer 02 • SQ Accomplishments • Revised strawman • Ran key StreamMD kernels on Imagine • Cache study, indexable SRF studies • Summer Goals • Architecture specification • Fix bugs[All] • Support for multi-node scalar arch [Mattan] • Simulator • Modify imagine simulator to match strawman [Jung Ho] • Application studies • Run StreamMD and StreamFlo on strawman simulator [Mattan] • Point studies • Conditional study, aspect ratio study • Issues • Coherency • Finalize cache/SRF architecture • Finalize remote ops • Support for reductions across nodes • Scalar architecture – multi-node
Application goals for SQ02 • Accomplishments • FFT microbenchmark • Solvers • Incompressible fluid flow running (all 3 PDE types) [Eran] • Hints • Application hints into Brook [all] • Microbenchmarks • PCA [Ian, Anand]
Application Goals for Summer 02 • SQ Accomplishments • 2 PDE types completed – smoke movie • Ungridded StreamMD • StreamFLO underway • Summer Goals • Finite element “miniapp” [Tim] • Investigate Sierra [Tim] • Sparse and Dense stress codes M*V [Tim] • Complete and run StreamFLO [Fatica,Ian] • Complete and run gridded StreamMD [Eric,Ian] • Run StreamFLO and gridded StreamMD on simulators and collect numbers [Mattan] • Issues • Follow up with Yates LLNL • Sweep3D – Sn Radiation Transport
Summer SS Meetings • We should meet at least every other week over the summer • Every other Tuesday at 11? • Schedule on web page soon