1 / 15

Stanford Streaming Supercomputer (SSS) Spring Quarter Wrapup Meeting

Stanford Streaming Supercomputer (SSS) Spring Quarter Wrapup Meeting. Bill Dally, Computer Systems Laboratory Stanford University June 11, 2002. Review – What is the SSS Project About?.

kristalg
Download Presentation

Stanford Streaming Supercomputer (SSS) Spring Quarter Wrapup Meeting

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Stanford Streaming Supercomputer (SSS) Spring Quarter Wrapup Meeting Bill Dally, Computer Systems LaboratoryStanford University June 11, 2002

  2. Review – What is the SSS Project About? • Exploit streams to give 100x improvement in performance/cost for scientific applications vs. ‘cluster’ supercomputers • From 100 GFLOPS PCs to TFLOPS single-board computers to PFLOPS supercomputers • Use layered programming system to simplify development and tuning of applications • Application specific frameworks and libraries • Stream languages • Streaming virtual machine • Demonstrate feasibility of above in year 1 • Run real applications on simulated hardware • Identify bottlenecks • Build a prototype and demonstrate CITS applications in years 2-6

  3. Architecture of SSS

  4. A layered software system simplifies stream programming

  5. The big picture • VLSI technology enables us to put TeraOPS on a chip • Conventional general-purpose architecture cannot exploit this • The problem is bandwidth • Streams expose locality and concurrency • Perform operations in record (not operation as with vector) order • Enables compiler optimization at a larger scale than scalar processing • A stream architectureachieves high arithmetic intensity • Intensity = arithmetic rate/bandwidth • Bandwidth hierarchy, compound stream operations • A Streaming Supercomputer is feasible • 100GFLOPS (64-b) on a chip, 1TFLOPS single-board computer, PFLOPS systems

  6. Outreach • TST Meeting May 13-14 • SSS Project well received • Sierra visit April 30 • Outreach plans for summer • DOE Headquarters • Labs • Industrial partners • Intel, IBM, Sun, HP, Cray, Nvidia • Start visits this fall • Other application areas

  7. EE482C – Streaming Architecture • 11 Class Projects • Irregular Streams (2) – caches and SRF indexing • Aspect ratio • Compiling Brook • Streams on legacy architectures • Mapping to multiple nodes • Communication Scheduling • Stencils • Vectors • Cellular Automata • Viterbi Decoding

  8. Three Major Thrusts • Software • Brook language • Virtual machines • ‘Compilers’ to map Brook to VM to streaming and other hardware • OS/Run-time system • Hardware • Specification • Stream caching and support for multi-dim streams • ‘Aspect ratio’ thread vs data parallelism • Global mechanisms & Memory system • Simulation • SSS Simulator • Prototyping on Imagine • Applications • Fluids • StreamFLO • Model PDEs • Molecular dynamics • Microbenchmarks/Stress tests

  9. Software goals for SQ02 • Accomplishments • Metacompiler parses Brook • Multidimensional features in Brook • Apps coded in Brook • Central source repository • Mapping analyses enumerated and mapped interaction kernel of StreamMD • Overall • End-to-end (brook->SVM->SSS) demonstration [all] • Put in place release process • Brook • Feature lock, all features needed for two apps [Ian] • Hints [Mattan] • Metacompilation • Compile Brook to SVM [Ben C.] • SVM • SVM specification, prototype C implementation, develop and run test suite [Francois] • Instrumented version of SVM [Francois] • Mapping • StreamMD running on Imagine [Mattan] • Enumerate known algorithms and research problems [Mattan] • Implement minimum mapping tool [Mattan]

  10. Software Goals for Summer 02 • Fill in at meeting • SQ Accomplishments • Brook to StreamC (manual to KernelC) runs on Imagine (unoptimized, subset) • Version 2 SVM Specification • Brook features a lot closer • Metacompilation of Brook to BRT and StreamC • Compilation document first draft • Summer Goals • Brook • Bug fixes and changes to facilitate compilation [Ian, Mattan] • SVM • Specification, Simulator, and run-time [Francois] • Compilation • Identify framework permitting analysis [Mattan] • Translate to SVM [Mattan] • Compile kernels [Jayanth] • See Mattan’s e-mail • Run-time • Scalar processor multi-node support [Mattan] • Memory management etc… • Issues • Critical path SVM implementation • Build long-term compiler framework • Leverage Imagine compilation techniques • Run-time system

  11. Hardware goals for SQ02 • Accomplishment • Completed strawman architecture • Initial bandwidth analysis of StreamMD • Architecture • Reconcile multidimensional language features with architecture [Tim] • Simulator • Define simulation results needed for October [all] • Single-node simulator [Ben S.] • Multi-node definition and simulator [Jung Ho] • Apps on Simulator • Map StreamFlow and StreamMD to SSS and analyze bandwidth [Mattan] • Point studies • Aspect ratio (TP vs ILP vs DP) [Ben], conditionals[Ujval], stream caching [Tim], global mechanisms [Mattan]…

  12. Hardware Goals for Summer 02 • SQ Accomplishments • Revised strawman • Ran key StreamMD kernels on Imagine • Cache study, indexable SRF studies • Summer Goals • Architecture specification • Fix bugs[All] • Support for multi-node scalar arch [Mattan] • Simulator • Modify imagine simulator to match strawman [Jung Ho] • Application studies • Run StreamMD and StreamFlo on strawman simulator [Mattan] • Point studies • Conditional study, aspect ratio study • Issues • Coherency • Finalize cache/SRF architecture • Finalize remote ops • Support for reductions across nodes • Scalar architecture – multi-node

  13. Application goals for SQ02 • Accomplishments • FFT microbenchmark • Solvers • Incompressible fluid flow running (all 3 PDE types) [Eran] • Hints • Application hints into Brook [all] • Microbenchmarks • PCA [Ian, Anand]

  14. Application Goals for Summer 02 • SQ Accomplishments • 2 PDE types completed – smoke movie • Ungridded StreamMD • StreamFLO underway • Summer Goals • Finite element “miniapp” [Tim] • Investigate Sierra [Tim] • Sparse and Dense stress codes M*V [Tim] • Complete and run StreamFLO [Fatica,Ian] • Complete and run gridded StreamMD [Eric,Ian] • Run StreamFLO and gridded StreamMD on simulators and collect numbers [Mattan] • Issues • Follow up with Yates LLNL • Sweep3D – Sn Radiation Transport

  15. Summer SS Meetings • We should meet at least every other week over the summer • Every other Tuesday at 11? • Schedule on web page soon

More Related