1 / 5

ClearSpeed’s CS301: The World’s First Commercially-Available Stream Processor

ClearSpeed’s CS301: The World’s First Commercially-Available Stream Processor. Architecture, Algorithms and Benchmark Results. Dairsie Latimer dairsie@clearspeed.com Simon McIntosh-Smith simon@clearspeed.com Ron Bell ron.bell@awe.co.uk Stephen Hudson stephen.hudson@awe.co.uk.

kin
Download Presentation

ClearSpeed’s CS301: The World’s First Commercially-Available Stream Processor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. ClearSpeed’s CS301:The World’s First Commercially-Available Stream Processor Architecture, Algorithms and Benchmark Results Dairsie Latimer dairsie@clearspeed.com Simon McIntosh-Smith simon@clearspeed.com Ron Bell ron.bell@awe.co.uk Stephen Hudson stephen.hudson@awe.co.uk

  2. A Better Approach: ClearSpeed’s CS301 Processor • Multi-threaded Array Processing • Programmed in high-level languages • Hardware multi-threading • Enables simultaneous data streaming and computation for latency tolerance • Run-time extensible instruction set • Array of Processors Elements • PEs are VLIW cores • Flexible data parallel processing • Built-in PE fault tolerance, resiliency • High performance, low power • 10 GFLOPS/Watt • Multiple high bandwidth I/O channels

  3. CS301 Based Development Board • 50 GFLOPS peak @ 10W maximum • 200K FFTs/s (1K complex single precision IEEE754) • Up to 1GB DRAM for local processing • Single slot width full-size PCI card • In evaluation use since early 2004

  4. Chemistry codes: DLPOLY (Molecular Dynamics) • Owned by UK Daresbury Laboratory • Widely used within AWE, also academia & industry • 91% of CPU time in 5 small routines • One calls the other 4 to compute forces on all atoms • Forces called once per time step • Small amount of data returned by forces from CS to host • Calculation for each atom is independent ClearSpeed AWE Joint Investigation Matrix Multiply Benchmark (SGEMM) • CS301 single precision code started at ~20% efficiency • AWE/CS code restructuring gave 12 GFLOPS – 47% • Performance verified by AWE on CS301 hardware • “Avebury” significantly increases this performance

  5. Extremely Cool, Extremely Fast Intrigued? Visit our poster outside

More Related