1 / 28

Parallel Processing with PlayStation3

Parallel Processing with PlayStation3. Lawrence Kalisz. Cell Processor History Architecture Parallel Programming Install Linux Examples PS3 Cluster Applications Examples. Topics. Created by Sony, Toshiba, and IBM (STI) 400 Engineers ½ Billion Dollars. PS3 Cell Processor: History.

wells
Download Presentation

Parallel Processing with PlayStation3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parallel Processing with PlayStation3 Lawrence Kalisz

  2. Cell Processor • History • Architecture • Parallel Programming • Install Linux • Examples • PS3 Cluster • Applications • Examples Topics

  3. Created by Sony, Toshiba, and IBM (STI) • 400 Engineers • ½ Billion Dollars PS3 Cell Processor: History

  4. PS3 Cell Processor: Architecture

  5. PS3 Cell Processor: Architecture

  6. Power Processing Element (PPE) • Synergistic Processing Element (SPE) • Element Interconnection Bus (EIB) • Memory System • Network Card & Graphics Card PS3 Cell Processor: Architecture

  7. PPE handles operating system and control tasks • ● 64-bit Power Architecture with VMX • ● In-order, 2-way hardware simultaneous multi-threading (SMT) • ● 32KB L1 cache (I & D) and 512KB L2 Power Processor Element

  8. Specialized high performance core • Three main components • SPU: Supplemental Processing Units • LS: local store memory • MFC: memory flow control manages data in and out of SPE • Can only access (load & store) data in the SPE local store • 7SPEs used for rendering, 1 SPE reserved for image compression Synergistic Processing Element

  9. SPU needs data • 1. SPU initiates MFC request for data • 2. MFC requests data from memory • 3. Data is copied to local store • 4. SPU can access data from local store • SPU operates on data then copies data from local store back to memory in a similar process SPE: Data IN and OUT Steps

  10. SPE: Data IN and OUT Steps

  11. Physically overlaps all processor elements • Central arbiter supports up to 3 concurrent transfers per ring • 2 stage, dual round robin arbiter • Each port supports concurrent 16B in and 16B out data path • Ring topology is transparent to element data interface • Each EIB Bus data port supports 25.6GBytes/sec each way Element Interconnect Bus

  12. PS3 Cell: Parallel Programming

  13. Current working Linux distros: • Fedora Core 5 • YellowDog5.0 • Gentoo PowerPC 64 edition • Debian • OpenMPI (for use with cluster) • IBM’s CELL SDK PS3 Cell: Parallel Programming

  14. Cell performance ~10x better than GPU for media and other applications that can take advantage of its SIMD capability • PPE performance is comparable to a traditional GPU performance • SPE performance mostly the same as, or better than, a GPU with SIMD • Performance scales with number of SPEs PS3 Cell: Parallel Programming

  15. Programming becomes exercise in partitioning, mapping (layout),routing (communication) and scheduling PS3 Cell: Parallel Programming

  16. AI Backgammon player PS3 Cell: Parallel Programming

  17. AI Backgammon player • 1M board evaluations in ~3 seconds (6 SPEs) • Data parallel implementation, linear speedup PS3 Cell: Parallel Programming

  18. SPU programs are designed and written to work together but are compiled independently • Separate compiler and toolchain (ppu-gcc and spu-gcc) • Produces small ELF image for each program that can be embedded in PPU program PS3 Cell: Parallel Programming

  19. BLUE-STEEL PS3 Cell: Parallel Programming

  20. BLUE-STEEL • Full ray tracer running on each SPE • Data parallel implementation • ://www.youtube.com/watch?v=C3ARXUSKXAM&feature=player_detailpage PS3 Cell: Parallel Programming

  21. BLUE-STEEL • A Solution to the rendering equation • Triangle Rasterization – Fast – possible in real time on a single core – Inaccurate or tedious for global effects such as shadows, reflection, refraction, or global illumination • Ray Tracing – Slow – unless done on multiple cores – Accurate and natural shadows, reflection, and refraction PS3 Cell: Parallel Programming

  22. BLUE-STEEL • Build a fast ray tracer from the ground up to take advantage of multiple cores. – 6 accessible cores for rendering PS3 Cell: Parallel Programming

  23. Ray Tracing • Shoot a ray through each pixel on the screen • Check for intersections with each object in the scene • Keep the closest intersection PS3 Cell: Parallel Programming

  24. Ray Tracing • Shade each point according to the material of the object, as well as the lights in the scene • Cast rays for shadows, reflection, and refraction PS3 Cell: Parallel Programming

  25. BLUE-STEEL PS3 Cell: Parallel Programming

  26. Air Force • Folding@home • PS3 Gravity Grid • LACAL Student Cluster PS3: Cluster Applications

  27. http://groups.csail.mit.edu/cag/ps3/ • http://impact.asu.edu/cse520fa07/lec19-PS3-cell-tutorial.pdf • http://www.youtube.com/watch?v=VxaLmS7XPiI • http://en.wikipedia.org/wiki/PlayStation_3_cluster • http://www.netlib.org/utk/people/JackDongarra/PAPERS/scop3.pdf References

  28. Any Questions ?

More Related