260 likes | 1.09k Views
Outline of Talk. What is Stream Processing?What is the problem?Why is Stream Processing relevant to HPC?What are Array of Structures and Structure of Arrays?Why use SoA?What can be done?What is being done?. Stream Processing. Computing paradigm allowing use of multiple functional unitsPerfor
E N D
1. Array-of-Structures (AoS)vs.Structure-of-Arrays (SoA) Steve Carroll
March 02, 2010
2. Outline of Talk What is Stream Processing?
What is the problem?
Why is Stream Processing relevant to HPC?
What are Array of Structures and Structure of Arrays?
Why use SoA?
What can be done?
What is being done?
3. Stream Processing Computing paradigm allowing use of multiple functional units
Performance increase in areas such as
4. Problem Definition Stream programming requiring new way of thinking about data
New way of thinking is not traditional
Programmers must relearn new style
5. Example An use of stream processing is that of displaying a particle system
E.g. smoke, fire, water spay, dust, etc.
Traditional way of thinking is to use an array to hold the particles
Bad for performance
Better to use Structure-of-Array format
6. Example (cont.) Array-of-Structures (AoS) Format
7. Example (cont.)
8. Example (cont.) Structures-of-Array (SoA) Format
9. Example (cont.)
10. Example (cont.) AoS
mParticleArray[i]->position.x
SoA
mParticleArray->x[i]
SoA requires different thinking than AoS
Why?
Stream programming encourages decoupling computation and memory access [4], [5]
11. Stream Processing and HPC “Can 2000 grad students complete a PhD’s worth of research in 1 day? Most likely not! One must design novel ways to utilize large scale resources in efficient ways.” [6]
3D graphics is huge workload, consists of
12. Stream Processing and HPC (cont.) Assigned paper
Ma, W.-C., Yang, C.-L. Using Intel streaming SIMD extensions for 3D geometry processing. In Advances in Multimedia Information Processing – PCM, 2002.
Using SIMD-FP alone achieves close to 3X speedup for graphics using Intel SSE
Arranging vertices favorable to SIMD is up to 4X
13. Issues Organizing data in SIMD format has significant overhead
Conventional approach is AoS
Intel suggest SoA approach
14. Issues (cont.)
15. Issues (cont.) AoS requires more instructions than SoA
16. Current Work SoA results in better performance
17. Current Work (cont.) [2] maps stream programming code to general purpose CPUs
Believed that thinking about stream programming is a benefit
[9], [10] studied design issues
[11] evaluated the MMX technology
[12] studied performance of SIMD on 3D geometry
18. Current Work (cont.) Memory management is important relation
19. Current Work (cont.) Stream Processors in current use / research
Imagine (Stanford University, 1996)
SSE3 (Intel, 2004)
Cell Broadband Engine Architecture (STI, 2005)
Storm-1 (Stream Processor, Inc, 2007)
System S (IBM, 2007)
GPUs (ATI, NVIDIA, present)
Merrimac (Stanford University, present)
20. Paper Conclusion Traditional AoS data structures boost performance 2.7X to 3X [3]
SoA data strucutres boost performance 3.1X to 3.3X [3]
SoA with prefetching boosts performance 3.6X to 3.9X [3]
Layout of data in memory is important for performance
21. Conclusion & Comments Coding programs in a streaming style “can improve performance on today’s machines and smooth the way for significant performance improvements with the depoloyment of streaming architetures” [2]
Stream programming forces the programmer to think about memory accesses and computer operations seperately
22. Stream processors can benefit numerous types of problems if data structures are kept in mind
23. References [1] Gordon, M. I. et al. A stream compiler for communication-exposed architectures. In Proceedings of the 10th international Conference on Architectural Support For Programming Languages and Operating Systems, October 05 - 09, 2002.
[2] Gummaraju, J. and Rosenblum, M. 2005. Stream Programming on General-Purpose Processors. In Proceedings of the 38th Annual IEEE/ACM international Symposium on Microarchitecture, November 12 - 16, 2005.
[3] Ma, W. and Yang, C.-L. 2002. Using Intel Streaming SIMD Extensions for 3D Geometry Processing. Advances in Multimedia Information Processing, PCM 2002.
[4] Creating a Particle System with Streaming SIMD Extensions
http://software.intel.com/en-us/articles/creating-a-particle-system-with-streaming-simd-extensions/
24. References [5] W. Thies, M. Karczmarek, and S. Amarasinghe, StreamIt: A language for streaming applications. in Int’l Conference on Compiler Construction, Apr. 2002
[6] Houston, M., General Purpose Computation on Graphics Processors (GPGPU). Stanford University, Public Talks, 2007.
[7] Folding@Home Official Stats
http://fah-web.stanford.edu/cgi-bin/main.py?qtype=osstats
[8] I. Buck, “Brook Specification v0.2,” merrimac.stanford.edu/brook/brookspec-v0.2.pdf, October 2003.
[9] M.T. et al. VIS speeds new media processing. In IEEE Micro, 16(4):10-20, 1996.
[10] Raman, S.K., Pentkovski, V. and Keshava, J. Implementing Streaming SIMD Extensions on the Pentium III processor. In IEEE Micro, 20(4):47-57, 2000.
25. References [11] Bhargava R., et. Al. Evaluating MMX technology using DSP and multimedia applications. In ACM/IEEE International Symposium on Microarchitecture, 1998.
[12] Yang, C.-L., Sano, B., Lebeck A.R., Exploiting instruction level patry processing for three dimensional graphics applications. In ACM/IEEE International Symposium on Microarchitecture, 1998.