130 likes | 213 Views
Parallel Computing in the Mainstream. Sanjeev Kumar Intel Corporation. PPoPP’08 Panel : Where Will All the Threads Come From?. Our Work. 1. RMS Applications. 2. Architectural Enhancements. Language & Compiler Improvements. Can we use chips with a large number of cores?
E N D
Parallel Computing in the Mainstream Sanjeev Kumar Intel Corporation PPoPP’08 Panel: Where Will All the Threads Come From?
Our Work 1 RMS Applications 2 Architectural Enhancements Language & Compiler Improvements • Can we use chips with a large number of cores? • Are there applications with large computational requirements? • Do these applications scale? What does it take? • How can the parallelization process be simplified? Enabing Parallel Computing in the Mainstream
RMS Applications A taxonomy of a large class of emerging applications Enabing Parallel Computing in the Mainstream
Game Physics Movie Physics Ray Tracing Based Rendering Financial Analytics Direct Illumination only Adding Indirect Illumination Image Courtesy: Siggraph 2004 paper by Dreamworks Enabing Parallel Computing in the Mainstream
Image Processing Virtual Worlds (Second Life) Vision Video/Data Mining Enabing Parallel Computing in the Mainstream
Public Benchmarks • PARSEC Benchmark Suite • Joint effort between Princeton and Intel • Available at http://parsec.cs.princeton.edu/ • Includes some RMS applications • Tutorial on PARSEC at ISCA’08 Enabing Parallel Computing in the Mainstream
What have we learned? • Are there applications with large computational requirements? • Do these applications scale? • What does it take to scale these applications? • How can the parallelization process be simplified? Enabing Parallel Computing in the Mainstream
Large Computation Requirements? GFLOPS Foreground Estimation Ray Tracing Fluid Dynamics Asset Liability Management Yes Enabing Parallel Computing in the Mainstream
Do these applications scale? Yes Enabing Parallel Computing in the Mainstream
What does it take to scale applications? • Lots of programming effort • Large number of cases: this suffices • Plus algorithmic changes • Usually involves one time performance hit • Plus architectural enhancements • Support for fine-grained concurrency [ ISCA’07 ] • Support for atomic vector operations [ ISCA’08 ] • Increased bandwidth, bigger caches (Stacked DRAM) • Support for efficient data movement • … Enabing Parallel Computing in the Mainstream
How can parallelization be simplified? • No silver bullet in sight • Problem harder than Scientific Computing • Every approach has its limitations • Better languages. Maybe even domain specific • Parallelized & optimized libraries • Use compilers to automate optimizations • Big performance gains with “Close to metal” programming • Tools for correctness checking • Tools for understanding performance Enabing Parallel Computing in the Mainstream
Computer Vision Physical Simulation (Financial) Analytics Rendering Data Mining Face Detection CFD Body Tracking Face, Cloth Rigid Body Option Pricing Global Illumination Portfolio Mgmt Text Index Cluster/ Classify Machine learning Media Synth PDE Collision detection LCP FIMI NLP SVM Classification SVM Training IPM (LP, QP) K-Means Level Set Filter/ transform Particle Filtering Text Indexer Fast Marching Method Monte Carlo Direct Solver (Cholesky) Krylov Iterative Solvers (PCG) Basic Iterative Solver (Jacobi, GS, SOR) Non-Convex Method Basic geometry primitives (partitioning structures, primitive tests) Basic matrix primitives (dense/sparse, structured/unstructured) Enabing Parallel Computing in the Mainstream
Conclusion • A large number of applications have • High computational requirements • Lots of parallelism • Challenge: Simplify parallelization Enabing Parallel Computing in the Mainstream