150 likes | 322 Views
A High-Performance Scalable Graphics Architecture. Daniel R. McLachlan Director, Advanced Graphics Engineering SGI. 200. Worldwide Production of Information. 180. 160. 140. 120. Exabytes. 100. 80. 60. 40. 20. 1997. 1998. 1999. 2000. 2001. 2002. 2003. 2004. 2005. 2006. 0.
E N D
A High-Performance Scalable Graphics Architecture Daniel R. McLachlan Director, Advanced Graphics Engineering SGI
200 Worldwide Production of Information 180 160 140 120 Exabytes 100 80 60 40 20 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 0 Growth in Model Sizes Source: Gartner Images courtesy of Parametric Technology Corporation; Photodisc, and Magic Earth, LLC
Bumper, hood, engine, wheels Entire car Bumper Crash dummy Organ damage E-crash dummy Images courtesy of EAI; SCI Institute, NLM, Theoretical Biophysics Group of the Beckman Institute at UIUC; Livermore Software Technology Corporation Problems Are Getting Increasingly Complex Over Time
The Complexity of the Simple Potato Chips Diapers Images courtesy of Procter & Gamble
Performance Gap Graphic Cards Are Outpacing PC Architecture and Bandwidth Graph based on relative scale.
Addressing Real Needs Visualization • Extreme resolution • Absolute visual quality • VAN • Solving complex problems • Dense data sets Performance Graphics Clusters • Low cost • Fast simple polygons • Single screen image quality 1992 2003 Visualization Breaks The Cognitive Barrier For Better Decisions Images courtesy of Advantage CFD; SCI institute; NLM; Theoretical Biophysics Group of the Beckman Institute at UIUC; Laboratory for Atmospheres, NASA Goddard Space Flight Center; Donghoon Shin, Art Center College of Design, Nvidia Corporation; ATI Technologies, Inc; and Nintendo Co., Ltd.
Pros Cheap Industry standard High display list performance Good for “embarrassingly parallel” problems Can potentially scale to 1000s of processors Cons Cumbersome to program High administration costs Few applications for visualization Difficult to scale for large problems Difficult to dynamically load balance Lack of software productivity tools Often requires data replication Reliability Limited to 2GB memory space Cluster Comparison
Commodity interconnect mem mem mem mem mem mem ... node +OS node +OS node +OS node +OS node +OS node +OS The Benefits of Shared Memory Traditional ClustersSGI® NUMAflex™ Fast NUMAflex™ interconnect Global shared memory node+ OS node+ OS node+ OS node+ OS ... 1-2 CPUs per node < 64 CPUs per node • What is shared memory? • All nodes operate on one large shared memory space, instead of each node having its own small memory space • Shared memory is high-performance • All nodes can access one large memory space efficiently, so complex communication and data passing between nodes aren’t needed • Big data sets fit entirely in memory; less disk I/O is needed • Shared memory is cost-effective and easy to deploy • It requires less memory per node, because large problems can be solved in big shared memory • Simpler programming means lower tuning and maintenance costs
How SGI® Onyx® Enables the RoleSystem at a Glance Scalable Interaction Scalable Graphics I/O Appropriate Delivery Scalable Data SGI Onyx CompositorNetwork Scalable Compute and Large Memory Large Data Sets Scalable Graphics Scalable Disk I/O Scalable Rendering Scalable Resolution
Silicon Graphics® Onyx4™ UltimateVision™ Changing the Application Paradigm • Moving from a fixed rendering path… Geometry …to a scalable and programmable rendering path. Application accelerators Images courtesy of Pratt and Whitney Canada and Magic Earth, LLC
ScalingA Shift in Pipe Paradigm 1. Screen-based decomposition Even more powerful in combination All modes can be used separately or combined in any number of ways 2. Eye-based decomposition 3. Time-based decomposition 4. Data-based decomposition Visible Human public data set Data courtesy of DaimlerChrysler, Images courtesy of MAK
Compositor Flexibility • Multi-Tier Composition • Composite output of multiple compositors e.g., first layer does 2D composition, second layer does anti-aliasing • Visual Serving • Composited output sent to workstations for viewing and/or editing
Silicon Graphics® Onyx4™ UltimateVision™System Architecture Optional 8GB RAM Standard I/O or 2 Graphics Pipes CPU CPU Memory Controller SGI® NUMA scalability CPU CPU 2 Graphics Pipes
Conclusion • Silicon Graphics® Onyx4™ UltimateVision™ • Solving bigger and more complex problems • World’s most scalable visualization system • Up to 32 GPUs in an SSI architecture • World-leading computational capability • Up to 64 CPUs per node, scalable to 1024 processors • Solves system b/w limitations of PCs and clusters • Up to 8 NUMAlink 3 connections to a single shared memory pool • New-generation programmable graphics architecture • OpenGL Shading Language