470 likes | 596 Views
GPU Programming Overview. Summer 2005 류승택 . Introduction. GPGPU (General-Purpose Computation on GPUs) The first commodity, programmable parallel architecture GPU evolution driven by computer game market Advantage of data-parallelism GPUs are >10x faster than CPU for appropriate problems
E N D
GPU ProgrammingOverview Summer 2005 류승택
Introduction • GPGPU (General-Purpose Computation on GPUs) • The first commodity, programmable parallel architecture • GPU evolution driven by computer game market • Advantage of data-parallelism • GPUs are >10x faster than CPU for appropriate problems • Advantage of commodity • GPUs are inexpensive • GPUs are Ubiquitous • Desktops, laptops, PDAs, cell phones • Achieving this speedup • Requires a large amount of GPU-specific knowledge
Motivation • Challenge Statement • GPGPU signifies the dawn of the desktop parallel computing age
Real-time Rendering • Realtime Rendering • Graphics hardware enables real-time rendering • Real-time means display rate at more than 10 images per second 3D Scene = Collection of 3D primitives (triangles, lines, points) Image = Array of pixels
Bus Interface • ISA (Industry Standard Architecture) • 버스 인터페이스 • 90년대 초반의 XT, AT시절부터 사용 • 이론적으로 최대 16Mbps의 속도 • 주변기기에서의 병목현상은 심각 • 처리속도가 크게 문제되지 않는 사운드카드나 모뎀등을 연결하는 정도로 쓰이고 있음 • PCI (Peripheral Component Interconnect) • parallel connection • ISA 후속으로 주변장치 연결을 위해 사용되고 있는 인터페이스 • ISA슬롯보다 크기가 작고 IRQ 공유 • 일반적인 32비트 33MHz는 133Mbps의 속도, 64비트 66MHz는 524Mbps 속도 • 주변 장치 대부분이 PCI인터페이스를 사용 PCI AGP ISA
Bus Interface PCIe x1 PCIe x16 • AGP (Accelerated Graphics Port) • Serial Connection (cheap, scalable) • 인텔에 의해 개발 • PCI에 기반을 두고 있으나 전송 속도는 PCI보다 두배 이상 빠름 • 기본적으로 66MHz로 작동 • AGP = 2 x PCI (AGP 2x = 2 x AGP) • AGP 1x방식일 경우는 최고 264Mbps • AGP 2x방식에서는 최고 533Mbps • 3D 그래픽 카드용 • PCIe (PCI Express) • Serial Connection • 최대 8.0 GB/s 의 대역폭 (PCIe = 2 x AGP x 8) • 전 세계 그래픽 시장을 책임지고 있는 인텔 / ATI / NVIDIA 가 이 새로운 규격을 차세대 그래픽 인터페이스로 확실하게 인정 • 기존 PCI의 제한 때문에 탄생한 그래픽 프로세싱 유닛(GPUs)에 독보적 존재였던 AGP가 PCI Express로 대체되고 있는 상황 PCI GeForce 7800 GTX (PCIe x16)
PC Graphics Software Architecture • The application, 3D API and driver are written in C or C++ • The vertex and pixel programs are written in a high-level shading language • (Cg, DirectX HLSL, OpenGL Shading Language) • Pushbuffer: Contains the commands to be executed on the GPU
GPU Fundamentals:The Graphics Pipeline • A simplified graphics pipeline • Note that pipe widths vary • Many caches, FIFOs, and so on not shown CPU GPU Graphics State Application Transform Rasterizer Shade VideoMemory(Textures) Vertices(3D) Xformed,LitVertices(2D) Fragments(pre-pixels) Finalpixels(Color, Depth) Render-to-texture
Stream Program => GPU • A stream is a sequence of data (could be numbers, colors, RGBA vectors,…)
Programmable vertex processor! Programmable pixel processor! GPU Fundamentals:The Modern Graphics Pipeline CPU GPU Graphics State VertexProcessor FragmentProcessor Application VertexProcessor Rasterizer PixelProcessor VideoMemory(Textures) Vertices(3D) Xformed,LitVertices(2D) Fragments(pre-pixels) Finalpixels(Color, Depth) Render-to-texture
GPU Pipeline: Transform • Vertex Processor (multiple operate in parallel) • Transform from “world space” to “image space” • Compute per-vertex lighting
GPU Pipeline: Rasterizer • Rasterizer • Convert geometric rep. (vertex) to image rep. (fragment) • Fragment = image fragment • Pixel + associated data: color, depth, stencil, etc. • Interpolate per-vertex quantities across pixels
GPU Pipeline: Shade • Fragment Processors (multiple in parallel) • Compute a color for each pixel • Optionally read colors from textures (images)
1995-1998: Texture Mapping and Z-Buffer • PCI: Peripheral Component Interconnect • 3dfx’s Voodoo
1998: Multitexturing • AGP: Accelerated Graphics Port • NVIDIA’s TNT, ATI’s Rage
Multitexturing Light Mapping
1999-2000: Transform and Lighting • Register Combiner: Offer many more texture/color combinations • NVIDIA’s Geforce 256 and Geforce2, ATI’s Radeon 7500)
Environment Mapping Environment Mapping
2001: Programmable Vertex Shader A programmable processor for any per-vertex computation • Z-Cull: Predicts which fragments will fail the Z test and discard them • Texture Shader: Offer more texture addressing and operations • NVIDIA’s Geforce3 and Geforce4 Ti, ATI’s Radeon 8500
2002-2003: Programmable Pixel Shader A programmable processor for any per-pixel computation • MRT: Multiple Render Target • NVIDIA’s Geforce FX, ATI’s Radeon 9600 to 9800
Shader: Static vs. Dynamic flow control • Static flow control • Condition varies per batch of triangles • Dynamic flow control • Condition varies per vertex or pixel • Full flow control • Static and dynamic flow control
2004: Shader Model 3.0 and 64 bit Color Support • PCIe: Peripheral Component Interconnect Express • NVIDIA’s Geforce 6800
Rasterization and Interpolation Raster Operations Fixed-function pipeline 3D API Commands 3D API: OpenGL or Direct3D 3D Application Or Game CPU-GPU Boundary (AGP/PCIe) GPU Command & Data Stream Vertex Index Stream Pixel Location Stream Assembled Primitives Pixel Updates GPU Front End Primitive Assembly Frame Buffer Transformed Vertices Transformed Fragments Pre-transformed Vertices Pre-transformed Fragments Programmable Fragment Processor Programmable Vertex Processor
Rasterization and Interpolation Raster Operations Programmable pipeline 3D API Commands 3D API: OpenGL or Direct3D 3D Application Or Game CPU-GPU Boundary (AGP/PCIe) GPU Command & Data Stream Vertex Index Stream Pixel Location Stream Assembled Primitives Pixel Updates GPU Front End Primitive Assembly Frame Buffer Transformed Vertices Transformed Fragments Pre-transformed Vertices Pre-transformed Fragments Programmable Fragment Processor Programmable Vertex Processor
Real-time Tone Mapping • The image is entirely computed in 64-bit color and tone-mapped for display • 64-bit color 16 bit floating-point value per channel (R, G, B, A) • Tone Mapping • HDRI(High Dynamic Range Image) low dynamic range device From low to high exposure image of the same scene
2005: Nvidia Geforce 7800 • Nvidia Geforce 7800 • NVIDIA SLI (Scalable Link Interface) Technology • Dramatically scales performance by allowing two graphics cards to be run in parallel. • 64-Bit Floating Point Texture Filtering and Blending • Designed for PCI Express x16 • API Support • Complete DirectX support, including the latest version of Microsoft DirectX 9.0 Shader Model 3.0 • Full OpenGL support, including OpenGL 2.0
Radiosity • A visual effect that shows how light bounces off of some objects and contributes to the final lighting of another object NVIDIA Demo: Mad Mod Mike
The Future • Unified general programming model at primitive, vertex and pixel levels • Scary amount of: • Floating point horsepower • Video memory • Bandwidth b/w system and video memory • Lower chip costs and power requirements to make 3D graphics hardware ubiquitous • Automotive (gaming, navigation, head-up displays) • Home (remotes, media center, automation) • Mobile (PDAs, cell phones)
GPU Programming • GPU Programming • Low-level Language • Assembler-like • best performance • Platform-dependent • Vertex programming, Fragment programming • Ex) OpenGL extensions, Direct 9 • High-level shading language • Easier programming • Easier code reuse • Easier debugging • Easy to read • Ex) Cg, HLSL, GLSL
GPU Programming • GPU Programming • Low-level Language • OpenGL extensions • GL_ARB_vertex_program, GL_ARB_fragment_program • Direct 9 • Vertex Shader 2.0, Pixel Shader 2.0 • High-level shading language • Cg • “C for Graphics” By Nvidia • HLSL • “High-Level Shading Language”, Part of DirectX 9 (Microsoft) • GLSL • “OpenGL 2.0 Shading Language”, Proposal by 3D Labs HLSL and Cg are much more similar to each other than they are to GLSL
Reference • Reference • Course Note • EG2004 • SIGGRAPH2004 • VIS2004 • David Luebke , General-Purpose Computation on Graphics Hardware • Daniel Weiskopf, Basic of GPU-Based Programming • Cyril Zeller, Introduction to the Hardware Graphics Pipeline • Randy Fernando, Programming the GPU • Suresh Venkatasubramanian, GPU Programming and Architecture • GPGPU (http://www.gpgpu.org/) • GPU Programming http://euclid.uits.iupui.edu/wiki/index.php/GPU_Programming • Shader::Tech http://www.shadertech.com/ • Nvidia Developer http://developer.nvidia.com/object/gpu_programming_guide.html • GPGPU DEVELOPER RESOURCES