360 likes | 579 Views
A Crash Course on Programmable Graphics Hardware. Li-Yi Wei 2005 at Tsinghua University, Beijing. Why do we need graphics hardware?. The evolution of graphics hardware. SGI Origin 3400. NVIDIA Geforce 7800. 7 years of graphics accelenation.com/?doc=123&page=1. General & flexible
E N D
A Crash Course on Programmable Graphics Hardware Li-Yi Wei 2005 at Tsinghua University, Beijing
The evolution of graphics hardware SGI Origin 3400 NVIDIA Geforce 7800
7 years of graphics • accelenation.com/?doc=123&page=1
General & flexible Intuitive Global illumination Hard to accelerate Ray tracing
Local computation Easy to accelerate Not general Unintuitive Polygonal graphics pipeline
Layered approach Like network layers Encapsulation Easy programming Driver optimization Driver workaround Driver simulation Protection Hardware error check Graphics hierarchy
Overview • Graphics pipeline • Only high level overview (so you can program), not necessarily real hardware • GPU programming
Mostly on CPU High level work User interface Control Simulation Physics Artificial intelligence Application
Gatekeeper of GPU Command processing Error checking State management Context switch Host
Vertex processor Primitive assembly Clip & cull Viewport transform Geometry
Vertex Processor • Process one vertex at one time • No information on other vertices • Programmable • Transformation • Lighting
Transformation • Global to eye coordinate system
Lighting • Diffuse • Specular
Transform & Light on Vertex Processor • A sequence of assembly instructions • (more on this later)
Primitive Assembly • Assemble individual vertices into triangle (or line or point) • Performance implication • A triangle is ready only when all 3 vertices are • Vertex coherence & caching
Clipping & Culling • Backface culling • Remove triangles facing away from view • Eliminate ½ of the triangles in theory • Clipping against view frustum • Triangles may become quadrilaterals
Viewport transform • From floating point range [-1, 1] x [-1, 1] to integer range [0, height-1] x [0, width-1]
Convert primitives (triangles, lines) into pixels Barycentric coordinate Attribute interpolation Rasterization
Attribute interpolation • Interpolation • Barycentric
Perspective correct interpolation correct incorrect
Fragment: corresponds to a single pixel and includes color, depth, and sometimes texture-coordinate values. Compute color and depth for each pixel Most interesting part of GPU Fragment processor
Optional (though hard to avoid) Cache data Hide latency from FB Sampling/filtering I told you this last time Texture
Write to framebuffer Comparison Z, stencil, alpha, window ROP (Raster Operation)
Storing buffers and textures Connect to display Characteristics Size Bandwidth Latency Framebuffer
Inputs (read-only) Attributes Constants Textures Registers (read-write) Used by shader Outputs (write-only) Attributes Conceptual programming model
Simple example • HPOS: position • COL0: diffuse color • MOV o[HPOS], v[HPOS]; • MOV o[COL0], v[COL0];
More complex example • o[COL0] = v[COL0] + constant*v[HPOS]; • MOV o[HPOS], v[HPOS]; • MOV R0, v[COL0]; • MAD R0, v[HPOS], c[0], R0; • MOV o[COL0], R0;
High-level shading language • Writing assembly is • Painful • Not portable • Not optimize-able • High level shading language solves these • Cg, HLSL
Applications • Too many of them for me to describe here • The only way to learn is try to program • Useless for you even if I try to describe • Look at developer website • NVIDIA, ATI, GPGPU
Homework • Try to program GPU! • Even without NVIDIA GPU, you can download the emulator • Stanford course on graphics hardware • http://www.graphics.stanford.edu/courses/cs448a-01-fall/ • History of graphics hardware • 7 years of graphics • accelenation.com/?doc=123&page=1