290 likes | 425 Views
Introduction to Programmable Hardware. Traditional Graphics Pipeline. transform & lighting. (per vertex operations). setup rasterizer. (per primitive operation). texture blending. (per fragment operation). frame-buffer anti-aliasing. Programmable features. Vertex Programming
E N D
Traditional Graphics Pipeline transform & lighting (per vertex operations) setup rasterizer (per primitive operation) texture blending (per fragment operation) frame-buffer anti-aliasing
Programmable features • Vertex Programming • Pixel Shader • Texture shader • Register combiner • Based on nVIDIA architecture
Vertex Program (cont’d) • Vertex Programming offers programmable T&L unit User-defined Vertex Processing transform & lighting setup rasterizer texture blending Gives the programmer total control of vertex processing. frame-buffer anti-aliasing
Vertex Program (cont’d) Vertex Program transform & lighting setup rasterizer texture blending frame-buffer anti-aliasing
Vertex Program (cont’d) • Vertex Program • Assembly language interface to T&L unit • GPU instruction set to perform all vertex math • Reads an untransformed, unlit vertex • Creates a transformed vertex • Optionally creates • Lights a vertex • Creates texture coordinates • Creates fog coordinates • Creates point sizes
Create Vertex Program • Programs (assembly) are defined inline as • character strings static const GLubyte vpgm[] = “\!!VP1. 0\ DP4 o[HPOS].x, c[0], v[0]; \ DP4 o[HPOS].y, c[1], v[0]; \ DP4 o[HPOS].z, c[2], v[0]; \ DP4 o[HPOS].w, c[3], v[0]; \ MOV o[COL0],v[3]; \ END";
Programming Model V[0] … V[15] VertexSource Program Constants c[0] … c[96] 16x4 registers O[HPOS] O[COL0] O[COL1] O[FOGP] O[PSIZ] O[TEX0] … O[TEX7] Vertex Program 96x4 registers R0 … R11 Temporary Registers 128 instructions 12x4 registers Vertex Output 15x4 registers All quad floats
Instruction Set: The ops • 17 instructions total • MOV, MUL, ADD, MAD, DST • DP3, DP4 • MIN, MAX, SLT, SGE • RCP, RSQ, LOG, EXP, LIT • ARL
transform & lighting setup rasterizer texture blending frame-buffer anti-aliasing Pixel Shader User-defined per pixel shading
Texture Mapping/Blending • Traditional OpenGL texture mapping/blending Vertex colors Gouraud Shading Fragment color Texture Unit Blend colors Texture Coordinate Fragment color output
Multitexturing • An optional extension of OpenGL 1.2 fragment color input texture unit 0 blend colors texture unit 0 blend colors texture unit 0 blend colors texture unit 0 blend colors fragment color output
Texture Environment 1 Texture Compositing • OpenGL 1.2 Fragment Color Texture Environment 0 Texture Fetching Tex0 Tex1 Specular Color Sum Specular Color Fog Application Fog Color/Factor
Compositing Operator Choice of 5 set functions for RGB and Alpha: Ct: texture color; At: texture alpha Cf: incoming fragment color; Af: incoming fragment alpha Cc: color assigned to GL_TEXTURE_ENV_COLOR Post-environment specular color addition and fog application
Pixel Shader (cont’d) • Based on nVIDIA’s GF3/4 architecture Texture shader • 4 texture units • 23 different texture shader operations • Conventional (1D, 2D, 3D, texture rectangle, cube map) • Special case (none, pass through, cull fragment) • Dependent texture fetches (result of one texture lookup affects texture coords for subsequent unit) • Dependent textures fetches with dot product (and optional reflection) calculations Register combiners • 8 stages (general combiners) on GeForce3/4 • Per-stage constants
Pixel Shader • Based on nVIDIA’s GF3/4 architecture • Texture shader + register combiner texture shader fragment color input texture unit 0 texture program texture unit 1 texture program texture unit 2 texture program texture unit 3 texture program register combiner fragment color output
Texture Shader • Texture program example: conventional 2D texture Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format Output Color 2D Any Format Texture 2D Si Ti (R,G,B,A) ( , ) i (Si,Ti,Ri,Qi) Qi Qi
Shader Operations Texture Fetch Bound Texture Target/Format Output Color R = Clamp0to1(Si) (R,G,B,A) None (Si,Ti,Ri,Qi) G = Clamp0to1(Ti) None B = Clamp0to1(Ri) A = Clamp0to1(Qi) Texture Shader (cont’d) • Texture program example: pass through
Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format Texture specific Texture specific Any type Unsigned RGB[A] 0 App specific R0G0B0A0 2D RGBA Ignored None 1 (A0,R0) R1G1B1A1 Texture Shader (cont’d) • Texture program example: dependent texture
Register Combiner • GeForce 2 (only 2 general combiner stages) 4 RGB Inputs Fragment Color 4 Alpha Inputs General Combiner 0 3 RGB Outputs Specular Color 3 Alpha Outputs Fog Color/Factor 4 RGB Inputs 4 Alpha Inputs General Combiner 1 Register Set Texture 0 Texture Fetching 3 RGB Outputs 3 Alpha Outputs Texture 1 Spare 0 Specular Color Final Combiner 6 RGB Inputs 1 Alpha Input
Register Combiner (cont’d) • Register-based programming • All textures and colors available for each and every texture blending stage • 8 Stages of blending in hardware, plus specular and fog • Note that GeForce3 has 8 combiners, and 4 textures. • Signed color arithmetic
Diagram of a General Combiner Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
General Combiner Input Registers Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
The Register Set • Primary (diffuse) color • initialized to RGBA of fragment’s primary color • Secondary (specular) color • initialized to RGB of fragment’s secondary/specular color • alpha not initialized • Texture 0 and Texture 1 colors • initialized to fragment’s filtered RGBA texel from numbered texture unit • not initialized if numbered texture unit is disabled or non-existent • Spare 0 and Spare 1 • Alpha of Spare 0 is initialized to alpha of Texture 0 color (if enabled) • RGB of Spare 0 and all of Spare 1 is not initialized • Fog • RGB is current fog color • alpha is fragment’s fog factor (only available in final combiner) • read-only • Constant color 0 and Constant color 1 • initialized to user-defined RGBA value • read-only • Zero • constant, read-only value of zero
General Combiner Input Mappings Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
General Combiner RGB Function Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
General Combiner RGB Functions Dot / Dot / Discard Dot / Mult / Discard Mult / Dot / Discard A A A A • B A • B AB B B B C • D CD C • D C C C D D D Mult / Mult / Mux Mult / Mult / Sum A A AB AB B B CD CD C C mux(AB, CD) AB + CD D D mux(AB, CD) = (Spare0[Alpha] ½) ? AB : CD Dot products on RGB registers: A • B = (A[red] * B[red] + A[green] * B[green] + A[blue] * B[blue], A[red] * B[red] + A[green] * B[green] + A[blue] * B[blue], A[red] * B[red] + A[green] * B[green] + A[blue] * B[blue]) Multiplication on RGB registers: AB = (A[red] * B[red], A[green] * B[green], A[blue] * B[blue])
Diagram of the Final Combiner (OpenGL only) Input RGB, Alpha Registers Available RGB Inputs Input Mappings RGB Function Input Mappings A Multiplier E EF B F RGB Portion RGB Out AB + (1-A)C + D Color Sum Unit Clamp to [0, 1] C Spare0 Sum 2nd-ary Color D Input Alpha, Blue Registers Input Mapping Alpha Portion Alpha Out