520 likes | 814 Views
Pixel Shader. Based on nVIDIA’s GF3/4 architecture Texture shader + register combiner. texture shader. fragment color input. texture unit 0. texture program. texture unit 1. texture program. texture unit 2. texture program. texture unit 3. texture program. register combiner.
E N D
Pixel Shader • Based on nVIDIA’s GF3/4 architecture • Texture shader + register combiner texture shader fragment color input texture unit 0 texture program texture unit 1 texture program texture unit 2 texture program texture unit 3 texture program register combiner fragment color output
Texture Shader “Bridge” (s0,t0,r0,q0) (s1,t1,r1,q1) (s2,t2,r2,q2) (s3,t3,r3,q3) (R0,G0,B0,A0) (R1,G1,B1,A1) (R2,G2,B2,A2) (R3,G3,B3,A3) Interpolated texture coordinate sets 32-bit IEEE floating-point Per-component RGBA colors 8-bit [0,1] or [-1,1) fixed-point Per-component Texture shader and Texture fetch units State
Texture Shaders • Provides a superset of conventional OpenGL texture addressing • Five main categories of shader operations • Conventional textures • 1D, 2D, texture rectangle, cube map • Special modes • none, pass through, cull fragment • Direct dependent textures • dependent AR, dependent GB, offset, offset scaled • Dot product (DP) dependent textures • DP 2D, DP texture rectangle, DP cube map, DP reflect cube map, DP diffuse cube map • Depth replace operations
Conventional Textures • Texture program: 2D texture mapping Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format Output Color 2D Any Format Texture 2D Si Ti (R,G,B,A) ( , ) i (Si,Ti,Ri,Qi) Qi Qi
Texture Rectangle • Texture program: No need to have power of 2 texture Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format Output Color Tex # Texture RectangleAny Format Texture Rectangle Si Ti (R,G,B,A) ( , ) i (Si,Ti,Ri,Qi) Qi Qi
Texture Cube Map Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format Output Color Cube Map Any Format (R,G,B,A) Texture Cube Map U=(Si, Ti, Ri) i (Si,Ti,Ri) U
Shader Operations Texture Fetch Bound Texture Target/Format Output Color R = Clamp0to1(Si) (R,G,B,A) None (Si,Ti,Ri,Qi) G = Clamp0to1(Ti) None B = Clamp0to1(Ri) A = Clamp0to1(Qi) Special Modes • Texture program: pass through
Special Modes • Texture program: none Output Color Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format (R,G,B,A) R = 0 G = 0 i Ignored None None B = 0 A = 0
Special Modes • Texture program: Cull Fragment • Cull the fragment based upon sign of texture coords • each tex coord (STRQ) has its own settable condition • each of the 4 conditions is set to one of the following: • GL_GEQUAL (tex coord ≥ 0) – pass iff positive or zero • GL_LESS (tex coord < 0) – pass iff negative • all four tex coords are tested • if any of the four fail, the fragment is rejected • Texture output for passing fragments is (0,0,0,0)
Cull FragmentApplications • Per-fragment clip planes • Up to 4 clip planes per texture unit • 16 clip planes maximum • Non-planar clipping approaches also possible • Vertex programs can compute a distance to a point or line and use that interpolated distance for clipping
Cull FragmentExamples Clipping a model to two texture shader clip planes Clipping a 3D grid of cubes based on distance from a point
Dependent Texture Shaders • Take results of one texture, use them for addressing subsequent texture • Simple dependent textures (single stage) • Dependent alpha-red • Dependent green blue • Offset texture 2D • Offset texture 2D scaled
Dependent Alpha-Red Texturing Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format Output Color Texture specific Texture specific Any type Unsigned RGBA 0 App specific R0G0B0A0 2D RGBA Ignored None 1 (A0,R0) R1G1B1A1
Dependent Green-Blue Texturing Output Color Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format Texture specific Texture specific Any type Unsigned RGB[A] R0G0B0A0 0 App specific 2D RGBA Ignored None R1G1B1A1 1 (G0,B0)
Offset Texture 2D • Use previous lookup (a signed 2D offset) to perturb the texture coordinates of a subsequent (non-projective) 2D texture lookup • Signed 2D offset is transformed by user-defined 2x2 matrix (shown in the following diagrams as constants k0-k3) • This 2x2 constant matrix allows for arbitrary rotation/scaling of offset vector • Offset defined in DS/DT texture
Offset Texture 2D Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format Output Color 2D DSDT Texture 2D S0 T0 (0,0,0,0) ( , ) 0 (S0,T0,R0,Q0) Q0 Q0 (ds,dt) S1’= S1+ k0*ds + k2*dt 2D Any Format 1 (S1,T1) (S1’, T1’) R1G1B1A1 T1’= T1+ k1*ds + k3*dt k0, k1, k2 and k3 define a constant 2x2 floating-point matrix set by glTexEnv
Offset Texture 2D Scale • Same as Offset Texture 2D, except that subsequent (non-projective) 2D texture RGB output is scaled • Scaling factor is the MAG component (from previous texture) scaled/biased by user-defined constants (kscale and kbias in the following diagrams) • Alpha component is NOT scaled
Offset Texture 2D Scale Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format Output Color 2D DSDT_Mag Texture 2D S0 T0 (0,0,0,0) ( , ) 0 (S0,T0,R0,Q0) Q0 Q0 (ds,dt,mag) S1’= S1+ k0*ds + k2*dt 2D RGBA 1 (S1,T1) (S1’, T1’) (R*M, G*M, B*M,A) T1’= T1+ k1*ds + k3*dt M = kscale* mag + kbias k0, k1, k2 and k3 define a constant 2x2 floating-point matrix set by glTexEnv kscale andkbias define constant floating-point scale/bias set by glTexEnv
Dot ProductDependent Texture Shaders • Take results of one texture, perform 2 or 3 dot products with it and incoming texture coordinates, then use results for addressing subsequent texture(s) • Multiple contiguous stages, not including source texture
Dot Product Texture 2D Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Output Color Bound Texture Target/Format Texture specific Texture specific Any type Signed RGB[A] 0 App specific R0G0B0 1 None None (0,0,0,0) (S1, T1, R1) Ux =[S1,T1,R1] • [R0,G0,B0] 2D RGBA 2 (Ux,Uy) (S2, T2, R2) R2G2B2A2 Uy =[S2, T2, R2] • [R0,G0,B0]
Dot Product Texture 2D Application • High-quality bump-mapping • 2D HILO texture stores normals • Per-fragment tangent-space normal, N • Vertex programs supplies tangent-space light (L) and half-angle (H) vectors in (s,t,r) texture coordinates • Two dot products compute • Diffuse L dot N • Specular H dot N • Illumination stored in 2D texture accessed byL dot N and H dot N • Excellent specular appearance
HILO Normal Map Dot Product Texture 2D Bump Mapping Bump mapping the Holy Grail
Dot Product Texture 3D Tex # Texture Coords (S,T,R,Q) Shader Operations Texture Fetch Bound Texture Target/Format Output Color Texture specific Texture specific 0 App specific R0G0B0 Any type Signed RGB[A] 1 None None (0,0,0,0) S=[S1,T1,R1] • [R0,G0,B0] (S1, T1, R1) 2 None None (0,0,0,0) (S2, T2, R2) T=[S2,T2,R2] • [R0,G0,B0] 3D RGBA R=[S3,T3,R3] • [R0,G0,B0] (S3, T3, R3) (S,T,R) 3 R3G3B3A3
Texture Shader Precision • Interpolated texture coordinates are IEEE 32-bit floating-point values • Texture projections, dot products, texture offset, post-texture offset scaling, reflection vector, and depth replace division computations are performed in IEEE 32-bit floating-point • HILO texture components are filtered as 16-bit values • DSDT, MAG, intensity, and color components are filtered as 8-bit values
Register Combiner • GeForce 2 (only 2 general combiner stages) 4 RGB Inputs Fragment Color 4 Alpha Inputs General Combiner 0 3 RGB Outputs Specular Color 3 Alpha Outputs Fog Color/Factor 4 RGB Inputs 4 Alpha Inputs General Combiner 1 Register Set Texture 0 Texture Fetching 3 RGB Outputs 3 Alpha Outputs Texture 1 Spare 0 Specular Color Final Combiner 6 RGB Inputs 1 Alpha Input
Register Combiner (cont’d) • Register-based programming • All textures and colors available for each and every texture blending stage • 8 Stages of blending in hardware, plus specular and fog • Note that GeForce3 has 8 combiners, and 4 textures. • Signed color arithmetic
Diagram of a General Combiner Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
General Combiner Input Registers Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
The Register Set • Primary (diffuse) color • initialized to RGBA of fragment’s primary color • Secondary (specular) color • initialized to RGB of fragment’s secondary/specular color • alpha not initialized • Texture 0 and Texture 1 colors • initialized to fragment’s filtered RGBA texel from numbered texture unit • not initialized if numbered texture unit is disabled or non-existent • Spare 0 and Spare 1 • Alpha of Spare 0 is initialized to alpha of Texture 0 color (if enabled) • RGB of Spare 0 and all of Spare 1 is not initialized • Fog • RGB is current fog color • alpha is fragment’s fog factor (only available in final combiner) • read-only • Constant color 0 and Constant color 1 • initialized to user-defined RGBA value • read-only • Zero • constant, read-only value of zero
General Combiner Input Mappings Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
General Combiner RGB Function Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
General Combiner RGB Functions Dot / Dot / Discard Dot / Mult / Discard Mult / Dot / Discard A A A A • B A • B AB B B B C • D CD C • D C C C D D D Mult / Mult / Mux Mult / Mult / Sum A A AB AB B B CD CD C C mux(AB, CD) AB + CD D D mux(AB, CD) = (Spare0[Alpha] ½) ? AB : CD Dot products on RGB registers: A • B = (A[red] * B[red] + A[green] * B[green] + A[blue] * B[blue], A[red] * B[red] + A[green] * B[green] + A[blue] * B[blue], A[red] * B[red] + A[green] * B[green] + A[blue] * B[blue]) Multiplication on RGB registers: AB = (A[red] * B[red], A[green] * B[green], A[blue] * B[blue])
General Combiner Alpha Function Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
General Combiner Alpha Functions Mult / Mult / Mux Mult / Mult / Sum A A AB AB B B CD CD C C mux(AB, CD) AB + CD D D mux(AB, CD) = (Spare0[alpha] ½) ? AB : CD
General Combiner Scale and Bias Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
General Combiner Scale and Bias Options Scale and bias operation is defined as: ClampNegativeOneToOne( Scale * (x + Bias) ) OR max(min(Scale * (x + Bias), 1), -1) Scale by ½ No scale Scale by 2 Scale by 4 No bias Bias by –½
General Combiner Output Registers Input RGB, Alpha Registers Input Mappings RGB Function RGB Scale/Bias Next Combiner’s RGB Registers A A op1 B B RGB Portion C op2 D C AB op3 CD D Input Alpha, Blue Registers Input Mappings Alpha Function Alpha Scale/Bias Next Combiner’s Alpha Registers A AB Alpha Portion B CD C AB op4 CD D
General Combiner Output Registers • Up to six outputs can be specified per general combiner: • three RGB outputs (A op1 B, C op2 D, AB op3 CD) written to RGB portion of writable registers • three Alpha outputs (AB, CD, AB op4 CD) written to Alpha portion of writable registers • RGB outputs must be written to distinct registers (that is, two outputs cannot be written to one register) • Alpha outputs must be written to distinct registers • Any output can be discarded • Those RGB functions performing dot products must discard the third result (Dot/Dot/Discard, Dot/Mult/Discard, Mult/Dot/Discard)
Diagram of the Final Combiner (OpenGL only) Input RGB, Alpha Registers Available RGB Inputs Input Mappings RGB Function Input Mappings A Multiplier E EF B F RGB Portion RGB Out AB + (1-A)C + D Color Sum Unit Clamp to [0, 1] C Spare0 Sum 2nd-ary Color D Input Alpha, Blue Registers Input Mapping Alpha Portion Alpha Out
Final Combiner Input Registers Input RGB, Alpha Registers Available RGB Inputs Input Mappings RGB Function Input Mappings A Multiplier E EF B F RGB Portion RGB Out AB + (1-A)C + D Color Sum Unit Clamp to [0, 1] C Spare0 Sum 2nd-ary Color D Input Alpha, Blue Registers Input Mapping Alpha Portion Alpha Out
Final Combiner Input Mappings Input RGB, Alpha Registers Available RGB Inputs Input Mappings RGB Function Input Mappings A Multiplier E EF B F RGB Portion RGB Out AB + (1-A)C + D Color Sum Unit Clamp to [0, 1] C Spare0 Sum 2nd-ary Color D Input Alpha, Blue Registers Input Mapping Alpha Portion Alpha Out
Final Combiner EF Multiplier Input RGB, Alpha Registers Available RGB Inputs Input Mappings RGB Function Input Mappings A Multiplier E EF B F RGB Portion RGB Out AB + (1-A)C + D Color Sum Unit Clamp to [0, 1] C Spare0 Sum 2nd-ary Color D Input Alpha, Blue Registers Input Mapping Alpha Portion Alpha Out
Final Combiner EF Multiplier E E F F Multiplication on RGB registers: E F = (E[red] * F[red], E[green] * F[green], E[blue] * F[blue])
Final Combiner Color Sum and Optional Clamp Input RGB, Alpha Registers Available RGB Inputs Input Mappings RGB Function Input Mappings A Multiplier E EF B F RGB Portion RGB Out AB + (1-A)C + D Color Sum Unit Clamp to [0, 1] C Spare0 Sum 2nd-ary Color D Input Alpha, Blue Registers Input Mapping Alpha Portion Alpha Out
Final Combiner Color Sum and Optional Clamp Color Sum Unit Clamp to [0, 1] Spare0 Clamp(Spare0) + Clamp(SecondaryColor) SecondaryColor Inputs to this unit are hardwired to Spare0 and SecondaryColor registers Each input to the color sum unit undergoes an unsigned identity mapping before addition Unsigned identity = max(0, x), clamping negative input values to 0 Addition on RGB registers: Spare0 + SecondaryColor = (Spare0[red] + SecondaryColor[red], Spare0[green] + SecondaryColor[green], Spare0[blue] + SecondaryColor[blue]) Output range for sum is [0, 2] Optional clamping unit clamps the sum to [0, 1]
Final Combiner RGB Input Set Input RGB, Alpha Registers Available RGB Inputs Input Mappings RGB Function Input Mappings A Multiplier E EF B F RGB Portion RGB Out AB + (1-A)C + D Color Sum Unit Clamp to [0, 1] C Spare0 Sum 2nd-ary Color D Input Alpha, Blue Registers Input Mapping Alpha Portion Alpha Out
Final Combiner RGB Input Set All registers available to all RGB function inputs (A, B, C and D) Result of EF multiplier also available to all RGB function inputs Result of (possibly clamped) color sum available to B, C and D function inputs (but not A) Neither the result of the EF multiplier nor the color sum is available to the Alpha portion
Final Combiner RGB Function Input RGB, Alpha Registers Available RGB Inputs Input Mappings RGB Function Input Mappings A Multiplier E EF B F RGB Portion RGB Out AB + (1-A)C + D Color Sum Unit Clamp to [0, 1] C Spare0 Sum 2nd-ary Color D Input Alpha, Blue Registers Input Mapping Alpha Portion Alpha Out