1 / 26

GPU Shading and Rendering

GPU Shading and Rendering. GPU Shading and Rendering: Introduction. Marc Olano UMBC. GPU. GPU: Graphics Processing Unit Designed for real-time graphics Present in almost every PC Increasing realism and complexity. Americas Army. Texture / Buffer. Vertex. Geometry. Fragment.

msilverman
Download Presentation

GPU Shading and Rendering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GPU Shading and Rendering

  2. GPU Shading and Rendering:Introduction Marc Olano UMBC

  3. GPU • GPU: Graphics Processing Unit • Designed for real-time graphics • Present in almost every PC • Increasing realismand complexity Americas Army

  4. Texture / Buffer Vertex Geometry Fragment GPU computation CPU Displayed Pixels

  5. Low-level code !!ARBvp1.0 # Transform the normal to view space TEMP Nv,Np; DP3 Nv.x,state.matrix.modelview.invtrans.row[0],vertex.normal; DP3 Nv.y,state.matrix.modelview.invtrans.row[1],vertex.normal; DP3 Nv.z,state.matrix.modelview.invtrans.row[2],vertex.normal; MAD Np,Nv,{.9,.9,.9,0},{0,0,0,1}; # screen position from vertex TEMP Vp; DP4 Vp.x, state.matrix.mvp.row[0], vertex.position; DP4 Vp.y, state.matrix.mvp.row[1], vertex.position; DP4 Vp.z, state.matrix.mvp.row[2], vertex.position; DP4 Vp.w, state.matrix.mvp.row[3], vertex.position; […] # interpolate MAD Np, Np, -vertex.color.x, Np; MAD result.position, Vp, vertex.color.x, Np; END

  6. High-level code void main() { vec4 Kin = gl_Color; // key input // screen position from vertex, texture and normal vec4 Vp = ftransform(); vec4 Tp = vec4(gl_MultiTexCoord0.xy*1.8-.9, 0,1); vec4 Np = vec4(nn*.9,1); // interpolate between Vp, Tp and Np gl_Position = Vp; gl_Position = mix(Tp,gl_Position,pow(1.-Kin.x,8.)); gl_Position = mix(Np,gl_Position,pow(1.-Kin.y,8.)); // copy to output gl_TexCoord[0] = gl_MultiTexCoord0; gl_TexCoord[1] = Vp; gl_TexCoord[3] = Kin; }

  7. Not real-time Developed from General CPU code Seconds to hours per frame 1000s of lines “Unlimited” computation, texture, memory, … Real-time Developed from fixed-function hardware Tens of frames per second 1000s of instructions Limited computation, texture, memory, … Non-real time vs. Real time

  8. Non-real time Real-time Non-real time vs. Real-time Application Application Displacement Texture/ Buffer Vertex Surface Light Volume Geometry Atmosphere Fragment Imager Displayed Pixels Displayed Pixels

  9. History (not real-time) • Testbed [Whitted and Weimer 1981] • Shade Trees [Cook 1984] • Image Synthesizer [Perlin 1985] • RenderMan [Hanrahan and Lawson 1990] • Multi-pass RenderMan [Peercy et al. 2000] • GPU acceleration [Wexler et al. 2005]

  10. History (real-time) • Custom HW [Olano and Lastra 1998] • Multi-pass standard HW [Peercy et al. 2000] • Register combiners [NVIDIA 2000] • Vertex programs [Lindholm et al. 2001] • Compiling to mixed HW [Proudfoot et al. 2001] • Fragment programs • Standardized languages • Geometry shaders [Blythe 2006]

  11. Choices • OS: Windows, Mac, Linux • API: DirectX, OpenGL • Language: HLSL, GLSL, Cg, … • Compiler: DirectX, OpenGL, Cg, ASHLI • Runtime: CgFX, ASHLI, OSG (& others), sample code

  12. Major Commonalities • Vertex & Fragment/Pixel • C-like, if/while/for • Structs & arrays • Float + small vector and matrix • Swizzle & mask (a.xyz = b.xxw) • Common math & shading functions

  13. Texture / Buffer Vertex Geometry Fragment GPU Parallelism Pipeline

  14. Texture / Buffer Vertex Geometry Fragment GPU Parallelism Pipeline SPMD ParallelFragment Stream

  15. Fragment Fragment Fragment Fragment GPU Parallelism SIMD Parallel2x2 Block SPMD ParallelFragment Stream

  16. Fragment Fragment Fragment Fragment Texture Unit Shader Unit Shader Unit L1 Cache Branch Unit L2 Cache Fog GPU Parallelism SIMD Parallel2x2 Block Pipeline (NVIDIA)

  17. Texture Unit Shader Unit ALU ALU Shader Unit L1 Cache ALU ALU Branch Unit L2 Cache Fog GPU Parallelism Vector ParallelLimited MIMD Pipeline (NVIDIA)

  18. Vertex (stream) Buffer Geometry(stream) Fragment(array) Managing GPU Programming • Simplified computational model • Bonus: consistent as hardware changes • All stages SIMD • Explicit 4-element SIMD vectors • Fixed conversion / remapping between each stage

  19. Vertex (stream) Buffer Geometry(stream) Fragment(array) Vertex • One element in / one out • NO communication • Can select fragment address

  20. Vertex (stream) Buffer Geometry(stream) Fragment(array) Geometry • More next (Blythe talk) • One element in / 0 to ~100 out • Limited by hardware buffer sizes • Like vertex: • NO communication • Can select fragment address

  21. Vertex (stream) Buffer Geometry(stream) Fragment(array) Fragment • Biggest computational resource • One element in / 0 – 1 out • Cannot change destination address • I am element x,y in an array, what is my value? • Effectively no communication • Conditionals expensive • Better if block coherence

  22. Vertex (stream) Buffer Geometry(stream) Fragment(array) Program / Multiple Passes • Communication • None in one pass • Arbitrary read addresses between passes • Data layout • No persistent per-processor memory • No penalty to change

  23. Multiple passes • GPGPU • Non-local effects • Shadow maps • Texture space • Precomputation • Fix some degrees of freedom • Factor into functions of 1-3D • Project input or output into another space

  24. GPU Shading and Rendering

More Related