150 likes | 162 Views
Explore the implementation and analysis of a GPU ray tracer with a focus on ray tracing, programmable pipelines, scene storage, future prospects, and more. Learn about accelerating ray shooting, uniform grids, space subdivision, and more with detailed explanations and examples. Discover how to map rays to fragments, feed back render targets as textures, and encode scenes in textures for efficient rendering. Evaluate scene performance, FPS rates, and lessons learned, along with pros and cons of the technique. Get insights on future developments, hardware support, and potential enhancements to the GPU ray tracing process.
E N D
Implementing and Analyzing a GPU Ray Tracer Kristóf Ralovich Budapest University of Technology and Economics
Agenda Quick overview of ray tracing Programmable pipeline Storing the scene GPU ray tracing Results Future work
Concept of ray tracing light occluder viewport camera reflective object reflected ray shadow ray diffuse object primary ray
Accelerating ray shooting uniform grid space subdivision 3D DDA traversal [Amanatides & Woo]
Pipeline > computing stage t e x t u r e m e m o r y vertex processor rasterization fragment processor render target (on/off-screen) rendering pass / kernel execution draw full screen quad programmable pipeline screen covered by rasterized fragments texture memory fragment program (kernel) 1 : 1 mapping of rays to fragments fragments to render target(s) render targets can be fed back as textures
Scene encoded in textures tri list referenced in vox0 tri list referenced vox2 R G B v1 x y z v2 x y z v3 x y z col ... r g b n1 x y z n2 x y z n3 x y z tri0 tri1 tri2 tri3 tri4 tri5 tri6 tri7 triN refl rc 3D RGB grid texture vox0 vox1 vox2 vox3 voxK ... R G B 0 3 0 3 2 0 0 0 0 5 4 0 ptr cnt prox cld. 2D LUMINANCE triangle list texture L ... triID 9 0 2 3 0 2 4 7 8 2x 3D RGB texture for triangle data 4 slices each
GPU ray tracer (ray generation) ray origins ray directions ray generator stage output in render targets: scene AABB hit? (masking) and 3D DDA initialization traverse + intersect shading
GPU ray tracer (initialization) 3D DDA tMax curr. voxel + finished? flag xyz xyz xyz xyz xzy fin? xzy fin? xzy fin? xzy fin? xyz xyz xyz xyz xzy fin? xzy fin? xzy fin? xzy fin? xyz xyz xyz xyz xzy fin? xzy fin? xzy fin? xzy fin? xyz xyz xyz xyz xzy fin? xzy fin? xzy fin? xzy fin? uvt ID uvt ID uvt ID uvt ID hit record: barycentric u,v + ray param. t + triID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID ray generator stage output in render targets: scene AABB hit? (masking) and 3D DDA initialization traverse + intersect shading
3D DDA tMax + # of proc. tris curr. voxel + finished? flag xzy #tris xzy #tris xzy #tris xzy #tris xzy fin? xzy fin? xzy fin? xzy fin? xzy #tris xzy #tris xzy #tris xzy #tris xzy fin? xzy fin? xzy fin? xzy fin? xzy #tris xzy #tris xzy #tris xzy #tris xzy fin? xzy fin? xzy fin? xzy fin? xzy #tris xzy #tris xzy #tris xzy #tris xzy fin? xzy fin? xzy fin? xzy fin? uvt ID uvt ID uvt ID uvt ID hit record: barycentric u,v + ray param. t + triID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID GPU ray tracer (trav. + isec. 1) ray generator stage output in render targets: scene AABB hit? (masking) and 3D DDA initialization traverse + intersect shading pass repeated until all ray are finished (halting cond. det. by occlusion query)
GPU ray tracer (trav. + isec. 2) uvt ID uvt ID uvt ID uvt ID hit record: barycentric u,v + ray param. t + triID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID ray generator stage output in render targets: scene AABB hit? (masking) and 3D DDA initialization traverse + intersect shading
GPU ray tracer (shading) accumulated color values uvt ID uvt ID uvt ID uvt ID hit record: barycentric u,v + ray param. t + triID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID uvt ID ray generator stage output in render targets: scene AABB hit? (masking) and 3D DDA initialization traverse + intersect shading
Evaluated scenes Knight Torus Knot Bunny (low poly) Stanford Bunny 14x16x14 grid 16x16x16 grid 20x20x20 grid 128x128x128 grid 636 tris 1024 tris 1764 tris 69451 tris 2.6 - 9.7 FPS 1.5 - 8.4 FPS 3.3 - 11.6 FPS 0.8 - 3.0 FPS
Experiences: Lessons learned Pros. ray tracing and GPU are both paralell uniform grid traversal is fast and simple Cons abstraction over graphics API no stack for recursion grid is not adaptive
Future SM 4.0 GPU: integer arith, geom. feedback others RASs on GPU: KD-tree [Foley & Sugerman 2005], BVH [Thrane & Simonsen 2005], geometry images [Carr et. al. 2006] better API to HW: CUDA different HW: Cell
Thank you for yourattention! Questions?