1 / 27

GPU-Assisted Path Tracing

GPU-Assisted Path Tracing. Matthias Boindl Christian Machacek. Institute of Computer Graphics and Algorithms Vienna University of Technology. Motivation: Why Path Tracing?. Physically based Nature provides the reference image Parallelizable Sublinear in #objects Conceptually simple

Download Presentation

GPU-Assisted Path Tracing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GPU-Assisted Path Tracing Matthias Boindl Christian Machacek Institute of Computer Graphics and Algorithms Vienna University of Technology

  2. Motivation: Why Path Tracing? • Physically based • Nature provides the reference image • Parallelizable • Sublinear in #objects • Conceptually simple • Can lead to a clean implementation • But: fast implementation on GPUs not trivial

  3. Outline • Path tracing intro • Main steps of the algorithm • Mapping the algorithm to the GPU • How to organize code into kernels • When to launch kernels • How to pass data between kernels • Accelerationstructures • Focus on bounding volume hierarchies Christian Machacek

  4. Path Tracing Intro • Like ray tracing, except it… • …supports arbitrary BRDFs • …is stochastic: at each bounce, the new direction is decided randomly • Convergence video From Pharr, Humphreys: PBRT, 2nd ed. (2010)

  5. Path Tracing Pseudocode while image not converged r = new ray from eye through next pixel do i = closest intersection of r with scene if no i: break if i is on a light source: c = c + throughput * emission randomly pick new direction and create reflected ray r evaluate BRDF at i update throughput while path throughput high enough From Pharr, Humphreys: PBRT, 2nd ed. (2010)

  6. Path Tracing Pseudocode while image not converged r = new ray from eye through next pixel do i = closest intersection of r with scene if no i: break if i is on a light source: c = c + throughput * emission randomly pick new direction and create reflected ray r evaluate BRDF at i update throughput while path throughput high enough From Pharr, Humphreys: PBRT, 2nd ed. (2010)

  7. Megakernel Execution Divergence From Bikker (2013)

  8. Solution: Wavefront Path Tracing • Separate, specialized kernels • Keep a pool of ~1 million paths alive • Work for next stage goes into kernel-specific, compact queues (=4MB index arrays) https://mediatech.aalto.fi/~samuli/

  9. Results • Performance • Execution times • (ms / 1M path segments) Christian Machacek

  10. Limitations and Possible Improvements • Higher memory requirements (+200 MB) • Kernel launch overhead • Dynamic parallelism on GK110 • Use an outer scheduling kernel • No CPU round trip • Launch independent stages side-by-side • CUDA streams • So kernels with little work don’t hog the GPU Christian Machacek

  11. Acceleration Structures • Find nearestintersection in O(log N) • Space partitioning vs. objectpartitioning • Hybrid methodsexist Matthias Boindl

  12. Performance • For interactive rendering, compromise • Traversal performance (build quality) • Construction/Update time • Update or rebuild from scratch • Adapt to GPU environment • Memory architecture • Parallel execution Matthias Boindl

  13. State of the Art • TeroKarras and Timo Aila. 2013. Fast parallel constructionofhigh-qualityboundingvolumehierarchies. In Proceedingsofthe 5th High-Performance Graphics Conference (HPG '13). ACM, New York, NY, USA, 89-99. Matthias Boindl

  14. Close the Performance Gap Matthias Boindl

  15. Basic Idea • Fast construction of simple BVH • Generate leaf for each triangle • Reduce SAH cost by modifying tree Matthias Boindl

  16. Treelets • Allow local tree modification ABCF areleaves, DEG areinternalnodes Matthias Boindl

  17. Treelet Construction • Find root: parallel bottom-up traversal • Start withleaves • Useatomiccounteratconjunctions • Ensures all childrenhavebeenprocessed • Buildtreelet • Add bothchildren • Pick childrenwithhighestsurfacearea • Fixed size: 7 leafnodes Matthias Boindl

  18. Rearrange Treelet • Minimizetreeletrootnodesurfacearea • Naive implementation: testeachpermutation • Better: dynamicprogramming • Caching ofbest intermediate resultsStart withleaves, thenpairs, thentriplets, … • Suboptimal subtreeconstructionavoided • Parallelizableas well Matthias Boindl

  19. Results • Gap closed Matthias Boindl

  20. Results • Speed/Quality tradeoff Matthias Boindl

  21. Conclusion • Use specialized kernels • Lower execution divergence • (Better use of instruction cache) • (Fewer registers used simultaneously) • Constructaccelerationstructuresquickly • But not tooquickly Matthias Boindl

  22. Thanks for your attention! Institute of Computer Graphics and Algorithms Vienna University of Technology

  23. Results • Speed/Quality tradeoff Matthias Boindl

  24. Logic Kernel • Does not need a queue, operates on all paths • If shadow ray was unblocked, add light contribution • Find material or light source the ray hits • Place path into proper material queue • Russian roulette • If path terminated, accumulate to image • Place path into new path queue • Sample light sources (aka next event estim.) Christian Machacek

  25. New Path Kernel • Generate a new image-space sample • Generate camera ray • Place it into extension ray cast queue • Initialize path state • Throughput • Pixel position • etc. Christian Machacek

  26. Material Kernels • Generate incoming direction • Evaluate light contribution based on light sample generated in the logic kernel • We haven’t cast the shadow ray yet! • For MIS: p(light sample) from the BSDF • Discard BSDF stack • Queue • extension ray • (shadow ray) Christian Machacek

  27. Ray Cast Kernels • Extension rays • Find first intersection against scene geometry • Store hit data into path state • Shadow rays • Blocked or not? Christian Machacek

More Related