1 / 17

Interactive Rendering With Coherent Ray Tracing

Interactive Rendering With Coherent Ray Tracing. Eurogaphics 2001 Wald, Slusallek, Benthin, Wagner Comp 238, UNC-CH, September 10, 2001 Joshua Stough. The Gist.

grace
Download Presentation

Interactive Rendering With Coherent Ray Tracing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interactive Rendering With Coherent Ray Tracing Eurogaphics 2001 Wald, Slusallek, Benthin, Wagner Comp 238, UNC-CH, September 10, 2001 Joshua Stough

  2. The Gist • The authors present “ a highly optimized implementation of a ray tracer that improves performance by more than an order of magnitude compared to currently available ray tracers…makes better use of computational resources…and better exploits image and object space coherence.”

  3. Organization • Why Ray Tracing over Rasterization? • An Optimized Ray Tracing Implementation • Code structure, Caching, Coherence • Intersections • Volume Traversal (Memory Layout, Overhead) • Performance of the Ray Tracing Engine

  4. Why Ray Tracing Over Raster? • Automatic Occlusion Culling • Logarithmic Complexity in number of scene primitives • Flexible sampling – allows for more effective use of time • Efficient Shading – “avoids computation for invisible geometry” • Shader Programming – direct use verses pipeline model • More Correct Physically – and can use the same approximations • “Trivially Parallel” – though initial resources required are higher • Coherence

  5. “Coherence is the key to efficiency.” • Basic (Recursive Tree) Ray Tracer lacks concern for: • Modern CPU design – pipeline execution • Caching to hide low bandwidth and high latency on main memory • Instead, “pay particular attention to:” • Caching – efficient/aligned data structures, traversing mechanisms • Pipelining • Parallel execution possibilities • “We show that even today the performance of a software ray tracer on a single PC can challenge dedicated rasterization hardware for complex environments.”

  6. An Optimized Ray Tracing Implementation • Reducing Code Complexity • Optimizing cache usage • Reducing memory bandwidth • Prefetching Data • And with SIMD/SSE: • Ray intersections • Scene traversal • Shading

  7. Code Complexity • Few conditionals, Tight Inner loops • Axis aligned BSP Tree – iterative algorithm possible • Triangles only – reduces branches • Shading less important – once verses 40-50 traversals 5-10 intersections

  8. Caching • Performance bound by bandwidth, not CPU speed • BSP traversal, low computation to bandwidth ratio • Fetching on entire cache line • Carefully lay out data • Data together only if used together (geometry vs. shading) • Separate read-only (preprocessing) data from read-write (mailboxes) • Hide latency with prefetching

  9. Ray-Triangle Intersection Compute distance to plane (defined by triangle) along ray If distance is within current interval for testing (via BSP) Compute hit point Project into an axis-aligned plane (largest angle to normal) Barycentric coordinates of the hit point in 2d Data alignment – 2 2D edge equations, plane equation for distance, tag for projection axis = 9 floats + tag. Padded to 48 bytes (memory tradeoff).

  10. CPU Cost of Ray-Triangle Test Bary. Pleucker Bary Speed- C Code SSE SSE Up Min 78 77 22 3.5 Max 148 123 41 3.7 ** -41 cycles ~ 20M ray-triangle intersections/sec -SSE requires bundling four rays at a time.

  11. The Bundling of Four Rays at Once • Better than four Triangles/One Ray • Requires new Traversal algorithm • Potential Overhead • Primary rays verses shadow rays

  12. BSP Traversal • Before, 2x-3x more time spent than on intersections • Axis Aligned BSP Tree • Only 2 binary decisions – efficient in parallel • Any ray traverses a child node => All four traverse in parallel • Algorithm • Maintain current ray segment [near, far] • Calculate distance to splitting plane • Three cases • Update segments and traverse children if necessary

  13. BSP Tree Memory Layout • Caching and Prefetching in mind • 1 children node pointer, node type flag, split coordinate • = 8 bytes/node = 4 nodes/cache line. • Aligned children • Memory bandwidth reduced by 4x. • Possible Overhead • Incoherent rays = high overhead • Worst case = no worse than normal

  14. Performance of the Ray Tracer

  15. Considerations • 11-15x Performance Increase! • RTRT on 256MB RAM, others on 1GB! BUT • Difference in features • Others not limited to triangles • Others did not target performance

  16. Comparison With Raster Hardware

  17. Miscellaneous • Reflections/Shadows • Coherence less likely • Hot spots, but same hacks as in raster (environment maps). • Linear scaling for rasterization vs. Logarithmic for ray tracing => higher complexity in favor of ray tracing http://graphics.cs.uni-sb.de/%7Ewald/Publications/EG2001_IRCRT/Gallery.html

More Related