1 / 56

N-Buffers for efficient depth map query

N-Buffers for efficient depth map query. Xavier Décoret Artis GRAVIR/IMAG INRIA. what won’t affect any pixel in final image. Context. Real-time rendering Visibility culling quickly reject what’s not visible. Many methods available [COCSD02,PT02]. Occlusion maps.

aluz
Download Presentation

N-Buffers for efficient depth map query

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. N-Buffers for efficient depth map query Xavier DécoretArtis GRAVIR/IMAG INRIA

  2. what won’t affect any pixel in final image Context • Real-time rendering • Visibility culling • quickly reject what’s not visible • Many methods available [COCSD02,PT02]

  3. Occlusion maps • Select potential occluders [LG95,KCCO00] • project and rasterize them • store distance to closest one at each pixel • Z buffer / occlusion map / depth map • Traverse potential occludees • project and rasterize them • test visibility of each fragment • depth comparison against depth map - use bounding volumes - do it hierarchically

  4. Optimizations • Reduce number of pixels tested • Hierarchical Z Buffer [ZMHH97] • Lazy Occlusion Grid [HTP01] • Summed Area Tables [HW99] • Use hardware Z buffer • implemented for hidden face removal • with optimizations [Mor00, AMN03] • exposed through Occlusion Queries

  5. Occlusion queries • # of pixels passing z test if some geometry were rendered in current framebuffer • Hardware-assisted culling [HSLM02,BWPP04] • Other applications [TPK01] • culling & clamping of shadow volumes [LWGM04] • LOD selection [ASVNB00]

  6. Motivation for N-Buffers • Query depth map within GPU • Advantages • reduce communication with CPU • allow to discard/optimize geometry on GPU • Constraints • limited # of operations • complex datastructures unavailable • no pointers and lists • “complex” algorithms prohibited • branching and indirections costly

  7. Task at hand • For a given object, find the maximum depth covered by its projection • Depth map accessed as a texture • Lookups give information at one pixel • We need information over a region • Use texture to encode depth over a region • proximity grids

  8. The datastructure • Sequence of depth maps (levels) • At level i a texel stores maximum depth in a neighborood of size i • various neigborood/size possible • we choose squares • with lower left corner on texel • with size 2ix 2i

  9. The datastructure • Sequence of depth maps (levels) • At level i a texel stores maximum depth in a neighborood of size i depth map level 0

  10. that texel stores maximum depth within that region The datastructure • Sequence of depth maps (levels) • At level i a texel stores maximum depth in a neighborood of size i depth map level 0 level 1

  11. that texel stores maximum depth within that region The datastructure • Sequence of depth maps (levels) • At level i a texel stores maximum depth in a neighborood of size i depth map level 0 level 1 level 2

  12. that texel stores maximum depth within that region The datastructure • Sequence of depth maps (levels) • At level i a texel stores maximum depth in a neighborood of size i depth map level 0 level 1 level 2 level 3

  13. The datastructure • Like an image pyramid but... • all levels have same resolution • level 0 (depth map) can have any dimensions • not limited to power of 2 • # of levels is log of largest dimension • but we might build only the first levels

  14. level 0 Construction • Level i+1 obtained from level i level 1 level 2

  15. Construction • Level i+1 obtained from level i level 0 level 1 level 2

  16. standard z-buffer Construction • Can be done on the GPU • render scene offscreen • copy depth to texture L[0] • for i = 1 to n • setup fragment program • render a quad • covering viewport • with unit texcoords • with fragment program • copy depth to texture L[i]

  17. Construction • Can be done on the GPU • render scene offscreen • copy depth to texture L[0] • for i = 1 to n • setup fragment program • render a quad • covering viewport • with unit texcoords • with fragment program • copy depth to texture L[i]

  18. Construction • Can be done on the GPU • render scene offscreen • copy depth to texture L[0] • for i = 1 to n • setup fragment program • render a quad • covering viewport • with unit texcoords • with fragment program • copy depth to texture L[i]

  19. Construction • Can be done on the GPU • render scene offscreen • copy depth to texture L[0] • for i = 1 to n • setup fragment program • render a quad • covering viewport • with unit texcoords • with fragment program • copy depth to texture L[i]

  20. Construction • Similar to matrix reduction... • Buck and Purcell, GPU Gems, p 626 • ...but we keep full resolution • gives us locality

  21. Construction • Complexity • first step depends on scene complexity • other steps depends only on resolution • Computation cost • ~10ms for 640x480 • no read back GeForce FX 6800

  22. Query • Naive approach top view viewport level 0 level 1 level 2 level 3 level 4 level 5

  23. Query • Naive approach • project occludee top view viewport level 0 level 1 level 2 level 3 level 4 level 5

  24. Query • Naive approach • project occludee • get screen space bbox • extents + zmin top view viewport level 0 level 1 level 2 level 3 level 4 level 5

  25. 25 x 25 Query • Naive approach • project occludee • get screen space bbox • extents + zmin • get bounding neighborood top view viewport level 0 level 1 level 2 level 3 level 4 level 5

  26. 25 x 25 zmax Query • Naive approach • project occludee • get screen space bbox • extents + zmin • get bounding neighborood • do one lookup • in matching level • at lower left corner top view viewport level 0 level 1 level 2 level 3 level 4 level 5

  27. 25 x 25 Query • Naive approach • project occludee • get screen space bbox • extents + zmin • get bounding neighborood • do one lookup • in matching level • at lower left corner • compare zmin and zmax top view zmax viewport level 0 level 1 level 2 level 3 level 4 level 5

  28. 25 x 25 Need a tighter coverage Query • Naive approach • Overly conservative • (bvolume of occludee) • screenspace bbox • bounding neighborood top view viewport level 0 level 1 level 2 level 3 level 4 level 5

  29. zmax z ≤ 4 tiles coverage • depthmax in region > depthmax in sub-region bounding neighborood 25 x 25 24 x 24 screenspace bbox

  30. zmax z ≤ 4 tiles coverage • depthmax in region > depthmax in sub-region bounding neighborood 25 x 25 24 x 24 screenspace bbox

  31. zmax z zmax ≤ = z1, z2, z3, z4 max( ) 4 tiles coverage • depthmax in region > depthmax in sub-region bounding neighborood 25 x 25 24 x 24 screenspace bbox

  32. 4 tiles coverage • 5 ways of covering with 4 squares • Measure of the gain on over-conservativity

  33. Applications Occlusion culling Particles Shadow volume clamping

  34. Applications Occlusion culling Particles Shadow volume clamping

  35. Occlusion Culling • N-Buffer vs. Occlusion Queries • walkthrough in city-like scene • occluders at frame n = visible at frame n-1 • Measured the number of depth tests • testing each building • using a hierarchy of bounding volumes

  36. Occlusion Culling • Occlusion queries are faster • harware implementation, available API • N-Buffers penalized • computation of 4 tiles coverage on CPU • use of glReadPixels to query levels • Occlusion queries can be interleavedwith rendering [BWPP04]

  37. Occlusion Culling • # of depth tests smaller with N-Buffers • 4 tests/occludee << nb of pixels rasterized • N-Buffers always benefit from hierarchy • testing A cheaper than testing children(A) • not the case for OQ

  38. n n1 n2 Occlusion Culling • # of depth tests smaller with N-Buffers • 4 tests/occludee << nb of pixels rasterized • N-Buffers always benefit from hierarchy • testing A cheaper than testing children(A) • not the case for OQ n>n1+n2

  39. Hardware implementation? • Extra memory to store levels • Dedicated component for level updates • not all levels? • lazy updates? • Faster than OQ for large objects • Fixed (4) number of operations • simple implementation • good for parallelism

  40. Applications Occlusion culling Particles Shadow volume clamping

  41. Particles • Particle rendered using ARB_point_sprite • no need to compute quad on CPU • Particle animated within GPU • up to a million particle in real-time

  42. Particles • Particle rendered using ARB_point_sprite • no need to compute quad on CPU • Particle animated within GPU • up to a million particle in real-time • How to cull unseen particles? • can not use OQ!

  43. Particles • Using N-Buffers • for 16x16 point sprites • compute 4 first levels only • do one texture lookup in vertex program • Not implementable yet • v. program lookups require LUMINANCE_FLOAT32_ATI • N-Buffers require DEPTH_COMPONENT

  44. Applications Occlusion culling Particles Shadow volume clamping

  45. Shadow volumes clamping • Ignore unseen or fully shadowed casters • Clamp shadow volume to shadowed area [LWGM04]

  46. Shadow volumes clamping • From light’s view, what part of the (visible) scenea shadow volume encompass? light camera scene

  47. Shadow volumes clamping • From light’s view, what part of the (visible) scenea shadow volume encompass? light camera scene

  48. The litmap • Light view of what’s seen by viewer Light’s view Camera’s view

  49. The litmap • Light view of what’s seen by viewer Light’s view Camera’s view

  50. Shadow volumes clamping • From light’s view, what part of the (visible) scenea shadow volume encompass? • Minimum/maximum depth coveredby a shadow caster

More Related