N-Buffers for efficient depth map query

N-Buffers for efficient depth map query Xavier DécoretArtis GRAVIR/IMAG INRIA

what won’t affect any pixel in final image Context • Real-time rendering • Visibility culling • quickly reject what’s not visible • Many methods available [COCSD02,PT02]

Occlusion maps • Select potential occluders [LG95,KCCO00] • project and rasterize them • store distance to closest one at each pixel • Z buffer / occlusion map / depth map • Traverse potential occludees • project and rasterize them • test visibility of each fragment • depth comparison against depth map - use bounding volumes - do it hierarchically

Optimizations • Reduce number of pixels tested • Hierarchical Z Buffer [ZMHH97] • Lazy Occlusion Grid [HTP01] • Summed Area Tables [HW99] • Use hardware Z buffer • implemented for hidden face removal • with optimizations [Mor00, AMN03] • exposed through Occlusion Queries

Occlusion queries • # of pixels passing z test if some geometry were rendered in current framebuffer • Hardware-assisted culling [HSLM02,BWPP04] • Other applications [TPK01] • culling & clamping of shadow volumes [LWGM04] • LOD selection [ASVNB00]

Motivation for N-Buffers • Query depth map within GPU • Advantages • reduce communication with CPU • allow to discard/optimize geometry on GPU • Constraints • limited # of operations • complex datastructures unavailable • no pointers and lists • “complex” algorithms prohibited • branching and indirections costly

Task at hand • For a given object, find the maximum depth covered by its projection • Depth map accessed as a texture • Lookups give information at one pixel • We need information over a region • Use texture to encode depth over a region • proximity grids

The datastructure • Sequence of depth maps (levels) • At level i a texel stores maximum depth in a neighborood of size i • various neigborood/size possible • we choose squares • with lower left corner on texel • with size 2ix 2i

The datastructure • Sequence of depth maps (levels) • At level i a texel stores maximum depth in a neighborood of size i depth map level 0

that texel stores maximum depth within that region The datastructure • Sequence of depth maps (levels) • At level i a texel stores maximum depth in a neighborood of size i depth map level 0 level 1

that texel stores maximum depth within that region The datastructure • Sequence of depth maps (levels) • At level i a texel stores maximum depth in a neighborood of size i depth map level 0 level 1 level 2

that texel stores maximum depth within that region The datastructure • Sequence of depth maps (levels) • At level i a texel stores maximum depth in a neighborood of size i depth map level 0 level 1 level 2 level 3

The datastructure • Like an image pyramid but... • all levels have same resolution • level 0 (depth map) can have any dimensions • not limited to power of 2 • # of levels is log of largest dimension • but we might build only the first levels

level 0 Construction • Level i+1 obtained from level i level 1 level 2

Construction • Level i+1 obtained from level i level 0 level 1 level 2

standard z-buffer Construction • Can be done on the GPU • render scene offscreen • copy depth to texture L[0] • for i = 1 to n • setup fragment program • render a quad • covering viewport • with unit texcoords • with fragment program • copy depth to texture L[i]

Construction • Can be done on the GPU • render scene offscreen • copy depth to texture L[0] • for i = 1 to n • setup fragment program • render a quad • covering viewport • with unit texcoords • with fragment program • copy depth to texture L[i]

Construction • Similar to matrix reduction... • Buck and Purcell, GPU Gems, p 626 • ...but we keep full resolution • gives us locality

Construction • Complexity • first step depends on scene complexity • other steps depends only on resolution • Computation cost • ~10ms for 640x480 • no read back GeForce FX 6800

Query • Naive approach top view viewport level 0 level 1 level 2 level 3 level 4 level 5

Query • Naive approach • project occludee top view viewport level 0 level 1 level 2 level 3 level 4 level 5

Query • Naive approach • project occludee • get screen space bbox • extents + zmin top view viewport level 0 level 1 level 2 level 3 level 4 level 5

25 x 25 Query • Naive approach • project occludee • get screen space bbox • extents + zmin • get bounding neighborood top view viewport level 0 level 1 level 2 level 3 level 4 level 5

25 x 25 zmax Query • Naive approach • project occludee • get screen space bbox • extents + zmin • get bounding neighborood • do one lookup • in matching level • at lower left corner top view viewport level 0 level 1 level 2 level 3 level 4 level 5

25 x 25 Query • Naive approach • project occludee • get screen space bbox • extents + zmin • get bounding neighborood • do one lookup • in matching level • at lower left corner • compare zmin and zmax top view zmax viewport level 0 level 1 level 2 level 3 level 4 level 5

25 x 25 Need a tighter coverage Query • Naive approach • Overly conservative • (bvolume of occludee) • screenspace bbox • bounding neighborood top view viewport level 0 level 1 level 2 level 3 level 4 level 5

zmax z ≤ 4 tiles coverage • depthmax in region > depthmax in sub-region bounding neighborood 25 x 25 24 x 24 screenspace bbox

zmax z zmax ≤ = z1, z2, z3, z4 max( ) 4 tiles coverage • depthmax in region > depthmax in sub-region bounding neighborood 25 x 25 24 x 24 screenspace bbox

4 tiles coverage • 5 ways of covering with 4 squares • Measure of the gain on over-conservativity

Applications Occlusion culling Particles Shadow volume clamping

Occlusion Culling • N-Buffer vs. Occlusion Queries • walkthrough in city-like scene • occluders at frame n = visible at frame n-1 • Measured the number of depth tests • testing each building • using a hierarchy of bounding volumes

Occlusion Culling • Occlusion queries are faster • harware implementation, available API • N-Buffers penalized • computation of 4 tiles coverage on CPU • use of glReadPixels to query levels • Occlusion queries can be interleavedwith rendering [BWPP04]

Occlusion Culling • # of depth tests smaller with N-Buffers • 4 tests/occludee << nb of pixels rasterized • N-Buffers always benefit from hierarchy • testing A cheaper than testing children(A) • not the case for OQ

n n1 n2 Occlusion Culling • # of depth tests smaller with N-Buffers • 4 tests/occludee << nb of pixels rasterized • N-Buffers always benefit from hierarchy • testing A cheaper than testing children(A) • not the case for OQ n>n1+n2

Hardware implementation? • Extra memory to store levels • Dedicated component for level updates • not all levels? • lazy updates? • Faster than OQ for large objects • Fixed (4) number of operations • simple implementation • good for parallelism

Particles • Particle rendered using ARB_point_sprite • no need to compute quad on CPU • Particle animated within GPU • up to a million particle in real-time

Particles • Particle rendered using ARB_point_sprite • no need to compute quad on CPU • Particle animated within GPU • up to a million particle in real-time • How to cull unseen particles? • can not use OQ!

Particles • Using N-Buffers • for 16x16 point sprites • compute 4 first levels only • do one texture lookup in vertex program • Not implementable yet • v. program lookups require LUMINANCE_FLOAT32_ATI • N-Buffers require DEPTH_COMPONENT

Shadow volumes clamping • Ignore unseen or fully shadowed casters • Clamp shadow volume to shadowed area [LWGM04]

Shadow volumes clamping • From light’s view, what part of the (visible) scenea shadow volume encompass? light camera scene

The litmap • Light view of what’s seen by viewer Light’s view Camera’s view

Shadow volumes clamping • From light’s view, what part of the (visible) scenea shadow volume encompass? • Minimum/maximum depth coveredby a shadow caster

N-Buffers for efficient depth map query

N-Buffers for efficient depth map query

Presentation Transcript

Buffers

Location Map N

Trie Indexes for Efficient XML Query Processing

Weighted Joint Bilateral Filter with Slope Depth Compensation Filter for Depth Map Refinement

Buffers

Efficient OLAP Query Processing for Distributed Data Warehouses

Efficient Query Filtering for Streaming Time Series

Buffers

Efficient Mining of XML Query Patterns for Caching

Indexing Methods for Efficient XML Query Processing

EFFICIENT PROFILING FOR ESTIMATION OF QUERY RESULT QUALITY

Depth Estimation for Ranking Query Optimization

Buffers

Buffers

Efficient computation of diverse query results

Efficient XML Storage, Query, and Update

Efficient and Self-tuning Incremental Query Expansions for Top-k Query Processing

Efficient Computation of Diverse Query Results

Efficient and Self-tuning Incremental Query Expansions for Top-k Query Processing

Efficient Query Filtering for Streaming Time Series