150 likes | 330 Views
Render Cache. John Tran CS851 - Interactive Ray Tracing February 5, 2003. Main Points. Layering interactive display processes over high quality but slow renderers is an increasingly popular idea
E N D
Render Cache John Tran CS851 - Interactive Ray Tracing February 5, 2003
Main Points • Layering interactive display processes over high quality but slow renderers is an increasingly popular idea • Stores recent shading results from the underlying renderer as colored 3d points in a fixed size cache • For each frame, reproject those points onto new image plane and filter results • Implemented completely in software! • (Self-proclaimed addictive interface)
Separating the Renderer from the Display Process • See Figure 1, paper 1 • Reduces framerate dependence on the speed of the renderer • Display provides interactive feedback to the user even when the renderer itself is too slow • Display process caches previous frame results in the form of 3D colored points – “Render Cache” • Can use any renderer (ray tracing, path tracing…) • Only requirement is that renderer must be able to efficiently compute individual rays or pixels
Problems with Reprojection • Pixel to point mapping is not a bijection • Especially true the faster the camera moves • Occlusion errors • Non-diffuse shading
Image Generation • Project render cache points onto new image plane • Transform is specified by the application • Depth culled using a 3x3 grid around the pixel • Compute average depth, and compare this pixels depth to that average • Smoothing Filter • 3x3 weighted grid (4, 2, 1)
Where to sample? • Assume renderer is slow relative to the frame dimensions • Priority image • Priority based on pixel’s age (oldest pixels will be updated sooner) • Priority for some points set by the values in the render cache; the others are set by interpolation on neighbors • Pixels with no valid neighbors receive the max priority • If pixel’s priority is over a certain threshold, then that pixel is requested as a sample from the renderer
The Render Cache • Fixed size slightly larger than the number of pixels to be displayed • New points that are resamples overwrite that point in the cache • They do not use LRU; instead they do LRU of a group of 8 points, which are rotated in a round-robin order
Mismatch ratio • Number of pixels in a frame / number of new samples produced by renderer per frame • Render Cache is best for handling high mismatch ratios (up to 64)
Implementation and Results • Used 195Mhz R10000 in an SGI Origin 2000 • 256x256 @ 14fps • Uniprocessors would have to split time between Display and Renderer • Best for multiprocessors • Implemented with shared memory, although mentioned message passing as possible • Used machines with 1 to 60 processors
2002 Update • Predictive sampling • Split projection/tiled z-buffer for memory coherence • Prefilter • SIMD optimization
Predictive sampling • Use predicted camera parameters • Project onto a lower resolution image • Create sample requests for any image that did not project • Use every nth sample if too many samples for this frame • Rely on application to provide the predicted camera
Tiled z-buffer • Break up the frame into tiles for memory coherence • Sort-first
Image Prefilter • Use a 7x7 uniform filter instead of 3x3 • Handles sparse regions better • Larger prefilter is only used where the 3x3 filter fails
More speedups • Point Evictions • Used to keep points in cache until they were over written • Now, can evict a point if it gets too old/stale • SIMD optimizations
Conclusions • New version can render 512x512 faster than the old version rendered 256x256 • Half of this speedup is probably because of SIMD