220 likes | 320 Views
Using wavelets on the XBOX360. Mike Boulton Senior Software Engineer Rare/MGS mboulton@microsoft.com. For current and future games San Francisco, GDC 2008. Introduction. Fixed compression methods such as DXT are becoming antiquated
E N D
Using wavelets on the XBOX360 Mike Boulton Senior Software Engineer Rare/MGS mboulton@microsoft.com For current and future games San Francisco, GDC 2008
Introduction • Fixed compression methods such as DXT are becoming antiquated • Distribution of data “density” becoming less uniform as complexity increases • Why can’t compression efforts be focused where it matters most? • Wavelets can help!
What are wavelets? • Wavelets are mathematical functions formed from scaled and translated copies of a few basis functions • The advantage is their ability to localize functions in both frequency and space • Fourier just frequency • “Point basis” just space • There are lots of common wavelet bases
Why 2D Haar? • This talk will focus on 2D Haar wavelets • Haar is the simplest basis set • In general, this means more basis terms and/or blocky artefacts • But also easiest to implement • “Blocky” nature of bases is less of a problem for operations such as integration • What do the wavelets look like?
2D Haar continued... • Three basis wavelet functions, plus ‘solid’ scaling function • White represents +1, black -1
The wavelet advantage • Local coverage allows windowed changes • Compared with spherical harmonics, which have global cover • This means that local changes only involve local bases • Mip-map chains not required for image decompression • Truly variable compression • Can focus efforts where it counts
Usage examples • Real-time shader image decompression • Lighting • Static shadow maps • Displacement maps over large areas • Easy dynamic texture packing • Geometry representations • ...many more!
Talk focus • Real-time shader image decompression • Real-time on XBOX360 GPU • 500Hz+ for full-screen 720p monochrome decompression • Double-product integration for relighting • Real-time (ish!) on XBOX360 GPU • Other applications • E.G. geometry representation • If time!
Real-time image decompression • Image broken up into 16x16 texel blocks • Each block compressed into wavelet sub-tree • Texture at 1/16th resolution stores scaling coefficient and offset to start of sub-tree • Why? • Decompression performance should be independent of main image resolution • Each sub-tree fits inside a texture cache tile, so no traversal cache thrashing • Can unroll traversal loop in shader for perf. gain
Real-time image decompression continued... • Given a (u,v) coordinate, pixel shader traverses appropriate sub-tree for final value • Mip-map chain not required • Obtained as by-product of traversal • Intermediate memory not required • All decompression performed directly in the pixel shader • Dynamic predication works well • Good vector coherency if minification avoided
Real-time image decompression continued... • How is the wavelet tree represented? • Stored breadth-first in a line texture • Multiple ways to pack the data • Example: ARGB8 • {r, g, b} stores windowed wavelet basis coefficients • {a} stores linear offset to child of node, if one exists • If no child exists, we jump to an “empty” node • Wait for the rest of the pixel vector to finish • Can use wider data formats • E.g. U16, F32
Real-time image decompression continued... Uncompressed with mipmaps ~2MB @ 1578fps
Real-time image decompression continued... Wavelet compressed, cut-off 0.05, 249KB @ 579fps
Real-time image decompression continued... Wavelet compressed, cut-off 0.08, 157KB @ 622fps
Real-time image decompression continued... • Notice how areas of high contrast have their detail preserved, and areas of lower contrast are “smoothed” out • Can use a different notion of importance!
Real-time image decompression continued... • Has advantages over fixed DXT-like compression schemes: • Can be lossless in areas you need it to be • Particularly important for “data” textures, such as SH coefficients • Will do a much better job for images with areas of low contrast • Again, higher-order SH terms are a good example • But most surface-parameterised data is likely to be like this • Can use “focus” ability to increase the area of parameterisation
Double-product integration • Another good application is relighting • Work by [Ng, Ramamoorthi and Hanrahan] • Diffuse BRDF • We represent both the transfer function (with cosine) and the environment as wavelet trees in a texture
Double-product integration continued... • Here we need to traverse only the intersection of both trees • So if one is simple, should get good performance • Parallel GPU traversal needs to know how to “jump over” bits of either tree not contained in the intersection • If a node has a child, store linear offset to sibling or ancestor (could be root) • Then we can jump over whole child branch if the other tree has no children at that point
Double-product integration continued... • Performance is a function of the size of the intersection between both trees • Plus other factors such as cache behaviour • Loads of other interesting ideas • Transfer function wavelet trees have a tendency to be quite similar in localised patches • Clustering scheme would help further • Could store “representative” tree for each patch, and then a (smaller) “residual” tree per texel of patch
Double-product integration continued... • (Video)
Other applications • Wavelets can be used to define surface deformations • As a “dynamic” displacement map • Can keep memory overhead constant by removing oldest high-frequency terms for new deformation terms • Windowed nature of wavelets makes applying localised deformations simple • Can easily modify e.g. moment of inertia