Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware

Nolan Goodnight • Rui Wang • Cliff Woolley • Greg Humphreys • University of Virginia Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware

HDR and Tone Mapping Compressed Clamped to [0,1]

Advances in graphics hardware • Physically-based rendering on the GPU (Purcell et al, 2003) • High dynamic range texture mapping (Debevec et al, 2001)

System Overview • Interactive tone mapping system for an OpenGL application • application • tone mapping system • Display • callback • HDR image • Frame • buffer • LDR image

Interface to the application • tmInitialize();// Initialize the system • tmEnable();// Retarget GL calls • Draw geometry • tmCompress();// Compress output • tmDisable(); // Restore app context • tone mapping system • application

Choosing a tone mapping operator • Photographic Tone Reproduction for High Contrast Images (Reinhard et al, 2002) • Global operator is a simple transfer function 1 0

Choosing a tone mapping operator • Local operator • Digital analog to ‘burning’ and ‘dodging’ Center-surround

Why use this tone mapping operator? • Global operator is simple and fast to compute • Only one global computation • We can dynamically choose the number of zones

Variable number of zones: 3 3 Zones

System block diagram

Implementation • Target architecture • ATI Radeon 9800 (R350) • Data storage • Floating-point off-screen buffers (pbuffers) • Multiple rendering surfaces (GL_AUXi) • Algorithms • ARB fragment and vertex assembly • Generate fragments with image-sized quads • Data representation • Vector vs. scalar organization

Global operator block diagram

Simple luminance transform Store luminance and log luminance in separate channels Implementation: global operator • HDR image • Luminance • Log luminance • luminance • log luminance • Mipmap • reduction • LDR image • Single buffer

Implementation: global operator • Single rendering surface • HDR image • Luminance • Log luminance • Mipmap • reduction • log luminance channel • log average luminance • LDR image • Single buffer

Implementation: global operator • HDR image • texture 0 • operator • shader • Luminance • Log luminance • texture 1 • texture 2 • Mipmap • reduction • LDR image • Single buffer

Local operator block diagram

Implementation: GPU-based convolutions • Transform n-vector product into multiple 4-vector products

Output 4 pixels at the same time Useful for expensive algorithms Requires a conversion back to scalar form. Vectorizing the luminance Stacked domain

Vectorizing the luminance • A simple method for luminance vectorization:

Vectorizing the luminance • A simple method for luminance vectorization: • Preserves spatial locality

GPU-based convolutions • stacked • image

GPU-based convolutions • Pass 1 • stacked • image

GPU-based convolutions • Pass 1 • Pass 2 + • stacked • image

GPU-based convolutions • Pass 1 • Pass 2 • Pass 3 + + • stacked • image

GPU-based convolutions • Compute multiple 4-vector products per pass • Less shader and texture switching • Single render pass + + • stacked • image

GPU-based convolutions Advantages: Handles large kernels Efficient memory access No transform back to scalar values 512 X 512 image 11 x 11 kernel ~ 6 ms 21 x 21 kernel ~ 10 ms 41 x 41 kernel ~ 16 ms

System block diagram

filtered filtered Calculating adaptation zones on the GPU luminance luminance FRONT 0 1 BACK Buffer 0 Buffer 1

Performance: global operator Frames per second Image size

Performance: local operator Frames per second Number of zones

Performance comparison: CPU vs. GPU

Results: Accuracy • Comparison with CPU: 512 x 512 image

False-color zone images

Images generated at ~30Hz • Clamped [0,1] • Compressed: 2 zones

Interactive Time-Dependent Tone Mapping Using Programmable Graphics Hardware