130 likes | 237 Views
3DTV 2.0. Their goals: prepare for the next generation of stereoscopic 3DTV applications by improving capture and display technology, by building new applications such as 3D media sharing and gaming
E N D
3DTV 2.0 • Their goals: prepare for the next generation of stereoscopic 3DTV applications • by improving capture and display technology, • by building new applications such as 3D media sharing and gaming • and by studying the 3D viewing/using/playing experience through lab experiments and in people’s natural environment. • My goal/task: stereo view interpolation for single lens stereo camera • small baseline • adjustment of stereo base and convergence parameters • M1-M24 (Q1 2011 - Q4 2012)
Requirements • Project 3DTV requirements: • Stereo, small baseline • HD (1920 x 1080) cinematic input (and output) • Real-time, means to me: at least 15 fps, preferably 25-30 fps • High-quality, or at least decent ‘preview’ quality • Project Fine requirements: • Multi-view, wide(r) baseline • Sports scenes (football) input: • Players look-alike • Uniform background (green pitch, …) • Speed vs quality?
Algorithm Core • Truncated separable approximation to an isotropic Laplacian kernel. • Advantages: • Large support windows • Separable: efficient GPU implementation • fewer costly texture fetches needed • Boundary-guided, less foreground-fattening • Cost = min(UL, UR, BL, BR, L, U, R, B, F)
Results • For project Fine • Movies…
Limitations (1) • On a NVidia GeForce 8800 GTX (yes, I know, old hardware by now) • 800x600, #50: 11 fps • 1024x768, #50: 7 fps • 1920x1080, #50: 3 fps • 800x600, #100: 6 fps • 1024x768, #100: 4 fps • 1920x1080, #100: 2 fps • Implementation ‘shortcuts’ due to GPU architecture limitations • Texture units, render targets, RGBA textures, … • Speed vs quality: limited depth/disparity range.
Limitations (4) • Slow on memory accesses: • Lots of ‘random’ memory accesses • vsrectified stereo • Needs a (too) large convolution kernel • to prevent mismatches caused by homogeneous background • Written in OpenGL and Cg (graphics pipeline) • high texture memory access latency • no (or limited) random framebuffer writes • CUDA: • Low level control over memory access • Less architecture specific limitations • CUDA vs Cg: on average 30% speedup on a GeForce 8800 GTX
Next (1) • Locally Adaptive Support-Weights • High computational intensity for multi-view input: computes a different convolution kernel with adaptive weights for each (Vi, Vj) image pair. • Yoon et al., Locally Adaptive Support-Weight Approach for Visual Correspondence Search
Next (2) • ‘Refocus’ stereo feed by adjusting: • stereo base (inter camera/pupil distance) • convergence distance (camera/pupil angle) • Immersive Teleconferencing with Natural 3D Stereoscopic Eye Contact Using GPU Computing, 3D Stereo Media 2009 • Biological-Aware Stereoscopic Rendering in Free Viewpoint Technology using GPU Computing, 3DTV-CON 2010
Future • Occlusion handling • Temporal aggregation • (Approximate) depth information: e.g. time of flight, Kinect, … • SLI: multi-GPU • … • … • … • … • …
FAD? (Food And Discussion)