210 likes | 374 Views
Automatic scene inference for 3D object compositing. Kevin Karsch (UIUC), Sunkavalli , K. Hadap , S.; Carr, N.; Jin, H.; Fonte , R.; Sittig , M., David Forsyth SIGGRAPH 2014. What is this system. Image editing system Drag-and-drop object insertion Place objects in 3D and relight
E N D
Automatic scene inference for 3D object compositing Kevin Karsch (UIUC), Sunkavalli, K. Hadap, S.; Carr, N.; Jin, H.; Fonte, R.; Sittig, M., David Forsyth SIGGRAPH 2014
What is this system • Image editing system • Drag-and-drop object insertion • Place objects in 3D and relight • Fully automatic for recovering a comprehensive 3D scene model: geometry, illumination, diffuse albedo, and camera parameters • From single low dynamic range (LDR) image
Existing problems • It’s the artist’s job to create photorealistic effects by recognizing the physical space • Lighting, shadow, perspective • Need: camera parameters, scene geometry, surface materials, and sources of illumination
State-of-the-art • http://www.popularmechanics.com/technology/digital/visual-effects/4218826 • http://en.wikipedia.org/wiki/The_Adventures_of_Seinfeld_%26_Superman
What can not this system handle • Works best when scene lighting is diffuse; therefore generally works better indoors than out • Errors in either geometry, illumination, or materials may be prominent • Does not handle object insertion behind existing scene elements
Contribution • Illumination inference: recovers a full lighting model including light sources not directly visible in the photograph • Depth estimation: combines data-driven depth transfer with geometric reasoning about the scene layout
How to do this • Need: geometry, illumination, surface reflectance • Even though the estimates are coarse, the composites still look realistic because even large changes in lighting are often not perceivable
Indoor/outdoor scene classification • K-nearest-neighbor matching of GIST features • Indoor dataset: NYUv2 • Outdoor dataset: Make3D • Different training images and classifiers are chosen depending on indoor/outdoor scene
Single image reconstruction • Camera parameters, geometry • Focal length f, camera center (cx, cy) and extrinsic parameters are computed from three orthogonal vanishing points detected in the scene
Surface materials • Per-pixel diffuse material albedo and shading by Color Rentinex method
Data-driven depth estimation • Database: rgbd • Appearance cues for correspondences: multi-scale SIFT features • Incorporate geometric information
Data-driven depth estimation Et: depth transfer Em: Manhattan world Eo: orientation E3s: spatial smoothness in 3D
Visible sources • Segment the image into superpixels; • Then compute features for each superpixel; • Location in image • Use 340 features used in Make3D • Train a binary classifier with annotated data to predict whether or not a superpixel is emitting/reflecting a significant amount of light.
Out-of-view sources • Data-driven: annotated SUN360 panorama dataset; • Assumption: if photographs are similar, then the illumination environment beyond the photographed region will be similar as well.
Out-of-view sources • Use features: geometric context, orientation maps, spatial pyramids, HSV histograms, output of the light classifier; • Measure: histogram intersection score, per-pixel inner product; • Similarity metric of IBLs: how similar the rendered canonical objects are; • Ranking function: 1-slack, linear SVN-ranking optimization (trained).
Relative intensities of the light sources • Intensity estimation through rendering: adjusting until a rendered version of the scene matches the original image; • Humans cannot distinguish between a range of illumination configurations, suggesting that there is a family of lighting conditions that produce the same perceptual response. • Simply choose the lighting configuration that can be rendered faster.
Physically grounded image editing • Drag-and-drop insertion • Lighting adjustment • Synthetic depth-of-field
User study • Real object, real scene VS inserted object, real scene • Synthetic object, synthetic scene VS inserted object, synthetic scene • Produces perceptually convincing results