1 / 10

Generalized Deep Image to Image Regression

Generalized Deep Image to Image Regression. Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis Spotlight in CVPR17. Recursively Branched Deconvolutional Network. RBDN is a powerful architecture for “Generic Image to Image Regression.” It features 2 core components:

hamlinj
Download Presentation

Generalized Deep Image to Image Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Generalized DeepImage to Image Regression Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis Spotlight in CVPR17

  2. Recursively Branched Deconvolutional Network • RBDN is a powerful architecture for “Generic Image to Image Regression.” • It features 2 core components: • Recursive Branches (Head): Develops an early multi-context image-representation by concatenating activations at multiple scales. • Main Branch (Backbone): Takes multi-context representation as input & adds non-linearity via a series of convolutions + activation functions, without any downsampling. • Unlike classification architectures (eg VGG, ResNet), RBDN offers • End-to-End Preservation of local correspondences. • Learnable upsampling with highly efficient parameter sharing. • Network can choose context vs locality based on task. • Fully Convolutional (can process variable sized inputs).

  3. Experiments

  4. Relighting Results • Input: Face image under arbitrary lighting conditions. • Desired Output: Face Image under a fixed (ambient) lighting. Surprisingly, a model trained only on frontal faces in the constrained CMU-Multipie dataset generalizes extremely well to unconstrained faces in Janus CS0, featuring a wide range of pose, illumination, occlusion and affordance (hat, glass, scarf, etc) variations.

  5. Denoising Results • Input: Noisy image with WGN of unknown standard deviation. • Desired Output: Clean Image. Other approaches train a different model specific for each noise level, while we train a single model designed to handle all noise levels. Despite this, we outperform others on almost all noise levels.

  6. A single RBDN model handles a wide range of noise levels.

  7. Colorization Results • Input: Grayscale image, Desired Output: RGB Image. • 1st Row: GT, 2nd Row: Greyscale, 3rd – 5th row: Other approaches • Last Row: RBDN output. RBDN is trained with softmax cross entropy loss with class rebalancing that results in very colorful colorizations.

  8. Face RGB2Depth Results (Janus CS3 dataset) • Input: RGB face (variable size), Desired Output: Depth Image. Surprisingly, RBDN trained using SIRFS as ground truth ends up outperforming SIRFS itself !

  9. Face RGB2Depth Results (FRGC dataset) • Both RBDN &SIRFS depth estimates computed for 101 FRGC images. • FRGC has its own ground truth depth against which the estimates are compared. • RBDN outperforms SIRFS on a majority of the images and has a significantly lower average depth distance of 25.58 compared to the SIRFS average depth distance of 46.47

  10. Conclusion • RBDN works extremely well for almost any Generic Image to Image Regression problem. • RBDN overcomes the context-locality tradeoff inherent in almost every other DCNN architecture. • RBDN has a very high capacity: a single RBDN denoising model outperforms noise-specific competitor approaches at almost all noise levels. • RBDN trained with softmax contrastive loss produces very colorful colorizations. • RBDN shows excellent transfer learning abilities: • Relighting RBDN trained on constrained frontal faces generalizes to unconstrained faces with arbitrary pose/illumination variations ! • RBDN trained using SIRFS ground truth outperforms SIRFS ! • RBDN is fully convolutional and can handle variable sized images during inference.

More Related