100 likes | 113 Views
Generalized Deep Image to Image Regression. Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis Spotlight in CVPR17. Recursively Branched Deconvolutional Network. RBDN is a powerful architecture for “Generic Image to Image Regression.” It features 2 core components:
E N D
Generalized DeepImage to Image Regression Venkataraman Santhanam, Vlad I. Morariu, Larry S. Davis Spotlight in CVPR17
Recursively Branched Deconvolutional Network • RBDN is a powerful architecture for “Generic Image to Image Regression.” • It features 2 core components: • Recursive Branches (Head): Develops an early multi-context image-representation by concatenating activations at multiple scales. • Main Branch (Backbone): Takes multi-context representation as input & adds non-linearity via a series of convolutions + activation functions, without any downsampling. • Unlike classification architectures (eg VGG, ResNet), RBDN offers • End-to-End Preservation of local correspondences. • Learnable upsampling with highly efficient parameter sharing. • Network can choose context vs locality based on task. • Fully Convolutional (can process variable sized inputs).
Relighting Results • Input: Face image under arbitrary lighting conditions. • Desired Output: Face Image under a fixed (ambient) lighting. Surprisingly, a model trained only on frontal faces in the constrained CMU-Multipie dataset generalizes extremely well to unconstrained faces in Janus CS0, featuring a wide range of pose, illumination, occlusion and affordance (hat, glass, scarf, etc) variations.
Denoising Results • Input: Noisy image with WGN of unknown standard deviation. • Desired Output: Clean Image. Other approaches train a different model specific for each noise level, while we train a single model designed to handle all noise levels. Despite this, we outperform others on almost all noise levels.
Colorization Results • Input: Grayscale image, Desired Output: RGB Image. • 1st Row: GT, 2nd Row: Greyscale, 3rd – 5th row: Other approaches • Last Row: RBDN output. RBDN is trained with softmax cross entropy loss with class rebalancing that results in very colorful colorizations.
Face RGB2Depth Results (Janus CS3 dataset) • Input: RGB face (variable size), Desired Output: Depth Image. Surprisingly, RBDN trained using SIRFS as ground truth ends up outperforming SIRFS itself !
Face RGB2Depth Results (FRGC dataset) • Both RBDN &SIRFS depth estimates computed for 101 FRGC images. • FRGC has its own ground truth depth against which the estimates are compared. • RBDN outperforms SIRFS on a majority of the images and has a significantly lower average depth distance of 25.58 compared to the SIRFS average depth distance of 46.47
Conclusion • RBDN works extremely well for almost any Generic Image to Image Regression problem. • RBDN overcomes the context-locality tradeoff inherent in almost every other DCNN architecture. • RBDN has a very high capacity: a single RBDN denoising model outperforms noise-specific competitor approaches at almost all noise levels. • RBDN trained with softmax contrastive loss produces very colorful colorizations. • RBDN shows excellent transfer learning abilities: • Relighting RBDN trained on constrained frontal faces generalizes to unconstrained faces with arbitrary pose/illumination variations ! • RBDN trained using SIRFS ground truth outperforms SIRFS ! • RBDN is fully convolutional and can handle variable sized images during inference.