1 / 20

DAVANet : Stereo Deblurring with View Aggregation

DAVANet : Stereo Deblurring with View Aggregation. Haitong Shi 5.6. DAVANet - D epth A wareness and V iew A ggregation. Introduction Motivation Network Architecture and Losses Stereo blur dataset Experiments. Introduction.

vaughn
Download Presentation

DAVANet : Stereo Deblurring with View Aggregation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DAVANet: Stereo Deblurring with View Aggregation Haitong Shi 5.6

  2. DAVANet - Depth Awareness and View Aggregation • Introduction • Motivation • Network Architecture and Losses • Stereo blur dataset • Experiments

  3. Introduction A stereo camera is a type of camera with two or more lenses with a separate image sensor or film frame for each lens. • Stereo image deblurringhas rarely been discussed • Dynamic scene deblurring from a single blurry image is a highly ill-posed task • propose a novel depth-aware and view-aggregated stereo deblurringnetwork named DAVANet • propose a large-scale multi-scene stereo blurry image dataset

  4. Motivation (i) Depth information can provide helpful prior information for estimating spatially-varying blur kernels (ii) The varying information in corresponding pixels cross two stereo views can help blur removal. Depth-varying and view-varying blur. (a, b) are the stereo blurry images, (c, d) are the motion trajectories in terms of optical flow which models the blur kernels and (e, f) are the estimated disparities

  5. Motivation 1.Depth-Varying Blur: 2.View-Varying Blur: (a) is the depth-varying blur due to relative translation parallel to the image plane. (b) and (c) are the view varying blur due to relative translation along depth direction and rotation

  6. Network Architecture single-image deblurring bidirectional disparities estimation The overall structure of stereo deblurring network DAVANet, where the depth and the two-view information from the DispBiNet and the DeblurNet are integrated in FusionNet.

  7. Network Architecture – DeblurNetDispBiNet

  8. Network Architecture – Context Module · dilated convolutions Context Module fuses richer hierarchical context information that benefit both blur removal and disparity estimation dilated rates :2 The four dilated rates are set to: 1, 2, 3, 4.

  9. Network Architecture - Fusion Network : Features from DeblurNet encoder : original stereo images : the estimated disparity of left view : Features of the second last layer of DispBiNet a soft gate map ranging from 0 to 1 Θdenotes element-wise multiplication

  10. Network Architecture - Fusion Network Input:

  11. Losses • DeblurringLosses 1. MSE loss: 2. perceptual loss: features from conv3-3 layer (j=15) :the features from the j-th convolution layer within the pretrained VGG-19 network

  12. Losses • Disparity Estimation Loss mask map, remove the invalid and occlusion regions the number of scales of the network estimated disparities ground truth

  13. Stereo Blur Dataset • 1.use the ZED stereo camera (frame rate 60 fps) to capture data • 2.increase the video frame rate to 480 fps using a fast and high-quality frame interpolation method https://arxiv.org/abs/1708.01692 • 3.average the varying number (17, 33, 49) of successive frames to generate different blur in size • 135 diverse real-world sequences of dynamic scenes • 20,637 blurry - sharp stereo image pairs with the corresponding bidirectional disparity the mask map • 98 training sequences (17,319 samples) and 37 testing sequences (3,318 samples)

  14. Experiments - Training • pretrainDeblurNeton the presented dataset, DispBiNeton a subset (10,806 samples) of FlyingThings3D dataset • finetune the DispBiNet fully on Stereo Blur Dataset until convergence • Jointly train the overall network on Stereo Blur Dataset

  15. Experiments · under the proposed Stereo Blur Dataset

  16. Experiments • On the GOPRO dataset

  17. Effectiveness of the disparity (d)do not warp features from the other view in the FusionNet (c)feed two exactly the same images into the proposed network

  18. Ablation study • Context Module, depth awareness, and view aggregation • replace the Context Module of DeblurNet by the one-path convolution block with the same number of layers • remove disparity loss of DispBiNet • Substitute with

  19. Conclusion • Advantages depth awareness and view aggregation accuracy, speed, model size Disadvantages ablation study?

  20. THANKS!

More Related