1 / 75

Visual Computing Theory and Engineering

Explore the subtopics of low-level visual computing, such as image estimation, de-noising, restoration, and super resolution techniques. Delve into the theories and engineering concepts behind these topics.

kaseyj
Download Presentation

Visual Computing Theory and Engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Visual Computing Theory and Engineering Topic: Low Level Group 1

  2. Topic:Low-level • Group Members (Group1): • 车朝晖,李伟,蔡春磊,陈琳,张烨珣,李高磊 • 陈卉,郑策,王敏思,朱文瀚 • Subtopics: • Image estimator • Image de-noising • Image restoration • Super resolution

  3. Outline • ‘Low-level’ and ‘High-level’ • Subtopics of ‘Low-level’: • Image estimator • Image de-noising • Image restoration • Super resolution • Summary

  4. Low-level & High-level • Low-level[1]: • Low level image processing is mainly concerned with extracting descriptions from images (that are usually represented as images themselves) . • There may be multiple, largely independent descriptions, such as edge fragments, spots, reflectance, line fragments, etc. • High-level: • High-level features are something that we can directly see and recognize, like object classification, recognition, segmentation and so on. [1] Low-level image processing http://homepages.inf.ed.ac.uk/rbf/CVonline/LOCAL_COPIES/MARBLE/low/low.htm

  5. Subtopics of ‘Low-level’: • Image estimator • Image de-noising • Image restoration • Super resolution

  6. Image Estimator • Generative Image Modeling Using Spatial LSTMs • Zhaohui Che 车朝晖 • Depth and surface normal estimation from monocular images using regression on deep features and hierarchical CRFs • Gaolei Li 李高磊

  7. Generative Image ModelingUsing Spatial LSTMs车朝晖

  8. MCGSM : mixture of conditional Gaussian scale mixtures • We should give the pixels an ordering and specify the distribution of each pixel conditioned on its parent pixels. • SLSTM : spatial long short-term memory • core part: memory units c(ij) and hidden units h(ij). For each location (i,j), the operation is :

  9. σ is the logistic sigmoid function, indicates a pointwise product, and T(A,b) is an affine transformation which depends on the only parameters of the network A and b. The gating units i(ij) and o(ij) determine which memory units are affected by the inputs through g(ij), and which memory states are written to the hidden units h(ij).(RIDE)Recurrent image density estimator:use pixels in a much larger region for prediction, and to nonlinearly transform the pixels before applying the MCGSM

  10. Depth and surface normal estimation from monocularimages using regression on deep features and hierarchical CRFs 李高磊

  11. Problems • Depth and surface estimation for single monocular color images

  12. Related Methods • Conditional random fields (CRFs). • Regression on deep convolutional neural network (DCNN).

  13. Basic Architecture

  14. Steps • Depth regression with CNNs. • Refining the results via hierarchical CRF.

  15. Principles and Implementation

  16. Experimental Results • NYU2 data set

  17. Make3D data set

  18. Performance Evaluation

  19. Conclusions • In this paper, we have presented a new and common framework for depth and surface normal estimation from single monocular images, which consists of regression using deep CNNs and refining via a hierarchical CRF. With this simple framework, we have achieved promising results for both tasks of depth and surface normal estimation. In the future, we plan to investigate different data augmentation methods to improve the performance in handling real-world image transformations. • Furthermore, we plan to explore the use of deeper CNNs. Our preliminary results show that improved depth estimation can be obtained with VGGNet, compared with AlexNet. In addition, the effect of joint depth and semantic class estimation with deep CNN features also deserves attention.

  20. Image de-noising • Learning a Convolutional Neural Network for Non-uniform Motion Blur Removal • Ce Zheng 郑策 • Patch Group Based Nonlocal Self-Similarity Prior Learning for Image De-noising • Wenhan Zhu 朱文瀚

  21. Learning a Convolutional Neural Network for Non-uniformMotion Blur Removal 郑策

  22. Introduction • Image deblurring aims at recovering sharp image from a blurry image due to camera shake, object motion or out-of-focus. • Estimate the probabilities of motion kernels at the patch level using a convolutional neural network (CNN) • Fuse the patch-based estimations into a dense field of motion kernels using a Markov random field (MRF) model. • Effectively estimate the spatially varying motion kernels, which enable us to well remove the motion blur.

  23. CNN for Motion Blur Estimation • First predict the probabilities of different motion kernels for each image patch. • Then estimate dense motion blur kernels for the whole image using a Markov random field model enforcing motion smoothness. Representation of motion blur kernel by motion vector and generation of motion kernel candidates

  24. Patch-level Motion Kernel Estimation by CNN Structure of CNN for motion kernels prediction It is composed of 6 layers of convolutional layers and fully connected layers, outputing the probability of each candidate motion kernel using soft-max layer. Motion kernel estimation on a rotated patch

  25. Dense Motion Field Estimation by MRF The left of (b) show four blurry patches cropped from (a). Each color map on the right of (b) shows the probabilities of motion kernels in different motion lengths and orientations estimated for each blurry patch by CNN. Note that the high probability regions are local in each map. (c) shows our final motion kernel estimation. Examples of motion kernel probabilities

  26. Dense Motion Field Estimation by MRF Example of non-uniform motion kernel estimation (b) Estimation using the unary term of Eqn.(6), i.e., choosing the motion kernel with highest confidence for each pixel. (c) Estimation using the full model of Eqn.(6) with motion smoothness constraint. (d) Ground-truth motion blur.

  27. Experiments Figure 9 presents four examples with strongly non-uniform motion blur captured for scenes with complex depth layers. The first three examples are real-captured blurry images, and the final example is a synthetic blurry image. All these examples show that our CNN-based approach can effectively predict the spatially varying motion kernels.

  28. Conclusion In this paper, we have proposed a novel CNN-based non-uniform motion deblurring approach. We learn an effective CNN for estimating motion kernels from local patches. Using an MRF model, we are able to well predict the non-uniform motion blur field. This leads to state-of-the-art motion deblurring results.

  29. Patch Group Based Nonlocal Self-Similarity Prior Learning for Image Denoising Wenhan Zhu

  30. Purpose • Background:There is not an explicit NSS prior model learned from natural images for image restoration. • Major work:In this paper, they propose to learn explicit NSS models from natural images, and apply the learned prior models to noisy images for high performance denoising. • Try to develop a patch group based NSS prior learning scheme to improve the performance of image restoration.

  31. Flowchart of PGPD • Flowchart of the proposed patch group based prior learning and image de-noising framework.

  32. Patch group • A PG is formed by grouping the M similar patches, denoted by • The mean vector of this PG is • is the group mean subtracted patch vector • called PG

  33. Algorithm

  34. Results: • Compare the proposed PGPD algorithm with BM3D, EPLL, LSSC, NCSR and WNNM. • PGPD has higher PSNR values than BM3D, LSSC, EPLL and NCSR, and is only slightly inferior to WNNM. However, PGPD is much more efficient than WNNM. • In summary, the proposed PGPD method demonstrates powerful de-noising ability quantitatively and qualitatively, and it is highly efficient.

  35. Image Restoration • Conformal and Low-Rank Sparse Representation for Image Restoration • Yexun Zhang 张烨珣

  36. Conformal and Low-Rank Sparse Representation for Image Restoration Yexun Zhang 张烨珣

  37. Objective: Image Restoration • Method: Obtaining an appropriate dictionary is the key point when sparse representation is applied to computer vision or image processing problems such as image restoration. • Opportunity: Many existing dictionary learning methods handle training samples individually, while missing relationships between samples, which result in dictionaries with redundant atoms but poor representation ability.

  38. Sparse Representation:is a model which suggests that there exists a dictionary which can reconstruct the signals. Each signal can be represented by a sparse linear combination of atoms in the dictionary. • How to obtain the dictionary? analytically learning dictionary predefined from training samples

  39. Early dictionary learning methods • K-SVD algorithm • Focus on reconstruction power of the dictionary • Depend on a large training dataset • Fixed dictionary size leading to dictionary redundancy • Add discrimination constraints • Only train samples individually or consider the discrimination between classes • Ignore local relationships between samples or structure of the data manifold leading to redundant dictionaries with poor representation ability

  40. Consider the relationship • local perspective: • Local samples have similar features and form a local subspace, reflecting the affinities between each other. • Global perspective: • Samples with similar features are linearly related, and thus they lie on a low-dimensional latent space. Key: How to embed these two relationships into dictionary learning and sparse representation?

  41. The paper’s work • Conformal and Low-rank Sparse Representation (CLRSR ) • Conformal property is introduced by preserving the angles of localgeometry formed by neighboring samples in the feature space. • Imposing low-rank constraint on the coefficient matrix can lead more faithful subspaces and capture the global structure of data.

  42. The data inner structures can be better modelled through Conformal Eigenmaps, which projects data from a high dimensional space to a low dimensional manifold while preserving the angles formed by neighboring samples. • These angle relationships are called as the conformal property. • Enforce the coefficient matrix to be low-rankto involve the global structure of data. • The samples extracted from image/video are relevant to each other, thus these samples lie on low-rank subspaces. • Samples having similar features will have similar sparse representations, resulting in similar coding coefficients. Therefore, the coefficient matrix A is expected to be low-rank.

  43. Super Resolution • Convolutional Sparse Coding for Image Super-resolution • Lin Chen 陈琳,Chunlei Cai 蔡春磊 • Deep Networks for Image Super-Resolution with Sparse Prior • Minsi Wang 王敏思 • Bidirectional Recurrent Convolutional Networks for Multi-Frame Super-Resolution • Wei Li 李伟 • Video Super-Resolution via Deep Draft-Ensemble Learning • Hui Chen 陈卉

  44. Convolutional Sparse Coding for Image Super-resolution 2015 IEEE International Conference on Computer Vision Reporter: Chunlei Cai 蔡春磊 Lin Chen 陈琳

  45. Some Fundamental Concept • Definition: • Super-resolution,SR The purpose of super-resolution (SR) is to reconstruct a high resolution (HR) image from a single low resolution (LR) image or a sequence of LR images • Sparse coding,SC Sparse representation encodes a signal vector x as the linear combination of a few atoms in a dictionary D, i.e., x ≈ Dα, where α is the sparse coding vector

  46. Related Work • Single Image Super Resolution Methods,SISR: • Most SISR methods utilize the prior knowledge on image patches, which can be grouped into three categories example-based methods mapping-based methods sparse coding based methods • What is the disadvantage of these methods? These methods partition the image into overlapped patches, and process each patch separately, they ignore the consistency of pixels in overlapped patches, which is a strong constraint for image reconstruction.

  47. The Contribution of this Paper • Contribution: • Propose a convolutional sparse coding(CSC) based SC Compared with conventional sparse coding methods which process each overlapped patch independently, the global decomposition strategy in CSC is more suitable for image reconstruction • Train a sparse mapping function To take full advantage of the feature maps generated by the convolutional coding, this paper utilizes the feature space information to train a sparse mapping function. Such a mechanism reduces the number of filters used to decompose the LR input image • Experiments turn out better results Experiments on commonly used test images show that the proposed method achieves very competitive SR results with the state-of-the-art methods not only in PSNR index, but also in visual quality.

  48. Details About the proposed Algorithm • Flowchart: In order to obtain sparser feature maps, this paper decomposes the LR image into one smooth component and one residual component before SR. The smooth component is simply enlarged by the bi-cubic interpolator, and the proposed CSCSR model is performed on the residual component

  49. Details About the proposed Algorithm • Algorithm: 1 The corresponding learning models: 2 4

  50. Experimental Results • Convergence Analysis: In most of experiments, the algorithm will converge in 10 iterations

More Related