Presenter : 張庭豪

Image Denoising and Inpainting with Deep Neural NetworksJunyuanXie, Linli Xu, EnhongChenSchool of Computer Science and TechnologyUniversity of Science and Technology of China 2012 Presenter: 張庭豪

Outline • Introduction • Model Description • Experiments • Discussion • Conclusion

Introduction • Image denoisingand inpainting are common image restoration problems • This paper focuses on image denoising and blind inpainting. • Image denoising problems arise whenan image is corrupted by additive white Gaussian noise, whereas image inpainting problems occur when some pixel values are missing or when we want to remove more sophisticated patterns, like superimposed text or other objects, from the image. • Image inpainting methods can be divided into two categories: non-blind inpainting and blind inpainting.

Introduction • In non-blind inpainting, the regions that need to be filled in are provided to the algorithm a priori, whereas in blind inpainting, no information about the locations of the corrupted pixels is given and the algorithm must automatically identify the pixels that require inpainting. • The state-of-the-art non-blind inpainting algorithms can perform very well on removing text, doodle, or even very large objects • Blind inpainting, however, is a much harder problem. • In this paper, we propose to combine the advantageous “sparse” and “deep” principles of sparse coding and deep networks to solve the image denoising and blind inpainting problems.

Introduction • We employ DA(Denoising Auto-encoders) to perform pre-training in our method because it naturally lends itself to denoisingand inpainting tasks. DA is a two-layer neural network that tries to reconstruct the original input from a noisy version of it. • A series of DAs can be stacked to form a deep network called Stacked Denoising Auto-encoders (SDA) by using the hidden layer activation of the previous layer as input of the next layer. • SDA is widely used for unsupervised pre-training and feature learning . In these settings, only the clean data is provided while the noisy version of it is generated during training by adding random Gaussian or Salt-and-Pepper noise to the clean data. • After training of one layer, only the clean data is passed on to the network to produce the clean training data for the next layer while the noisy data is discarded. The noisy training data for the next layer is similarly constructed by randomly corrupting the generated clean training data.

Introduction • For the image denoising and inpainting tasks, however, the choices of clean and noisy input are natural: they are set to be the desired image after denoising or inpainting and the observed noisy image respectively. • Therefore, we propose a new training scheme that trains the DA to reconstruct the clean image from the corresponding noisy observation. • After training of the first layer, the hidden layer activations of both the noisy input and the clean input are calculated to serve as the training data of the second layer.

Model Description • We briefly give preliminaries about Denoising Auto-encoder (DA), which is a fundamental building block of our proposed method. • Problem Formulation : • Assuming x is the observed noisy image and y is the original noise free image, we can formulate the image corruption process as:

Model Description • where is an arbitrary stochastic corrupting process that corrupts the input. Then, the denoisingtask’s learning objective becomes: • From this formulation, we can see that the task here is to find a function f that best approximates . • DenoisingAuto-encoder : • Let yibe the original data for i = 1, 2, … , N and xibe the corrupted version of corresponding yi.

Model Description • is the sigmoid activation function, hiis the hidden layer activation, is an approximation of yiand represents the weights and biases. • DA can be trained with various optimization methods to minimize the reconstruction loss: • After finish training a DA, we can move on to training the next layer by using the hidden layer activation of the first layer as the input of the next layer. This is called Stacked denoisingautoencoder(SDA).

Model Description • Stacked Sparse DenoisingAuto-encoders : • In this section, we will describe the structure and optimization objective of the proposed model Stacked Sparse Denoising Auto-encoders (SSDA). • In the training phase, the model is supplied with both the corrupted noisy image patches xi, for i= 1, 2,…,N, and the original patches yi. • After training, SSDA will be able to reconstruct the corresponding clean image given any noisy observation.

Model Description • To combine the virtues of sparse coding and neural networks and avoid over-fitting, we train a DA to minimize the reconstruction loss regularized by a sparsity-inducing term: Where • is the average activation of the hidden layer. • We regularize the hidden layer representation to be sparse by choosing small so that the KL-divergenceterm will encourage the mean activation of hidden units to be small. • Hence the hidden units will be zero most of the time and achieve sparsity.

Model Description • After training of the first DA, we use h(yi) and h(xi) as the clean and noisy input respectively for the second DA. • We then initialize a deep network with the weights obtained from K stacked DAs. The network has one input layer, one output and 2K - 1 hidden layers . • The entire network is then trained using the standard back-propagation algorithm to minimize the following objective: • Here we removed the sparsity regularization because the pre-trained weights will serve as regularization to the network.

Experiments • We narrow our focus down to denoising and inpainting of grey-scale images. • We use a set of natural images collected from the web as our training set and standard testing images as the testing set. • We create noisy images from clean training and testing images by applying the function to them. • We employ Peak Signal to Noise Ratio (PSNR) to quantify denoising results: • where is the mean squared error. • PSNR is one of the standard indicators used for evaluating image denoising results.

Experiments • Denoising White Gaussian Noise • We first corrupt images with additive white Gaussian noise of various standard deviations. • For the proposed method, one SSDA model is trained for each noise level. • We set K to 2 for all cases because adding more layers may slightly improve the performance but require much more training time. • Numerical results showed that differences between the three algorithms are statistical insignificant.

Experiments • A visual comparison is shown. Bayes Least Squares- Gaussian Scale Mixture

Experiments • We find that SSDA gives clearer boundary and restores more texture details than KSVD and BLS-GSM although the PSNR scores are close. • This indicates that although the reconstruction errors averaged over all pixels are the same, SSDA is better at denoisingcomplex regions.

Experiments • Image Inpainting • For the image inpainting task, we test our model on the text removal problem. • Both the training and testing set compose of images with super-imposed text of various fonts and sizes from 18-pix to 36-pix. • Due to the lack of comparable blind inpainting algorithms, We compare our method to the non-blind KSVD inpaintingalgorithm • We find that SSDA is able to eliminate text of small fonts completely while text of larger fonts is dimmed. • The proposed method, being blind, generates results comparable to KSVD’s even though KSVD is a non-blind algorithm.

Experiments

Experiments • Inspired by SSDA’s ability to learn different features when trained on denoising different noise patterns, we argue that training denoising auto-encoders with noise patterns that fit to specific situations can also improve the performance of unsupervised feature learning. • We compare the quality of the learned features by feeding them to SVMs and comparing the corresponding classification accuracy. • We find that the highest classification accuracy on each type of • noise is achieved by the DA trained to remove that type of noise.

Discussion • Unlike models relying on structural priors,ourmethod’s denoising ability comes from learning. • Therefore SSDA’s ability to denoise and inpaint images is mostly the result of training. • Whereas models that rely on structural priors usually have very limited scope of applications, our model can be adapted to other tasks more conveniently.

Conclusion • In this paper, we present a novel approach to image denoising and blind inpainting that combines sparse coding and deep neural networks pre-trained with denoising auto-encoders. • We propose a new training scheme for DA that makes it possible to denoise and inpaint images within a unified framework. • In the experiments, our method achieves performance comparable to traditional linear sparse coding algorithm on the simple task of denoising additive white Gaussian noise. • In our future work, we would like to explore the possibility of adapting the proposed approach to various other applications such as denoising and inpainting of audio and video.

Presenter : 張庭豪