Advancing Artistic Style with Convolutional Neural Networks

Project on A Neural Algorithm of Artistic Style by Convolutional Neural Networks Yeqi Wang Shuyang Gu

Recap: What we have in the midterm • Read various of paper related to understand exactly how exactly CNN works, what algorithm they use, dig deeper on some details related to the algorithm • I tried to implement all the method followed by the tutorial guide from neural style approach • Using cpu mode(not using cuda), image size: 256*256 • Average iteration time: 5min per 100 iterations

Iteration time : 100 – 400 iterations

What to hope to do: • Still dig deeper on all parameters inside of the CNN, tried to modify some part of Caffe model layout or parameters • Tried to modify the loss function between style and content • Tried to shorten the iteration time based on some modification method: change the input image, change the convolution computation mathods…. • Add some supervised learning feature?

Dig deeper to the CNN model • How loss function works in CNN? • How each layer works during the propagation? • Some tricks we use in the CNN.

Sytle loss: For each layer, we have a “Style loss”, which is more important than content

x • Gram matrix of layer as “feature” -> style loss in layer l • Global feature

Fine tuning CNN layer configuration • In the original design of style of loss layer, it has one single style loss layer in each layer • Tried to add more style loss layer in each layer(after each ReLu layer) • Max pooling -> average pooling

Original

Details: • We may see some difference among details of the generated image after style layer configuration change (even all the other parameters remains the same)

Max pooling -> Avarage pooling

Loss function change

Distribution of W • We tried Gaussian distribution as the simplest one (Assumption: middle layer will have a better style construction)

Gradient decent of weight with Gaussian

What we want from “art style”

Tricks in CNN • Eliminate sizing headaches TIPS/TRICKS • - start with image that has power-of-2 size- for conv layers, use stride 1 filter size 3x3 pad input with aborder of zeros (1 spatially)This makes it so that: [W1,H1,D1] -> [W1,H1,D2] (i.e. spatial sizeexactly preserved)- for pool layers, use pool size 2x2 (more = worse) (slide from Feifei Li & Andrej Karparthy )

Thanks != +

Advancing Artistic Style with Convolutional Neural Networks

Advancing Artistic Style with Convolutional Neural Networks

Presentation Transcript

Tiled Convolutional Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Neural Networks

Introduction: Convolutional Neural Networks for Visual Recognition

Neural Networks

Neural Networks

Neural networks

Learning Algorithm and Neural Networks

Neural Networks

Neural Networks

Neural Networks

Heterogeneous convolutional neural networks for visual recognition

A Review on Neural Networks