220 likes | 274 Views
The relation between biological vision and computer vision. Principles of Back-Propagation Prof. Bart ter Haar Romeny. How does this actually work?. Deep Learning Convolutional Neural Networks. In. AlexNet (Alex Krizhevsky 2012). Error backpropagation. ImageNet challenge: 1.4 million
E N D
The relation between biological vision and computer vision Principles of Back-Propagation Prof. Bart ter Haar Romeny
How does this actually work? Deep Learning Convolutional Neural Networks In AlexNet(Alex Krizhevsky2012) Error backpropagation ImageNet challenge: 1.4 million images, 1000 classes 75% → 94% A typical big deep NN has (hundreds of) millions of connections: weights. Convolution, ReLU, max pooling, convolution, convolution etc.
From Prakash Jay, Senior Data Scientist @FractalAnalytics: https://medium.com/@14prakash/back-propagation-is-very-simple-who-made-it-complicated-97b794c97e5c A numerical example of backpropagation on a simple network:
Approach • Build a small neural network as defined in the architecture right. • Initialize the weights and biases randomly. • Fix the input and output. • Forward pass the inputs. Calculate the cost. • Compute the gradients and errors. • Backprop and adjust the weights and biases accordingly. We initialize the network randomly:
Forward pass layer 1: Matrix operation: Relu operation: Example:
Forward pass layer 2: Matrix operation: Sigmoid operation: Example:
Forward pass output layer: Matrix operation: Softmax operation: Example: [ 0.1985 0.2855 0.5158 ]
Analysis: • The Actual Output should be [1.0, 0.0, 0.0] • but we got [0.2698, 0.3223, 0.4078]. • To calculate the error let us use cross-entropy • Error: Example: Error = -(1 * Log[0.19858]+0+0 * Log[0.28559]+1 * Log[1-0.28559] +0 *Log[0.51583]+1 * Log[1-0.51583]) = 2.67818
Analysis: • The Actual Output should be [1.0, 0.0, 0.0] but we got [0.19858, 0.28559, 0.51583]. • To calculate the error let us use cross-entropy • Error: Example: Error = -(1 * Log[0.19858]+0+0 * Log[0.28559]+1 * Log[1-0.28559] +0 *Log[0.51583]+1 * Log[1-0.51583]) = 2.67818 We are done with the forward pass. We know the error of the first iteration (we go do this numerous times). Now let us study the backward pass.
A chain of functions: From Rohan Kapur: https://ayearofai.com/rohan-lenny-1-neural-networks-the-backpropagation-algorithm-explained-abf4609d4f9d
For gradient descent: The derivative of this function with respect to some arbitrary weight (for example w1) is calculated by applying the chain rule: For a simple error measure (p = predicted, a = actual):
Important derivatives: Sigmoid: ReLU: SoftMax:
= Two slides ago, we saw that
Going one more layer backwards, we can determine that: With etc. And finally: 1 And iterate until convergence:
Numerical example in great detail by Prakash Jay on Medium.com: • https://medium.com/@14prakash/back-propagation-is-very-simple-who-made-it-complicated-97b794c97e5c etc.
Deeper reading: • https://eli.thegreenplace.net/2016/the-softmax-function-and-its-derivative • https://eli.thegreenplace.net/2018/backpropagation-through-a-fully-connected-layer/