230 likes | 432 Views
Deep Learning. Supervised Learning. Works well if we have right features Domains like computer vision, audio processing, and natural language processing requires feature engineering. Feature Engineering is tough job Manually finding right features does not scale well. What?.
E N D
Supervised Learning • Works well if we have right features • Domains like computer vision, audio processing, and natural language processing requires feature engineering. • Feature Engineering is tough job • Manually finding right features does not scale well
What? • Learn better features. • That are sparse • Effective How? • Motivated by small part of brain neocortex • In all mammals, it is involved in "higher functions" such as sensory perception, generation of motor commands, spatial reasoning, conscious thought and language.
Big Picture object models object parts (combination of edges) edges pixels
Neural Network where is called the activation function.
Back propagation Objective function Update rule for weights and biases for given layer and given training sample Batch update rule for given layer and cumulated over all training samples
Auto-encoders and Sparsity • Back propagation for Unsupervised learning with • Learn an approximation to the identity function. It is trivial, what we can achieve • limit number of hidden nodes. • if we impose a sparsity constraint on the hidden units be the average activation of hidden unit (averaged over the training set).
Auto-encoder and Sparsity • enforce the constraint • where is a sparsity parameter, typically a small value close to zero • (say ) • This can be done by adding one more term in objective function Now the objective function becomes
What is learned by auto-encoder? • We will try to find what image activates most a particular hidden node? • To achieve this for a particular ithhidden node, we construct image by setting jth pixel by
Unsupervised feature learning with a neural network x1 x1 x2 x2 • Autoencoder. • Network is trained to output the input (learn identify function). • Trivial solution unless: • Constrain number of units in Layer 2 (learn compressed representation), or • Constrain Layer 2 to be sparse. x3 x3 a1 x4 x4 x5 x5 a2 +1 x6 x6 a3 Layer 2 Layer 3 +1 Layer 1
Unsupervised feature learning with a neural network x1 x1 x2 x2 a1 x3 x3 a2 x4 x4 a3 x5 x5 +1 x6 x6 Layer 2 Layer 3 +1 Layer 1
Unsupervised feature learning with a neural network x1 x2 a1 x3 a2 x4 a3 x5 +1 New representation for input. x6 Layer 2 +1 Layer 1
Unsupervised feature learning with a neural network x1 x2 a1 x3 a2 x4 a3 x5 +1 x6 Layer 2 +1 Layer 1
Unsupervised feature learning with a neural network x1 x2 a1 b1 x3 a2 b2 x4 a3 b3 x5 +1 +1 x6 Train parameters so that , subject to bi’s being sparse. +1
Unsupervised feature learning with a neural network x1 x2 a1 b1 x3 a2 b2 x4 a3 b3 x5 +1 +1 x6 Train parameters so that , subject to bi’s being sparse. +1
Unsupervised feature learning with a neural network x1 x2 a1 b1 x3 a2 b2 x4 a3 b3 x5 +1 +1 x6 Train parameters so that , subject to bi’s being sparse. +1
Unsupervised feature learning with a neural network x1 x2 a1 b1 x3 a2 b2 x4 a3 b3 x5 +1 +1 New representation for input. x6 +1
Unsupervised feature learning with a neural network x1 x2 a1 b1 x3 a2 b2 x4 a3 b3 x5 +1 +1 x6 +1
Unsupervised feature learning with a neural network x1 x2 a1 b1 c1 x3 a2 b2 c2 x4 a3 b3 c3 x5 +1 +1 +1 x6 +1
Unsupervised feature learning with a neural network x1 x2 a1 b1 c1 x3 a2 b2 c2 x4 a3 b3 c3 x5 New representation for input. +1 +1 +1 x6 +1 Use [c1, c3, c3] as representation to feed to learning algorithm.
References • http://ufldl.stanford.edu/wiki