Deep Learning Tutorial

Deep Learning Tutorial Mitesh M. Khapra IBM Research India (Ideas and material borrowed from Richard Socher’s tutorial @ ML Summer School 2014 YoshuaBengio’s tutorial @ ML Summer School 2014 & Hugo Larochelle’s lecture videos & slides)

Roadmap • What? • Why? • How? • Where?

Roadmap • What are Deep Neural Networks? • Why? • How? • Where?

Roadmap • What are Deep Neural Networks? • Why should I be interested in Deep Learning? • How? • Where?

Roadmap • What are Deep Neural Networks? • Why should I be interested in Deep Learning? • How do I make a Deep Neural Network work? • Where?

Roadmap • What are Deep Neural Networks? • Why should I be interested in Deep Learning? • How do I make train a Deep Neural Network work? • Where?

Roadmap • What are Deep Neural Networks? • Why should I be interested in Deep Learning? • How do I train a Deep Neural Network? • Where can I find additional material?

the what?

A typical machine learning example data label number of positive words, number of negative words, length of review, author name, bag of words, etc. feature vector feature extraction

next A typical machine learning example data label

So, where does deep learning fit in? • Machine Learning • hand crafted features • optimize weights to improve prediction • Representation Learning • automaticallylearn features • Deep Learning • automaticallylearn multiple levels of features From RicharSocher’s tutorial @ ML Summer School, Lisbon

back The basic building block single artificial neuron

Okay, so what can I use it for? • For binary classification problems by treating • Works when data is linearly separable (image from Hugo Larochelles’sslides)

What are its limitations? • Fails when data is not linearly separable…. (images from Hugo Larochelles’sslides) • …unless the input is suitably transformed

A neural network for XOR Wait…., are you telling me that I will always have to meditate on the data and then decide the transformation/network ? No, definitely not. The XOR example is only to give the intuition. The key takeaway is that by adding more layers you can make the data separable. A multi-layered neural network Lets spend some more time in understanding this ….

(graphs from Pascal Vincent’s slides) Capacity of a multi-layer network

Capacity of a multi-layer network (image from Pascal Vincent’s slides)

Capacity of a multi-layer network In particular, we can find a separator for the XOR problem (images from from Pascal Vincent’s slides) • Universal Approximation Theorem (Hornik, 1991) : • “a single hidden layer neural network with a linear output unit can approximate any continuous function arbitrary well, given enough hidden units”

Lets take a minute here… If “a single hidden layer neural network” is enough then why go deeper? Hand-crafted featuresrepresentations Automatically learned featuresrepresentations … … …

Multiple layers = multiple levels of features But why would I be interested in learning multiple levels of representations ? Lets see where the motivation comes from…

The brain analogy Layer 1 representation nose Layer 2 representation mouth eyes face Layer 3 representation (idea from Hugo Larochelle’s slides)

YAWN!!!! Enough With the Brain Tampering Just tell me Why should I be interested In Deep Learning?(“Show Me the Money”)

the why?

(from Y. Bengio’s MLSS 2014 slides) Used in a wide variety of applications

Industrial Scale Success Stories Speech Recognition Object Recognition Face Recognition Cross Language Learning Machine Translation Text Analytics Disclaimer: Some nodes and edges may be missing due to limited public knowledge Dramatic improvements reported in some cases

(from Y. Bengio’s MLSS 2014 slides) Some more success stories

Let me see if I understand this correctly… • Speech Recognition, Machine Translation, etc. are more than 50 years old • Single artificial neurons have been around for more than 50 years No, even deep neural networks have been around for many, many years but prior to 2006 training deep nets was unsuccessful 50+ years?

(from Y. Bengio’s MLSS 2014 slides) So what has changed since 2006? • New methods for unsupervised pre-training have been developed • More efficient parameter estimation methods • Better understanding of model regularization • Faster machines and more data help DL more than other algorithms

the how?

recap single artificial neuron

Switching to slides corresponding to lecture 2 from Hugo Larochelle’s course http://info.usherbrooke.ca/hlarochelle/neural_networks/content.html

the where?

Some pointers to additional material • http://deeplearning.net/ • http://info.usherbrooke.ca/hlarochelle/neural_networks/content.html

Deep Learning Tutorial

Deep Learning Tutorial

Presentation Transcript

Deep Learning

Deep Learning

Deep Learning!!!!

Deep learning

Deep Learning

Deep Learning

Deep Learning

Deep Learning

Deep Learning

Deep Learning

Active Learning = Deep Learning

NCAP Summer School 2010 Tutorial on: Deep Learning

Deep learning

Deep Learning

Deep Learning With Python Tutorial | Edureka

Deep Learning Market

Deep Learning

What Are GANs? | Generative Adversarial Networks Tutorial | Deep Learning Tutorial | Simplilearn

Discriminate between deep learning and deep q learning