180 likes | 745 Views
Comp 5013 Deep Learning Architectures. Daniel L. Silver March, 2014. Y. Bengio - McGill. 2009 Deep Learning Tutorial 2013 Deep Learning towards AI Deep Learning of Representations (Y. Bengio ) http://www.youtube.com/watch?v=4xsVFLnHC_0. Deep Belief RBM Networks with Geoff Hinton.
E N D
Comp 5013Deep Learning Architectures Daniel L. Silver March, 2014
Y. Bengio - McGill • 2009 Deep Learning Tutorial • 2013 Deep Learning towards AI • Deep Learning of Representations (Y. Bengio) • http://www.youtube.com/watch?v=4xsVFLnHC_0
Deep Belief RBM Networks with Geoff Hinton • Learning layers of features by stacking RBMs • http://www.youtube.com/watch?v=VRuQf3DjmfM • Discriminative fine-tuning in DBN • http://www.youtube.com/watch?v=-I2pgcH02QM • What happens during fine-tuning? • http://www.youtube.com/watch?v=yxMeeySrfDs
Deep Belief RBM Networks with Geoff Hinton • Learning handwritten digits • http://www.cs.toronto.edu/~hinton/digits.html • Modeling real-value data (G.Hinton) • http://www.youtube.com/watch?v=jzMahqXfM7I
Deep Learning Architectures • Consider the problem of trying to classify these hand-written digits.
Deep Learning Architectures 2000 top-level artificial neurons 2 1 3 500 neurons (higher level features) 0 1 2 4 3 7 5 6 8 9 500 neurons (low level features) • Neural Network: • - Trained on 40,000 examples • Learns: • * labels / recognize images • * generate images from labels • Probabilistic in nature • Demo Images of digits 0-9 (28 x 28 pixels)
Deep Convolution Networks • Intro - http://www.deeplearning.net/tutorial/lenet.html#lenet
ML and Computing Power Andrew Ng’s work on Deep Learning Networks (ICML-2012) • Problem: Learn to recognize human faces, cats, etc from unlabeled data • Dataset of 10 million images; each image has 200x200 pixels • 9-layered locally connected neural network (1B connections) • Parallel algorithm; 1,000 machines (16,000 cores) for three days Building High-level Features Using Large Scale Unsupervised Learning Quoc V. Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, and Andrew Y. Ng ICML 2012: 29th International Conference on Machine Learning, Edinburgh, Scotland, June, 2012.
ML and Computing Power Results: • A face detector that is 81.7% accurate • Robust to translation, scaling, and rotation Further results: • 15.8% accuracy in recognizing 20,000 object categories from ImageNet • 70% relative improvement over the previous state-of-the-art.
Deep Belief Convolution Networks • Deep Belief Convolution Network (Javascript) • Runs well under Google Chrome • https://www.jetpac.com/deepbelief
Google and DLA • http://www.youtube.com/watch?v=JBtfRiGEAFI • http://www.technologyreview.com/news/524026/is-google-cornering-the-market-on-deep-learning/
Cloud-Based ML - Google https://developers.google.com/prediction/
Additional References • http://deeplearning.net • http://en.wikipedia.org/wiki/Deep_learning • Coursera course – Neural Networks fro Machine Learning: • https://class.coursera.org/neuralnets-2012-001/lecture • ML: Hottest Tech Trend in next 3-5 Years • http://www.youtube.com/watch?v=b4zr9Zx5WiE • Geoff Hinton’s homepage • https://www.cs.toronto.edu/~hinton/
Challenges & Open Questions • Stability-Plasticity problem - How do we integrate new knowledge in with old? • No loss of new knowledge • No loss or prior knowledge • Efficient methods of storage and recall • ML methods that can retain learned knowledge will be approaches to “common knowledge” representation – a “Big AI” problem
Challenges & Open Questions • Practice makes perfect ! • An LML system must be capable of learning from examples of tasks over a lifetime • Practice should increase model accuracy and overall domain knowledge • How can this be done? • Research important to AI, Psych, and Education
Challenges & Open Questions • Scalability • Often a difficult but important challenge • Must scale with increasing: • Number of inputs and outputs • Number of training examples • Number of tasks • Complexity of tasks, size of hypothesis representation • Preferably, linear growth
Never-Ending Language Learner • Carlson et al (2010) • Each day: Extracts information from the web to populate a growing knowledge base of language semantics • Learns to perform this task better than on previous day • Uses a MTL approach in which a large number of different semantic functions are trained together