1 / 35

Deep Learning

Deep Learning. Disclaimer: This PPT is modified based on Dr. Hung- yi Lee http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML17.html. Deep learning attracts lots of attention. I believe you have seen lots of exciting results before.

billyk
Download Presentation

Deep Learning

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Deep Learning Disclaimer: This PPT is modified based on Dr. Hung-yi Lee http://speech.ee.ntu.edu.tw/~tlkagk/courses_ML17.html

  2. Deep learning attracts lots of attention. • I believe you have seen lots of exciting results before. Deep learning trends at Google. Source: SIGMOD 2016/Jeff Dean

  3. Ups and downs of Deep Learning • 1958: Perceptron (linear model) • 1969: Perceptron has limitation • 1980s: Multi-layer perceptron • Do not have significant difference from DNN today • 1986: Backpropagation • Usually more than 3 hidden layers is not helpful • 1989: 1 hidden layer is “good enough”, why deep? • 2006: RBM (Restricted Boltzmann Machines) initialization • 2009: GPU (graphical processing unit) • 2011: Start to be popular in speech recognition • 2012: win ILSVRC image competition • 2015.2: Image recognition surpassing human-level performance  • 2016.3: Alpha GO beats Lee Sedol • 2016.10: Speech recognition system as good as humans

  4. https://pdfs.semanticscholar.org/presentation/c139/cf0377dddf95d8a268652dd9c51631279400.pdfhttps://pdfs.semanticscholar.org/presentation/c139/cf0377dddf95d8a268652dd9c51631279400.pdf

  5. https://pdfs.semanticscholar.org/presentation/c139/cf0377dddf95d8a268652dd9c51631279400.pdfhttps://pdfs.semanticscholar.org/presentation/c139/cf0377dddf95d8a268652dd9c51631279400.pdf

  6. https://pdfs.semanticscholar.org/presentation/c139/cf0377dddf95d8a268652dd9c51631279400.pdfhttps://pdfs.semanticscholar.org/presentation/c139/cf0377dddf95d8a268652dd9c51631279400.pdf

  7. https://pdfs.semanticscholar.org/presentation/c139/cf0377dddf95d8a268652dd9c51631279400.pdfhttps://pdfs.semanticscholar.org/presentation/c139/cf0377dddf95d8a268652dd9c51631279400.pdf

  8. Introduction to deep learning • Dr. XiaogangWang: • Check out PPT: page 5-28 http://www.ee.cuhk.edu.hk/~xgwang/deep_learning_isba.pdf https://www.enlitic.com/?fbclid=IwAR3H3RRnw_BK756MJt-l34C3w4QgVMZfcf8cDrPaxBxajo-oRxJh6nEnUoc https://automatedinsights.com/customer-stories/associated-press/?fbclid=IwAR0H8Pq-o-bbTFcTff8hS6fz0hO1osIvOS-e3pZC66y9Zn7s-Ok0iWJ7Q5c

  9. Three Steps for Deep Learning Neural Network Deep Learning is so simple ……

  10. Neural Network “Neuron” Neural Network Different connection leads to different network structures Network parameter :all the weights and biases in the “neurons”

  11. Fully Connect Feedforward Network 0.98 4 1 1 -2 1 0.12 -1 -2 -1 1 0 Sigmoid Function

  12. Fully Connect Feedforward Network 0.98 0.62 4 0.86 2 3 1 1 -1 -1 -2 0 2 -2 1 0 0 0.12 -2 -1 -1 0.83 -2 0.11 -1 -1 4 1

  13. Fully Connect Feedforward Network 0.72 0.51 0.73 2 3 1 0 -1 -1 -2 -2 0 1 0.5 -2 -1 -1 0.85 0.12 0 -1 4 1 0 2 0 This is a function. Input vector, output vector Given network structure, define a function set

  14. Fully Connect Feedforward Network neuron Layer 1 Layer 2 Layer L Input Output y1 …… …… y2 …… …… …… …… …… …… yM Output Layer Input Layer Hidden Layers

  15. Deep = Many hidden layers 22 layers http://cs231n.stanford.edu/slides/winter1516_lecture8.pdf 19 layers 8 layers 6.7% 7.3% 16.4% AlexNet (2012) GoogleNet (2014) VGG (2014)

  16. Deep = Many hidden layers 101 layers 152 layers Special structure Ref: https://www.youtube.com/watch?v=dxB6299gpvI 3.57% 7.3% 6.7% 16.4% GoogleNet (2014) Residual Net (2015) VGG (2014) AlexNet (2012) Taipei 101

  17. Matrix Operation 0.98 4 1 1 -2 1 0.12 -1 -2 -1 1 0

  18. Review: Example 0.98 0.62 4 0.86 2 3 1 1 -1 -1 -2 -2 2 1 0 0 0 0.12 -2 -1 -1 0.83 -2 0.11 -1 -1 4 1 Q: Completely write out the Matrix Operation for this fully Connect Feedforward Network.

  19. Neural Network y1 …… W1 WL W2 y2 …… bL b2 b1 …… …… …… …… …… x a1 y a2 …… yM x a1 bL b2 b1 W1 W2 WL + + + aL-1

  20. Neural Network y1 …… W1 WL W2 y2 …… bL b2 b1 …… …… …… …… …… x a1 y a2 …… yM Using parallel computing techniques to speed up matrix operation y x x bL b1 b2 W1 WL W2 … … + + +

  21. Matryoshkadoll https://en.wikipedia.org/wiki/Matryoshka_doll

  22. Output Layer as Multi-Class Classifier Feature extractor replacing feature engineering y1 …… …… y2 …… …… …… Softmax …… …… …… yM Input Layer = Multi-class Classifier Output Layer Hidden Layers The last fully-connected layer is called the “output layer”. For classification, it represents the class scores.

  23. Example Application Input Output y1 y2 is 1 0.1 …… is 2 0.7 The image is “2” y10 …… …… 0.2 is 0 16 x 16 = 256 Each dimension represents the confidence of a digit. Ink → 1 No ink → 0

  24. Example Application • Handwriting Digit Recognition y1 is 1 y2 Machine is 2 …… Neural Network “2” …… …… y10 is 0 What is needed is a function …… Input: 256-dim vector output: 10-dim vector

  25. Example Application Input Layer 1 Layer 2 Layer L Output y1 …… is 1 y2 …… A function set containing the candidates for Handwriting Digit Recognition is 2 “2” …… …… …… …… …… …… y10 …… is 0 Output Layer Input Layer Hidden Layers You need to decide the network structure to let a good function in your function set.

  26. FAQ • Q: How many layers? How many neurons for each layer? • Q: Can the structure be automatically determined? • E.g. Evolutionary Artificial Neural Networks • Q: Can we design the network structure? + Intuition Trial and Error Convolutional Neural Network (CNN)

  27. Three Steps for Deep Learning Neural Network Deep Learning is so simple ……

  28. Loss for an Example target “1” …… 1 y1 …… 0 Given a set of parameters y2 Softmax …… …… …… …… …… Cross Entropy …… …… y10 0

  29. Total Loss Total Loss: For all training data … NN Find a function in function set that minimizes total loss L NN yN xN y1 y2 y3 x1 x2 x3 NN …… …… …… …… Find the network parameters that minimize total loss L NN

  30. Three Steps for Deep Learning Neural Network Deep Learning is so simple ……

  31. Gradient Descent Compute 0.2 0.15 Compute -0.1 0.05 …… Compute 0.3 0.2 gradient ……

  32. Gradient Descent Compute Compute 0.2 0.15 0.09 …… Compute Compute -0.1 0.05 0.15 …… …… Compute Compute 0.3 0.2 0.10 …… ……

  33. Gradient Descent This is the “learning” of machines in deep learning …… Even alpha go using this approach. People image …… Actually ….. I hope you are not too disappointed :p

  34. Backpropagation • Backpropagation: an efficient way to compute in neural network libdnn 台大周伯威 同學開發 • Ref: http://speech.ee.ntu.edu.tw/~tlkagk/courses/MLDS_2015_2/Lecture/DNN%20backprop.ecm.mp4/index.html

  35. Three Steps for Deep Learning Neural Network Deep Learning is so simple ……

More Related