Deep Learning

Lecture Series: AI is the New Electricity Deep Learning - SCOPING, EVOLUTION & FUTURE TRENDS Dr. Chiranjit Acharya AILABS Academy J-3, GP Block, Sector V, Salt Lake City, Kolkata, West Bengal 700091 Presented at AILABS Academy, Kolkata on April 18th 2018 Confidential, unpublished property of aiLabs. Do not duplicate or distribute. Use and distribution limited solely to authorized personnel. (c) Copyright 2018

A Journey into Deep Learning ▪Cutting edge technology ▪Garnered traction in both industry and academics ▪Achieves near-human-level performance in many pattern recognition tasks ▪Excels in ▪structured, relational data ▪unstructured rich-media data such as image, video, audio and text AILABS (c) Copyright 2018 2

A Journey into Deep Learning ▪What is Deep Learning? Where is the “deepness”? ▪Where does Deep Learning come from? ▪What are the models and algorithms of Deep Learning? ▪What is the trajectory of evolution of Deep Learning? ▪What are the future trends of Deep Learning? AILABS (c) Copyright 2018 3

Artificial Intelligence Holy Grail of AI Research ▪Understanding the neuro-biological and neuro- physical basis of human intelligence ▪science of intelligence ▪Building intelligent machines which can think and act like humans ▪engineering of intelligence AILABS (c) Copyright 2018 5

Machine Learning Basic Doctrine of Learning ▪learning from examples Outcome of Learning ▪rules of inference for some predictive task ▪embodiment of the rules = model ▪model is an abstract computing device •kernel machine, decision tree, neural network AILABS (c) Copyright 2018 8

Machine Learning Connotations of Learning: ▪knowing distributional characteristics of data ▪identifying causal effects and propagation ▪identifying non-causal co variations & correlations AILABS (c) Copyright 2018 10

Machine Learning Design Aspects of Learning System ▪ Choose the training experience ▪ Choose exactly what is to be learned, i.e. the target function / machine ▪ Choose objective function & optimality criteria ▪ Choose a learning algorithm to infer the target function from the experience. AILABS (c) Copyright 2018 11

Learning Work Flow ▪Stage 1: Feature Extraction, Feature subset selection, Feature Vector Representation ▪Stage 2: Training / Testing Set Creation and Augmentation ▪Stage 3: Training the Inference Machine ▪Stage 4: Running the Inference Machine on Test Set ▪Stage 5: Stratified Sampling and Validation AILABS (c) Copyright 2018 12

Feature Extraction / Selection low-level parts mid-level parts Cognitive Elements high-level parts additional descriptors Domain Expert Corpus Knowledge Engineer Sparse Sparse Coder Representation AILABS (c) Copyright 2018 13

Training and Prediction / Recognition Prediction / Recognition Model Adaptive Learner Training Set Unlabelled Residual Corpus Predicted / Recognized Corpus Prediction/ Recognition Model AILABS (c) Copyright 2018 15

Sampling , Validation & Convergence Human Reviewed Stratified sub- samples Predicted Corpus Stratified sub- samples Reviewer Stratified Sampler Precision & Recall Calculator Go back to Training Set Augmentation End of Relevance Scoring No Yes Converged ? AILABS (c) Copyright 2018 16

Evolution of Connectionist Models 1943: Artificial neuron model (McCulloch & Pitts) ▪ "A logical calculus of the ideas immanent in nervous activity" ▪ simple artificial “neurons” could be made to perform basic logical operations such as AND, OR and NOT ▪ known as Linear Threshold Gate ▪ NO learning AILABS (c) Copyright 2018 17

Evolution of Connectionist Models 1957: Perceptron model (Rosenblatt) ▪ invention of learning rules inspired by ideas from neuroscience if Σ inputi * weighti> threshold, output = +1 if Σ inputi * weighti< threshold, output = -1 ▪ learns to classify input into two output classes ▪ Sigmoid transfer function: boundedness, graduality    1 as y x    0 as y x AILABS (c) Copyright 2018 19

Evolution of Connectionist Models 1943: Artificial neuron model (McCulloch & Pitts) w1j x1 n w2j   y  s w x b ( ) f s j j x2 j ij i j yj 0 i wnj 1  js  1 e xn bj AILABS (c) Copyright 2018 20

Evolution of Connectionist Models 1960s: Delta Learning Rule (Widrow & Hoff)    1 2 2 ˆ y ( ) E y Define the error as the squared residuals summed over all training cases: ▪ n n n       ˆ y w E E w    1 2 n n Now differentiate to get error derivatives for weights ▪ ˆ y n i i n    ˆ y ( ) x y , i n n n n The batch delta rule changes the weights in proportion to their error derivatives summed over all training cases ▪   E w     w i i AILABS (c) Copyright 2018 21

Evolution of Connectionist Models 1969: Minsky's objection to Perceptrons ▪ Marvin Minsky & Seymour Papert: Perceptrons ▪ Unless input categories are linearly separable, a perceptron cannot learn to discriminate between them. ▪ Unfortunately, it appeared that many important categories were not linearly separable. AILABS (c) Copyright 2018 22

Evolution of Connectionist Models 1969: Minsky's objection to Perceptrons Perceptrons are incapable of simple nonlinear classification like XOR (1) x1 (1) (0) X1 X2 Output 0 0 0 0 1 1 1 0 1 1 1 0 (0) (0) (1) (XOR operation) x2(1) (0) AILABS (c) Copyright 2018 24

Universal Approximation Theorem Existential Version (Kolmogorov) ▪ There exists a finite combination of superposition and addition of continuous functions of single variables which can approximate any continuous, multivariate function on compact subsets of R^d. Constructive Version (Cybenko) ▪ The standard multilayer feed-forward network with a single hidden layer, containing finite number of hidden neurons, is a universal approximator among continuous functions on compact subsets of R^d, under mild assumptions on the activation function. AILABS (c) Copyright 2018 25

Evolution of Connectionist Models 1986: Backpropagation for Multi-Layer Perceptrons (Rumelhart, Hinton & Williams) ▪ solution to Minsky's objection regarding perceptron's limitation ▪ nonlinear classification is achieved by fully connected, multilayer, feedforward networks of perceptrons (MLP) ▪ MLP can be trained by backpropagation ▪ Two-pass algorithm ▪ forward propagation of activation signals from input to output ▪ backward propagation of error derivatives from output to input AILABS (c) Copyright 2018 26

Evolution of Connectionist Models 1986: Backpropagation for Multi-Layer Perceptrons (Rumelhart, Hinton & Williams) Layer 1 Layer 2 Input Output y1 1x 2x y2 … … … … … … … yM … x N Input Layer Hidden Layer Output Layer AILABS (c) Copyright 2018 27

Evolution of Connectionist Models 1986: Backpropagation for Multi-Layer Perceptrons (Rumelhart, Hinton & Williams) ▪ solution to Minsky's objection regarding perceptron's limitation ▪ nonlinear classification is achieved by fully connected, multilayer, feedforward networks of perceptrons (MLP) ▪ MLP can be trained by backpropagation ▪ Two-pass algorithm ▪ forward propagation of activation signals from input to output ▪ backward propagation of error derivatives from output to input AILABS (c) Copyright 2018 28

Handwriting Digit Recognition Input Output y1 0.1 is 1 y1 1x 2x y2 y2 0.7 is 2 The image is “2” … … … … y10 y1 0.2 is 0 x 256 16 x 16 = 256 Each output represents the confidence of a digit. Color → 1 No color → 0 AILABS (c) Copyright 2018 30

Evolution of Connectionist Models 1989: Convolutional Neural Network (LeCun) neuron Layer 1 Layer 2 Layer L Input Output y1 … … … … 1x 2x y2 … … … … … … … … … yM … … … x N Output Layer Input Layer Hidden Layers Deep means many hidden layers AILABS (c) Copyright 2018 32

Convolutional Neural Network ▪ Input can have very high dimension. Using a fully-connected neural network would need a large amount of parameters. ▪ CNNs are a special type of neural network whose hidden units are only connected to local receptive field. The number of parameters needed by CNNs is much ▪ ▪ smaller. Example: 200x200 image a)fully connected: 40,000 hidden units => 1.6 billion parameters b)CNN: 5x5 kernel (filter), 100 feature maps => 2,500 parameters AILABS (c) Copyright 2018 33

Convolution Operation in CNN Input: an image (2-D array): x Convolution kernel (2-D array of learnable parameters): w Feature map (2-D array of processed data): s Convolution operation in 2-D domains: ▪ ▪ ▪ ▪ AILABS (c) Copyright 2018 35

Evolution of Connectionist Models 2006: Deep Belief Networks (Hinton), Stacked Auto-Encoders (Bengio) neuron Layer 1 Layer 2 Layer L Input Output y1 … … … … 1x 2x y2 … … … … … … … … … yM … … … x N Output Layer Input Layer Hidden Layers Deep means man y hidden layers AILABS (c) Copyright 2018 41

Deep Learning Traditional pattern recognition models use hand-crafted features and relatively simple trainable classifier. “Simple” Trainable Classifier hand-crafted feature extractor output This approach has the following limitations: • It is very tedious and costly to develop hand-crafted features ▪ The hand-crafted features are usually highly dependents on one application, and cannot be transferred easily to other applications AILABS (c) Copyright 2018 42

Deep Learning Deep learning = representation learning Seeks to learn automatically through multiple stage of feature learning process. hierarchical representations (i.e. features) High-level features Mid-level features Trainable classifier Low-level features output Feature visualization of convolutional net trained on ImageNet (Zeiler and Fergus, 2013) AILABS (c) Copyright 2018 43

Learning Hierarchical Representations High-level features Mid-level features Trainable classifier Low-level features output Increasing level of abstraction Hierarchy of representations with increasing level of abstraction. Each stage is a kind of trainable nonlinear feature transformation Image recognition Pixel → edge → motif → part → object Text Character → word → word group → clause → sentence → story AILABS (c) Copyright 2018 44

Pooling Common pooling operations: Max pooling Report the maximum output within a rectangular neighborhood. Average pooling Report the average output of a rectangular neighborhood (possibly weighted by the distance from the central pixel). AILABS (c) Copyright 2018 45

Future Trends ▪ Different and wider range of problems are being addressed ▪ natural language understanding ▪ natural scene understanding ▪ natural speech understanding ▪ Feature learning is being investigated at deeper level ▪ Manifold learning ▪ Reinforcement learning ▪ Integration with other paradigms of machine learning AILABS (c) Copyright 2018 50

Deep Learning - Evolution and Future Trends

Deep Learning - Evolution and Future Trends

Presentation Transcript

Deep Brain Stimulation: Current Trends and Future Applications

Deep Learning

Deep Learning!!!!

Deep Learning: Back To The Future

Deep learning

Deep Learning

Deep Learning

Deep Learning

Deep Learning

Deep Learning and HPC

Deep Learning

Deep Learning

Exploring Galaxy Evolution with Current and Future Deep Surveys

Active Learning = Deep Learning

Deep learning

DEEP SURVEYS: Galaxy formation and evolution

Deep Learning

Deep Learning

Machine Learning and Deep Learning

Discriminate between deep learning and deep q learning