280 likes | 481 Views
Patterson: Chap 2 Foundations of Neural Networks and Deep Learning. Dr. Charles Tappert The information here, although greatly condensed, comes almost entirely from the chapter content. Neural Networks.
E N D
Patterson: Chap 2Foundations of Neural Networks and Deep Learning Dr. Charles Tappert The information here, although greatly condensed, comes almost entirely from the chapter content.
Neural Networks • Neural networks are models that share some properties of animal brains -- simple units work in parallel with no centralized control unit • A networks architecture can be defined by • Number of layers, number of neurons per layer, and the types of connections between layers • The most well-known and easiest to understand is the feed-forward multilayer NN • A feed-forward multilayer NN can represent any function given enough neuron units
Neural Networks • The feed-forward multilayer NN
Neural NetworksThe Perceptron • Definition: The simple single-layer perceptron is a linear model used for binary classification
Neural NetworksThe Perceptron • Artificial neuron for a multilayer perceptron
Neural NetworksThe Perceptron • Fully connected multilayer feed-forward perceptron
Neural NetworksTraining Neural Networks • A well-trained ANN has weights that amplify the signal and dampen the noise • Bigger weights mean tighter correlations between the signal and the networks outcome • The process of learning is the process of adjusting the weights, making some smaller and some larger, thereby allocating significance to some information and minimizing other
Neural NetworksTraining Neural Networks • Backpropagation learning • Similar to the perceptron learning algorithm • Compute the output of a training sample with a forward pass through the system • If the output does not match the training label, adjust the weights, working backward from the output layer to the input layer • Mathematical details in Appendix D
Neural NetworksActivation Functions • Activation functions are used to propagate the output of one layer’s nodes to the next layer • The activation functions for the hidden units introduce nonlinearities which are necessary to solving most problems
Neural NetworksLoss Functions • Loss functions quantify how close a given neural network is to the trained ideal network • Looking for the ideal state is equivalent to finding parameters that minimize the loss • Thus, loss functions reframe the training problem as an optimization problem
Neural NetworksLoss Functions • Mean squared error loss function • Negative log likelihood loss function - M classes
Neural NetworksHyperparameters • Hyperparameters are tuning parameters that deal with controlling optimization functions • Hyperparameter selection focuses on ensuring that the model neither underfits nor overfits the training dataset
Neural NetworksHyperparameters • Learning rate • The amount by which you adjust parameters • Scales the size of the steps (updates) • How much of the gradient to use • A large error and steep gradient combine with the learning rate to produce a large step • A large learning rate is often used in the initial portion of training and a smaller learning rate later as the system approaches the global minimum
Neural NetworksHyperparameters • Regularization • Helps with the effects of out-of-control parameters by minimizing parameter size over time • Basically controls overfitting • Usually represented by the coefficient lambda l • Big training data is the ultimate regularizer
Neural NetworksHyperparameters • Momentum • Helps the learning algorithm get out of spots in the search space where it could become stuck • Helps updater find gulleys leading toward minima • Helps produce better quality models
Neural NetworksHyperparameters • Sparsity • The sparsity hyperparameter recognizes that for some inputs only a few features are relevant • For example, for a network that classifies a million images, any single image will be indicated by a limited number of features