Optimal Architectures for Efficient Neural Networks

Neural NetworksChapter 6 Joost N. Kok Universiteit Leiden

Feedforward networks

Feedforward Networks

NetTalk

Feedforward Networks • A network to pronounce English text • 7 x 29 input units • 1 hidden layer with 80 hidden units • 26 output units encoding phonemes • Trained by 1024 words with context • Produces intelligible speech after 10 training epochs

Feedforward Networks • Functionally equivalent to DEC-talk • Rule-based DEC-talk is the result of a decade of efforts by many linguists • NETtalk learns from examples, and requires no linguistic knowledge

Back-Propagation

Back-Propagation • Initialize the weights to small random values • Choose a pattern and apply it to the input layer • Propagate the signal forwards through the network • Compute the deltas for the output layer

Back-Propagation • Compute the deltas for the preceding layers by propagating the errors backwards • Update all the connections • Go back to the second step for the next pattern

Navigation of a Car • Carnegie-Mellon • 30 times 32 pixel image • 8 times 32 range finder • 29 hidden units, 45 output units • 1200 simulated road images, 40 training cycles • 5km/hr

Backgammon • Score from –100 to +100 • 3000 examples • 459 inputs • Two hidden layers of 24 nodes • Neurogammon vs. Gammontool: 59 percent • Without precomputed features: 41 percent • Without noise: 45 percent

0.5 1 -2 0.5 0.5 1 1 1 1 Parity Problem • Parity Problem: Output is on if an odd number of inputs is on

Back-Propagation

Back-Propagation • The update rule is local • Incremental weight updating vs. batch mode • Momentum: accelerate the long term trend by a factor

Back-Propagation • Adaptive parameters

Feedforward Networks • Process Modeling and Control • Machine Diagnostics • Portfolio Management • Target Recognition • Medical Diagnosis • Credit Rating

Feedforward Networks • Targeted Marketing • Voice Recognition • Financial Forecasting • Quality Control • Intelligent Searching • Fraud Detection

Optimal Network Architectures • Optimization • Use as few units as possible: • Improve computational costs and training time • Improve generalization • Search through space of possible architectures, for example using Back-Propagation and Evolutionary Algorithms

Optimal Network Architectures • Construct or modify architecture • Start with too many nodes and take some away • Start with too few and add some more

Optimal Network Architectures • Pruning and weight decay

Optimal Network Architectures • Small weights decay more rapidly than large ones:

Optimal Network Architectures • We want to remove units: use same for all connections feeding unit i:

Optimal Network Architectures • Start with small network and gradually grow one of the appropriate size • Boolean function from N binary inputs to single binary output

Optimal Network Architectures

Optimal Network Architectures • Choose hidden units such that • Same output for all remaining patterns with one target • Opposite output for at least one of the remaining patterns with opposite target and remove these patterns • Linearly separable problem

- + + - + - Optimal Network Architectures

Optimal Network Architectures • We do the best we can with single node • Correct with two nodes • One for wrongly on patterns • One for wrongly off patterns • Each additional unit reduces the number of incorrectly classified patterns by at least one

Optimal Network Architectures

Optimal Network Architectures • Faithful representation: two patterns with different targets should have different representations • Master unit: does as well as possible on the task • Ancillary units: added to obtain faithful representation

Optimal Architectures for Efficient Neural Networks

Optimal Architectures for Efficient Neural Networks

Presentation Transcript

Neural Networks Chapter 2

Chapter 6: Multilayer Neural Networks (Sections 6.1-6.3)

Chapter 7 Artificial Neural Networks

Neural Networks Chapter 4

Chapter 4: Artificial Neural Networks

Chapter 3 ARTIFICIAL NEURAL NETWORKS

Chapter 11 Neural Networks

Chapter 7 Artificial Neural Networks

Neural Networks

Neural Networks

Neural networks

Neural Networks

Chapter 11 – Neural Networks

Chapter 3 ARTIFICIAL NEURAL NETWORKS

Chapter 5 NEURAL NETWORKS

Neural Networks

Neural Networks Chapter 6

Neural Networks Chapter 9

Chapter 6: Multilayer Neural Networks (Sections 6.1-6.3)

Chapter 6: Multilayer Neural Networks (Sections 6.1-6.3)

Chapter 11 Neural Networks

Neural Networks Chapter 7