450 likes | 478 Views
Neural Networks Chapter 6. Joost N. Kok Universiteit Leiden. Feedforward networks. Feedforward networks. Feedforward Networks. Feedforward Networks. NetTalk. NetTalk. Feedforward Networks. A network to pronounce English text 7 x 29 input units 1 hidden layer with 80 hidden units
E N D
Neural NetworksChapter 6 Joost N. Kok Universiteit Leiden
Feedforward Networks • A network to pronounce English text • 7 x 29 input units • 1 hidden layer with 80 hidden units • 26 output units encoding phonemes • Trained by 1024 words with context • Produces intelligible speech after 10 training epochs
Feedforward Networks • Functionally equivalent to DEC-talk • Rule-based DEC-talk is the result of a decade of efforts by many linguists • NETtalk learns from examples, and requires no linguistic knowledge
Back-Propagation • Initialize the weights to small random values • Choose a pattern and apply it to the input layer • Propagate the signal forwards through the network • Compute the deltas for the output layer
Back-Propagation • Compute the deltas for the preceding layers by propagating the errors backwards • Update all the connections • Go back to the second step for the next pattern
Navigation of a Car • Carnegie-Mellon • 30 times 32 pixel image • 8 times 32 range finder • 29 hidden units, 45 output units • 1200 simulated road images, 40 training cycles • 5km/hr
Backgammon • Score from –100 to +100 • 3000 examples • 459 inputs • Two hidden layers of 24 nodes • Neurogammon vs. Gammontool: 59 percent • Without precomputed features: 41 percent • Without noise: 45 percent
0.5 1 -2 0.5 0.5 1 1 1 1 Parity Problem • Parity Problem: Output is on if an odd number of inputs is on
Back-Propagation • The update rule is local • Incremental weight updating vs. batch mode • Momentum: accelerate the long term trend by a factor
Back-Propagation • Adaptive parameters
Feedforward Networks • Process Modeling and Control • Machine Diagnostics • Portfolio Management • Target Recognition • Medical Diagnosis • Credit Rating
Feedforward Networks • Targeted Marketing • Voice Recognition • Financial Forecasting • Quality Control • Intelligent Searching • Fraud Detection
Optimal Network Architectures • Optimization • Use as few units as possible: • Improve computational costs and training time • Improve generalization • Search through space of possible architectures, for example using Back-Propagation and Evolutionary Algorithms
Optimal Network Architectures • Construct or modify architecture • Start with too many nodes and take some away • Start with too few and add some more
Optimal Network Architectures • Pruning and weight decay
Optimal Network Architectures • Small weights decay more rapidly than large ones:
Optimal Network Architectures • We want to remove units: use same for all connections feeding unit i:
Optimal Network Architectures • Start with small network and gradually grow one of the appropriate size • Boolean function from N binary inputs to single binary output
Optimal Network Architectures • Choose hidden units such that • Same output for all remaining patterns with one target • Opposite output for at least one of the remaining patterns with opposite target and remove these patterns • Linearly separable problem
- + + - + - Optimal Network Architectures
Optimal Network Architectures • We do the best we can with single node • Correct with two nodes • One for wrongly on patterns • One for wrongly off patterns • Each additional unit reduces the number of incorrectly classified patterns by at least one
Optimal Network Architectures • Faithful representation: two patterns with different targets should have different representations • Master unit: does as well as possible on the task • Ancillary units: added to obtain faithful representation