170 likes | 186 Views
CAP6938 Neuroevolution and Developmental Encoding Neural Network Weight Optimization. Dr. Kenneth Stanley September 6, 2006. Review. Remember, the values of the weights and the topology determine the functionality Given a topology, how are weights optimized?
E N D
CAP6938Neuroevolution and Developmental EncodingNeural Network Weight Optimization Dr. Kenneth Stanley September 6, 2006
Review • Remember, the values of the weights and the topology determine the functionality • Given a topology, how are weights optimized? • Weights are just parameters on a structure ? ? ? ? ? ? ? ? ?
Two Cases • Output targets are known • Output targets are not known out1 out2 H1 H2 w11 w21 w12 X1 X2
Decision Boundaries • OR is linearly separable • Linearly separable problems do not require hidden nodes (nonlinearities) OR function: + + Input Output 1 1 1 1 -1 1 -1 1 1 -1 -1 -1 - + Bias
Decision Boundaries • XOR is not linearly separable • Requires at least one hidden node XOR function: + - Input Output 1 1 -1 1 -1 1 -1 1 1 -1 -1 -1 - + Bias
Hebbian Learning • Change weights based on correlation of connected neurons • Learning rules are local • Simple Hebb Rule: • Works best when relevance of inputs to outputs is independent • Simple Hebb Rule grows weights unbounded • Can be made incremental:
More Complex Local Learning Rules • Hebbian Learning with a maximum magnitude: • Excitatory: • Inhibitory: • Second terms are decay terms: forgetting • Happens when presynaptic node does not affect postsynaptic node • Other rules are possible • Videos: watch the connections change
Bias Perceptron Learning • Will converge on correct weights • Single layer learning rule: • Rule is applied until boundary is learned
Backpropagation • Designed for at least one hidden layer • First, activation propagates to outputs • Then, errors are computed and assigned • Finally, weights are updated • Sigmoid is a common activation function t1 t2 x’s are inputs z’s are hidden units y’s are outputs t’s are targets v’s are layer 1 weights w’s are layer 2 weights y1 y2 w21 w11 w22 w12 z1 z2 v11 v22 v21 v12 X1 X2
Backpropagation Algorithm • Initialize weights • While stopping condition is false, for each training pair • Compute outputs by forward activation • Backpropagate error: • For each output unit, error • Weight correction • Send error back to hidden units • Calculate error contribution for each hidden unit: • Weight correction • Adjust weights by adding weight corrections (target minus output times slope) (Learning rate times error times hidden output)
Example Applications • Anything with a set of examples and known targets • XOR • Character recognition • NETtalk: reading English aloud • Failure predicition • Disadvantages: trapped in local optima
Output Targets Often Not Available (Stone, Sutton, and Kuhlmann 2005)
One Approach: Value Function Reinforcement Learning • Divide the world into states and actions • Assign values to states • Gradually learn the most promising states and actions Goal 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Start
Learning to Navigate T=56 T=1 Goal Goal 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0.5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Start Start T=703 T=350 Goal Goal 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0.9 1 1 1 1 1 1 1 1 1 Start Start
How to Update State/Action Values • Q learning rule: • Exploration increases Q-values’ accuracy • The best actions to take in different states become known • Works only in Markovian domains
Backprop In RL • The state/action table can be estimated by a neural network • The target learned by the network is the Q-value: Value NN Action State_description
Next Week: Evolutionary Computation • EC does not require targets • EC can be a kind of RL • EC is policy search • EC is more than RL For 9/11: Mitchell ch.1 (pp. 1-31) and ch.2 (pp. 35-80) Note Section 2.3 is "Evolving Neural Networks"