250 likes | 398 Views
Neural Networks. References: “Artificial Intelligence for Games” "Artificial Intelligence: A new Synthesis". History. In the 70’s it was *THE* approach to AI Since then, not so much. Although it has crept back into some apps. It still has it’s uses. Many different types of Neural Nets
E N D
Neural Networks References: “Artificial Intelligence for Games” "Artificial Intelligence: A new Synthesis"
History • In the 70’s it was *THE* approach to AI • Since then, not so much. • Although it has crept back into some apps. • It still has it’s uses. • Many different types of Neural Nets • (Multi-layer) Feed-forward (perceptron) • Hebbian • Recurrent
Overview • Givens: • A set of n training cases (each represented by a m-bit “binary string”) • The “correct” p-bit answer for each training case. • Supervised vs. Unsupervised • Process: • Train a neural network to “learn” the training cases. • Using the Neural Network. • When exposed to new (grey) inputs, it should pick the closest match. • Better than decision trees in this respect. • Basically: a classifier.
Example Application #1 • Modeling a enemy AI. • The AI can do one of these 4 actions: • Go for cover • Shoot nearest enemy • Run away • Idle • Training cases: • 50 sample play-throughs with: • position of player (16 bits) • health of player (6 bits) • ammo-state of player (8 bits) • One of the following: • Supervised training: The action the AI *should* have taken. • Unsupervised training: A way to measure success (based on player/AI state)
Example Application #2 • Training Cases: • 8 8x8 pixel images (black/white) (64 bits each) • For each a classification number (3 bits each) • This is supervised learning. • Use: • After training, feed it some images similar to one of the training cases (but not the same) • It *should* return the best classification number.
Perceptrons • Modeled after a single neuron. • Components: • Dendrites (Input) • Axon (Output) • Soma (Activation Function)
Perceptron Activation functions • Notation: • the vector of all input values. For us, input values are 0 or 1. • the vector of weight values (same size as I) • Positive, or negative, no limit. • I usually initialize to -1.0 to +1.0. • the total (weighted) input to the perceptron.
Perceptron Activation functions, cont. • Square function: • Sigmoid function (the one we’ll use): To convert the output of the sigmoid to a "crisp" value (when this is an output from the nnet), you count a value >= 0.5 as 1.
Perceptron Activation functions, cont. • Augmented Input Vectors • Append: • A 1 to the input vector • -δ to the weight vector • (calculate Σ) as before • Square function becomes: • Sigmoid function (the one we’ll use) becomes:
Examples O0 • Two inputs, one output • AND • OR • XOR • Problem! • Can’t be done with a single perceptron Perceptron w2=-δ w0 w1 1.0 I1 I0
Why not XOR? • The first two examples are linearly separable W2=δ OUTPUT=1 OUTPUT=0 In higher dimensions, the “dividing line” is a hyper-plane, not a plane.
AND and OR δ=0.75 δ=0.4 OR AND
XOR You can’t draw a line to separate the “True”’s from the “False”’s
Multi-layer perceptron networks • We can’t solve XOR with one perceptron, but we can solve it with 3. • The 0th one in the first layer looks for [0, 1] • The 1st one in the first layer looks for [1, 0] • The only one in the second layer OR’s the output of the first layer’s perceptrons. 0.2 0.3
XOR NNet O0 Perceptron10 W021 W001 W011 1.0 Perceptron00 Perceptron01 W120 W020 W100 W110 1.0 W000 W010 I0 I1 Notation: Wijh = Weight to layer h, from input#j to perceptron#i
Weight matrices + Feed-Forward • For compactness, represent the weights for layerN as a matrix.
Feed Forward Example • Feed Forward is how you use a Nnet. • Input(5), Hidden1(4), Hidden2 (3), Output (2) Hidden1Input = W0*(I) = Hidden1Output = f(Hidden1Input) = Hidden2Input = W1 * (Hidden1Output + [1.0]) = Hidden2Output = f(Hidden2Input) =
Feed Forward Example, cont. Hidden2Output = OutputInput = Hidden2Output + [1.0] = OutputOutput = W2 * OutputInput = f(OutputOutput) =
Training intro • Feed Forward is how you use a Nnet. • But we usually won’t have the weights initially • So we’ll need to train it (i.e. adjust weights) • Let’s look first at training a single perceptron. • Error-correction procedure: • Err = • is the weight vector for this perceptron • c is a “learning rate” (I use 0.1 – 1.0 for mine) • d is the desired output for input X • o is the actual output for input X (get this by Feed-Fwd) • f’ is the derivative of the activation function, the gradient. For sigmoids: f' = f * (1- f)
Training intro, cont. • Err = • Observations: • If d == o, no change in W • d – o could be -1 or +1. • -1 means the output was too high. Decreases weight • +1 means the output was too low. Increase weight. • The f’ term indicates how “changeable” the value is • f’ of 0 (when f is nearly 1 or nearly 0) means it’s “locked-in” • f’ of 0.25 (when f is near 0.5) means it’s very changeable.
Training intro, cont. • You repeatedly show examples, X • Weight values will stabilize at “correct” values. • *IF* the answers are linearly separable (like AND; unlike XOR) • If they aren’t it, will at least minimize the error.
Err = Training example • 2 inputs, one output. • Weights initially [0.3, 0.7, 0.0] • Scenario#1 In=[0, 1,1]. DesiredOutput=0 Feed forward. f(0.7)=0.668. f’(0.7)=0.224. Err=-0.668 W += 0.1 * (0 – 0.668) * 0.224 * [0, 1,1] W is now [0.3, 0.685,-0.015] Feed forward. f(0.685)=0.664. f'(0.685)=0.223. Err=-0.661 If we repeat this 299 more times: Err=-0.032 • Scenario#2 In=[0,1,1]. DesiredOutput=1 Feed forward. f(0.7)=0.668. f'(0.7)=0.224. Err=0.332 W += 0.1 * (1 – 0.668) * 0.224 * [0,1,1] W is now [0.3, 0.707, 0.007] Feed forward. f(0.715)=0.671. f'(0.715)=0.22. Err=0.329 If we repeat this 299 more times: Err=0.055
Training example, cont. • Basically, were… • Adjusting weights: • We subtract from the weight if the error is negative and we have a 1 for input • Negative Error == output is too high. • We add to the weight if the error is positive and we have a 1 for input. • Positive Error == output is too low. • If the input is 0, leave weight unchanged. • The threshold (δ) is a weight too, where the input is always 1. • So even if the inputs are all 0, we could still change the threshold.
Training Hidden Layers • Recall: Updating weights feeding into an output perceptron (terms are re-arranged): • Updating weights feeding into a hidden perceptron: • wk is the weight from the node we're feeding into to the kth node in the next layer up. • errk is the error of the kth node in the next layer up. • Err = • Err = )
Back-propagation Example • [See Handout]