240 likes | 333 Views
Artificial Intelligence in Game Design. N-Grams and Decision Tree Learning. Predicting Player Actions. Type of games where works best: Player has choice between few possible actions Attack left Attack right Character can take simultaneous counteraction
E N D
Artificial Intelligence in Game Design N-Grams and Decision Tree Learning
Predicting Player Actions Type of games where works best: • Player has choice between few possible actions • Attack left • Attack right • Character can take simultaneous counteraction • Each player action has “correct” counteraction • Attack left defend left • Attack right defend right • Goal: Character should learn to “anticipate” current player action based on past actions
Probabilistic Approach • Keep track of last n actions taken by player • Size of n is “window” of character memory • Compute probability of next player action based on these • Base decision about counteraction on that probability • Example: Last 10 player actionsL L R L R L L L R L • Estimated probabilities of next action:
Probabilistic Actions • Simple majority approach • Since left attack has highest probability, defend left • Problem: Character will take same action for long periods • Too predictable! • Example: • Player’s next 4 attacks are from right • Character will still choose defend left since left attacks still have highest probability • Character looks very stupid! player: L L R L R L L L R L R R R Rcharacter: L L L L Left attack still majority player action in last 10 moves
Probabilistic Actions • Probabilistic Approach:Choose action with same probability as corresponding player action • Biased towards recent player actions • Far less predictable • Player may notice that character is changing tactics without being aware of how this is done
Window Size • Key question: What is good value for n?L L R L R L L L R L R R L • Can be too small (n = 2, for example)L L R L R L L L R L R R L • Can be too large (n = 20, for example)LLLLLLLLLLLLRRRRRRRR • No best solution • Will need to experiment Size of “window” used to determine probabilities Character has no “memory” of past player actions Too slow to react to changes in player tactics
N-Grams • Conditional probabilities based on sequence of user actions • Example: • “Last two player attacks were left and then right” • “What has player done next after last times the attacked left then right?” • After last 4left then right attacks: • Attacked right3 times • Attacked left1 time • Conclusion: Player has 75% chance of attacking right next
N-Grams Example • Example: • Window of memory = last 12 actions • Base decision on last two actions taken by player (past sequence length = 2) • Goal: Determine what action player is likely to take next given last two actions L R • Previous actions:L L R L R R L L R L L R? • Previous cases of L R: • Followed by L twice • Followed by R once
N-Grams and Sequence Size • Number of statistics to keep grow exponentially with length of past sequences • Number of possible actions = a • Past sequences length = L • Number of possible configurations of past sequences = aL • L must be small (no more than 2 or 3)
Storing N-Grams • Algorithm: • Keep statistics for all possible action strings of length L • Based on “window” of past actions • Example: L L L L R R L L R L L R
N-Grams and Updating • When player takes new action • Add instance for that action and update statistics • Remove oldest action from list and update statistics • Move “window” • Example:L L L L R R L L R L L RL New action Now outside “window”
Decision Trees Simple tree traversal • Node = question • Branch to follow = answer • Leaf = final action to take Can create dynamically from a set of examples Hit points < 5? no yes Obstacle between myself and player? Within one unit of player? yes yes no no Path to playerclear? hide run attack yes no run towardsplayer runsideways
Decision Tree Learning • Based on a set of training examples • Attribute values about current situation • From world or from character state • Target action character should take on basis of those attribute values • Example: Estimating driving time • Hour of departure: 8, 9, or 10 • Weather: sunny, cloudy or rainy • Accident on route: yes or no • Stalled car on route: yes or no • Commute time: short, medium, or long attributes desired action
Decision Tree Building node BuildTree(examples[ ] E) { if (all examples in E have same target action T) { return new leaf with action T } else { choose “best” attribute A to create branches for examples E set question at this node to A for (all possible values V for attribute A) { create branch with value V EV = all examples in E that have valueV for attribute A attach BuildTree(EV ) to that branch } } }
Decision Tree Building • Start tree by using BuildTree(all examples) to create root • Example: Suppose Hour were chosen at root: Examples 1 – 13 4 short, 2 medium, 7 long Hour 10 9 8 Examples 3, 6, 7, 10, 11 4 short, 0 medium, 1 long Examples 4, 5, 8, 9, 130 short, 2 medium, 3 long Examples 1, 2, 120 short, 0 medium, 3 long another question another question Long
ID3 Decision Tree Learning • Choose attribute with highest information gain • Based on definition of entropy from information theory • Entropy over examples E with target actions T:Entropy(E) = - | ET | log2 | ET |T | E | | E | • Example: entropy for training set =-(4/13) log2(4/13) -(2/13) log2(2/13) -(7/13) log2(7/13) = 1.424 short examples 2 medium examples 7 medium examples
ID3 Decision Tree Learning • Expected entropy remaining after attribute A used = • Sum of entropies of child nodes down branches V • Weighted by percentage of examples down that branch | EV | Entropy (EV)V | E | • Information gain for attribute A = that value subtracted from original entropy
Using ID3 in Games • ID3 efficient enough for on-line learning • Execution time proportional to number of examples • Best applied to games if: • Few possible actions for NPC to choose from(attack, retreat, defend) • Key attributes have few possible values(or continuous values partitioned into predefined ranges) • Have easy way to determine desired actions NPC should take based on significant number of player actions • May require tweaking rules of game!
Black and White Game • Player given “creature” at beginning of game • Creature “trained” by player by observing player actions in different situations • What other armies to attack in what circumstances • Later in game (after creature grows), creature takes those same actions
Black and White Game • Sample training set:
Black and White Game • Tree created from examples: Allegiance friendly enemy No attack Defense weak strong medium Attack No attack Attack
Black and White Game • Nature of game suited to decision tree learning: • Large number of examples created by player actions • Known target actions based on player actions • Small number of possible actions, attribute values • Game specifically designed around algorithm!