Machine Learning

Machine Learning A Quick look • Sources: • Artificial Intelligence – Russell & Norvig • Artifical Intelligence - Luger By: Héctor Muñoz-Avila

Action1 Action2 Knowledge Knowledge changed What Is Machine Learning? “Logic is not the end of wisdom, it is just the beginning” --- Spock time Game Game System System

Learning: The Big Picture • Two forms of learning: • Supervised: the input and output of the learning component can be perceived (for example: experienced player giving friendly teacher) • Unsupervised: there is no hint about the correct answers of the learning component (for example to find clusters of data)

Offline vs. Online Learning • Online – during gameplay • Adapt to player tactics • Avoid repetition of mistakes • Requirements: computationally cheap, effective, robust, fast learning (Spronck 2004) • Offline– Between the end of a game and the next • Devise new tactics • Discover exploits

Classification(According to the language representation) • Symbolic • Version Spaces • Decision Trees • Explanation-Based Learning • … • Sub-symbolic • Reinforcement Learning • Connectionist • Evolutionary

Classification(According to the language representation) • Symbolic • Version Space • Decision Trees • Explanation-Based Learning • … • Sub-symbolic • Reinforcement Learning • Connectionist • Evolutionary

Two extremes (temptative) solutions: too general obj(X,Y,Z) … obj(X,Y,ball) concept space obj(large,Y,ball) obj(small,Y,ball) • obj(large,white,ball) obj(small,blue,ball) … too specific Version Space Idea: Learn a concept from a group of instances, some positive and some negative • Example: • target: obj(Size,Color,Shape) • Size = {large, small} • Color = {red, white, blue} • Shape = {ball, brick, cube} • Instances: • +: • obj(large,white,ball) • obj(small,blue,ball) • −: • obj(small,red,brick) • obj(large,blue,cube)

How Version Space Works If we consider positive and negatives If we consider only positives + + + + + + + + + + + + + − + − − − + + What is the role of the negative instances? to help prevent over-generalizations

Explanation-Based learning C A B C A B C B A B A C ? B A ? C B ? A B C C B A B A C A C A A B C B C C B A A B C Can we avoid making this error again?

Possible rule: If the initial state is this and the final state is this, don’t do that Explanation-Based learning (2) C A B ? B A ? C ? A B C C B A B A C More sensible rule: don’t stack anything above a block, if the block has to be free in the final state

Motivation # 1: Analysis Tool • Suppose that a gaming company have a data base of runs with a beta version of the game, lots of data • How can that company’s developers use this data to figure out an good strategies for their AI

Decision Tree induction “if built center hall & has built 4 workers then build defense tower” Motivation # 1: Analysis Tool (cont’d) Games data

The Knowledge Base in Expert Systems A knowledge base consists of a collection of IF-THEN rules: if built center hall & has built 4 workers then build defense tower if built center hall & mine then upgrade center hall Knowledge bases of expert systems contain hundreds and sometimes even thousands such rules. Frequently rules are contradictory and/or overlap

Sample Expert System in Games: Age of Empires http://www.youtube.com/watch?v=GEbnqc82lew (defrule �(current-age == dark-age �(building-type-count-total mining-camp > 0) �(not (research-available feudal-age)) �=> �(set-strategic-number sn-food-gatherer-percentage 52) �(set-strategic-number sn-wood-gatherer-percentage 35) �(set-strategic-number sn-gold-gatherer-percentage 13) �(set-strategic-number sn-stone-gatherer-percentage 0) �(disable-self) )

Main Drawback of Expert Systems: The Knowledge Acquisition Bottle-Neck The main problem of expert systems is acquiring knowledge from human specialist is a difficult, cumbersome and long activity. KB = Knowledge Base KA = Knowledge Acquisition

Motivation # 2: Avoid Knowledge Acquisition Bottle-Neck • GASOIL is an expert system for designing gas/oil separation systems stationed of-shore • The design depends on multiple factors including: • proportions of gas, oil and water, flow rate, pressure, density, viscosity, temperature and others • To build that system by hand would had taken 10 person years • It took only 3 person-months by using inductive learning! • GASOIL saved BP millions of dollars

Motivation # 2 : Avoid Knowledge Acquisition Bottle-Neck KB = Knowledge Base KA = Knowledge Acquisition IDT = Induced Decision Trees

Full none some no yes waitEstimate? 0-10 >60 30-60 10-30 no Alternate? Hungry? yes Yes no yes No yes Alternate? Reservation? Fri/Sat? yes yes no no yes no yes Raining? No Yes Bar? Yes yes no no yes yes No no Yes Example of a Decision Tree Patrons?

Definition of A Decision Tree A decision tree is a tree where: • The leaves are labeled with classifications (if the classification is “yes” or “no”. The tree is called a boolean tree) • The non-leaves nodes are labeled with attributes • The arcs out of a node labeled with an attribute A are labeled with the possible values of the attribute A

Databases: what are the data that matches this pattern? Induction: what is the pattern that matches these data? database induction Induction Data pattern

Induction of Decision Trees • Objective: find a concise decision tree that agrees with the examples • The guiding principle we are going to use is the Ockham’s razor principle: the most likely hypothesis is the simplest one that is consistent with the examples • Problem: finding the smallest decision tree is NP-complete • However, with simple heuristics we can find a small decision tree (approximations)

Induction of Decision Trees: Algorithm • Algorithm: • Initially all examples are in the same group • Select the attribute that makes the most difference (i.e., for each of the values of the attribute most of the examples are either positive or negative) • Group the examples according to each value for the selected attribute • Repeat 1 within each group (recursive call)

Example

Patrons? full none some X4(+),x12(+), x2(-),x5(-),x9(-),x10(-) X7(-),x11(-) X1(+),x3(+),x6(+),x8(+) Type? burger italian french thai X3(+),x12(+), x7(-),x9(-) X6(+), x10(-) X1(+), x5(-) X4(+),x12(+) x2(-),x11(-) IDT: Example Lets compare two candidate attributes: Patrons and Type. Which is a better attribute?

Full none some no yes waitEstimate? 0-10 >60 30-60 10-30 no Alternate? Hungry? yes Yes no yes No yes Alternate? Reservation? Fri/Sat? yes yes no no yes no yes Raining? No Yes Bar? Yes yes no no yes yes No no Yes Example of a Decision Tree Patrons?

Decision Trees in Gaming http://www.youtube.com/watch?v=HMdOyUp5Rvk • Black & White, developed by Lionhead Studios, and released in 2001 • Used to predict a player’s reaction to a certain creature’s action • In this model, a greater feedback value means the creature should attack • This is done by inducing a decision tree

Decision Trees in Black & White should your creature attack a town?

Decision Trees in Black & White Allegiance Friendly Enemy Defense -1.0 Weak Strong Medium 0.4 -0.3 0.1 Note that this decision tree does not even use the tribe attribute

Decision Trees in Black & White • Now suppose we don’t want the entire decision tree, but we just want the 2 highest feedback values • We can create a Boolean expressions, such as ((Allegiance = Enemy) ^ (Defense = Weak)) v ((Allegiance = Enemy) ^ (Defense = Medium))

Classification(According to the language representation) • Symbolic • Version Space • Decision Trees • Explanation-Based Learning • … • Sub-symbolic • Reinforcement Learning • Connectionist • Evolutionary Next class

Machine Learning