170 likes | 355 Views
Concepts of Game Theory I. What are Multi-Agent Systems?. Organisational relationship. Interaction. Agent. Spheres of influence. Environment. A Multi-Agent System Contains:. A number of agents that interact through communication are able to act in an environment
E N D
What are Multi-Agent Systems? Organisational relationship Interaction Agent Spheres of influence Environment
A Multi-Agent System Contains: A number of agents that • interact through communication • are able to act in an environment • have different “spheres of influence” (which may coincide) • will be linked by other (organisational) relationships.
Utilities of agents (1) • Assume that we have just two agents: AG = {i, j } • Agents are assumed to be self-interested: • They have preferences over environmental states
Utilities of agents (2) • Assume that there is a set of “outcomes” that agents have preferences over: = {1, 2, } • Example: odd-or-even game (alternative to head-or-tail) = {(0,0),…,(0,5),(1,0),…,(1,5),…(5,0),…,(5,5)} • These preferences are captured by utility functions: ui : uj : • Example: odd-or-even game (alternative to head-or-tail) ueven((0,0)) = 1 ueven((0,1)) = 0 ueven((0,2)) = 1 … uodd((0,0)) = 0 uodd((0,1)) = 1 uodd((0,2)) = 0 … Or, more simply, ueven((m,n)) = 1, if m +n is an even number; otherwise 0 uodd((m,n)) = 0, if m +n is an even number; otherwise 1
Utilities of agents (2) • Utility functions lead to preference orderings over outcomes: i’ means ui () ui (’) j’ means uj () uj (’) • But, what is utility? • In some domains, utility is analogous to money; e.g. we could have a relationship like this: Utility Money
Agent Encounters • To investigate agent encounters we need a model of the environment in which agents act: • agents simultaneously choose an action to perform, • the actions they select will result in an outcome ; • the actual outcome depends on the combination of actions; • Assume each agent has just two possible actions it can perform: • C (“cooperate”) • D (“defect”).
The State Transformer Function • Let’s formalise environment behaviour as: : Aci Acj • Some possibilities: • Environment is sensitive to the actions of both agents: (D,D)1 (D,C)2 (C,D)3 (C,C)4 • Neither agent has influence on the environment: (D,D)(D,C)(C,D)(C,C)1 • The environment is controlled by agent j . (D,D)1 (D,C)2 (C,D)1 (C,C)2
Rational Action (1) • Suppose an environment in which both agents can influence the outcome, with these utility functions: ui (1)1 ui (2)1 ui (3)4 ui (4)4 uj (1)1 uj (2)4 uj (3)1 uj (4)4 • Including choices made by the agents: ui ((D,D))1ui ((D,C ))1 ui ((C,D))4ui ((C,C ))4 uj ((D,D))1uj ((D,C ))4 uj ((C,D))1 uj ((C,C ))4
Rational Action (2) • Then, the preferences of agent i are: (C,C)i (C,D)i (D,C) i (D,D) • “C” is the rational choice for i: • Agent i prefers outcomes that arise through C over all outcomes that arise through D.
Pay-off Matrices • We can charaterise this scenario (& similar scenarios) as a pay-off matrix: i j • Agent i is the column player • Agent j is the row player
Dominant Strategies • Given any particular strategy s (either C or D) for agent i, there will be a number of possible outcomes • s1dominatess2 if every outcome possible by i playing s1 is preferred over every outcome possible by i playing s2 • A rational agent will never play a strategy that is dominated by another strategy • However, there isn’t always a unique strategy that dominates all other strategies…
Nash Equilibrium • Two strategies s1 and s2 are in Nash Equilibrium if: • under the assumption that agent i plays s1, agent j can do no better than play s2; and • under the assumption that agent j plays s2, agent i can do no better than play s1. • Neither agent has any incentive to deviate from a Nash equilibrium!! • Unfortunately: • Not every interaction has a Nash equilibrium • Some interactions have more than one Nash equilibrium… John Forbes Nash, Jr http://www.math.princeton.edu/jfnj/
Competitive and Zero-Sum Interactions • When preferences of agents are diametrically opposed we have strictly competitive scenarios • Zero-sum encounters have utilities which sum to zero: , ui () uj () 0 • Zero sum implies strictly competitive • Zero sum encounters in real life are very rare • However, people tend to act in many scenarios as if they were zero sum.
The Prisoner’s Dilemma • Two people are collectively charged with a crime • Held in separate cells • No way of meeting or communicating • They are told that: • if one confesses and the other does not, the confessor will be freed, and the other will be jailed for three years; • if both confess, both will be jailed for two years • if neither confess, both will be jailed for one year Albert W. Tucker
The Prisoner’s Dilemma Pay-Off Matrix • Defect = confess; Cooperate = not confess • Numbers in pay-off matrix are not years in jail • They capture how good an outcome is for the agents • The shorter the jail term, the better • The utilities thus are: ui (D,D) 2ui (D,C ) 5 ui (C,D ) 0ui (C,C ) 3 uj (D,D) 2uj (D,C ) 0 uj (C,D ) 5 uj (C,C ) 3 • The preferences are: (D,C )i (C,C )i (D,D) i (C,D ) (C,D )j (C,C )j (D,D) j (D,C )
The Prisoner’s Dilemma Pay-Off Matrix • Top left: both defect, both get 2 years. • Top right: i cooperates and j defects, i gets sucker’s pay-off, while j gets 5. • Bottom left is the opposite • Bottom right: reward for mutual cooperation. i Defect = confess j Coop= not confess