230 likes | 262 Views
Multiagent Systems. Solution Concepts for Normal Form (Matrix) Games. Solution Concepts. Solution concepts define “ interesting ” outcomes in a multiagent game, e.g.: Pareto-optimal outcomes Pure-strategy Nash equilibrium Mixed-strategy Nash equilibrium Maxmin strategy profile
E N D
Multiagent Systems Solution Concepts for Normal Form (Matrix) Games
Solution Concepts • Solution concepts define “interesting” outcomes in a multiagent game, e.g.: • Pareto-optimal outcomes • Pure-strategy Nash equilibrium • Mixed-strategy Nash equilibrium • Maxmin strategy profile • Minmax strategy profile • Important questions • Do they exist ? • How can they be computed ?
Zero-Sum Games • Zero-Sum games are games where all the sum of the payoffs for all agents for all strategies is equal to 0. • Any game in which a transformation of the utility function of the form u’ = k*u + l, k>0, l≥0 leads to a zero-sum game can itself be considered a zer-sum game. • a 2-payer zero-sum game is completely antagonistic • u1 = -u2
Nash Equilibria in 2-Player Zero-Sum Games • In 2-player zero-sum games Nash equilibria are minmax and maxmin strategies • Solution can be found by searching through the possible sets of support for mixed strategies • Exponential complexity • Maxmin solution can be formulated as a linear program • Maxmin for agent 1 is equal to Minmax for agent 2
Linear Programming • Linear programs are optimization problems under linear, closed inequality constraints • xi are the variables to solve for
2-Player Zero-Sum Game • In 2-player zero-sum games, maxmin and minmax can be expressed as linear programs • maxmin for agent 1 or minmax for agent 2:
2-Player Zero-Sum Game • minmax for agent 1 or maxmin for agent 2:
Maxmin Strategies for General Games • maxmin strategies can be computed for general sum games • maxmin strategy determines the agent’s strategy only based on its utility function • Computation by re-designing the game • Construct a zero-sum game G’ by maintaining the agent’s payoffs and changing the other agent’s payoff function • The maxmin solution for the agent in the modified game G’ is the same as for the original game • maxmin strategy in a general sum game is not necessarily a Nash equilibrium
Nash Equilibria for 2-Player Non Zero-Sum Games • For non zero-sum games, linear programming does no longer work because there is no single objective function • The complexity of computing Nash equilibria in general sum games is unknown. • It is assumed that it is worst case exponential • 2-Player general sum games can be fomulated as Linear Complementarity Problems (LCP) • LCP is a constraint satisfaction problem without an objective function • Formulation considers both players’ utility functions
LCP Formulation for 2-Player Non Zero-Sum Games • LCP combines the constraints of both agents and adds complementarity constraints
Nash Equilibria for 2-Player Non Zero-Sum Games • LCP for 2-player games can be solved, e.g., using the Lemke-Howson algorithm • Moves along the edges of a labeled graph in strategy space • Worst-case exponential • Other solution is to search the space of supports • Determine which sets of actions could yield mixed Nash equilibria • Compute the corresponding probabilities and verify that it is an equilibrium
Domination and Strategy Removal • Strategy profiles can be related • si strictly dominates si’ for player i if • si weakly dominates si’ for player i if • si very weakly dominates si’ for player i if
Domination and Nash Equilibria • A dominant strategy is a strategy that dominates all others • A strategy profile consisting of dominant strategies for all players must be a Nash equilibrium
Iterated Removal of Dominated Strategies • No equilibrium can be strictly dominated by another strategy • All strictly dominated strategies can be removed while still maintaining the solution to the game • Iterated removal of dominated strategies repeats the removal process until no further dominated strategies are available
Iterated Removal of Dominated Strategies • M is dominated by [0.5:U; 0.5:D] for player 1 • R is dominated by L for player 2
Iterated Removal of Dominated Strategies • Iterated removal preserves Nash equilibria • Strict dominance preserves all equilibria • Weak or very weak dominance preserves at least one equilibrium • Removal order can influence which equilibria are preserved • Iterated removal can be used as a preprocessing step for Nash equilibrium calculation
Correlated Equilibria • In particular in coordination games the Nash equilibrium does not achieve a very good performance • Equilibrium results in strategies that often end up in low payoff outcomes • What could solve this ?
Correlated Equilibria • A solution would be to coordinate the random picks by the players • Has to be outside the control and insight of each player or they could change their strategy • Central random variable with a common, known distribution and a private signal to each of the players • Signal is correlated to other signals but does not determine the other players’ signals
Correlated Equilibria • A correlated equilibrium is a tuple (v, π, σ) • v is a vector of random variables with domains D • π is the joint distribution of v • σ is a vector of mappings from D to actions in A • for every mapping σ’:
Correlated Equilibria • For every Nash equilibrium there exists a corresponding correlated equilibrium. • If the mapping is replaced by the decoupled probabilistic choices and the mapping is reduced to the action choice indicated by the domain, the correlated equilibrium reduces to a Nash equilibrium • Not every correlated equilibrium is a Nash equilibrium
Computing Correlated Equilibria • Linear programming constraints for CE • Objective function: e.g. social-welfare Not necessarily a Nash Equilibrium
Computing Correlated Equilibria • CE are easier to compute than Nash Equilibria • Only one randomization in CE and product of independent probabilities in NE • Constraints for NE: This is a nonlinear constraint – No linear program
Computing Nash Equilibria • No algorithm is known to compute Nash equilibria for n-player general sum games in polynomial time • A number of iterated algorithms provide good approximations • Multiagent learning can lead to efficient approximate solutions