260 likes | 371 Views
The price of stochastic anarchy. Load Balancing on Unrelated Machines. n players, each with a job to run, chooses one of m machines to run it on Each player’s goal is to minimize her job’s finish time . NOTE: finish time of a job is equal to load on the machine where the job is run.
E N D
Load Balancing on Unrelated Machines • n players, each with a job to run, chooses one of m machines to run it on • Each player’s goal is to minimize her job’s finish time. • NOTE: finish time of a job is equal to load on the machine where the job is run.
Load Balancing on Unrelated Machines • n players, each with a job to run, chooses one of m machines to run it on • Each player’s goal is to minimize her job’s finish time. • NOTE: finish time of a job is equal to load on the machine where the job is run.
Load Balancing on Unrelated Machines • n players, each with a job to run, chooses one of m machines to run it on • Each player’s goal is to minimize her job’s finish time. • NOTE: finish time of a job is equal to load on the machine where the job is run.
Load Balancing on Unrelated Machines • n players, each with a job to run, chooses one of m machines to run it on • Each player’s goal is to minimize her job’s finish time. • NOTE: finish time of a job is equal to load on the machine where the job is run.
L R job 1 δ 1 1 job 2 δ Unbounded Price of Anarchy in the Load Balancing Game on Unrelated Machines • Price of Anarchy (POA) measures the cost of having no central authority. • Let an optimal assignment under centralized authority be one in which makespan is minimized. • POA = (makespan at worst Nash)/(makespan at OPT) • Bad POA instance: 2 players and 2 machines (L and R). • OPT here costs δ. • Worst Nash costs 1. • Price of Anarchy: 1 δ
Drawbacks of Price of Anarchy • A solution characterization with no road map. • If there is more than one Nash, don’t know which one will be reached. • Strong assumptions must be made about the players: e.g., fully informed and fully convinced of one anothers’ “rationality.” • Nash are sometimes very brittle, making POA results feel overly pessimistic.
Evolutionary Game Theory • Young (1993) specified a model of adaptive play.
Evolutionary Game Theory 9 “I dispense with the notion that people fully understand the structure of the games they play, that they have a coherent model of others’ behavior, that they can make rational calculations of infinite complexity, and that all of this is common knowledge. Instead I postulate a world in which people base their decisions on limited data, use simple predictive models, and sometimes do unexplained or even foolish things.” – P. Young, Individual Strategy and Social Structure, 1998 Young (1993) specified a model of adaptive play that allows us to predict which solutions will be chosen in the long run by self-interested decision-making agents with limited info and resources.
Evolutionary Game Theory 10 Young (1993) specified a model of adaptive play. Adaptive play allows us to predict which solutions will be chosen in the long run by self-interested decision-making agents with limited info and resources.
L R job 1 δ 1 1 job 2 δ Adaptive Play Example • In each round of play, each player uses some simple, reasonable dynamics to decide which strategy to play. E.g., • imitation dynamics • Sample s of the last memstrategies I played • Play the strategy whose average payoff was highest (breaking ties uniformly at random) • best response dynamics • Sample the other player’s realized strategy in s of the last mem rounds. • Assume this sample represents the probability distribution of what the other player will play the next round, and play a strategy that is a best response (minimizes my expected cost).
L R job 1 δ 1 1 job 2 δ Adaptive Play Example • In each round of play, each player uses some simple, reasonable dynamics to decide which strategy to play. E.g., • imitation dynamics • Sample s of the last memstrategies I played • Play the strategy whose average payoff was highest (breaking ties uniformly at random) • best response dynamics • Sample the other player’s realized strategy in s of the last mem rounds. • Assume this sample represents the probability distribution of what the other player will play the next round, and play a strategy that is a best response (minimizes my expected cost).
L R job 1 δ 1 1 job 2 δ Adaptive Play Example: a Markov process • Let mem = 4. • If s = 3, each player randomly samples three past plays from the memory, and picks the strategy among them that worked best (yielded the highest payoff). (Then there are 2^8 = 256 total states in the state space.) player 1 ... player 2 1/4 3/4 1 1
L R job 1 δ 1 1 job 2 δ Absorbing Sets of the Markov Process • An absorbing set is a set of states that are all reachable from one another, but cannot reach any states outside of the set. • In our example, we have 4 absorbing sets: • But which state we end up in depends on our initial state. Hence we perturb our Markov process as follows: • During each round, each player, with probability ε, does not use imitation dynamics, but instead chooses a machine at random. OPT 1 1 1 1 NASH
L R job 1 δ 1 1 job 2 δ Stochastic Stability • The perturbed process has only one big absorbing set (any state is reachable from any other state). • Hence we have a unique stationary distributionμε(where μεP = με). • The probability distributionμε is the time-average asymptotic frequency distribution of Pε. • A state z is stochastically stable if
L R job 1 δ 1 1 job 2 δ Finding Stochastically Stable States • Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential. 1 1 1 1
L R job 1 δ 1 1 job 2 δ Finding Stochastically Stable States • Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential. 1 3 LR LR LR RRRRRRRR LR LL LL LL LLLLRRRR LL
L R job 1 δ 1 1 job 2 δ Finding Stochastically Stable States • Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential. = cost of min spanning tree rooted there 6 2 1 3
L R job 1 δ 1 1 job 2 δ Finding Stochastically Stable States • Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential. = cost of min spanning tree rooted there 6 6 1 2 3
L R job 1 δ 1 1 job 2 δ Finding Stochastically Stable States • Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential. = cost of min spanning tree rooted there 6 6 1 1 5 3
L R job 1 δ 1 1 job 2 δ Finding Stochastically Stable States • Theorem (Young, 1993): The stochastically stable states are those states contained in the absorbing sets of the unperturbed process that have minimum stochastic potential. = cost of min spanning tree rooted there 6 6 1 1 5 2 Stochastically Stable! 4
Recap: Adaptive Play Model • Assume the game is played repeatedly by players with limited information and resources. • Use a decision rule (aka “learning behavior” or “selection dynamics”) to model how each player picks her strategy for each round. • This yields a Markov Process where the states represent fixed-sized histories of game play. • Add noise (players make “mistakes” with some small positive probability and don’t always behave according to the prescribed dynamics)
Stochastic Stability 23 • The states in the perturbed Markov process with positive probability in the long-run are the stochastically stable states (SSS). • In our paper, we define the Price of Stochastic Anarchy (PSA) to be
L R job 1 δ 1 1 job 2 δ PSA for Load Balancing • Recall bad instance: POA = 1/δ (unbounded) • But the bad Nash in this case is not a SSS. In fact, OPT is the only SSS here. So PSA = 1 in this instance. • Our main result: • For the game of load balancing on unrelated machines, while POA is unbounded, PSA is bounded. • Specifically, we show PSA ≤ m∙(Fib(n)(mn+1)), which is m times the (mn+1)th n-step Fibonacci number. • We also exhibit instances of the game where PSA > m. Ω(m)≤ PSA ≤ m∙Fib(n)(mn+1) (m is the number of machines, n is the number of jobs/players)
Closing Thoughts • In the game of load balancing on unrelated machines, we found that while POA is unbounded, PSA is bounded. • Indeed, in the bad POA instances for many games, the worst Nash are not stochastically stable. • Finding PSA in these games are interesting open questions that may yield very illuminating results. • PSA allows us to determine relative stability of equilibria, distinguishing those that are brittle from those that are more robust, giving us a more informative measure of the cost of having no central authority.
L R job 1 δ 1 1 job 2 δ Conjecture • You might notice in this game that if players could coordinate or form a team, they would play OPT. • Instead of being unbounded, [AFM2007] have shown the strong price of anarchy is O(m). • We conjecture that PSA is also O(m), i.e., that a linear price of anarchy can be achieved without player coordination.