320 likes | 654 Views
Belief Learning in an Unstable Infinite Game Paul J. Healy CMU Issue #3 Issue #2 Belief Learning in an Unstable Infinite Game Issue #1 Issue #1: Infinite Games Typical Learning Model: Finite set of strategies Strategies get weight based on ‘fitness’
E N D
Belief Learning in an Unstable Infinite Game Paul J. Healy CMU
Issue #3 Issue #2 Belief Learning in an Unstable Infinite Game Issue #1
Issue #1: Infinite Games • Typical Learning Model: • Finite set of strategies • Strategies get weight based on ‘fitness’ • Bells & Whistles: experimentation, spillovers… • Many important games have infinite strategies • Duopoly, PG, bargaining, auctions, war of attrition… • Quality of fit sensitive to grid size? • Models don’t use strategy space structure
Previous Work • Grid size on fit quality: • Arifovic & Ledyard • Groves-Ledyard mechanisms • Convergence failure of RL with |S| = 51 • Strategy space structure: • Roth & Erev AER ’99 • Quality-of-fit/error measures • What’s the right metric space? • Closeness in probs. or closeness in strategies?
Issue #2: Unstable Game • Usually predicting convergence rates • Example: p–beauty contests • Instability: • Toughest test for learning models • Most statistical power
Previous Work • Chen & Tang ‘98 • Walker mechanism & unstable Groves-Ledyard • Reinforcement > Fictitious Play > Equilibrium • Healy ’06 • 5 PG mechanisms, predicting convergence or not • Feltovich ’00 • Unstable finite Bayesian game • Fit varies by game, error measure
Issue #3: Belief Learning • If subjects are forming beliefs, measure them! • Method 1: Direct elicitation • Incentivized guesses about s-i • Method 2: Inferred from payoff table usage • Tracking payoff ‘lookups’ may inform our models
Previous Work • Nyarko & Schotter ‘02 • Subjects BR to stated beliefs • Stated beliefs not too accurate • Costa-Gomes, Crawford & Boseta ’01 • Mouselab to identify types • How players solve games, not learning
This Paper • Pick an unstable infinite game • Give subjects a calculator tool & track usage • Elicit beliefs in some sessions • Fit models to data in standard way • Study formation of “beliefs” • “Beliefs” <= calculator tool • “Beliefs” <= elicited beliefs
The Game • Walker’s PG mechanism for 3 players • Added a ‘punishment’ parameter
Parameters & Equilibrium • vi(y) = biy – aiy2 + ci • Pareto optimum: y = 7.5 • Unique PSNE: si* = 2.5 • Punishment γ= 0.1 • Purpose: Not too wild, payoffs rarely negative • Guessing Payoff: 10 – |gL - sL|/4 - |gR - sR|/4 • Game Payoffs: Pr(<50) = 8.9% Pr(>100) = 71%
Choice of Grid Size S = [-10,10]
Properties of the Game • Best response: • BR Dynamics: unstable • One eigenvalue is +2
Design • PEEL Lab, U. Pittsburgh • All Sessions • 3 player groups, 50 periods • Same group, ID#s for all periods • Payoffs etc. common information • No explicit public good framing • Calculator always available • 5 minute ‘warm-up’ with calculator • Sessions 1-6 • Guess sL and sR. • Sessions 7-13 • Baseline: no guesses.
Does Elicitation Affect Choice? • Total Variation: • No significant difference (p=0.745) • No. of Strategy Switches: • No significant difference (p=0.405) • Autocorrelation (predictability): • Slightly more without elicitation • Total Earnings per Session: • No significant difference (p=1) • Missed Periods: • Elicited: 9/300 (3%) vs. Not: 3/350 (0.8%)
Does Play Converge? Average | si – si* | per Period Average | y – yo | per Period
Accuracy of Beliefs • Guesses get better in time Average || s-i – s-i(t) || per Period Elicited guesses Calculator inputs
Model 1: Parametric EWA • δ : weight on strategy actually played • φ : decay rate of past attractions • ρ : decay rate of past experience • A(0): initial attractions • N(0): initial experience • λ : response sensitivity to attractions
Model 1’: Self-Tuning EWA • N(0) = 1 • Replace δ and φ with deterministic functions:
STEWA: Setup • Only remaining parameters: λ and A0 • λ will be estimated • 5 minutes of ‘Calculator Time’ gives A0 • Average payoff from calculator trials:
STEWA: Fit • Likelihoods are ‘zero’ for all λ • Guess: Lots of near misses in predictions • Alternative Measure: Quad. Scoring Rule • Best fit: λ = 0.04 (previous studies: λ>4) • Suggests attractions are very concentrated
STEWA: Adjustment Attempts • The problem: near misses in strategy space, not in time • Suggests: alter δ (weight on hypotheticals) • original specification : QSR* = 1.193 @ λ*=0.04 • δ = 0.7 (p-beauty est.): QSR* = 1.056 @ λ*=0.03 • δ = 1 (belief model): QSR* = 1.082 @ λ*=0.175 • δ(k,t) = % of B.R. payoff: QSR* = 1.077 @ λ*=0.06 • Altering φ: • 1/8 weight on surprises: QSR* = 1.228 @ λ*=0.04
STEWA: Other Modifications • Equal initial attractions: worse • Smoothing • Takes advantage of strategy space structure • λ spreads probability across strategies evenly • Smoothing spreads probability to nearby strategies • Smoothed Attractions • Smoothed Probabilities • But… No Improvement in QSR* or λ* ! • Tentative Conclusion: • STEWA: not broken, or can’t be fixed…
Other Standard Models • Nash Equilibrium • Uniform Mixed Strategy (‘Random’) • Logistic Cournot BR • Deterministic Cournot BR • Logistic Fictitious Play • Deterministic Fictitious Play • k-Period BR
“New” Models • Best respond to stated beliefs (S1-S6 only) • Best respond to calculator entries • Issue: how to aggregate calculator usage? • Decaying average of input • Reinforcement based on calculator payoffs • Decaying average of payoffs
Model Comparisons * Estimates on the grid of integers {-10,-9,…,9,10} In = periods 1-35 Out = periods 36-End
The “Take-Homes” • Methodological issues • Infinite strategy space • Convergence vs. Instability • Right notion of error • Self-Tuning EWA fits best. • Guesses & calculator input don’t seem to offer any more predictive power… ?!?!