550 likes | 725 Views
LEVEL-K MODELS AND DECISION NOISE Miguel A. Costa-Gomes University of Aberdeen, Scotland, U.K. (partly based on joint work with Vincent Crawford, Nagore Iriberri and B runo Broseta ) April, 2012. OVERVIEW I) INTRODUCTION: LEVEL-K MODEL
E N D
LEVEL-K MODELS AND DECISION NOISEMiguel A. Costa-Gomes University of Aberdeen, Scotland, U.K. (partly based on joint work with Vincent Crawford, NagoreIriberri and Bruno Broseta)April, 2012
OVERVIEW I) INTRODUCTION: LEVEL-K MODEL II) LEVEL-K MODEL: ADJUSTMENT OF PLAYERS’ BELIEFS III) LEVEL-K MODEL: MODELING L0 IV) LEVEL-K MODEL: MODELING NOISE V) TRAINING SUBJECTS AND NOISE
INTRODUCTION: THE LEVEL-K MODEL Bosch-Domenech, Montalvo, Nagel and Satorra (2002) Guessing Game n subjects 2/3 of the mean of subjects’ guesses Uniform random play has mean 50, so 2/3x50=33.3 (L1) 2/3x33.3=22.2 (L2) 2/3x22.2=14.8 (L3) … 0
INTRODUCTION: THE LEVEL-K MODEL The level-k (also known as level-n) model has its origins in the experimental literature in the articles by Nagel (1995, AER) and Stahl and Wilson (1994, JEBO; 1995 GEB). These authors hypothesize and present empirical evidence that human subjects use decision rules or “rules of thumb” that anchor beliefs in random behavior (invoking a uniform prior / the principle of insufficient reason), (which SW refer to as “L0”) and then adjust them via iterated best responses. These rules of thumb (or “types”) are often referred to asLk. The three main modeling issues about the level-k model are: 1) How to model the adjustment of players’ beliefs via iterated best responses? Beliefs: Homogeneous vs. heterogeneous population? 2) How to model the anchor of players’ beliefs, the L0 rule? Should it always be random behavior? 3) How to model decision noise?
LEVEL-K MODEL: ADJUSTMENT OF PLAYERS’ BELIEFS 1) How to model the adjustment of players’ beliefs via iterated best responses? Does a player’s beliefs reflect an homogeneous or a heterogeneous population? Different researchers take a different view on how players adjust their beliefs via iterated best responses. Nagel (1995) discusses the Lk rule as best responding solely to the Lk-1 rule, while SW model the Lk rule as best responding to lower rules (i.e., L2 best responds to L1 and L0). Camerer, Ho and Chong’s (2004) “Cognitive Hierarchy” (CH) explicitly say that an Lk rule best-responds to the distribution of all lower rules, from L0 to Lk-1. While the former interpretation allows for a priori prediction for each of the rules, the latter doesn’t, unless one makes assumptions about parameter values, e.g. proportions of the different rules in the population. The former is also computationally simpler, and thus less demanding from a cognitive point of view. The former interpretation allows each subject’s data to be considered separately from other subjects’ data, while the latter does not always allow it.
LEVEL-K MODEL: ADJUSTMENT OF PLAYERS’ BELIEFS 1) How to model the adjustment of players’ beliefs via iterated best responses? Homogeneous vs. heterogeneous population? Modeling and empirical evidence: Lk best responds to Lk-1: Nagel (1995), Costa-Gomes, Crawford and Broseta (2001), Costa-Gomes and Crawford (2006). Lk best responds to L0,…,Lk-1: Stahl and Wilson (1994,1995), Camerer, Ho and Chong (2004).
LEVEL-K MODEL: MODELING L0 The anchor of players’ beliefs, i.e. L0, can be interpreted more generally as a “strategically naïve initial assessment of others’ likely responses” as described in Crawford, Costa-Gomes and Iriberri (2012). This view is broader than the initial view that the anchor is uniform random behavior, as was the case in the one-shot games of Nagel (1995), SW (1994,1995) and Costa-Gomes, Crawford and Broseta (2001), where context, incompleteinformation, and communication do not matter. How to define L0 when there is communication of intentions? One possibility is to favor its literal interpretation: An L0sender is truthful and an L0receiver is credulous. This implies that L1 receivers are credulous, while L1senders might lie or not. Elligsen and Ostling (2010) consider L0 receivers can also be uniform random. The difference between the two approaches is minimal, because L1 receivers are still credulous, only changing the incentives of L1 senders to lie. Elligsen and Ostling (2010) add that players have a lexicographic preference for the truth, when payoffs are tied.
LEVEL-K MODEL: MODELING L0 How to define L0 when there is incomplete information? Crawford and Iriberri (2007a) propose that in addition to an anchor based on random behavior, there is an anchor that relies on truthfulness. Thus, two families of Lk rules appear: i) one anchored in an L0 rule that randomizes uniformly in a way that is typically (but not always) independent of its private information, random L0 (in an IPV auction randomizes uniformly over the set of signals, i.e., values; in a CV auction randomizes uniformly over the set of feasible values given its signal). ii) one anchored in an L0 rule that is truthful, and communicates its private information, truthful L0, i.e., in an IPV auction bids its signal. In sender-receiver games, it assumes the L0 sender is truthful, with L0 receivers assumed to be credulous. (See also Cai and Wang (2006)).
LEVEL-K MODEL: MODELING L0 How to define L0 when the context is relevant, as it is the case when labels are salient? Crawford and Iriberri (2007b) propose that in such settings L0 deviates from uniform randomness by favoring salient locations. In a constant-sum game with four strategies arranged in a row, A,B,A,A, Crawford and Iriberri (2007b) say that salient locations are either B or one of the end As, assumed to be equally salient and treated equally by both players, thus requiring the addition of one binary parameter. In O’Neill’s (1987) card-matching game, players simultaneously choose one of four cards: A,2,3,J. Crawford and Iriberri (2007b) say that salient locations are A and J, as they are both face cards and end locations, and add one binary parameter. How to model L0 when there is a tension between label and payoff-salience? Crawford, Gneezy and Rottenstreich (2008) assume that L0 responds to both kinds of salience, with a “payoffs bias” that favors payoff over label salience.
LEVEL-K MODEL: MODELING NOISE I sometimes refer to two different approaches to data fitting: models of aggregate play and models of individual play. Models of individual play: use a subject’s decisions in a series of games, and consider his decisions follow one of K Lk rules, with probabilities Models of aggregate play: use the subjects’ decisions (although usually just from one game) in a series of games and consider their decisions follow one of K Lk rules, with probabilities
LEVEL-K MODEL: MODELING NOISE How to model deviations from a rule’s predicted decision? i) Uniform error rate: The subject follows the rule with probability and makes an error with probability . There are two different approaches to modeling such errors: a) he plays each of his decisions (including the rule’s predicted decision) with equal probability; or b) he plays each of his decisions (excluding the rule’s predicted decision) with equal probability;
LEVEL-K MODEL: MODELING NOISE i) Uniform error rate: a) he plays each of his decisions (including the rule’s predicted decision, ) with equal probability: (Example: Costa-Gomes, Crawford and Broseta (2001)) b) he plays each of his decisions (excluding the rule’s predicted decision) with equal probability;
LEVEL-K MODEL: MODELING NOISE 3) How to model decision noise? ii) Logistic choice function: The subject’s decision is governed by a logistic choice function. Therefore, the probability with which he makes the rule’s predicted action depends on how much larger is its expected payoff in relation to the expected payoff of the other actions, and is therefore not constant across games. , where (Example: Stahl and Wilson (1994,1995)) Remarks: Follows the rule’s predicted decision with a different probability across games, more often when all other actions’ expected payoffs are much lower, and when the number of the other available actions is smaller. Deviation is more likely to lead to an action with a higher than a lower expected payoff.
LEVEL-K MODEL: MODELING NOISE iii) “Spike-logit” error structure: In each game (g), a subject (i) makes his rule’s decision exactly with probability , and otherwise makes errors, that follow a logisticdistribution over the rest of her feasible actions. , where (Example: Costa-Gomes and Crawford (2006)) Remarks: Follows the rule’s predicted decision with the same probability across games, and independently of other decision’s expected payoffs. However, a deviation is more likely to lead to decision with a higher than a lower expected payoff.
LEVEL-K MODEL: MODELING NOISE There are other ways to add “extra” decision noise to the model. One of them is by explicitly allowing the subject to be an L0, in which case another rule is added to the set of rules, . , where and (Example: Stahl and Wilson (1994)) Remarks: Improves fit when is rule specific, i.e., , lowering .
LEVEL-K MODEL: MODELING NOISE When Lk rules are modeled as best responding to the distribution of all lower rules, from L0 to Lk-1, with the distribution being inferred from the data, each rule’s predicted action is not defined a priori, but depends on . This gives extra-flexibility to the model, and might allow rules’ predicted actions to move towards the actions most played by the subjects. However, this requires that the choices of several subjects are considered together, , where and
LEVEL-K MODEL: MODELING NOISE When Lk rules are modeled as best responding to the distribution of all lower rules, from L0 to Lk-1, with the distribution being inferred from the data, each rule’s predicted action is not defined a priori, but depends on The Cognitive Hierarchy model (CHC 2004) is a one parameter example of this approach: (In practice k is truncated at an integer smaller than 6). A level-k player has an accurate guess about the relative proportions of players who are doing less thinking than they are:
LEVEL-K MODEL: MODELING NOISE A few years later CHC added payoff-sensitive errors to their Cognitive Hierarchy model (CHC 2008) Note: In this formulation you do not need level 0 players to avoid the zero-likelihood problem.
LEVEL-K MODEL: MODELING NOISE CHC also used a few other formulations, namely allowing to be a set of parameters to be inferred from the data (rather than assuming they are given by the Poisson distribution). Note: They seem not to have tried the specification that does not have payoff-sensitive errors.
LEVEL-K MODEL: MODELING NOISE When the action set has a natural order, i.e., actions are numbers, other modeling choices have been used to introduce noise. For example, instead of assuming that a rule’s predicted action is unique or subject to either uniform errors across all actions, or payoff sensitive errors, an alternative approach has been proposed specifying that a rule’s action is distributed according to a distribution (beta distribution, or a truncated normal), with its mean coinciding with the rule’s usually predicted action, with the other parameters chosen so as to maximize fit of the data, Bosch-Domenech, Montalvo, Nagel and Satorra (2010). Other modeling choices have been used to circumvent the need to introduce noise, e.g., Haruvy, Stahl and Wilson (2001). The assume that an Lk player’s belief about his opponent’s choices is not a point in the (c-1) dimensional simplex, but drawn from a subset of the (c-1) dimensional simplex, which leads the Lkplayer`s predicted distribution of choices to be non-degenerate. (Gets rid of the need to introduce error as long as no dominated actions are played.)
LEVEL-K MODEL: TRAINING SUBJECTS AND NOISE We impose specifications for noise on data and infer parameters values that produce the best fit. An alternative, is to get people who are rewarded for playing according to our rules of thumb, and investigate their “errors” (Costa-Gomes and Crawford, in progress) UCSD + YorkL3 L2 L1 D2 D1 Eq. # Subjects 18 27 25 19 30 29 % Compliance0.8440.9140.8000.5530.6230.703 EQ.EQ. EFEQ. ECEQ. BREQ. ID # Subjects 66 18 14 18 16 % Compliance 0.7970.9200.8260.8610.613
EXPERIMENTAL DESIGN TWO-PERSON GUESSING GAMES Each player has a lower and an upper limit (both strictly positive), but players are not required to guess between their limits. Guesses outside a player’s limits are automatically adjusted up to his lower limit or down to his upper limit as necessary, Each player also has a target , and his payoff increases with the closeness of his adjusted guess to his target times the other’s adjusted guess.
EXPERIMENTAL DESIGN TWO-PERSON GUESSING GAMES Example – Game g: Suppose i guesses 400 --> j guesses 200 --> i guesses 300 --> j guesses 150 --> i guesses 225 (adjusted to 300). Equilibrium is (i guesses 300, j guesses 150) Player i’s payoff is
EXPERIMENTAL DESIGN TWO-PERSON GUESSING GAMES Within this structure, we vary the targets and limitsindependently and acrossplayers and games, with targets either both less than one, both greater than one, or mixed (the targets in previous guessing experiments varied only across treatments, or not at all). LOWER AND UPPER LIMITS: (a) [100,500]; (b) [100,900]; (g) [300,500]; (d) [300,900]. TARGETS:(1) 0.5; (2) 0.7; (3) 1.3; (4) 1.5. Example (d4b1):
EXPERIMENTAL DESIGN TWO-PERSON GUESSING GAMES Example (d4b1)
EXPERIMENTAL DESIGN FEATURES OF TWO-PERSON GUESSING GAMES Reducing n to two allows us to focus on the central strategic problem of predicting the guesses of other players who view themselves as a non-negligible part of one’s own environment. It also eliminates his need to predict how his guess affects an average. Having player-specific limits and targets, moves equilibrium guesses away from the boundaries allowing clearer inferences. Our games are not zero-sum, and have more than two possible payoffs. Consequently, players’ best responses to their beliefs are usually unique within their limits. Deviations are costly.
EXPERIMENTAL DESIGN TYPES L1 – plays the best response to beliefs that assign equal probabilities to other's actions. L2 – plays the best response to L1. L3 – plays the best response to the best response to L1. D1 – which does one round of deleting actions that are dominated by pure actions and then best responds to a uniform prior over other's remaining actions. D2 – which does two rounds of deleting actions that are dominated by pure actions and then best responds to a uniform prior over other's remaining actions. EQUILIBRIUM – always plays his equilibrium action. SOPHISTICATED – plays the best response to the probabilities of his partner's actions, which we estimate from the observed frequencies (depends on data).
EXPERIMENTAL DESIGN Computer Interface
EXPERIMENTAL DESIGN Click to open a box
EXPERIMENTAL DESIGN Click to close the box previously opened
EXPERIMENTAL DESIGN Entering a guess
EXPERIMENTAL DESIGN DATA - Guesses (Actions). - Look-ups (box, gaze time - amount of time the box was "open"). Camerer, Johnson, Rymon, and Sen (1993) and Johnson, Camerer, Rymon, and Sen (2002) pioneered the use of MouseLab data (actions, and information search) in games by studying backward induction in extensive-form alternating-offers bargaining games in which subjects could look up the sizes of the “pies” to de divided in each period. Costa-Gomes, Crawford, and Broseta (2001) used MouseLab to study two-person matrix games, with unique equilibria. Several other papers study other kinds of games and single-person decision problems.
EXPERIMENTAL DESIGN FEATURES OF THE DESIGN Varying targets and limits within an intuitive structure facilitates teaching subjects the rules and allows them to concentrate on predicting others’ guesses and identifying best responses, which greatly reduces the noisiness typical of initial responses to games. Types’ predicted guesses are more separated than in earlier studies, which mostly used payoff matrix games, in which players have at most 5 possible actions. In particular, the Lk, and Dk types are not separated in previous guessing games experiments, and are only weakly separated in other experiments. Allowing subjects to search within the intuitive common structure of guessing games makes mental models of others easy to express as functions of the targets and limits, as we will see next. As will also be seen next, the L1 type, which seems to best describe most subjects’ actions in earlier studies, requires subjects looking-up elements of the game about the other player (the lower and upper limits). In matrix games, L1 subjects only need to look-up their own payoffs.
ECONOMETRIC MODEL GUESSES-ONLY ANALYSIS Counting “Beans” Maximum-Likelihood Error-rate Analysis Specification Test (Overfitting, Omitted types) SEARCH-ONLY ANALYSIS Maximum-Likelihood Error-rate Analysis GUESSES AND SEARCH ANALYSIS Maximum-Likelihood Error-rate Analysis Specification Test (Overfitting)
ECONOMETRIC MODEL GUESSES-ONLY ANALYSIS Counting “Beans” 43 of 88 subjects made b/w 7 and 16 of some type’s exact (within 0.5) guesses, 20, 12, and 3 conforming closer to L1, L2, and L3, than to Eq. (8). Maximum-Likelihood Error-rate Analysis: We use a simple “spike-logit” error structure in which, in each game (g), a subject (i) makes his type’s guess exactly with probability , and otherwise makes errors, that follow a logisticdistribution over the rest of the interval between her/his limits. Expected payoff of guess given type-k’s beliefs:
ECONOMETRIC MODEL GUESSES-ONLY ANALYSIS Maximum-Likelihood Error-rate Analysis: The probability density function of an error (a guess that differs from type k’s predicted guess) follows a logistic distribution over the rest of the interval between the player’s lower and upper limits. We assume that, given type, errors in guesses are independent across games. Subject i’sguesses related log-likelihood is:
SOME FINDINGS GUESSES-ONLY ANALYSIS Maximum-Likelihood Error-rate Analysis (cont.): Estimated Types: 43 L1, 20 L2, 3 L3, 5 D1, 14 Eq., and 3 Sop. Hypotheses Tests: is rejected for all but for 7 subjects. Spike is necessary. is rejected for 34 subjects. Thus, the logit model’s payoff-sensitive errors significantly improve the fit over a spike-uniform model for about 1/3 of the subjects. which corresponds to a random model of guesses within our specification is rejected at the 5% level for all but 10 subjects (6 L1, 2 D1, 1 Eq., and 1 Sop.)
ECONOMETRIC MODEL GUESSES-ONLY ANALYSIS “SPECIFICATION TEST” Type estimates could be sensitive to our a priorispecification ofpossibletypes, which might err by omittingrelevant types and/or overfitting by including empirically irrelevant ones. Test is based on the idea of a pseudotype - a pseudotype is constructed from one subject’s guesses in the 16 games. Since we have 88 subjects, we are going to have 88 pseudotypes. (Not all different, since if two subjects’ guesses coincide in all games, their pseudotypes coincide). We can then take some other subject’s guesses in the 16 games, and compute the likelihood of the subject’s guesses, given a pseudotype’s (predicted) guesses. (Not the same subject, since, obviously, the likelihood of a subject’s guesses given the pseudotype based on its guesses is 1!)
ECONOMETRIC MODEL GUESSES-ONLY ANALYSIS “SPECIFICATION TEST” The test compares the likelihood of our type estimate to the likelihoods of analogous estimates based on 87 pseudotypes. Omitting – Suppose we had omitted a relevant type, say L2. The pseudotypes of subjects now estimated to be L2 would then outperform the non-L2 types estimated for them, and would also make approximately the same (L2) guesses. We define a cluster as a group of 2 or more subjects such that: (i) each subject’s pseudotype has higher likelihood than the estimated type for each other subject in the group; (ii) subjects’ pseudotypes make “sufficiently similar guesses”. Finding a cluster should lead us to diagnose an omitted type, and studying the common elements of its’ subjects guesses may help to reveal its decision rule. We find 5 clusters with 3, 2, 2, 2, and 3 subjects, respectively (see Table IX).
ECONOMETRIC MODEL GUESSES-ONLY ANALYSIS “SPECIFICATION TEST” Overfitting – A subject’s estimated type to be acredible explanation of his behavior should perform at least as well against the pseudotypes as it would, on average, at random. Then for a pseudotype to have higher likelihood than our estimated type it must come first among our 7 types plus itself, which has probability 1/8. The subject’s estimated type has higher likelihood than all but an expected number of 10.75 pseudotypes. (15 type estimates are ruled out on the basis of overfitting – 10 L1, 2 L2, 1 D1, 1 Eq., and 1 Sop., of which 4 L1, 1 D1, 1 Eq., and 1 Sop. already ruled out on the basis of a random model of guesses.)
SOME FINDINGS SUMMARY OF REVISED GUESSES-ONLY ANALYSIS Combining our guesses-only estimates with our statistical tests, we say that a guesses-only type estimate appears reliable if: (i) it does significantly better at the 5% level than arandom model of guesses within our specification. (ii)it has higher likelihood than all but at most a random number of pseudotypes. (iii)it is not a member of any cluster. By these criteria, 58 of our 88 subjects’ guesses-only type estimates appear reliable.
SOME FINDINGS SUMMARY OF REVISED GUESSES-ONLY ANALYSIS L1: 43 (27 reliably identified, other 16 may be spurious - 5 in clusters + 11 no better than random guesses and/or pseudotypes). L2: 20 (17 reliably identified, other 3 may be spurious - 1 in clusters + 2 no better than random pseudotypes). L3: 3 (1 reliably identified, other 2 may be spurious - in clusters). D1: 5 (1 reliably identified, other 4 may be spurious - 2 in clusters + 2 no better than random guesses and/or pseudotypes). Eq.: 14 (11 reliably identified, other 3 may be spurious - 2 in clusters + 1 no better than random guesses and/or pseudotypes). Sop.: 3 (1 reliably identified, other 2 may be spurious - no better than random guesses and/or pseudotypes). Our findings are close to previous estimates from other kinds of games.
ECONOMETRIC MODEL Each type is associated with algorithms that describe how to processinformation about targets and limits into guesses. We use these algorithms as models of each subject’s cognition. We infer a type’sminimal search implications from plausible algorithms for identifying its ideal guess, under conservative assumptions about how cognition affects search, like in our previous paper. Standard assumptions imply a type looks up all freely available information that may affect its beliefs and guesses its target times the mean of its beliefs. The algorithms comprise basic operations and other operations. We assume that basic operations are associated with adjacent look-ups, which can appear in any order but cannot be separated; other operations can appear in any order, and can be separated (although we report the most natural order, we do not insist on it).
ECONOMETRIC MODEL L1 Ideal GuessSearch Implications
ECONOMETRIC MODEL L2 – 1st Step Her GuessSearch Implications
ECONOMETRIC MODEL L2 – 2nd Step My GuessSearch Implications
ECONOMETRIC MODEL SEARCH-ONLY ANALYSIS Example of a Look-up Sequence:{ai, pi, pj,ai, bi, aj, pi, bj,…} L2’s Look-up Sequence: Our econometric analysis quantifies compliance with a type’s search implications as the density of the type’s relevant look-up sequence in the subject’s look-up sequence. Since subjects vary widely in where the relevant look-ups tend to be located in their sequences, we filter out some idiosyncratic noise using a binary nuisance parameter called style(“early”or “late”), (assumed constant across games) If , we start at the beginning of the subject’s look-up sequence, and continue until we obtain the type’s complete relevant sequence: Example: {ai,pi, pj,[ai,bi],aj, pi,bj,…} (the L2’s relevant sequence has length six, and the first complete sequence is obtained after eight look-ups, compliance is 0.75).
ECONOMETRIC MODEL SEARCH-ONLY ANALYSIS We discretizecompliance into three categories: Let be the probability that subject i has type-kstyle-scompliance c in any given game. Let be the number of games for which subject i has type-kstyle-scompliance c. Subject i’sinformation search related log-likelihood is: 58 out of 71 Baseline subjects’ style estimates are “early”, 10 are “late”, and 3 ties.
ECONOMETRIC MODEL GUESSES AND SEARCH ANALYSIS A subject’s type and styledetermine hisinformation search and guess, each with error. We assume that, given type and style, errors in search and guesses are independent of each other and across games. Subject i’sinformation search and guesses log-likelihood is: The model has 6 parameters per subject: error rate , precision , type , style , and two independent probabilities .
ECONOMETRIC MODEL GUESSES AND SEARCH ANALYSIS Our subjects’ type estimates change when search is taken into account, for one of two reasons: For some subjects there is a tension between guesses-only and search-only type estimates, resolved in favor of a type other than the guesses-only estimate. For other subjects the type estimate based on guesses-only has 0 search compliance in 8 or more games, and is therefore ruled out by a priori constraints. When the guesses-and-search type estimatediffers from the guesses-only estimate, we favor the former but require it to pass the analogs of the guesses-only criteria. We say that a guesses-and-search type estimate is reliable if: (i) it does significantly better at the 5% level than a random model of guesses and search within our specification. (ii) The guesses-only part of its likelihood has higher likelihood than all but at most a random number of pseudotypes. (iii)it is not a member of any cluster.