340 likes | 426 Views
AMBIGUITY AND UNCERTAINTY IN ELLSBERG AND SHACKLE Marcello Basili and Carlo Zappia. Department of Economics University of Siena FUR XII Luiss – Rome, 23-26 June, 2006. INTRODUCTION
E N D
AMBIGUITY AND UNCERTAINTY IN ELLSBERG AND SHACKLEMarcello Basili and Carlo Zappia Department of Economics University of Siena FUR XII Luiss – Rome, 23-26 June, 2006
INTRODUCTION • George L. S. Shackle and Daniel Ellsberg represent two main positions among the critics of the forthcoming (at their time) mainstream in modern decision theory as represented by Savage’s Foundations of Statistics (1954) • Although both opposed maximisation of (subjective) expected utility as a criterion for choice under uncertainty, at first their theoretical enterprises appear to have few points in common
Ellsberg (1961) introduced the notion of “ambiguity” to refer to situations in which, due to lack of information, there is uncertainty about probabilities on events. • Ellsberg aimed to work in the footsteps of Knight (1921) and his notion of “unmeasurable uncertainty,” but started from the analysis of actual decisions in urn problems • Though confounding and denoted by information perceived as scanty, urn problems can be represented through an exhaustive list of the possible states of the world.
Shackle (1949 and 1955) insisted that the notion of uncertainty could not be reduced either to aleatory probability or subjective probability • Shackle rejected the use of probability measures in decision theory on the basis that the context of “crucial” entrepreneurial decisions is characterized by the fact that the list of possible states of the world known to the entrepreneur is not exhaustive • Keynesian authors have referred for long to Shackle’s theory as the only formalised alternative to the mainstream
This paper argues that Ellsberg’s and Shackle’s frameworks for discussing the limits of the (subjective) probabilistic approach to decision theory are not as different as they may appear. • Both Ellsberg and Shackle can be understood as main contributions of what today is called the non-additive probability approach to decisions under uncertainty • To stress the common elements in their theories Keynes’s Treatise on Probability provides an essential starting point (as recognised by Ellsberg in his 1962 Ph.D. thesis, published only as late as 2001) • Keynes’s rejection of well-defined probability functions and of maximisation as a guide to human conduct is shown to imply a reconsideration of what probability theory can encompass which inspired both Ellsberg and Shackle
KEYNES’S VIEW • Keynes interpreted probability differently from chance or frequency. Probability is a logical relation between two sets of propositions • The measurement of probabilities involves two magnitudes: the probability of an argument and the weight of the argument • Measurement of probability means comparison of the arguments, for such a comparison is “theoretically possible, whether or not we are actually competent in every case to make”
Keynes was well aware that the probabilities of two quite different arguments can be incomparable. • Probabilities can be compared if they belong to the same series, that is, if they “belong to a single set of magnitude measurable in term of a common unit” • Probabilities are incomparable if they belong to two different arguments and one of them is not (weakly) included in the other
Keynes’s idea about comparability is represented by the diagram at page 42 of the Treatise, reproduced below
On the horizontal axis there is the scale of probability, ranging from 0 to 1: every point on this axes can be compared to other points, because there exists a numerical representation of D-M degree of belief about a logical proposition • But in the plane depicted in the diagram there are also other different paths, starting from 0 and ending up in 1 but not lying on the straight line between the extremes • Each point on every non-linear path identifies what Keynes calls a non-numerical probability • From the point of view of modern theory of decision, the Keynesian paths are nothing but distorted probabilities, that is, contraction or expansion of prior linear probabilities
The second relevant aspect in the measurement of an argument is in Keynes’s view, the weight of argument • Keynes maintained that the weight of an argument is correlated to its relevance, but is independent of its probability. For long, the prominent interpretation was to relate the weight of argument to the notion of second order probability distribution • Keynes proposed a precise way to calculate it, when he stated that the weight is: “the balance between the absolute amount of the relevant knowledge and of the relevant ignorance respectively”
In the language of modern decision theory (after Schmeidler and Dow and Werlang), given an event A, the relevant ignorance can be defined as • that is, the difference between complete knowledge and the probability of the occurrence of the event plus the probability of its complement (negation of the event) • The weight of argument is then represented by • that is, by the difference between what Keynes called the absolute amount of the relevant knowledge and the absolute amount of relevant ignorance
If v is an additive measure (probability), the relevant ignorance is zero and the weight of argument is 1. • If v is a non-additive measure (convex capacity), the relevant ignorance is different from zero and the weight of argument belongs to the interval zero-one • This interpretation of Keynes’s thought entails that the significance of the weight of argument emerges only when the decision-maker is not endowed with a unique additive probability measure and does not behave as an expected utility maximizer
The third aspect of Keynes’s theory instrumental to our argument emerges in Chapter 26 of the Treatise, dealing with the application of probability to conduct • Keynes maintained that “mathematical expectation, of goods or advantage are not always numerically measurable, and hence that even if a meaning can be given to the sum of a series of non-numerical mathematical expectations, not every pair of such sums are numerically comparable in respect of more and less” • As a result he (implicitly) rejected expected utility maximisation from the outset
ELLSBERG’S APPROACH • Ellsberg stressed that his findings imply that no unique additive probability can account for the choices of unrepentant violators of the sure-thing principle • “under most circumstances of decision-making there may remain a sizeable subset Y° of distributions … that still seem ‘reasonably acceptable’ … that do not contradict your (‘vague’) opinions [and that] may yet be large, particularly when relevant information is perceived as scanty, unreliable, contradictory, ambiguous”
In 1961 Ellsberg had recommended as solution for the paradox a weighted average of the expectation of the most reliable (“best guess”) probability distribution and the maximin solution • Accordingly the decision rule adopted by Ellsberg was to associate with each act x the following index: ρ E(x) + (1-ρ) min(x) • ρ is a parameter that measures confidence and weighs the additive distribution that serves as a best estimate and all the other possible probability distributions assumed to be reasonable by the decision-maker under ambiguity (as in Hodges and Lehmann 1952)
In the thesis of 1962 Ellsberg retained the idea of a set of distributions over the states of the world, but now applied to the restricted set of possible distributions the Hurwicz criterion. • Hurwicz (1951) had proposed to select the minimum and the maximum payoff to each given action x, and then associates to each action the following index: α max(x) + (1-α) min(x) where the parameter alpha measures the individual’s optimism (this is better know as Arrow-Hurwicz criterion • The new index Ellsberg proposed, called restricted Bayes/Hurwicz criterion, is: ρ E(x) + (1-ρ) [α max(x) + (1-α) min(x)]
SHACKLE’S APPROACH • A decision-maker, typically an entrepreneur, has to choose among alternative “sequels” on the basis of two elements: the possible gains and losses embedded in a sequel, called face-values, and a valuation of the “possibility” of the gains and losses, called potential surprise • The list of gains and losses is not complete so (additive) probability theory cannot be applied
Potential surprise can be considered as a degree of disbelief, or implausibility of the hypothesis that supports the sequel. It can account for a “residual hypothesis” (i.e. unanticipated event) • When the decision maker has to choose among alternative sequels, she re-considers the face-values of each sequel by their degree of potential surprise. • Finally, Shackle defined a function σ, called “degree of stimulus” (or “ascendancy function”) in order to select two values, the focus-gain and the focus–loss, through which sequels are ranked
After rejecting von Neumann-Morgenstern expected utility maximization and considering Wald’s maximin too conservative, Shackle presented his decision rule for the ranking of sequels as follows: • “there is surely a third [criterion] which is more plausible, and of more general analytical power, than either of the two former, namely, that he [the decision-maker] will take into account both the ‘best possible’ and the ‘worst possible’ outcome of each course of action and make these pairs of outcomes the basis of his decision” • So Shackle’s criterion generalizes Hurwicz’s (better: Hurwicz is but a special case of Shackle!)
COMPARISON BETWEEN SHACKLE AND ELLSBERG • Shackle’s decision problem shares a crucial feature with the decision problem in which Ellsberg paradox emerges. Both decision-making problems refer to an epistemic state that can be represented by the notion of ambiguity, which stands in the region between the two extremes of complete ignorance and risk • At first the two problems seem quite different since, in Ellsberg, the question is one of ambiguous probabilities, but with a complete list of all possible events, while in Shackle, the question is one of providing an exhaustive list of possible events.
But this difference concerns more the degree rather than quality of uncertainty faced by the decision-maker • Both Shackle and Ellsberg assume that the decision-maker has partial knowledge about the future states of the world, and faces ambiguity in the sense of our definition, that is, awareness of the unreliability of a unique, additive probability distribution • What is relevant, as a result, is the representation of the small world in which the decision-maker has act. From this viewpoint Shackle’s and the Ellsberg’s scenarios are analogous; both of them are miss-representation of the hypothetical grand world in which all future states of the world are (potentially) completely described
The grand world is the complete list of states which are of concern to an individual. The small world is a construction derived from a certain partition of the grand world into events, which constitutes the states of the small-world • A state in the small world is an event in the grand world. A consequence in the small world is an act in the grand world • Savage claimed that subjective expected theory should be applied only to small worlds. In fact, it is only in small worlds that all possibilities can be exhaustively enumerated in advance, and all implications of all possibilities explored in detail
Savage added that: a small world is a microcosm if the probability of each state in the small world equals the probability of corresponding event in the grand world • But if small worlds are microcosm, the decision-maker is supposed to be able to enumerate exhaustively all possibilities in advance, and to explore all consequences in detail, though she works exclusively in a practical setting called the small world. • Thereby it is as if she had a sort of “divine” knowledgeof the outside world
What happensif the decision maker is not endowed with “divine knowledge” and the set of states of the small world that is relevant her is a miss-specified representation because: • of possible missing states which are accounted for in the grand world (Shackle’s view) • the decision-maker is unable to spit an event in the small word into its constitutive states in the grand world (Ellsberg’s view) ?
Shackle and Ellsberg talk of decision-making problems located in Savage’s small world. But of course their small worlds are not microcosm • To be precise, in both scenarios the decision-maker is unable to enumerate the unique mutually exclusively possible future states of the world. Roughly speaking the decision-maker has only a rough representation (partition) of the set of states of the world • The non-additive probabilityapproach provides the formal context for interpreting Ellsberg and Shackle
Depending on the epistemic condition of the individual, the beliefs on the grand world may have a non-additive representation. In fact, if the individual transfers a likelihood assigned to an event in the small world to an event in the grand world, the implication is that she is unable to distribute beliefs across the elements of the grand world. • The non-additivity of subjective probability measures becomes an expression of the limits of the decision-maker’s understanding of the possibilities of the world, as well as of her awareness of these limits (Mukerji 1997). • Hence it is legitimate to assume that an individual with a perception of the grand world as fuzzy, incomplete, or vague behaves as if she had a non-additive prior rather than a well-defined probability
But the analogy between Ellsberg’s and Shackle’s problem and non-additive probability can be stressed as regards decision rules as well by making reference to the modern developments of non-additive probability theory as representing multiple priors • On the one hand, Gilboa and Schmeidler (1989) proposed to consider an individual that has opinions about the likelihood of different states, but she is not able to assign exact probabilities to them. According to their theory the decision maker has a convex set of subjective probability, which expresses the range of probabilities she considers possible. • Since the subjective probability is not unique there is a set of expected utilities for each action. Gilboa and Schmeidler then propose the following criterion: an action a is preferred to b if and only if the minimum possible value of the expected utility of a is greater than the minimum expected value of b.
Following on Gardenfors and Sahlin (1982), they called this criterion maximin expected utility (MEU). If the set of probabilities consists only of a single probability distribution, maximin expected utility coincides with subjective expected utility. If it consists of all possible probability distributions it coincides with Wald’s maximin. • On the other hand, Schmeidler showed that the proper integral for a capacity is the Choquet integral and proposed that decision-makers behave as if they maximise the Choquet integral of their utility function.
The Choquet integral with respect to a capacity was known in the mathematical literature since Choquet (1954) as a generalisation of the Lebesgue integral to a non-additive measure. • Schmeidler applied the concept to decision under uncertainty and used the Choquet integral of the non-additive measure as a generalisation of the mathematical expectation usually used in expected utility models. This new procedure is usually referred to as Choquet expected utility (CEU). • Furthermore Schmeidler provided an axiomatic foundation for CEU by means of a representation theorem that, in the same vein of Savage’s approach, made it possible to identify the non-additive probability uniquely, and the utility function up to a positive linear transformation.
There is a close relationship between MEU and CEU that needs emphasising. • Gilboa and Schmeidler (1994) showed that there is an isomorphism between a non-additive probability measure and a convex set of additive probability measure. • If the convex capacity that represents the decision-maker’s beliefs has a non-empty core of additive measures, than the CEU with respect to the capacity equals the maximin expected utility of the set of additive measures, that is, the subjective expected utility with respect to the less favourable probability distribution in the core.
MEU suggests that the non-additive probability is the lower bound of what the “real” additive probability might be, and concentrates on the worst cases, regardless of any consideration relative to the individual’s degree of confidence in her probability assessment. • This limitation is overcome by a generalized version of MEU called -maxmin expected utility (-MEU). In this theory a crucial role is played by the parameter [0,1], which expresses, exactly as in Arrow-Hurwicz, the decision maker’s ambiguity attitude. • The -MEU emphasises the decision maker’s degree of ambiguity perception, and involves a valuation of the quality of the decision maker’s information.
CONCLUDING REMARKS • Shackle and Ellsberg talk of decision-making problems located in Savage’s small world. Their small worlds are such that, in both cases, the decision-maker is unable to enumerate the unique mutually exclusively possible future states of the world • Shackle and Ellsberg were substantially referring to a model of ambiguity aversion currently known as the maxmin expected utility model (MEU) and Choquet expected
Recently a slightly different decision rule, the α-MEU approach, has been proposed as an outcome of these developments • Ghirardato, Maccheroni and Marinacci (2004) assume that the decision-maker’s ambiguity is expressed by a set of additive probability distributions (multiple priors) and the parameter α represents her ambiguity attitude, that is
The α-MEU preference model is a generalization of the Hurwicz’s maxmin functional in which the parameter α is constant • Although Shackle did not explicitly set the functional of the criterion he was proposing, it is straightforward to note that it overlaps with the -MEU functional. • As a result both Shackle’s and Ellsberg’s approach can be considered as variations of Hurwicz’s maxmin expected utility criterion (α-MEU) • On these grounds the ambiguity surrounding the decision maker in Ellsberg’s urn experiment can be deemed analogous to the uncertainty faced by Shackle’s entrepreneur taking unique decisions