330 likes | 346 Views
This article explores different methods for assigning probabilities and introduces the concept of similarity-weighted frequencies for objective probabilities. It also discusses the axiomatization of probabilities and the role of similarity in empirical analysis.
E N D
Empirical Similarity and Objective Probabilities Joint works of subsets of A. Billot, G. Gayer, I. Gilboa, O. Lieberman, A. Postlewaite, D. Samet, D. Schmeidler
What is the probability that… • This coin will come up Head? • My car will be stolen tonight? • I will survive the operation? • War will erupt over the next year?
Methods for assigning probabilities • The “classical” – Laplace’s Principle of Insufficient Reason • “Objective” – empirical frequencies • “Subjective” – degree of belief - Observe that the first two rely on a primitive notion of similarity
The subjective approach • Beautiful and axiomatically-based • Problems: • In many situations, preferences are not complete until probabilities are assessed. • Says nothing about the formation of beliefs and allows for beliefs we would consider ridiculous. (Bayes’s updating only aggravates the problem) • In “Rationality of Belief”, and “Is It Always Rational to Satify Savage’s Axioms?” w/ Postlewaite and Schmeidler, we argue that the Savage axioms are neither necessary not sufficient for rationality
Our goal • To extend the definition of empirical frequencies so that they cover a larger domain of applications • To retain the claim to objectivity • By doing this we hope to get “objective probabilities” in more applications, but by no means in all!
Similarity-weighted frequencies – Set-up The data: where and We are asked about the probability that for a new data point
Similarity-weighted frequencies – Formula Choose a similarity function Given observations and a new data point estimate by
Similarity-weighted frequencies – Interpretation • Special cases of • If is constant: an estimate of the expectation (in fact, “repeated experiment” is always a matter of subjective judgment of equal similarity) • If : an estimate of the conditional expectation • Useful when precise updating leaves us with a sparse database • Akin to interpolation • But not to extrapolation!
Axiomatization – Setup observations (case types) A database is a multi-set of observations We will refer to a database as a sequence or a multi-set interchangeably.
Axiomatization I: Observables • A state space • Fix a new data point • Databases • A probability assignment function
The combination axiom Ω = {1,2,3,…,s} databaseI + J case typesM databaseI databaseJ 9 18 . . . 11 1 2 . . . m 5 12 . . . 3 4 6 . . . 8 + 3 . States of the world . p(I + J) . p(J) p(I) 1 2 Δ(Ω)
The combination axiom • Formally for some
Theorem I • The combination axiom holds, and not all are collinear if and only if • For each there are , not all collinear, and such that • In “Probabilities as Similarity-Weighted Frequencies” w/ Billot, Samet, Schmeidler
Frequencyof cases Probabilityof states Probability = Frequencyin perspective F1s1p1 +F2s2p2 +F3s3p3 For case 3 s2p2 s3p3 . 3 3 . . I For case 2 Δ(Ω) F = (F1, F2, F3) . p2 . . . 2 p3 2 . For case 1 . . s1p1 p1 p(F) = p(I) 1 1
What about a single dimension? • The perspective only works for • Evidently, probability is also interesting with two states
Axiomatization II – Observables Fix a new datapoint For each database , we assume a binary relation on ( ) is interpreted as “given database , and the new datapoint , is a more likely estimate of the probability than is ”
Axioms • Weak order: is complete and transitive • Combination: imply and • Archimedean: implies s.t.
Axioms – cont. • Averaging: if all are constant over then ranks values by their proximity to the empirical frequency
Theorem II The axioms hold iff there exists a function such that ranks values by their proximity to where and The function is unique up to multiplication by • In “Empirical Similarity” w/Lieberman and Schmeidler
Exponential similarity – Axiomatization Generic notation: • any component of the vector – hence a similarity-weighted average • Shift: • Ray Monotonicity: decreases in
Exponential similarity – Axiomatization (cont.) • Symmetry: • Ray Invariance: • Self-Relevance:
Theorem III The axioms hold iff there exists a norm such that • Satisfies “multiplicative transitivity”: • In “Exponential Similarity” w/ Billot and Schmeidler
The Similarity – whence? • In “Empirical Similarity” w/Lieberman and Schmeidler we propose an empirical approach: • Estimate the similarity function from the data • A parametrized approach: Consider a certain functional form • Choose a criterion to measure goodness of fit • Find the best parameters
A functional form • Consider a weighted Euclidean distance and
Selection criteria • Find weights that would minimize • Or: round off to get a prediction • and then minimize
How objective is it? • Modeling choices that can affect the “probability”: • Choice of X’s and of sample • Choice of functional form • Choice of goodness of fit criterion • As usual, objectivity may be an unattainable ideal • But it doesn’t mean we shouldn’t try.
Statistical inference • In “Empirical Similarity” w/Lieberman and Schmeidler we also develop statistical inference tools for our estimation procedure • Assume that the data were generated by a DGP of the type • Estimate the similarity function from the data • Perform statistical inference
Statistical inference – cont. • Estimate the weights by maximum likelihood • Test hypotheses of the form • Predict out-of-sample by the maximum likelihood estimators (via the similarity-weighted average formula)
Failures of the combination axiom • Integration of induction and deduction • Learning the parameter of a coin • Linear regression Limited to case-to-case induction, generalizing empirical frequencies
Failures of the combination axiom – cont. • Second order induction • Learning the similarity function In particular, doesn’t allow the similarity function to get more concentrated for large databases Combination restricted to periods of “no learning”.
Future Directions Integrate empirical similarity with: • Bayesian networks – to capture Bayesian reasoning such as a chain of conditional probabilities. • Logistics regression – to allow the identification of trends.
How close is rationality to objectivity? • Rationality – behaving in a way that doesn’t lead to regret or embarrassment when faced with analysis of own choices. • Objectivity – has to do with the ability to convince others. • Both “rational” and “objective” have to do with reasoning and convincing.