650 likes | 664 Views
Learn about the theory and application of discrete choice modeling in econometrics, including binary choice models and estimation techniques.
E N D
Econometrics Chengyuan Yin School of Mathematics
Econometrics 23. Discrete Choice Modeling
A Microeconomics Platform • Consumers Maximize Utility (!!!) • Fundamental Choice Problem: Maximize U(x1,x2,…) subject to prices and budget constraints • A Crucial Result for the Classical Problem: • Indirect Utility Function: V = V(p,I) • Demand System of Continuous Choices • The Integrability Problem: Utility is not revealed by demands
Theory for Discrete Choice • Theory is silent about discrete choices • Translation to discrete choice • Existence of well defined utility indexes: Completeness of rankings • Rationality: Utility maximization • Axioms of revealed preferences • Choice sets and consideration sets – consumers simplify choice situations • Implication for choice among a set of discrete alternatives • Commonalities and uniqueness • Does this allow us to build “models?” • What common elements can be assumed? • How can we account for heterogeneity? • Revealed choices do not reveal utility, only rankings which are scale invariant
Choosing Between Two Alternatives • Modeling the Binary Choice Ui,suv = suv + Psuv + suvIncome + i,suv Ui,sed = sed + Psed + sedIncome + i,sed • Chooses SUV: Ui,suv > Ui,sed Ui,suv - Ui,sed> 0 • (SUV-SED)+ (PSUV-PSED)+(SUV-sed)Income + i,suv - i,sed > 0 • i> -[+ (PSUV-PSED)+ Income]
What Can Be Learned from the Data? (A Sample of Consumers, i = 1,…,N) • Are the attributes “relevant?” • Predicting behavior • Individual • Aggregate • Analyze changes in behavior when • attributes change
Application • 210 Commuters Between Sydney and Melbourne • Available modes = Air, Train, Bus, Car • Observed: • Choice • Attributes: Cost, terminal time, other • Characteristics: Household income • First application: Fly or Other
Binary Choice Data Choose Air Gen.Cost Term Time Income 1.0000 86.000 25.000 70.000 .00000 67.000 69.000 60.000 .00000 77.000 64.000 20.000 .00000 69.000 69.000 15.000 .00000 77.000 64.000 30.000 .00000 71.000 64.000 26.000 .00000 58.000 64.000 35.000 .00000 71.000 69.000 12.000 .00000 100.00 64.000 70.000 1.0000 158.00 30.000 50.000 1.0000 136.00 45.000 40.000 1.0000 103.00 30.000 70.000 .00000 77.000 69.000 10.000 1.0000 197.00 45.000 26.000 .00000 129.00 64.000 50.000 .00000 123.00 64.000 70.000
An Econometric Model • Choose to fly iff UFLY> 0 • Ufly = +1Cost + 2Time + Income + • Ufly> 0 > -(+1Cost + 2Time + Income) • Probability model: For any person observed by the analyst, Prob(fly) = Prob[ > -(+1Cost + 2Time + Income)] • Note the relationship between the unobserved and the outcome
Modeling Approaches • Nonparametric – “relationship” • Minimal Assumptions • Minimal Conclusions • Semiparametric – “index function” • Stronger assumptions • Robust to model misspecification (heteroscedasticity) • Still weak conclusions • Parametric – “Probability function and index” • Strongest assumptions – complete specification • Strongest conclusions • Possibly less robust. (Not necessarily)
Nonparametric P(Air)=f(Income)
Semiparametric • MSCORE: Find b’x so that sign(b’x) * sign(y) is maximized. • Klein and Spady: Find b to maximize a semiparametric likelihood of G(b’x)
Klein and Spady Semiparametric Note necessary normalizations. Coefficients are not very meaningful.
Logit vs. MScore • Logit fits worse • MScore fits better, coefficients are meaningless
Parametric Model Estimation • How to estimate , 1, 2, ? • It’s not regression • The technique of maximum likelihood • Prob[y=1] = Prob[ > -(+1Cost + 2Time + Income)] Prob[y=0] = 1 - Prob[y=1] • Requires a model for the probability
Completing the Model: F() • The distribution • Normal: PROBIT, natural for behavior • Logistic: LOGIT, allows “thicker tails” • Gompertz: EXTREME VALUE, asymmetric, underlies the basic logit model for multiple choice • Does it matter? • Yes, large difference in estimates • Not much, quantities of interest are more stable.
Estimated Binary Choice (Probit) Model +---------------------------------------------+ | Binomial Probit Model | | Maximum Likelihood Estimates | | Dependent variable MODE | | Weighting variable None | | Number of observations 210 | | Iterations completed 6 | | Log likelihood function -84.09172 | | Restricted log likelihood -123.7570 | | Chi squared 79.33066 | | Degrees of freedom 3 | | Prob[ChiSqd > value] = .0000000 | | Hosmer-Lemeshow chi-squared = 46.96547 | | P-value= .00000 with deg.fr. = 8 | +---------------------------------------------+ +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ Index function for probability Constant .43877183 .62467004 .702 .4824 GC .01256304 .00368079 3.413 .0006 102.647619 TTME -.04778261 .00718440 -6.651 .0000 61.0095238 HINC .01442242 .00573994 2.513 .0120 34.5476190
Estimated Binary Choice Models LOGITPROBITEXTREMEVALUE Variable Estimate t-ratio Estimate t-ratio Estimate t-ratio Constant 1.78458 1.40591 0.438772 0.702406 1.45189 1.34775 GC 0.0214688 3.15342 0.012563 3.41314 0.0177719 3.14153 TTME -0.098467 -5.9612 -0.0477826 -6.65089 -0.0868632 -5.91658 HINC 0.0223234 2.16781 0.0144224 2.51264 0.0176815 2.02876 Log-L -80.9658 -84.0917 -76.5422 Log-L(0) -123.757 -123.757 -123.757
Effect on Predicted Probability of an Increase in Income +1Cost + 2Time + (Income+1) ( is positive)
Marginal Effects in Probability Models • Prob[Outcome] = some F(+1Cost…) • “Partial effect” = F(+1Cost…) / ”x” (derivative) • Partial effects are derivatives • Result varies with model • Logit: F(+1Cost…) / x = Prob * (1-Prob) * • Probit: F(+1Cost…) / x = Normal density • Scaling usually erases model differences
Marginal Effects for Binary Choice • Logit • Probit
Estimated Marginal Effects Logit Probit Extreme Value
Marginal Effect for a Dummy Variable • Prob[yi = 1|xi,di] = F(’xi+di) =conditional mean • Marginal effect of d Prob[yi = 1|xi,di=1]=Prob[yi= 1|xi,di=0] • Logit:
(Marginal) Effect – Dummy Variable HighIncm = 1(Income > 50) +-------------------------------------------+ | Partial derivatives of probabilities with | | respect to the vector of characteristics. | | They are computed at the means of the Xs. | | Observations used are All Obs. | +-------------------------------------------+ +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ Characteristics in numerator of Prob[Y = 1] Constant .4750039483 .23727762 2.002 .0453 GC .3598131572E-02 .11354298E-02 3.169 .0015 102.64762 TTME -.1759234212E-01 .34866343E-02 -5.046 .0000 61.009524 Marginal effect for dummy variable is P|1 - P|0. HIGHINCM .8565367181E-01 .99346656E-01 .862 .3886 .18571429
Computing Effects • Compute at the data means? • Simple • Inference is well defined • Average the individual effects • More appropriate? • Asymptotic standard errors. (Not done correctly in the literature – terms are correlated!) • Is testing about marginal effects meaningful?
Elasticities • Elasticity = • How to compute standard errors? • Delta method • Bootstrap • Bootstrap the individual elasticities? (Will neglect variation in parameter estimates.) • Bootstrap model estimation?
Estimated Income Elasticity for Air Choice Model +------------------------------------------+ | Results of bootstrap estimation of model.| | Model has been reestimated 25 times. | | Statistics shown below are centered | | around the original estimate based on | | the original full sample of observations.| | Result is ETA = .71183 | | bootstrap samples have 840 observations.| | Estimate RtMnSqDev Skewness Kurtosis | | .712 .266 -.779 2.258 | | Minimum = .125 Maximum = 1.135 | +------------------------------------------+ Mean Income = 34.55, Mean P = .2716, Estimated ME = .004539, Estimated Elasticity=0.5774.
Odds Ratio – Logit Model Only Effect Measure? “Effect of a unit change in the odds ratio.”
Ordered Outcomes • E.g.: Taste test, credit rating, course grade • Underlying random preferences: Mapping to observed choices • Strength of preferences • Censoring and discrete measurement • The nature of ordered data
Modeling Ordered Choices • Random Utility Uit = +’xit+ i’zit + it =ait + it • Observe outcome j if utility is in region j • Probability of outcome = probability of cell Pr[Yit=j] = F(j – ait) - F(j-1 – ait)
Health Care Satisfaction (HSAT) Self administered survey: Health Care Satisfaction? (0 – 10) Continuous Preference Scale
Effects in the Ordered Probability Model Assume the βk is positive. Assume that xk increases. β’x increases. μj- β’x shifts to the left for all 5 cells. Prob[y=0] decreases Prob[y=1] decreases – the mass shifted out is larger than the mass shifted in. Prob[y=2] decreases – same reason. Prob[y=3] increases. Prob[y=4] increases When βk > 0, increase in xk decreases Prob[y=0] and increases Prob[y=J]. Intermediate cells are ambiguous, but there is only one sign change in the marginal effects from 0 to 1 to … to J
Ordered Probability Model for Health Satisfaction +---------------------------------------------+ | Ordered Probability Model | | Dependent variable HSAT | | Number of observations 27326 | | Underlying probabilities based on Normal | | Cell frequencies for outcomes | | Y Count Freq Y Count Freq Y Count Freq | | 0 447 .016 1 255 .009 2 642 .023 | | 3 1173 .042 4 1390 .050 5 4233 .154 | | 6 2530 .092 7 4231 .154 8 6172 .225 | | 9 3061 .112 10 3192 .116 | +---------------------------------------------+ +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ Index function for probability Constant 2.61335825 .04658496 56.099 .0000 FEMALE -.05840486 .01259442 -4.637 .0000 .47877479 EDUC .03390552 .00284332 11.925 .0000 11.3206310 AGE -.01997327 .00059487 -33.576 .0000 43.5256898 HHNINC .25914964 .03631951 7.135 .0000 .35208362 HHKIDS .06314906 .01350176 4.677 .0000 .40273000 Threshold parameters for index Mu(1) .19352076 .01002714 19.300 .0000 Mu(2) .49955053 .01087525 45.935 .0000 Mu(3) .83593441 .00990420 84.402 .0000 Mu(4) 1.10524187 .00908506 121.655 .0000 Mu(5) 1.66256620 .00801113 207.532 .0000 Mu(6) 1.92729096 .00774122 248.965 .0000 Mu(7) 2.33879408 .00777041 300.987 .0000 Mu(8) 2.99432165 .00851090 351.822 .0000 Mu(9) 3.45366015 .01017554 339.408 .0000
Ordered Probability Effects +----------------------------------------------------+ | Marginal effects for ordered probability model | | M.E.s for dummy variables are Pr[y|x=1]-Pr[y|x=0] | | Names for dummy variables are marked by *. | +----------------------------------------------------+ +---------+--------------+----------------+--------+---------+----------+ |Variable | Coefficient | Standard Error |b/St.Er.|P[|Z|>z] | Mean of X| +---------+--------------+----------------+--------+---------+----------+ These are the effects on Prob[Y=00] at means. *FEMALE .00200414 .00043473 4.610 .0000 .47877479 EDUC -.00115962 .986135D-04 -11.759 .0000 11.3206310 AGE .00068311 .224205D-04 30.468 .0000 43.5256898 HHNINC -.00886328 .00124869 -7.098 .0000 .35208362 *HHKIDS -.00213193 .00045119 -4.725 .0000 .40273000 These are the effects on Prob[Y=01] at means. *FEMALE .00101533 .00021973 4.621 .0000 .47877479 EDUC -.00058810 .496973D-04 -11.834 .0000 11.3206310 AGE .00034644 .108937D-04 31.802 .0000 43.5256898 HHNINC -.00449505 .00063180 -7.115 .0000 .35208362 *HHKIDS -.00108460 .00022994 -4.717 .0000 .40273000 ... repeated for all 11 outcomes These are the effects on Prob[Y=10] at means. *FEMALE -.01082419 .00233746 -4.631 .0000 .47877479 EDUC .00629289 .00053706 11.717 .0000 11.3206310 AGE -.00370705 .00012547 -29.545 .0000 43.5256898 HHNINC .04809836 .00678434 7.090 .0000 .35208362 *HHKIDS .01181070 .00255177 4.628 .0000 .40273000
Multinomial Choice Among J Alternatives • Random Utility Basis Uitj = ij+i ’xitj+ i’zit + ijt i = 1,…,N; j = 1,…,J(i); t = 1,…,T(i) • Maximum Utility Assumption Individual i will Choose alternative j in choice setting t iff Uitj > Uitk for all k j. • Underlying assumptions • Smoothness of utilities • Axioms: Transitive, Complete, Monotonic
Utility Functions • The linearity assumption and curvature • The choice set • Deterministic and random components: The “model” • Generic vs. alternative specific components • Attributes and characteristics • Coefficients • Part worths • Alternative specific constants • Scaling
The Multinomial Logit (MNL) Model • Independent extreme value (Gumbel): • F(itj) = 1 – Exp(-Exp(itj)) (random part of each utility) • Independence across utility functions • Identical variances (means absorbed in constants) • Same parameters for all individuals (temporary) • Implied probabilities for observed outcomes
Specifying Probabilities • • Choice specific attributes (X) vary by choices, multiply by generic • coefficients. E.g., TTME, GC • Generic characteristics (Income, constants) must be interacted with • choice specific constants. (Else they fall out of the probability) • • Estimation by maximum likelihood; dij = 1 if person i chooses j
Observed Data • Types of Data • Individual choice • Market shares • Frequencies • Ranks • Attributes and Characteristics • Choice Settings • Cross section • Repeated measurement (panel data)