Discrete Choice Modeling

William Greene Stern School of Business New York University Discrete Choice Modeling

Stated Preference and Revealed Preference Data

Revealed and Stated Preference Data • Pure RP Data • Market (ex-post, e.g., supermarket scanner data) • Individual observations • Pure SP Data • Contingent valuation • (?) Validity • Combined (Enriched) RP/SP • Mixed data • Expanded choice sets

Panel Data • Repeated Choice Situations • Typically RP/SP constructions (experimental) • Accommodating “panel data” • Multinomial Probit [Marginal, impractical] • Latent Class • Mixed Logit

Application Survey sample of 2,688 trips, 2 or 4 choices per situation Sample consists of 672 individuals Choice based sample Revealed/Stated choice experiment: Revealed: Drive,ShortRail,Bus,Train Hypothetical: Drive,ShortRail,Bus,Train,LightRail,ExpressBus Attributes: Cost –Fuel or fare Transit time Parking cost Access and Egress time

Each person makes four choices from a choice set that includes either two or four alternatives. The first choice is the RP between two of the RP alternatives The second-fourth are the SP among four of the six SP alternatives. There are ten alternatives in total.

Revealed Preference Data • Advantage: Actual observations on actual behavior • Disadvantage: Limited range of choice sets and attributes – does not allow analysis of switching behavior.

Stated Preference Data • Pure hypothetical – does the subject take it seriously? • No necessary anchor to real market situations • Vast heterogeneity across individuals

Pooling RP and SP Data Sets - 1 • Enrich the attribute set by replicating choices • E.g.: • RP: Bus,Car,Train (actual) • SP: Bus(1),Car(1),Train(1) Bus(2),Car(2),Train(2),… • How to combine?

An Underlying Random Utility Model

Nested Logit Approach Mode RP Car Train Bus SPCar SPTrain SPBus Use a two level nested model, and constrain three SP IV parameters to be equal.

Enriched Data Set – Vehicle Choice Choosing between Conventional, Electric and LPG/CNG Vehicles in Single-Vehicle Households David A. Hensher William H. Greene Institute of Transport Studies Department of Economics School of Business Stern School of Business The University of Sydney New York University NSW 2006 Australia New York USA September 2000

Fuel Types Study • Conventional, Electric, Alternative • 1,400 Sydney Households • Automobile choice survey • RP + 3 SP fuel classes • Nested logit – 2 level approach – to handle the scaling issue

Attribute Space: Conventional

Attribute Space: Electric

Attribute Space: Alternative

A Generalized Mixed Logit Model

Generalized Multinomial Choice Model

Estimation in Willingness to Pay Space Both parameters in the WTP calculation are random.

--------+-------------------------------------------------- Variable| Coefficient Standard Error b/St.Er. P[|Z|>z] --------+-------------------------------------------------- |Random parameters in utility functions QUAL| -.32668*** .04302 -7.593 .0000 1.01373 renormalized PRICE| 1.00000 ......(Fixed Parameter)...... -11.80230 renormalized |Nonrandom parameters in utility functions FASH| 1.14527*** .05788 19.787 .0000 1.4789 not rescaled ASC4| .84364*** .05554 15.189 .0000 .0368 not rescaled |Heterogeneity in mean, Parameter:Variable QUAL:AGE| .05843 .04836 1.208 .2270 interaction terms QUA0:AGE| -.11620 .13911 -.835 .4035 PRIC:AGE| .23958 .25730 .931 .3518 PRI0:AGE| 1.13921 .76279 1.493 .1353 |Diagonal values in Cholesky matrix, L. NsQUAL| .13234*** .04125 3.208 .0013 correlated parameters CsPRICE| .000 ......(Fixed Parameter)...... but coefficient is fixed |Below diagonal values in L matrix. V = L*Lt PRIC:QUA| .000 ......(Fixed Parameter)...... |Heteroscedasticity in GMX scale factor sdMALE| .23110 .14685 1.574 .1156 heteroscedasticity |Variance parameter tau in GMX scale parameter TauScale| 1.71455*** .19047 9.002 .0000 overall scaling, tau |Weighting parameter gamma in GMX model GammaMXL| .000 ......(Fixed Parameter)...... |Coefficient on PRICE in WTP space form Beta0WTP| -3.71641*** .55428 -6.705 .0000 new price coefficient S_b0_WTP| .03926 .40549 .097 .9229 standard deviation | Sample Mean Sample Std.Dev. Sigma(i)| .70246 1.11141 .632 .5274 overall scaling |Standard deviations of parameter distributions sdQUAL| .13234*** .04125 3.208 .0013 sdPRICE| .000 ......(Fixed Parameter)...... --------+-------------------------------------------------- Estimated Model for WTP

Mixed Logit Approaches • Pivot SP choices around an RP outcome. • Scaling is handled directly in the model • Continuity across choice situations is handled by random elements of the choice structure that are constant through time • Preference weights – coefficients • Scaling parameters • Variances of random parameters • Overall scaling of utility functions

Experimental Design

SP Study Using WTP Space

Rank Data and Best/Worst

Rank Data and Exploded Logit Alt 1 is the best overall Alt 3 is the best amongremaining alts 2,3,4,5 Alt 5 is the best among remaining alts 2,4,5 Alt 2 is the best among remaining alts 2,4 Alt 4 is the worst.

Exploded Logit

Best Worst • Individual simultaneously ranks best and worst alternatives. • Prob(alt j) = best = exp[U(j)] / mexp[U(m)] • Prob(alt k) = worst = exp[-U(k)] / mexp[-U(m)]

Choices

Best

Worst

Uses the result that if U(i,j) is the lowest utility, -U(i,j) is the highest.

Nested Logit Approach.

Nested Logit Approach – Different Scaling for Worst 8 choices are two blocks of 4. Best in one brance, worst in the second branch

Discrete Choice Modeling