430 likes | 527 Views
WEMBA, Regression Analysis. Market Intelligence Julie Edell Britton Session 9 October 9, 2009. Today’s Agenda. Announcements WEMBA C Multiple Regression Conjoint & dummy variable regression Multiple R2Y|X1, X2 vs. r2Y|X(i) Uncorrelated predictors Correlated predictors
E N D
WEMBA, Regression Analysis Market IntelligenceJulie Edell Britton Session 9 October 9, 2009
Today’s Agenda • Announcements • WEMBA C • Multiple Regression • Conjoint & dummy variable regression • Multiple R2Y|X1, X2 vs. r2Y|X(i) • Uncorrelated predictors • Correlated predictors • Promotion analysis • Course Evaluations
Announcements • Submit Nestle Contadina slides by 8 am tomorrow, Sat., 10/10
Become a Fuqua Student A Model of School Choice Values Perceptions Individual Differences & Constraints • Assumes that behavior is driven by differences in: • Values (Importance of key attributes) • Perceptions (Duke and Competition on key attributes) • Individual Differences & Constraints (travel, cost, etc.) 4
The Funnel Do not attend Information Session Attend Information Session Opt Out Opt Out Apply Selected Out Admitted Opt Out Matriculate 5
The Analysis Approach • Sample groups that differ in behavior • Compare the groups on relevant dimensions: • Perceptions • Values • Individual Differences & Constraints • Infer that any difference found between groups are partly responsible for differences in behavior 6
Disproportionate Stratified Random Sample Do not attend Information Session Attend Information Session n = 26 (of 60, 52%) n = 30 (of 60, 50%) Not Apply Not Apply n = 26 (of 158, 16.5%) n = 24 (of 173, 14%) Apply Selected Out n = 56 Admitted n = 12 (of 56, 25%) n = 42 (of 56, 75%) Opt Out Matriculate 7 7
What Drives Application to Duke? • Compare Applied to Did Not Apply • Perceptions (Duke – Competitor) • Constraints • Financial assistance program • % Cost • Time to travel to Duke • Can’t use attendance at Information Session as a factor here without going to population data • What did you learn? 8
What Drives Application to Duke? * = chi-square test statistic 9
What Drives Acceptance? • Compare accepted to did not accept, conditional having applied • Perceptions (Duke – Competitor) • Constraints • Financial assistance program • % Cost paid by employer • Time to travel to Duke • Can consider info session attendance here • What did you learn? 10
WhatDrivesAcceptance? * = chi-square test statistic 11
Acceptance by Sponsorship % paid by company: Student = 53.5 NonStudent = 31.1 12
The Impact of Info Sessions? • Compare those who attended to those who did not attend on perceptions of Duke • What did you learn? 13
The Impact of Info Sessions? Half the perceptual factors (9 of 17) were in the wrong direction! Information sessions don’t seem to be doing much good at all. *Only positive evidence is that, in overall population, probability of applying was 16.5% for those who attend an info session vs. 9.5% for those who did not, χ2 = 8.70, p < .005. *No significant effect on matriculation / acceptance: 59.5% of those who attended accepted v. 39.7% of those who did not attend, χ2 = 2.41, p = .11 14
What Should be Emphasized in Information Sessions? • Attributes that are important and where we do well and/or where our competition does not do well. • Importance • Focus on attributes that predict applying & accepting • Rank order attribute importance • Look at important attributes where perceptions (Duke – Comp) is positive. 16
Quasi-MAAM for Communication Content: Demonstrated Importance 17
Quasi-MAAM for Communication Content: Using Importance Ratings 18
WEMBA Takeaways • Be backward in your analysis process too • Outline your analysis before you begin • Think about tables needed, then do the analysis to make them • Real data is imperfect…do the best you can • Survey data is correlational, not causal • The funnel approach applies to many business problems 19
Multiple Regression Simple linear regression, with more than one predictor. a = intercept: predicted value of y if x1 = x2 = …xk = 0 b1 = Slope of y on x1 given that x2…xk are already in equation R = multiple correlation = correlation of Y-hat with Y (0< R < 1) R2 = % variance in Y explained by best linear regression equation. 20
Conjoint as Dummy Variable Regression (Y) (X1) (X2) (X3) Rating Size16oz Pepsi Caffeine 7 0 0 0 9 0 0 1 3 0 1 0 4 0 1 1 8 1 0 0 10 1 0 1 5 1 1 0 6 1 1 1 21
R2Y|X1,X2,X3 = r2Y|X1 +r2Y|X2+r2Y|X3Because Dummies are Uncorrelated.976 =[(.327)2+(-.873)2+.327)2] 23
A Framework for Understanding Multicollinearity Major Problem in Multiple Regression: Assessing the (unique) contribution of the individual predictors. Variance in y e a b c Variance in x1 Variance in x2 25
Multicollinearity The area in c causes ambiguity in specifying the contributions of x1 and x2 to explaining y. Should we attribute all of this variance to x1? All to x2? Somehow split it? Two different ways exist of assessing the contribution of x1 to explaining y. 26 26
Understand Expenditures in Milan Food Run two regressions Any kids 6-18? (0=no, 1=yes) Weekly food expenditures Total in household Annual income 28
Zero order coefficients tell us: More people in household, more weekly food expenditures rbc = +.43 rbi = +.40 rbj = +.37 If any kids 6-18, more weekly food expenditures Higher income, more spent on food All strongly statistically significant with 498 df Which two should predict Weekly expenditures best? 30
“Partial Effect” Milan Food Problem Run two regressions Any children under 6? (0=no, 1=yes) Weekly food expenditures Any children under 6? (0=no, 1=yes) Total in household 33
Doritos • XL Models. Effects of own price (& price promotion) & price promotions of other sizes (SM, XXL, 3XL) on sales of XL size 35
36 36
Promotion Models (p. 195) • If coupon dummy (1 = yes, 0 = no) and promo dummy (1=yes, 0=no) are perfectly correlated and each is correlated, say, r = .5 with weekly sales, • R2 (sales |coupon, promo) = .25 << .25 + .25. • Coefficients on coupon, promo would be indistinguishable from zero (nonsignificant), with huge standard errs. 39
Promotion Models • Omitted Variable Bias • Promo, Coupons each boost 1000 units, but • Coupon omitted & r = 1 with promo, coefficient for promo will be 2000. (P. 175-181) • Multicollinearity & Overloaded Models 41
Takeaways • Multiple Regression • Dummy Variable Regression for Conjoint (uncorrelated predictors) • Correlated predictors make it difficult to assess each predictor’s unique contribution. • Common in promotion analysis because it is common to pull multiple promotional levers simultaneously. • 2 Solutions: • Drop a predictor (omitted variable bias so reinterpret coefficients) • Leave both in (inflated Standard Errors, hard to assess impact of each) 43