430 likes | 453 Views
Binomial regression. Greg Francis. PSY 626: Bayesian Statistics for Psychological Science Fall 2018 Purdue University. Zenner Cards. Guess which card appears next:. Zenner Cards. Guess which card appears next:. Zenner Cards. Guess which card appears next:. Data.
E N D
Binomial regression Greg Francis PSY 626: Bayesian Statistics for Psychological Science Fall 2018 Purdue University
Zenner Cards • Guess which card appears next:
Zenner Cards • Guess which card appears next:
Zenner Cards • Guess which card appears next:
Data • Score indicates whether you predicted correctly (1) or not (0) • File ZennerCards.csv contains the data for 22 participants • # load full data file • ZCdata<-read.csv(file="ZennerCards.csv",header=TRUE,stringsAsFactors=FALSE)
Binomial model • yi is number of observed outcomes (e.g., correct responses) from n draws when the probability of a correct response is pi • We know n for any trial is 1 • We estimate pi • It is convenient to actually estimate the logit of pi
Binomial regression • We will model the logit as a linear equation • Among other things, this insures that our pi value is always between 0 and 1 (as all probabilities must be) • Makes it easier to identify priors • Our model posterior will gives us logit(pi), to get back pi we have to invert (plogis function)
Model set up • No independent variables # Estimate probability of correct model1 = brm(Score ~ 1, data = ZCdata, family="binomial") print(summary(model1)) Family: binomial Links: mu = logit Formula: Score ~ 1 Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept -1.31 0.07 -1.46 -1.17 1401 1.00
Odd warning > model1 = brm(Score ~ 1, data = ZCdata, family="binomial") Using the maximum response value as the number of trials. Only 2 levels detected so that family 'bernoulli' might be a more efficient choice. Compiling the C++ model Start sampling SAMPLING FOR MODEL 'binomial brms-model' NOW (CHAIN 1). Gradient evaluation took 0.000312 seconds 1000 transitions using 10 leapfrog steps per transition would take 3.12 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) • No independent variables
Odd warning • Data can be coded as 0 – 1 or as a frequency • For the latter, brm needs to know how many trials so that it can judge how many successes (proportion) • The number of trials might vary across reported frequencies, so this should be provided for each score • In our case it is 1 trial for each score model1 = brm(Score|trials(1) ~ 1, data = ZCdata, family="binomial") Only 2 levels detected so that family 'bernoulli' might be a more efficient choice. Compiling the C++ model Start sampling SAMPLING FOR MODEL 'binomial brms-model' NOW (CHAIN 1). Gradient evaluation took 0.000312 seconds 1000 transitions using 10 leapfrog steps per transition would take 3.12 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 1 Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept -1.32 0.07 -1.46 -1.18 1443 1.00
Model results • Convert posterior logit to proportion # compute means of posteriors post<-posterior_samples(model1) props<-plogis(post$b_Intercept) dev.new() plot(density(props))
Model results • Can quickly answer some questions such as • Probability that the success rate is better than pure chance (0.2)? betterThanChance <-length(props[props > 0.2])/length(props) cat("Probability that success rate is better than pure chance = ", betterThanChance, "\n") Probability that success rate is better than pure chance = 0.82225
Skeptical model • Set a prior to be tight around 0.2 • Have to set it for the logistic of 0.2 # Skeptical prior lgSkeptical = qlogis(0.2) stanvars <-stanvar(lgSkeptical, name='lgSkeptical') prs <- c(prior(normal(lgSkeptical, 0.01), class = "Intercept") ) model2 = brm(Score|trials(1) ~ 1, data = ZCdata, family="binomial", prior = prs, stanvars=stanvars ) print(summary(model2)) Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 1 Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept -1.38 0.01 -1.40 -1.36 1253 1.00
Model results • Convert posterior logit to proportion # compute means of posteriors post<-posterior_samples(model1) props<-plogis(post$b_Intercept) dev.new() plot(density(props))
Model results • Can quickly answer some questions such as • Probability that the success rate is better than pure chance (0.2)? betterThanChance <-length(props[props > 0.2])/length(props) cat("Probability that success rate is better than pure chance = ", betterThanChance, "\n") Probability that success rate is better than pure chance = 0.559 > mean(props) [1] 0.2002437
Differences across actual cards model3 = brm(Score|trials(1) ~ 0 + ActualCard, data = ZCdata, family="binomial") • Maybe you suspect that some cards are more easily guessed than other cards Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + ActualCard Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat ActualCardCircle -1.51 0.17 -1.87 -1.19 4000 1.00 ActualCardCross -1.13 0.16 -1.46 -0.84 4000 1.00 ActualCardSquare -1.01 0.15 -1.30 -0.72 4000 1.00 ActualCardStar -1.52 0.18 -1.86 -1.18 4000 1.00 ActualCardWavy -1.51 0.17 -1.87 -1.18 4000 1.00
Differences across actual cards dev.new() plot(marginal_effects(model3) ) • Plot of marginal means converts logits to proportions
Differences across actual cards • We might want to compare predicted performance of a model that considers different guess rates for different cards against a model that treats all cards equally • A model that uses different probabilities for different cards is expected to do better than a model that ignores card type. • Does this suggest that people have some kind of predictive power that works for some cards and not for other cards? print(model_weights(model1, model3, weights=”waic")) model1 model3 0.3752853 0.6247147
Differences in guessed cards model4 = brm(Score|trials(1) ~ 0 + GuessedCard, data = ZCdata, family="binomial") print(summary(model4)) • It might be better to see if participant’s guesses improve model fit Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GuessedCard Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GuessedCardCircle -1.45 0.17 -1.80 -1.11 4000 1.00 GuessedCardCross -1.32 0.15 -1.63 -1.03 4000 1.00 GuessedCardSquare -1.18 0.15 -1.47 -0.90 4000 1.00 GuessedCardStar -1.53 0.17 -1.87 -1.20 4000 1.00 GuessedCardWavy -1.15 0.18 -1.51 -0.79 4000 1.00
Differences across guessed cards dev.new() plot(marginal_effects(model4) ) • Plot of marginal means converts logits to proportions
Differences across cards • We might want to compare predicted performance of models that: • considers different success rates for different cards • treats all cards equally • Considers different success rates for different guesses • A model that uses different probabilities for different cards is expected to do better than a model that ignores card type or a model that is based on guessed cards. print(model_weights(model1, model3, model4, weights=”waic")) model1 model3 model4 0.35892977 0.59748876 0.04358147
Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GuessedCard * ActualCard Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.SampleRhat GuessedCardCircle 18748.29 23335.67 569.90 82365.89 14 1.48 GuessedCardCross -130341.46 188186.65 -687440.63 -1398.18 4 1.95 GuessedCardSquare -125573.75 330002.03 -1389784.88 -567.34 9 1.49 GuessedCardStar -91900.30 138111.23 -516973.74 -2489.86 5 1.71 GuessedCardWavy -52455.04 69700.33 -297285.00 -908.54 13 1.37 ActualCardCross -260964.33 338283.58 -1186086.07 -15152.35 3 2.02 ActualCardSquare -113004.55 122265.01 -497683.26 -6292.06 11 1.66 ActualCardStar -166532.86 214562.89 -861071.25 -7501.83 8 1.63 ActualCardWavy -109945.16 138081.22 -532928.00 -4535.43 12 1.42 GuessedCardCross:ActualCardCross 621878.51 673531.42 50147.20 2308893.53 3 2.83 GuessedCardSquare:ActualCardCross -171411.07 1023514.89 -3291415.19 1607030.41 6 2.55 GuessedCardStar:ActualCardCross -82208.08 313172.23 -1112235.35 376533.31 27 1.14 GuessedCardWavy:ActualCardCross -226065.05 329146.11 -1148391.59 79984.64 7 1.47 GuessedCardCross:ActualCardSquare -141046.83 744019.18 -3237012.28 430138.02 14 1.28 GuessedCardSquare:ActualCardSquare 630016.28 1348957.38 28622.18 6647163.21 8 1.43 GuessedCardStar:ActualCardSquare -516200.69 1222076.03 -5277814.59 190150.81 6 1.48 GuessedCardWavy:ActualCardSquare -254554.08 378122.65 -1105859.03 208732.66 19 1.18 GuessedCardCross:ActualCardStar -236520.96 571416.03 -1990659.48 498791.23 18 1.16 GuessedCardSquare:ActualCardStar -138495.18 405125.90 -1409142.16 453403.85 28 1.06 GuessedCardStar:ActualCardStar 591728.12 684506.33 32211.24 2642849.92 7 1.72 GuessedCardWavy:ActualCardStar -108802.15 264440.16 -908997.28 224749.11 13 1.36 GuessedCardCross:ActualCardWavy -196106.32 347193.69 -1102970.69 423275.50 20 1.21 GuessedCardSquare:ActualCardWavy -420514.11 705384.91 -2611403.36 248315.35 43 1.12 GuessedCardStar:ActualCardWavy -367909.12 576345.19 -2096693.45 79980.82 8 2.04 GuessedCardWavy:ActualCardWavy 416592.62 569143.92 23860.54 1966798.87 8 1.56 Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1). Warning messages: 1: The model has not converged (some Rhats are > 1.1). Do not analyse the results! We recommend running more iterations and/or setting stronger priors. 2: There were 3875 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup Interaction of cards and guesses • Maybe subjects are more successful for some combination of cards and guesses? • Duh! • Type Circle, guess Circle will be 100% correct model5 = brm(Score|trials(1) ~ 0 + GuessedCard*ActualCard, data = ZCdata, family="binomial") print(summary(model5))
Interaction • If we ignore the warning about model convergence • And don’t notice that the interaction model is stupid • It looks like the interaction model does great! • Indeed, it has to do great, because it has all the answers • Using both guessed and actual cards does not make sense for a model > print(model_weights(model1, model2, model3, model4, model5, weights="waic")) model1 model2 model3 model4 model5 7.780087e-248 1.316699e-247 1.295104e-247 9.446629e-249 1.000000e+00
Effect of trial model6 = brm(Score|trials(1) ~ 0 + Trial, data = ZCdata, family="binomial") print(summary(model6)) • Success could be related to trial (information available for later trials) • Success rate could be higher for later trials Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + Trial Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Trial -0.04 0.00 -0.05 -0.04 1387 1.00
Effect of trial • Plot of marginal means converts logits to proportions • Opposite of information availability! dev.new() plot(marginal_effects(model6) )
Effect of trial • Compare to previous models (except for stupid model5) • Trial does not help the model at all • The best model is the one that supposes the success rate is almost precisely 0.2 (guess rate) • The model that considers variation across actual cards is almost as good • Trading off flexibility for fit > print(model_weights(model1, model2, model3, model4, model6, weights="waic")) model1 model2model3 model4 model6 2.232912e-01 3.778972e-013.716994e-01 2.711215e-02 5.280207e-20
Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + ParticipantFactor Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat ParticipantFactor1 -2.06 0.46 -3.04 -1.24 4000 1.00 ParticipantFactor2 -0.77 0.30 -1.36 -0.19 4000 1.00 ParticipantFactor3 -1.42 0.37 -2.17 -0.74 4000 1.00 ParticipantFactor4 -1.71 0.40 -2.56 -0.97 4000 1.00 ParticipantFactor5 -1.55 0.37 -2.33 -0.84 4000 1.00 ParticipantFactor6 -1.56 0.38 -2.37 -0.86 4000 1.00 ParticipantFactor7 -1.30 0.34 -2.02 -0.68 4000 1.00 ParticipantFactor8 -1.30 0.35 -2.00 -0.67 4000 1.00 ParticipantFactor9 -0.77 0.31 -1.38 -0.19 4000 1.00 ParticipantFactor10 -1.42 0.36 -2.16 -0.79 4000 1.00 ParticipantFactor11 -1.19 0.35 -1.90 -0.52 4000 1.00 ParticipantFactor12 -1.87 0.42 -2.79 -1.10 4000 1.00 ParticipantFactor13 -1.57 0.39 -2.38 -0.83 4000 1.00 ParticipantFactor14 -1.30 0.35 -2.02 -0.64 4000 1.00 ParticipantFactor15 -1.57 0.38 -2.36 -0.87 4000 1.00 ParticipantFactor16 -0.68 0.31 -1.29 -0.10 4000 1.00 ParticipantFactor17 -1.07 0.32 -1.71 -0.48 4000 1.00 ParticipantFactor18 -1.56 0.38 -2.35 -0.87 4000 1.00 ParticipantFactor19 -1.18 0.34 -1.87 -0.56 4000 1.00 ParticipantFactor20 -1.18 0.34 -1.88 -0.54 4000 1.00 ParticipantFactor21 -2.06 0.46 -3.02 -1.23 4000 1.00 ParticipantFactor22 -1.42 0.36 -2.19 -0.74 4000 1.00 Effect of subject # By default, participant numbers are treated as _numbers_. Need to correct that. ZCdata$ParticipantFactor = factor(ZCdata$Participant) model7 = brm(Score|trials(1) ~ 0 + ParticipantFactor, data = ZCdata, family="binomial") print(summary(model7)) dev.new() plot(marginal_effects(model7) ) • Could be that subjects have different success rates
Effect of trial • Compare to previous models (except for stupid model5) • Subject selective success rate does not help the model at all • The best model is the one that supposes the success rate is almost precisely 0.2 (guess rate) • The model that considers variation across actual cards is almost as good • Trading off flexibility for fit > print(model_weights(model1, model2, model3, model4, model6, model7, weights="waic")) model1 model2model3 model4 model6 model7 2.232906e-01 3.778962e-013.716984e-01 2.711208e-02 5.280193e-20 2.675989e-06
Your turn • Modify the code to make “random” effect on subjects • Should shrink performance to indicate less difference between subjects • Modify the code to shrink population scores for each actual card • Set a prior to shrink toward a success rate of 0.2 • Priors have to be set up in terms of logit scores!
Driving data set • Questionnaire about driving in inclement weather • CHO2DRI: How often he or she chooses to drive in inclement weather: 1 = always, 3 = sometimes, 5 = never # load full data file DRdata<-read.csv(file="Driving.csv",header=TRUE,stringsAsFactors=FALSE) # mark whether Choose 2 Drive in inclement weather or not DRdata$Score<-DRdata$CHO2DRI DRdata$Score[DRdata$Score!=5] =0 DRdata$Score[DRdata$Score==5] =1
Hypothesis test • Compare males and females • Z-test
Binomial regression Call: glm(formula = Score ~ 1, family = "binomial", data = DRdata) Deviance Residuals: Min 1Q Median 3Q Max -0.8083 -0.8083 -0.8083 1.5985 1.5985 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.9510 0.2856 -3.33 0.000868 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 72.189 on 60 degrees of freedom Residual deviance: 72.189 on 60 degrees of freedom AIC: 74.189 Number of Fisher Scoring iterations: 4 modelf1null = glm(formula = Score ~ 1, family = "binomial", data = DRdata) • Model with no effect of sex
Frequentist binomial regression modelf1 <- glm(Score ~ GENDER, data=DRdata, family="binomial") print(summary(modelf1) ) • Different rates for females and males Call: glm(formula = Score ~ GENDER, family = "binomial", data = DRdata) Deviance Residuals: Min 1Q Median 3Q Max -0.8446 -0.8446 -0.7375 1.5518 1.6942 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.8473 0.3450 -2.456 0.0141 * GENDERMale -0.3159 0.6177 -0.511 0.6091 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 72.189 on 60 degrees of freedom Residual deviance: 71.922 on 59 degrees of freedom AIC: 75.922 Number of Fisher Scoring iterations: 4
Frequentist binomial regression Call: glm(formula = Score ~ GENDER, family = "binomial", data = DRdata) Deviance Residuals: Min 1Q Median 3Q Max -0.8446 -0.8446 -0.7375 1.5518 1.6942 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.8473 0.3450 -2.456 0.0141 * GENDERMale -0.3159 0.6177 -0.511 0.6091 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 72.189 on 60 degrees of freedom Residual deviance: 71.922 on 59 degrees of freedom AIC: 75.922 Number of Fisher Scoring iterations: 4 > plogis(-0.8437) [1] 0.3007561 > plogis(-0.8437-0.3159) [1] 0.23874 • Convert to proportions
Frequentist model comparison • Compare null model to gender model • Null model is expected to better predict future data
Bayesian binomial regression model1null = brm(Score|trials(1) ~ 1, data=DRdata, family="binomial") print(summary(model1null)) • No effect of sex Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 1 Data: DRdata (Number of observations: 61) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept -0.97 0.28 -1.55 -0.45 1548 1.00
Bayesian binomial regression model1 = brm(Score|trials(1) ~ 0 + GENDER, data=DRdata, family="binomial") print(summary(model1)) • Different rates for females and males Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GENDER Data: DRdata (Number of observations: 61) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GENDERFemale -0.87 0.35 -1.55 -0.21 3198 1.00 GENDERMale -1.24 0.55 -2.38 -0.24 2839 1.00
Bayesian binomial regression > plogis(-0.87) [1] 0.2952543 > plogis(-1.24) [1] 0.224436 • Convert to proportions Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GENDER Data: DRdata (Number of observations: 61) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GENDERFemale -0.87 0.35 -1.55 -0.21 3198 1.00 GENDERMale -1.24 0.55 -2.38 -0.24 2839 1.00
Bayesian binomial regression > plogis(-0.87) [1] 0.2952543 > plogis(-1.24) [1] 0.224436 • Convert to proportions Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GENDER Data: DRdata (Number of observations: 61) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GENDERFemale -0.87 0.35 -1.55 -0.21 3198 1.00 GENDERMale -1.24 0.55 -2.38 -0.24 2839 1.00
Posteriors > mean(propsF) [1] 0.2995896 > mean(propsM) [1] 0.2382168 # compute means of posteriors post<-posterior_samples(model1) propsF<-plogis(post$b_GENDERFemale) propsM<-plogis(post$b_GENDERMale) dev.new() xrange = c(min(c(propsF, propsM)), max(c(propsF, propsM))) yrange = c(0, max(c(density(propsF)$y, density(propsM)$y))) plot(density(propsF), xlim= xrange, ylim=yrange, col="green", main="Posterior Female (green) and Male (red)") lines(density(propsM), col="red", lwd=3, lty=2) • Convert to proportions
Posteriors dev.new() plot(marginal_effects(model1) ) • Can also look at credible intervals
Bayesian model comparison • Compare null model to gender model • Null model is expected to better predict future data > print(model_weights(model1null, model1, weights="loo")) model1null model1 0.7523272 0.2476728
Your turn • The driving data set includes the age of each respondent • Include AGE as a variable in the binomial regression • Compare to the other models • Frequentist • Bayesian