Bayesian Binomial Regression Models for Psychological Data Analysis

Binomial regression Greg Francis PSY 626: Bayesian Statistics for Psychological Science Fall 2018 Purdue University

Zenner Cards • Guess which card appears next:

Data • Score indicates whether you predicted correctly (1) or not (0) • File ZennerCards.csv contains the data for 22 participants • # load full data file • ZCdata<-read.csv(file="ZennerCards.csv",header=TRUE,stringsAsFactors=FALSE)

Binomial model • yi is number of observed outcomes (e.g., correct responses) from n draws when the probability of a correct response is pi • We know n for any trial is 1 • We estimate pi • It is convenient to actually estimate the logit of pi

Binomial regression • We will model the logit as a linear equation • Among other things, this insures that our pi value is always between 0 and 1 (as all probabilities must be) • Makes it easier to identify priors • Our model posterior will gives us logit(pi), to get back pi we have to invert (plogis function)

Model set up • No independent variables # Estimate probability of correct model1 = brm(Score ~ 1, data = ZCdata, family="binomial") print(summary(model1)) Family: binomial Links: mu = logit Formula: Score ~ 1 Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept -1.31 0.07 -1.46 -1.17 1401 1.00

Odd warning > model1 = brm(Score ~ 1, data = ZCdata, family="binomial") Using the maximum response value as the number of trials. Only 2 levels detected so that family 'bernoulli' might be a more efficient choice. Compiling the C++ model Start sampling SAMPLING FOR MODEL 'binomial brms-model' NOW (CHAIN 1). Gradient evaluation took 0.000312 seconds 1000 transitions using 10 leapfrog steps per transition would take 3.12 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) • No independent variables

Odd warning • Data can be coded as 0 – 1 or as a frequency • For the latter, brm needs to know how many trials so that it can judge how many successes (proportion) • The number of trials might vary across reported frequencies, so this should be provided for each score • In our case it is 1 trial for each score model1 = brm(Score|trials(1) ~ 1, data = ZCdata, family="binomial") Only 2 levels detected so that family 'bernoulli' might be a more efficient choice. Compiling the C++ model Start sampling SAMPLING FOR MODEL 'binomial brms-model' NOW (CHAIN 1). Gradient evaluation took 0.000312 seconds 1000 transitions using 10 leapfrog steps per transition would take 3.12 seconds. Adjust your expectations accordingly! Iteration: 1 / 2000 [ 0%] (Warmup) Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 1 Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept -1.32 0.07 -1.46 -1.18 1443 1.00

Model results • Convert posterior logit to proportion # compute means of posteriors post<-posterior_samples(model1) props<-plogis(post$b_Intercept) dev.new() plot(density(props))

Model results • Can quickly answer some questions such as • Probability that the success rate is better than pure chance (0.2)? betterThanChance <-length(props[props > 0.2])/length(props) cat("Probability that success rate is better than pure chance = ", betterThanChance, "\n") Probability that success rate is better than pure chance = 0.82225

Skeptical model • Set a prior to be tight around 0.2 • Have to set it for the logistic of 0.2 # Skeptical prior lgSkeptical = qlogis(0.2) stanvars <-stanvar(lgSkeptical, name='lgSkeptical') prs <- c(prior(normal(lgSkeptical, 0.01), class = "Intercept") ) model2 = brm(Score|trials(1) ~ 1, data = ZCdata, family="binomial", prior = prs, stanvars=stanvars ) print(summary(model2)) Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 1 Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept -1.38 0.01 -1.40 -1.36 1253 1.00

Model results • Convert posterior logit to proportion # compute means of posteriors post<-posterior_samples(model1) props<-plogis(post$b_Intercept) dev.new() plot(density(props))

Model results • Can quickly answer some questions such as • Probability that the success rate is better than pure chance (0.2)? betterThanChance <-length(props[props > 0.2])/length(props) cat("Probability that success rate is better than pure chance = ", betterThanChance, "\n") Probability that success rate is better than pure chance = 0.559 > mean(props) [1] 0.2002437

Differences across actual cards model3 = brm(Score|trials(1) ~ 0 + ActualCard, data = ZCdata, family="binomial") • Maybe you suspect that some cards are more easily guessed than other cards Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + ActualCard Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat ActualCardCircle -1.51 0.17 -1.87 -1.19 4000 1.00 ActualCardCross -1.13 0.16 -1.46 -0.84 4000 1.00 ActualCardSquare -1.01 0.15 -1.30 -0.72 4000 1.00 ActualCardStar -1.52 0.18 -1.86 -1.18 4000 1.00 ActualCardWavy -1.51 0.17 -1.87 -1.18 4000 1.00

Differences across actual cards dev.new() plot(marginal_effects(model3) ) • Plot of marginal means converts logits to proportions

Differences across actual cards • We might want to compare predicted performance of a model that considers different guess rates for different cards against a model that treats all cards equally • A model that uses different probabilities for different cards is expected to do better than a model that ignores card type. • Does this suggest that people have some kind of predictive power that works for some cards and not for other cards? print(model_weights(model1, model3, weights=”waic")) model1 model3 0.3752853 0.6247147

Differences in guessed cards model4 = brm(Score|trials(1) ~ 0 + GuessedCard, data = ZCdata, family="binomial") print(summary(model4)) • It might be better to see if participant’s guesses improve model fit Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GuessedCard Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GuessedCardCircle -1.45 0.17 -1.80 -1.11 4000 1.00 GuessedCardCross -1.32 0.15 -1.63 -1.03 4000 1.00 GuessedCardSquare -1.18 0.15 -1.47 -0.90 4000 1.00 GuessedCardStar -1.53 0.17 -1.87 -1.20 4000 1.00 GuessedCardWavy -1.15 0.18 -1.51 -0.79 4000 1.00

Differences across guessed cards dev.new() plot(marginal_effects(model4) ) • Plot of marginal means converts logits to proportions

Differences across cards • We might want to compare predicted performance of models that: • considers different success rates for different cards • treats all cards equally • Considers different success rates for different guesses • A model that uses different probabilities for different cards is expected to do better than a model that ignores card type or a model that is based on guessed cards. print(model_weights(model1, model3, model4, weights=”waic")) model1 model3 model4 0.35892977 0.59748876 0.04358147

Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GuessedCard * ActualCard Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.SampleRhat GuessedCardCircle 18748.29 23335.67 569.90 82365.89 14 1.48 GuessedCardCross -130341.46 188186.65 -687440.63 -1398.18 4 1.95 GuessedCardSquare -125573.75 330002.03 -1389784.88 -567.34 9 1.49 GuessedCardStar -91900.30 138111.23 -516973.74 -2489.86 5 1.71 GuessedCardWavy -52455.04 69700.33 -297285.00 -908.54 13 1.37 ActualCardCross -260964.33 338283.58 -1186086.07 -15152.35 3 2.02 ActualCardSquare -113004.55 122265.01 -497683.26 -6292.06 11 1.66 ActualCardStar -166532.86 214562.89 -861071.25 -7501.83 8 1.63 ActualCardWavy -109945.16 138081.22 -532928.00 -4535.43 12 1.42 GuessedCardCross:ActualCardCross 621878.51 673531.42 50147.20 2308893.53 3 2.83 GuessedCardSquare:ActualCardCross -171411.07 1023514.89 -3291415.19 1607030.41 6 2.55 GuessedCardStar:ActualCardCross -82208.08 313172.23 -1112235.35 376533.31 27 1.14 GuessedCardWavy:ActualCardCross -226065.05 329146.11 -1148391.59 79984.64 7 1.47 GuessedCardCross:ActualCardSquare -141046.83 744019.18 -3237012.28 430138.02 14 1.28 GuessedCardSquare:ActualCardSquare 630016.28 1348957.38 28622.18 6647163.21 8 1.43 GuessedCardStar:ActualCardSquare -516200.69 1222076.03 -5277814.59 190150.81 6 1.48 GuessedCardWavy:ActualCardSquare -254554.08 378122.65 -1105859.03 208732.66 19 1.18 GuessedCardCross:ActualCardStar -236520.96 571416.03 -1990659.48 498791.23 18 1.16 GuessedCardSquare:ActualCardStar -138495.18 405125.90 -1409142.16 453403.85 28 1.06 GuessedCardStar:ActualCardStar 591728.12 684506.33 32211.24 2642849.92 7 1.72 GuessedCardWavy:ActualCardStar -108802.15 264440.16 -908997.28 224749.11 13 1.36 GuessedCardCross:ActualCardWavy -196106.32 347193.69 -1102970.69 423275.50 20 1.21 GuessedCardSquare:ActualCardWavy -420514.11 705384.91 -2611403.36 248315.35 43 1.12 GuessedCardStar:ActualCardWavy -367909.12 576345.19 -2096693.45 79980.82 8 2.04 GuessedCardWavy:ActualCardWavy 416592.62 569143.92 23860.54 1966798.87 8 1.56 Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1). Warning messages: 1: The model has not converged (some Rhats are > 1.1). Do not analyse the results! We recommend running more iterations and/or setting stronger priors. 2: There were 3875 divergent transitions after warmup. Increasing adapt_delta above 0.8 may help. See http://mc-stan.org/misc/warnings.html#divergent-transitions-after-warmup Interaction of cards and guesses • Maybe subjects are more successful for some combination of cards and guesses? • Duh! • Type Circle, guess Circle will be 100% correct model5 = brm(Score|trials(1) ~ 0 + GuessedCard*ActualCard, data = ZCdata, family="binomial") print(summary(model5))

Interaction • If we ignore the warning about model convergence • And don’t notice that the interaction model is stupid • It looks like the interaction model does great! • Indeed, it has to do great, because it has all the answers • Using both guessed and actual cards does not make sense for a model > print(model_weights(model1, model2, model3, model4, model5, weights="waic")) model1 model2 model3 model4 model5 7.780087e-248 1.316699e-247 1.295104e-247 9.446629e-249 1.000000e+00

Effect of trial model6 = brm(Score|trials(1) ~ 0 + Trial, data = ZCdata, family="binomial") print(summary(model6)) • Success could be related to trial (information available for later trials) • Success rate could be higher for later trials Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + Trial Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Trial -0.04 0.00 -0.05 -0.04 1387 1.00

Effect of trial • Plot of marginal means converts logits to proportions • Opposite of information availability! dev.new() plot(marginal_effects(model6) )

Effect of trial • Compare to previous models (except for stupid model5) • Trial does not help the model at all • The best model is the one that supposes the success rate is almost precisely 0.2 (guess rate) • The model that considers variation across actual cards is almost as good • Trading off flexibility for fit > print(model_weights(model1, model2, model3, model4, model6, weights="waic")) model1 model2model3 model4 model6 2.232912e-01 3.778972e-013.716994e-01 2.711215e-02 5.280207e-20

Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + ParticipantFactor Data: ZCdata (Number of observations: 1100) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat ParticipantFactor1 -2.06 0.46 -3.04 -1.24 4000 1.00 ParticipantFactor2 -0.77 0.30 -1.36 -0.19 4000 1.00 ParticipantFactor3 -1.42 0.37 -2.17 -0.74 4000 1.00 ParticipantFactor4 -1.71 0.40 -2.56 -0.97 4000 1.00 ParticipantFactor5 -1.55 0.37 -2.33 -0.84 4000 1.00 ParticipantFactor6 -1.56 0.38 -2.37 -0.86 4000 1.00 ParticipantFactor7 -1.30 0.34 -2.02 -0.68 4000 1.00 ParticipantFactor8 -1.30 0.35 -2.00 -0.67 4000 1.00 ParticipantFactor9 -0.77 0.31 -1.38 -0.19 4000 1.00 ParticipantFactor10 -1.42 0.36 -2.16 -0.79 4000 1.00 ParticipantFactor11 -1.19 0.35 -1.90 -0.52 4000 1.00 ParticipantFactor12 -1.87 0.42 -2.79 -1.10 4000 1.00 ParticipantFactor13 -1.57 0.39 -2.38 -0.83 4000 1.00 ParticipantFactor14 -1.30 0.35 -2.02 -0.64 4000 1.00 ParticipantFactor15 -1.57 0.38 -2.36 -0.87 4000 1.00 ParticipantFactor16 -0.68 0.31 -1.29 -0.10 4000 1.00 ParticipantFactor17 -1.07 0.32 -1.71 -0.48 4000 1.00 ParticipantFactor18 -1.56 0.38 -2.35 -0.87 4000 1.00 ParticipantFactor19 -1.18 0.34 -1.87 -0.56 4000 1.00 ParticipantFactor20 -1.18 0.34 -1.88 -0.54 4000 1.00 ParticipantFactor21 -2.06 0.46 -3.02 -1.23 4000 1.00 ParticipantFactor22 -1.42 0.36 -2.19 -0.74 4000 1.00 Effect of subject # By default, participant numbers are treated as _numbers_. Need to correct that. ZCdata$ParticipantFactor = factor(ZCdata$Participant) model7 = brm(Score|trials(1) ~ 0 + ParticipantFactor, data = ZCdata, family="binomial") print(summary(model7)) dev.new() plot(marginal_effects(model7) ) • Could be that subjects have different success rates

Effect of trial • Compare to previous models (except for stupid model5) • Subject selective success rate does not help the model at all • The best model is the one that supposes the success rate is almost precisely 0.2 (guess rate) • The model that considers variation across actual cards is almost as good • Trading off flexibility for fit > print(model_weights(model1, model2, model3, model4, model6, model7, weights="waic")) model1 model2model3 model4 model6 model7 2.232906e-01 3.778962e-013.716984e-01 2.711208e-02 5.280193e-20 2.675989e-06

Your turn • Modify the code to make “random” effect on subjects • Should shrink performance to indicate less difference between subjects • Modify the code to shrink population scores for each actual card • Set a prior to shrink toward a success rate of 0.2 • Priors have to be set up in terms of logit scores!

Driving data set • Questionnaire about driving in inclement weather • CHO2DRI: How often he or she chooses to drive in inclement weather: 1 = always, 3 = sometimes, 5 = never # load full data file DRdata<-read.csv(file="Driving.csv",header=TRUE,stringsAsFactors=FALSE) # mark whether Choose 2 Drive in inclement weather or not DRdata$Score<-DRdata$CHO2DRI DRdata$Score[DRdata$Score!=5] =0 DRdata$Score[DRdata$Score==5] =1

Hypothesis test • Compare males and females • Z-test

Binomial regression Call: glm(formula = Score ~ 1, family = "binomial", data = DRdata) Deviance Residuals: Min 1Q Median 3Q Max -0.8083 -0.8083 -0.8083 1.5985 1.5985 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.9510 0.2856 -3.33 0.000868 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 72.189 on 60 degrees of freedom Residual deviance: 72.189 on 60 degrees of freedom AIC: 74.189 Number of Fisher Scoring iterations: 4 modelf1null = glm(formula = Score ~ 1, family = "binomial", data = DRdata) • Model with no effect of sex

Frequentist binomial regression modelf1 <- glm(Score ~ GENDER, data=DRdata, family="binomial") print(summary(modelf1) ) • Different rates for females and males Call: glm(formula = Score ~ GENDER, family = "binomial", data = DRdata) Deviance Residuals: Min 1Q Median 3Q Max -0.8446 -0.8446 -0.7375 1.5518 1.6942 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.8473 0.3450 -2.456 0.0141 * GENDERMale -0.3159 0.6177 -0.511 0.6091 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 72.189 on 60 degrees of freedom Residual deviance: 71.922 on 59 degrees of freedom AIC: 75.922 Number of Fisher Scoring iterations: 4

Frequentist binomial regression Call: glm(formula = Score ~ GENDER, family = "binomial", data = DRdata) Deviance Residuals: Min 1Q Median 3Q Max -0.8446 -0.8446 -0.7375 1.5518 1.6942 Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.8473 0.3450 -2.456 0.0141 * GENDERMale -0.3159 0.6177 -0.511 0.6091 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 72.189 on 60 degrees of freedom Residual deviance: 71.922 on 59 degrees of freedom AIC: 75.922 Number of Fisher Scoring iterations: 4 > plogis(-0.8437) [1] 0.3007561 > plogis(-0.8437-0.3159) [1] 0.23874 • Convert to proportions

Frequentist model comparison • Compare null model to gender model • Null model is expected to better predict future data

Bayesian binomial regression model1null = brm(Score|trials(1) ~ 1, data=DRdata, family="binomial") print(summary(model1null)) • No effect of sex Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 1 Data: DRdata (Number of observations: 61) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept -0.97 0.28 -1.55 -0.45 1548 1.00

Bayesian binomial regression model1 = brm(Score|trials(1) ~ 0 + GENDER, data=DRdata, family="binomial") print(summary(model1)) • Different rates for females and males Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GENDER Data: DRdata (Number of observations: 61) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GENDERFemale -0.87 0.35 -1.55 -0.21 3198 1.00 GENDERMale -1.24 0.55 -2.38 -0.24 2839 1.00

Bayesian binomial regression > plogis(-0.87) [1] 0.2952543 > plogis(-1.24) [1] 0.224436 • Convert to proportions Family: binomial Links: mu = logit Formula: Score | trials(1) ~ 0 + GENDER Data: DRdata (Number of observations: 61) Samples: 4 chains, each with iter = 2000; warmup = 1000; thin = 1; total post-warmup samples = 4000 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat GENDERFemale -0.87 0.35 -1.55 -0.21 3198 1.00 GENDERMale -1.24 0.55 -2.38 -0.24 2839 1.00

Posteriors > mean(propsF) [1] 0.2995896 > mean(propsM) [1] 0.2382168 # compute means of posteriors post<-posterior_samples(model1) propsF<-plogis(post$b_GENDERFemale) propsM<-plogis(post$b_GENDERMale) dev.new() xrange = c(min(c(propsF, propsM)), max(c(propsF, propsM))) yrange = c(0, max(c(density(propsF)$y, density(propsM)$y))) plot(density(propsF), xlim= xrange, ylim=yrange, col="green", main="Posterior Female (green) and Male (red)") lines(density(propsM), col="red", lwd=3, lty=2) • Convert to proportions

Posteriors dev.new() plot(marginal_effects(model1) ) • Can also look at credible intervals

Bayesian model comparison • Compare null model to gender model • Null model is expected to better predict future data > print(model_weights(model1null, model1, weights="loo")) model1null model1 0.7523272 0.2476728

Your turn • The driving data set includes the age of each respondent • Include AGE as a variable in the binomial regression • Compare to the other models • Frequentist • Bayesian

Bayesian Binomial Regression Models for Psychological Data Analysis

Bayesian Binomial Regression Models for Psychological Data Analysis

Presentation Transcript

Binomial Heaps

Analysis of Categorical and Ordinal Data: Binomial and Logistic Regression

Negative Binomial Regression

Analysis of Categorical and Ordinal Data: Binomial and Logistic Regression

Binomial Distribution

Analysis of Categorical and Ordinal Data: Binomial and Logistic Regression

Binomial Distributions

Binomial Project

Binomial Project

Binomial Distribution

Binomial Distribution

Binomial Heaps

Binomial Expansions

Binomial Distribution

Binomial Distributions

Binomial Distributions

Binomial Distributions

Binomial Problem

Negative Binomial Regression

Binomial Heap