Bayesian style ANOVA

Bayesian style ANOVA Greg Francis PSY 626: Bayesian Statistics for Psychological Science Fall 2018 Purdue University

Smiles and Leniency • LaFrance & Hect (1995): vignette describes a person committing some minor infraction. A photo shows the person with different smile types (or neutral) • The question is how lenient subjects are in punishing the person in the vignette • Each subject contributes one score

Standard analysis • Summary statistics, One-way ANOVA

Standard analysis • Contrasts to compare neutral versus each other condition

Bayesian variation of ANOVA • We start with default priors (uniform, improper) • Follow along by downloading “SmilesLeniency.csv” and “SmilesLeniency1.R” from the class web site # load data file SLdata<-read.csv(file="SmilesLeniency.csv",header=TRUE,stringsAsFactors=TRUE) > head(SLdata) SmileType Leniency 1 False 2.5 2 False 5.5 3 False 6.5 4 False 3.5 5 False 3.0 6 False 3.5

Define the model • ANOVA is just linear regression • This makes it a bit awkward to interpret the results • First level of SmileType (False) becomes the intercept • Every other condition is a slope relative to that intercept • library(brms) • model1 = brm(Leniency ~ SmileType, data = SLdata, iter = 2000, warmup = 200, chains = 3, thin = 2 )

Output • print(summary(model1)) • Prints out a table of estimates of Intercept (False SmileType) and deviations for other conditions Family: gaussian Links: mu = identity; sigma = identity Formula: Leniency ~ SmileType Data: SLdata (Number of observations: 136) Samples: 3 chains, each with iter = 2000; warmup = 200; thin = 2; total post-warmup samples = 2700 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept 5.37 0.28 4.81 5.95 2470 1.00 SmileTypeFelt -0.46 0.40 -1.24 0.30 2442 1.00 SmileTypeMiserable -0.46 0.40 -1.25 0.28 2464 1.00 SmileTypeNeutral -1.24 0.40 -2.01 -0.41 2576 1.00 Family Specific Parameters: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat sigma 1.64 0.10 1.46 1.87 2700 1.00 Samples were drawn using sampling(NUTS). For each parameter, Eff.Sample is a crude measure of effective sample size, and Rhat is the potential scale reduction factor on split chains (at convergence, Rhat = 1).

Output • # Check on model convergence and posteriors • plot(model1) • Looks good!

Output • # Plot model fit to data • dev.new() • plot(marginal_effects(model1), points = TRUE)

Like Tukey # Compare each condition to every other condition (simlar to Tukey) mcmc = model1 coefs <-as.matrix(mcmc)[, 1:4] newdata <-data.frame(x = levels(SLdata$SmileType)) # A Tukeys contrast matrix library(multcomp) # table(newdata$x) - gets the number of replicates of each level tuk.mat <-contrMat(n = table(newdata$x), type = "Tukey") Xmat <-model.matrix(~x, data = newdata) pairwise.mat <- tuk.mat %*% Xmat pairwise.mat # plot posteriors of differences library(bayesplot) dev.new() mcmc_areas(coefs %*% t(pairwise.mat)) • Look at differences between conditions

HDPI • You might be interested in the Highest Density Posterior Interval (HPDI) for the differences of means # Generate HPDIs for differences comps =tidyMCMC(coefs %*% t(pairwise.mat), conf.int = TRUE, conf.method = "HPDinterval") print(comps) term estimate std.error conf.low conf.high 1 Felt - False -0.458715814 0.3984868 -1.1789537 0.35172475 2 Miserable - False -0.462241088 0.3998117 -1.2402750 0.29442586 3 Neutral - False -1.239229116 0.4043405 -2.0430631 -0.47332554 4 Miserable - Felt -0.003525275 0.3985073 -0.7327741 0.78471045 5 Neutral - Felt -0.780513302 0.4038958 -1.5626832 -0.01610910 6 Neutral - Miserable -0.776988028 0.4006923 -1.5729656 -0.03355703

Plot HDPI • I find I have to run this directly from the command line rather than through the file dev.new() ggplot(comps, aes(y = estimate, x = term)) + geom_pointrange(aes(ymin = conf.low, ymax = conf.high)) + geom_hline(yintercept = 0, linetype = "dashed") + scale_y_continuous("Effect size") + scale_x_discrete("") + coord_flip() + theme_classic()

ANOVA versus Bayesian • Note, we get the same pattern of results either way • It looks like the Neutral condition is different from each of the other conditions • E.g., it is tempting to note that the HDPI of the Neutral-Felt difference does not include 0  significance?! • Wait, we can do better • Don’t we worry about multiple comparisons? • Not really • The Bayesian analysis is not about making decisions, it is about estimating parameter values, given the prior and the data • You might make Type I errors, but you cannot control the Type I error rate, anyhow

Posteriors • You can compute things that are not possible with the standard ANOVA • Probability that the mean Leniency rating for a False smile is bigger than the mean Leniency rating for a Miserable smile # Extract posteriors from samples post<-posterior_samples(model1) # Compute probability that mean leniency rating is bigger for a False smile than for a Miserable smile FalseLeniency <- post$b_Intercept MiserableLeniency<- post$b_Intercept + post$b_SmileTypeMiserable difLeniency <- FalseLeniency - MiserableLeniency length(difLeniency[difLeniency >0])/length(difLeniency) [1] 0.8703704 >

Using priors • model2 = brm(Leniency ~ SmileType, data = SLdata, iter = 2000, warmup = 200, chains = 3, thin = 2, prior = c(prior(normal(0, 10), class = "Intercept"), prior(normal(0, 10), class = "b"), prior(cauchy(0, 5), class = "sigma")) ) Family: gaussian Links: mu = identity; sigma = identity Formula: Leniency ~ SmileType Data: SLdata (Number of observations: 136) Samples: 3 chains, each with iter = 2000; warmup = 200; thin = 2; total post-warmup samples = 2700 Population-Level Effects: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat Intercept 5.36 0.28 4.83 5.90 2585 1.00 SmileTypeFelt -0.45 0.40 -1.21 0.33 2530 1.00 SmileTypeMiserable -0.45 0.40 -1.24 0.31 2183 1.00 SmileTypeNeutral -1.24 0.40 -2.01 -0.43 2423 1.00 Family Specific Parameters: Estimate Est.Error l-95% CI u-95% CI Eff.Sample Rhat sigma 1.64 0.10 1.45 1.86 2561 1.00

Using priors • model2 = brm(Leniency ~ SmileType, data = SLdata, iter = 2000, warmup = 200, chains = 3, thin = 2, prior = c(prior(normal(0, 10), class = "Intercept"), prior(normal(0, 10), class = "b"), prior(cauchy(0, 5), class = "sigma")) ) • Class activity: try to “break” the analysis by using crazy priors

Conclusions • Bayesian ANOVA • No multiple comparisons (because no decisions, yet) • Priors

Bayesian style ANOVA

Bayesian style ANOVA

Presentation Transcript

ANOVA

ANOVA

Bayesian ANOVA

ANOVA

ANOVA

ANOVA

ANOVA

ANOVA

ANOVA

ANOVA

ANOVA

ANOVA

Bayesian style dependent tests

Bayesian ANOVA