190 likes | 351 Views
Applied Bayesian Analysis for the Social Sciences. Philip Pendergast Computing and Research Services Department of Sociology philip.pendergast@colorado.edu. Sponsored by Computing and Research Services and the Institute of Behavioral Science.
E N D
Applied Bayesian Analysis for the Social Sciences Philip PendergastComputing and Research Services Department of Sociology philip.pendergast@colorado.edu Sponsored by Computing and Research Services and the Institute of Behavioral Science
Suspending Disbelief-- Faith in Classical Statistics What are some issues that we have with classical statistics? Think back to your introductory class…
Suspending Disbelief-- Faith in Classical Statistics • Conducting an infinite number of experiments/ repeated sampling • Assume that some parameter is unknown but has a fixed value • P-value worship • Null hypothesis testing • Multiple comparisons • Strict data assumptions, often unmet • Confidence Interval interpretation • Small samples are an issue
The Coin Flip • Frequentist • We can determine the bias of a coin (b) by repeatedly flipping it and counting heads. As long as we repeat the process enough times, we should be able to estimate the “true” bias of the coin. • If p<.05 that b= 0.5, we reject the null hypothesis that it is unbiased.
The Nail Flip • Frequentist • We determine the bias of a nail (b) by repeatedly flipping it and counting “heads” (landing on its flat base). • If p<.05 that b= 0.5, we reject the null hypothesis that it is unbiased. Does this seem reasonable? Don’t we know that the nail is biased?
Classical Statistics is Atheoretical • Science is an iterative process, we should learn from past research. • Theory should guide us in how we analyze data. • Typically, beyond the lit. review, informs: • Variable selection • Model building • Choice of model (e.g. SEM, HLM) • NOT the actual way parameters are estimated in the analysis
Bayesian Statistics and Theory p(|y)p(y|)p() • Bayesian statistics considers to be unknown, possessing a probability distribution reflecting our degree of uncertainty about it. • We take into consideration theory and uncertainty when estimating . • The Posterior: A probably distribution for given our data on hand. • The Data: Needs only meet the assumption of exchangeability. • The Prior: A distribution based on knowledge about , and our certainty.
The Nail Flip • Bayesian • Prior Beliefs: We consult several nail experts, who are relatively certain that nails will land on their heads only 1/50 times, or 2% of the time. • Data on Hand: We flip the nail 100 times. • Posterior: We sample from the joint probability of our prior beliefs given our data (the Posterior distribution) to see whether the experts’ opinions are reasonable and/or if our nail shares a similar bias to other nails. Well if we examine the anatomy of the nail…
Priors and Subjectivity • “B-B-B-Bbbbut wait, aren’t these priors subjective? We are objective scientists!” • Variable selection, model choice, research questions are all subjective decisions. • By making these subjective decisions explicit, we open ourselves to critique and are forced to thoughtfully choose and defend our choice of priors. If we have no good theory, we must choose a prior that lets the data speak for itself.
Choosing Sensible Priors • How much do we know? How accurate do we take this information to be? • Informative priors: Historical data, expert opinion, past research findings, theoretical implications. • Non-informative prior: Uniform distribution over a sensible range of values. • If the prior has high precision (1/2) or N is small, it will heavily influence the posterior distribution. If it has low precision or N is large, the data influences the posterior more.
Conjugate Prior Distributions • Conjugate priors have a distribution that yields a posterior distribution in the same family as the prior when combined with data. Data Distribution Normal Poisson Binomial Conjugate Prior Normal or Uniform Gamma Beta
The Posterior Distribution and Monte Carlo Integration • Recall that p(|y) is a probability distribution. • It is computationally demanding to directly derive summary measures of p(|y). • Instead, we repeatedly sample from p(|y) and summarize the distribution formed by these samples. • This is called Monte Carlo Integration
Markov Chains, Continued • We specify the number of chains as well as the number of iterations made. • They “dance” around the posterior from starting values, moving to areas of higher density. • Chains stabilize around the posterior mean. • Once stabilized, discard early iterations (Burn-in samples). • Estimates of the posterior come from the post-burn-in period.
Bayesian Analysis (Finally!) • Decide on a model. • Specify the # of Markov Chains, # of iterations, a burn-in period, and your prior beliefs. • Run model diagnostics to check for convergence. • Compare results of models with different specifications of priors, parameters, etc. to see which best “returns” the data in-hand or obtains the highest model fit (e.g. BIC, Bayes Factor, Deviance).
Overcoming Classical Shortcomings • Only use data on hand, no extrapolating to other potential(ly conflicting) data • Directly estimate our uncertainty of • Report HDIs, thoughtfully draw conclusions • More meaningful hypothesis testing (e.g. different priors) • Not an issue • Minimal assumptions (exchangability) • HDI shows the believability (probability) of values • If strong priors, still useful • Conducting an infinite number of experiments/ repeated sampling • Assume that some parameter is unknown but has a fixed value • P-value worship • Null hypothesis testing • Multiple comparisons • Strict data assumptions, often unmet • Confidence Interval Interpretation • Small samples are an issue
References Kruschke, J. K. (2011). Doing Bayesian Data Analysis: A tutorial with R and BUGS. Oxford: Academic Press. Kaplan, D. (2014). Bayesian Statistics for the Social Sciences. New York: Guilford Press.
R “MCMCpack” Tutorial • Run simple models predicting job satisfaction as a function of income. • One model uses an uninformative prior (specifically, the uniform distribution) • The other uses an informed prior from earlier data • Compare the Bayes Factors to see which “retrieves” the data better (i.e. is a better fit)
R “MCMCpack” Tutorial • Open R • Click “Packages”-->Set CRAN mirror--> Pick anything in the US. • Open “Packages” again-->Install Packages-->Scroll down to MCMCpack. • Say “yes” to a new library. • Type “library(MCMCpack)” to load it in, also type “library(foreign)” to enable reading of the STATA file.