270 likes | 285 Views
Learn about Bayesian theory, inference methods, conjugate families, and computational techniques for generating posterior distributions. Explore examples and computational simulations in R for predictive analysis.
E N D
Bayesian Essentials Slides by Peter Rossi and David Madigan
Y 1 1 X Distribution Theory 101 • Marginal and Conditional Distributions: uniform
Simulating from Joint • To draw from the joint: • i. Draw from marginal on X • ii. Condition on this draw, and draw from conditional of Y|X library(triangle) x <- rtriangle(NumDraws,0,1,1) y <- runif(NumDraws,0,x) plot(x,y)
Triangular Distribution If U~ unif(0,1), then: sqrt(U) has the standard triangle distribution If U1, U2 ~ unif(0,1), then: Y=max{U1,U2} has the standard triangle distribution
Sampling Importance Resampling f g draw a big sample from g sub-sample from that sample with probability f/g
Metropolis f g start with current = 0.5 to get the next value: draw a “proposal” from g keep with probability f(proposal)/f(current) else keep current
The Goal of Inference • Make inferences about unknown quantities using available information. • Inference -- make probability statements • unknowns -- • parameters, functions of parameters, states or latent variables, “future” outcomes, outcomes conditional on an action • Information – • data-based • non data-based • theories of behavior; subjective views; mechanism • parameters are finite or in some range
Bayes theorem • p(θ|D) α p(D| θ) p(θ) • Posterior α “Likelihood”× Prior • Modern Bayesian computing– simulation methods for generating draws from the posterior distribution p(θ|D).
Summarizing the posterior Output from Bayesian Inference: A possibly high dimensional distribution Summarize this object via simulation: marginal distributions of don’t just compute Contrast with Sampling Theory: point est/standard error summary of irrelevant dist bad summary (normal) Limitations of asymptotics
Metropolis Start somewhere with θcurrent To get the next value, generate a proposal θproposal Accept with “probability”: else keep currrent
Example Believe these measurements (D) come from N(μ,1): 0.9072867 -0.4490744 -0.1463117 0.2525023 0.9723840 -0.8946437 -0.2529104 0.5101836 1.2289795 0.5685497 Prior for μ? p(μ) = 2μ
Example continued p(D|μ)? 0.9072867 -0.4490744 -0.1463117 0.2525023 0.9723840 -0.8946437 -0.2529104 0.5101836 1.2289795 0.5685497 y1,…,y10 switch to R… other priors? unif(0,1), norm(0,1), norm(0,100) generating good candidates?
Prediction future observable See D,compute: “Predictive Distribution”
Bayes/Classical Estimators Prior washes out – locally uniform!!! Bayes is consistent unless you have dogmatic prior.
Bayesian Computations Before simulation methods, Bayesians used posterior expectations of various functions as summary of posterior. If p(θ|D) is in a convenient form (e.g. normal), then I might be able to compute this for some h.
Conjugate Families • Models with convenient analytic properties almost invariably come from conjugate families. • Why do I care now? • - conjugate models are used as building blocks • build intuition re functions of Bayesian inference • Definition: • A prior is conjugate to a likelihood if the posterior is in the same class of distributions as prior. • Basically, conjugate priors are like the posterior from some imaginary dataset with a diffuse prior.
Need a prior! Beta-Binomial model
Bayesian Regression Prior: Interpretation as from another dataset. Inverted Chi-Square: Draw from prior?
IID Simulations Scheme: [y|X, , 2] [|2] [2] [, 2|y,X] [2 | y,X] [ | 2,y,X] 1) Draw [2 | y, X] 2) Draw [ | 2,y, X] 3) Repeat