1 / 42

Bayesian modelling hevruta

Bayesian modelling hevruta. How does it work? JAGS, MCMC, and more…. Our goal. Prior. Likelihood. Posterior. Examples of statistical analysis parameters: – for simple normally distributed data – GLM coefficients - for bivariate distributions

zihna
Download Presentation

Bayesian modelling hevruta

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Bayesian modelling hevruta How does it work? JAGS, MCMC, and more…

  2. Our goal Prior Likelihood Posterior • Examples of statistical analysis parameters: • – for simple normally distributed data • – GLM coefficients • - for bivariate distributions • (covariance matrix) – for multivariate distributions

  3. Data analysis example We want to say something about the height of a population. We have the following sample:

  4. Data analysis example The first question is - what is the data generating process?. We assume a normal process so that: (data variance is assumed to be known for simplicity)

  5. Data analysis example The second question is what kind of prior do we want to assume

  6. Generate data

  7. Data analysis example For a weakly informative prior we can choose a prior that representswhat we know about heights in general, but allows for high variation. E.g.

  8. JAGS model

  9. Posterior distribution

  10. High posterior density (High density interval)

  11. High posterior density (High density interval)

  12. Prior check

  13. Posterior predictive check Posterior Simulate new data

  14. Posterior predictive check Posterior Simulate new data

  15. Original sample Posterior predictive

  16. Binomial likelihood – coin toss

  17. Binomial likelihood – coin toss P-value (one-tailed=0.028; two tailed=0.057)

  18. Prior -the beta distribution – Beta (a=1, b=1)

  19. Beta (a=100,b=100)

  20. Beta(a=10,b=1)

  21. Beta (a=1, b=10)

  22. JAGS model – Beta-binomial

  23. Today’s posterior is tomorrow’s prior

  24. Conclusions?

  25. Jags model

  26. Posterior Posterior beliefs under after experiment 1 Posterior beliefs under after both experiments

  27. Using experiment 1 posterior as experiment 2 prior Combining the results of both experiments (y=49+40, N=200)

  28. MCMC – why do we need this • For discrete parameters • We can pretty easily sum across all possible values of • At least when there is a single parameters to estimate…. • For continuous parameters • We can’t easily integrate across , especially when there is more than one parameter to estimate Normalized posterior Non-normalized posterior

  29. MCMC – how does it work • A group of algorithms that can use our knowledge about the relative posterior probability (non-normalized) of each pair of values, to build a representative sample from the posterior distribution • Kruschke’s island hopping politician metafor: • Wants to visit each island proportionally to its population • Knows only the population of the current island and can find out the population of the two adjacent islands • The Metropolis algorithm (simplification): • Toss a coin to decide whether to check the population of the island to the right or to the left • If the population of the proposed island is bigger than the population of the current island – go there • Otherwise, go to the proposed island with a probability of

  30. The magic works…

  31. A more realistic example • Start at a initial value of for which • Calculate the relative probability of by: • Draw the proposed value from a normal distribution centered around the current value (sd of the proposal distribution will affect the smoothness of the random walk) • Use the same decision rule:

  32. Other (more efficient) samples • Gibbs samples – used by JAGS and WINBUGS. • Great for models with a lot of parameters, especially when parameters are inter-dependent (e.g. Hierarchical models) • Hamiltonian MCMC sample – used by STAN • We probably won’t talk about it unless you want us to. It’s a faster but a more complicated software, that is much better than JAGS for more complex model (e.g. time-series models, multivariate models, models with complex covariance structures)

  33. MCMC diagnostics - representativeness • Trace plot

  34. MCMC diagnostics - representativeness • Actual posteriors

  35. MCMC diagnostics - - representativeness • Gelman-Rubin statistic • Both should be at around 1, and not higher than 1.1. This can be used to decide on burn-in.

  36. MCMC diagnostics - Autocorrelation • Why there is an autocorrelation? • Why is it a problem • Effective sample size – the size of the sample of relatively independent iterations (10,000 is a good heuristic number…. (Kruschke))

  37. Diagnostics – the beta-binomial example

  38. Diagnostics – the beta-binomial example

  39. Diagnostics – the beta-binomial example Larger than 9000 because of negative correlations. Ignore it.

More Related