80 likes | 268 Views
More about Posterior Distributions. The process of Bayesian inference involves passing from a prior distribution to a posterior distribution. It is natural to expect that some general relations might hold between these two distributions.
E N D
More about Posterior Distributions • The process of Bayesian inference involves passing from a prior distribution to a posterior distribution. It is natural to expect that some general relations might hold between these two distributions. • For example, since the posterior distribution incorporates the information from the data, it will be less variable than the prior distribution. week 3
Recall – Conditional Expectation • The theorem of total expectation states the following week 3
Claim • The posterior variance is on average smaller than the prior variance, that is, • Proof:… week 3
Example • In the location normal model, the posterior variance is smaller than the variance of the sample mean. • Consider the following numerical example… week 3
Inference Based on the Posterior • As we have seen, the principle of conditional probability implies that the posterior distribution contains all the relevant information about the unknown parameter in the sampling model, the prior and the data. • We now proceed to make inference about the unknown parameter or some other characteristic of interest that is a function of it using the posterior distribution. • In particular we will specify how to compute estimates, credible regions and carry out hypothesis assessment. week 3
Estimation – Posterior mode • Suppose we want to calculate an estimate for the parameter of interest θ based on its posterior distribution. There are several different approaches to this problem. • One of the most natural estimate is the posterior mode . It is the point where the posterior probability or density function of θ takes its maximum. • In the discrete case, it is the value that has the highest posterior probability. In the continuous case, it is the value that has the highest amount of posterior probability in short interval containing it. • To calculate the posterior mode we need to maximize as a function of θ. Note that this is equivalent to maximizing so that we do not need to compute the inverse normalizing constant to implement this. week 3
Example: Bernoulli Model • Suppose we observe a sample from the Bernoulli(θ) distribution with unknown and we place the Beta(α, β) prior on θ. • We already determined that the posterior distribution of θ is the distribution. • The posterior density is then… • So we need to maximize ….. week 3
Taking first derivative of the above function, setting it equal to 0 and solving gives the solution ... • Next we need to check the second derivative… • Now, if α ≥ 1, β ≥ 1, we see that the second derivative is always negative and so is the unique posterior mode. This restriction on the choice of α and β implies that the prior has a mode in (0,1) rather than at 0 or 1. • Note that when α = β = 1, namely, if we put a uniform prior on θ, the posterior mode is and this is the same as the maximum likelihood estimates (MLE). week 3