Estimation – Posterior mean

Estimation – Posterior mean • An alternative estimate to the posterior mode is the posterior mean. It is given by E(θ | s), whenever it exists. • This estimate is commonly used and has a natural interpretation. • If the posterior distribution of θ is symmetric about its mode, and the expectation exists, then the posterior mean is the same as the posterior mode, but otherwise these estimates will be different. • If we want our estimate to reflect where the central mass of the posterior probability lies than in case where the posterior is highly skewed, the mode is a better choice than the mean. week 4

Example: Bernoulli Model • Suppose we observe a sample from the Bernoulli(θ) distribution with unknown and we place the Beta(α, β) prior on θ. • We already determined that the posterior distribution of θ is the distribution. • The posterior expectation of θ is given by… • When we have a uniform prior (i.e. α = β = 1), the posterior expectation is given by • Note that when n is large, the mode and the mean will be very close together and in fact very close to the MLE . week 4

Example: Location Normal Model • Suppose that is a sample from an distribution where is unknown and is known and we take the prior distribution of μ to be the for some specified choices of μ0 and . • We have seen that the posterior distribution is • This normal distribution is symmetric about its mode, and the mean exists, the posterior mode and mean agree and equal • This is a weight average of the prior mean and the sample mean and lies between these two values. • When n is large, we have that this estimator is approximately equal to the sample mean , which is also the MLE for this situation. week 4

Important Notes Re Location Normal Example • When we take the prior to be very diffuse, that is, taking to be very large, then again this estimator is close to the sample mean. • The ratio of the sampling variance of to the posterior variance of μ is given by and this is always greater than 1. The closer is to 0, the larger this ratio is. • Further, as , the Bayesian estimate converges to μ0. week 4

If we are pretty confident that the population mean μ is close to the prior mean μ0, we will take to be small so that the bias in the Bayesian estimate will be small and its variance will be much smaller that the sampling variance of . In such a situation, the Bayesian estimator improves on accuracy over the sample mean. • If we are not confident that μ is close to the prior mean μ0, we will take to be large, and the Bayesian estimator will basically be the MLE. week 4

Accuracy of Bayesian Estimates • The accuracy of Bayesian estimates is naturally based on the posterior distribution and how concentrated it is about the quantity of interest. • If we chose the posterior mean as the Bayesian estimate for θ, we would compute the posterior variance as a measure of spread for the posterior distribution of θ about its mean. • We will discuss later how to access the accuracy of the posterior mode as a Bayesian estimate. week 4

Examples • In the Bernoulli model, the posterior variance is given by Note that the posterior variance converge to 0 as . Intuitively it means as we have more data the Bayesian estimate is more accurate. • In the location normal model, the posterior variance is given by Note that the posterior variance converges to 0 when we have a very precise prior, i.e., when ,and converges to , the variance of the sample mean, when we have a very diffuse prior, i.e., when week 4

Credible Intervals • A credible interval for a real-values parameter θ, is an interval that we believe will contain the true value of θ. • As with the frequentist approach, we specify a probability α, and then find an interval C(s) satisfying That is, the posterior probability of the set of all θ values satisfying l(s) ≤ θ ≤u(s) is greater then or equal to α. • We try to find α-credible interval C(s) so that the above posterior probability is as close as possible to α, and such that C(s) is as shortest as possible. week 4

Highest Posterior Density Intervals • HPD intervals are of the form where п(θ|s)is the posterior density of θ and where c is chosen as large as possible so that (**) is satisfied. • For example… • The length of a α-credible interval for θ will serve the same purpose as the margin of error does with confidence intervals. week 4

Example: Location Normal Model • Suppose that is a sample from an distribution where is unknown and is known and we take the prior distribution of μ to be the for some specified choices of μ0 and . • We have already seen that the posterior distribution is • Since this distribution is symmetric about its mode (also the mean) , a shortest α-credible interval for μ is of the form where c is such that…. week 4

Estimation – Posterior mean