100 likes | 108 Views
This article explores two different approaches for statistical inference on regression parameters: using the posterior distribution after data is seen and using the sampling distribution before data is seen. It discusses the logics of these approaches and the normal posterior distribution. Examples and calculations for estimating parameters and computing probabilities are provided. The Bayesian approach and the sampling distribution approach are compared for one-sample and many-samples analysis.
E N D
Set 17 Distributions for statistical inference about the regression parameters
Two different approaches • Use posterior distribution • After data are seen • One sample • Use sampling distribution • Before data are seen • Many many samples bj Parameter, Unknown bj Parameter, Unknown
Logics of the two approaches • The posterior (Bayesian updating) approach • Probability is applicable to unknown parameters (e.g., bj) • Probability is applicable to results actually computed from a particular sample • The sampling distribution (long-run frequency) approach • Probability is applicable to the results of many many samples repeated in identical conditions, in long-run • Probability is applicable to the outcomes of a sample (e.g., least square coefficients, bj) before the sample is seen • Probability is not applicable to unknown parameters (e.g., bj) • Probability is not applicable to results actually computed from a particular sample
Normal Posterior Distribution • The Posterior distribution • Approximately, when the sample size is large • Exactly, when the prior distribution is uniform and data distribution is normal g(bj) bj bj bj
Example • From regression output • b1 = 3.57 and s1 = .35 • Posterior distribution, given data and s2 3.57 • s unknown, so can’t compute the z-ratio • Can’t computeprobability
Variance unknown • In practice susually is not known • Estimate sjusing sj, the standard error of bj (regression output) • Compute t-ratio • Taking uncertainty about unknown s into account (using a prior distribution) leads to distribution t distribution
Example • Estimate sj by the sample standard deviation, and s1 = .35 • Estimate the posterior standard deviation • Compute t-ratio • Select an upper tail probability, a • Solve for b1to find the threshold for b* • Then P(b>b* ) = a df = 23 a ta
Upper 5% (lower 95%) threshold df = 23 .05 .95 t = 1.714 • For a =.05, ma = 3.57 + 1.714x.35=4.17 P(b1> 4.17) = .05 • One-sided lower 95% interval estimate for m: P(b1< 4.17) = .95
Thresholds for middle 95% df = 3 .95 .025 .025 t = -2.069 t = 2.069 • Let t=2.069 in Upper and lower thresholds = • Two sided interval for m: P(2.85 <m< 4.30) = .95
Summary • Bayesian approach • One sample analysis • Probability of regression parameters after data are seen • Sampling distribution approach • Many samples analysis • Probability of least square estimates before data are seen • Analysis of normal samples • Analysis of large samples