1 / 11

Confidence Intervals

Confidence Intervals. Underlying model: Unknown parameter We know how to calculate point estimates E.g. regression analysis But different data would change our estimates. So, we treat our estimates as random variables Want a measure of how confident we are in our estimate.

jela
Download Presentation

Confidence Intervals

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Confidence Intervals • Underlying model: Unknown parameter • We know how to calculate point estimates • E.g. regression analysis • But different data would change our estimates. • So, we treat our estimates as random variables • Want a measure of how confident we are in our estimate. • Calculate “Confidence Interval”

  2. What is it? • If know how data sampled • We can construct a Confidence Interval for an unknown parameter, q. • A 95% C.I. gives a range such that true q is in interval 95% of the time. • A 100(1-a) C.I. captures true q (1-a) of the time. • Smaller a, more sure true q falls in interval, but wider interval.

  3. Example 1: Lead in Water • Lead in drinking water causes serious health problems. • To test contamination, require a control site. • Problems: • Lead concentration in control site? • Estimate 95% confidence interval

  4. Example 2: Gas Market • Recall U.S. gas market question: • By how much does gas consumption decrease when price increases? • Our linear model: • Estimate of b1: -.04237. • How confident are we in this estimate? • Construct 90% C.I. for this estimate

  5. If Data ~N(m,s2) • Since we don’t know s, use t-distribution. • 95% C.I. for m: • s is standard error of mean. • t97.5 is critical value of t distribution • Draw on board (Prob = 2.5%)

  6. t-distribution • Similar to Normal Distribution • Requires “degrees of freedom”. • df = (# data points) – (# variables). • E.g. mean of lead concentration, 8 samples, one variable: d.f.=7. • Higher d.f., closer t is to Normal distribution.

  7. If Distribution Unknown • Can use “Bootstrapping”. • Draw large sample with replacement • Calculate mean • Repeat many times • Draw histogram of sample means • Calculate empirical 95% C.I. • Requires no previous knowledge of underlying process

  8. Lead Concentration • 8 lead measurements: • Mean=51.39, s=5.75, t97.5=2.365 • Lower=51.39-(5.75)(2.365) • Upper= 51.39+(5.75)(2.365) • C.I. = [37.8,65.0] • Using bootstrapped samples: • C.I. = [40.8,62.08]

  9. Gas Regression: S-Plus Coefficients: Value Std. Error t value Pr(>|t|) (Intercept) -0.0898134 0.0507787 -1.7687217 0.0867802 PG -0.0423712 0.0098406 -4.3057672 0.0001551 Y 0.0001587 0.0000068 23.4188561 0.0000000 PNC -0.1013809 0.0617077 -1.6429209 0.1105058 PUC -0.0432496 0.0241442 -1.7913093 0.0830122 Residual standard error: 0.02680668 on 31 degrees of freedom Multiple R-Squared: 0.9678838 F-statistic: 233.5615 on 4 and 31 degrees of freedom, the p-value is 0

  10. Gas Price Response • b2=-.04237, s=.00984 • 90% C.I.: t95=1.695 (d.f.=37-5=32) • C.I. = [-.0591,-.0256] • Using bootstrapped samples: • C.I. = [-.063,-.026] • Response is probably between 2.5 gallons and 6 gallons.

  11. Interpretation & Other Facts • There is a 95% chance that the true average lead concentration lies in this range. • There is a 90% chance that the true value of b1 lies in this range. • Also can calculate “confidence region” for 2 or more variables.

More Related