Understanding Confidence Intervals in Statistics

Economics 105: Statistics Any questions? Review #1 GH 9 and GH 10 due on Wednesday

Confidence Intervals Confidence Intervals Population Mean Population Proportion σKnown σUnknown

Confidence Interval for μ(σ Unknown) • If the population standard deviation σ is unknown, we can substitute the sample standard deviation, sx • This introduces extra uncertainty, since sx varies from sample to sample • Use t distribution instead of the normal distribution

Student’s t distribution William Sealy Gosset was an Irish statistician who worked for Guinness Brewery in Dublin in the early 1900s. He was interested in the effects of various ingredients and temperature on beer, but only had a few batches of each “formula” to analyze. Thus, he needed a way to correctly treat SMALL SAMPLES in statistical analysis. Not supposed to be publishing, so used the pseudonym, “Student”

Student’s t distribution • The t is a family of distributions • The shape depends on degrees of freedom (d.f.) • Number of observations that are free to vary after sample mean has been calculated d.f. = n - 1

Student’s t distribution Note: t Z as n increases Standard Normal (t with df = ∞) t (df = 13) t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal t (df = 5) t 0

Confidence Interval for μ(σ Unknown) (continued) • Assumptions • Population standard deviation is unknown • Population is normally distributed • Use Student’s t distribution • Confidence Interval Estimate: (where t is the critical value of the t distribution with n -1 degrees of freedom and an area of α/2 in each tail)

Student’s t Table Let: n = 3 df = n - 1 = 2  = 0.10/2 = 0.05 Upper Tail Area df .25 .10 .05 1 1.000 3.078 6.314 0.817 1.886 2 2.920 /2 = 0.05 3 0.765 1.638 2.353 The body of the table contains t values, not probabilities 0 t 2.920

t distribution values With comparison to the Z value Confidence t t t Z Level (10 d.f.)(20 d.f.)(30 d.f.) ____ 0.80 1.372 1.325 1.310 1.28 0.90 1.812 1.725 1.697 1.645 0.95 2.228 2.086 2.042 1.96 0.99 3.169 2.845 2.750 2.58 Note: t Z as n increases

Confidence Intervals for  A manufacturer produces bags of flour whose weights are normally distributed. A random sample of 25 bags was taken and their mean weight was 19.8 ounces with a sample standard deviation of 1.2 ounces. Find and interpret a 99% confidence interval for the true average weight for all bags of flour produced by the company.

Confidence Intervals Confidence Intervals Population Mean Population Proportion σKnown σUnknown

Confidence Intervals for  • A random sample of 100 people shows that 25 are left-handed. • Form a 95% confidence interval for the true proportion of left-handers

Confidence Intervals for  (continued) • A random sample of 100 people shows that 25 are left-handed. Form a 95% confidence interval for the true proportion of left-handers.

Interpretation • We are 95% confident that the true percentage of left-handers in the population is between 16.51% and 33.49%. • Although the interval from 0.1651 to 0.3349 may or may not contain the true proportion, 95% of intervals formed in repeated samples of size 100 in this manner are expected to contain the true proportion.

Confidence Intervals for  (continued) • A random sample of 1000 people shows that 250 are left-handed. Form a 95% confidence interval for the true proportion of left-handers.

Determining Sample Size Determining Sample Size For the Mean For the Proportion

Sampling Error • The required sample size can be found to reach a desired margin of error (e) with a specified level of confidence (1 - ) • The margin of error is also called sampling error • the amount of imprecision in the estimate of the population parameter • the amount added and subtracted to the point estimate to form the confidence interval

Determining Sample Size Determining Sample Size For the Mean Sampling error (margin of error)

Determining Sample Size (continued) Determining Sample Size For the Mean Now solve for n to get

Determining Sample Size (continued) • To determine the required sample size for the mean, you must know: • The desired level of confidence (1 - ), which determines the critical Z value • The acceptable sampling error, e • The standard deviation, σ

Required Sample Size Example If  = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence? So the required sample size is n = 220 (Always round up)

If σ is unknown • If unknown, σ can be estimated when using the required sample size formula • Use a value for σ that is expected to be at least as large as the true σ • Select a pilot sample and estimate σ with the sample standard deviation, s

Determining Sample Size (continued) Determining Sample Size For the Proportion Now solve for n to get

Determining Sample Size (continued) • To determine the required sample size for the proportion, you must know: • The desired level of confidence (1 - ), which determines the critical Z value • The acceptable sampling error, e • The true proportion of “successes”, π • π can be estimated with a pilot sample, if necessary (or conservatively use π = 0.5)

Required Sample Size Example How large a sample would be necessary to estimate the true proportion defective in a large population within ±3%, with 95% confidence? (Assume a pilot sample yields p = 0.12)

Required Sample Size Example (continued) Solution: For 95% confidence, use Z = 1.96 e = 0.03 p = 0.12, so use this to estimate π So use n = 451

What is a Hypothesis? • A hypothesis is a claim (assumption) about a population parameter: Example: The mean monthly cell phone bill of this city is μ = $42 Example: The proportion of adults in this city with cell phones is π = 0.68

The Null Hypothesis, H0 • States the claim or assertion to be tested Example: The average number of TV sets in U.S. Homes is equal to three ( ) • Is always about a population parameter, not about a sample statistic

The Null Hypothesis, H0 (continued) • Begin with the assumption that the null hypothesis is true • Similar to the notion of innocent until proven guilty • Refers to the status quo • Always contains “=” , “≤” or “”sign • May or may not be rejected

The Alternative Hypothesis, H1 • Is the opposite of the null hypothesis • e.g., The average number of TV sets in U.S. homes is not equal to 3 ( H1: μ≠ 3 ) • Challenges the status quo • Never contains the “=” , “≤” or “”sign • May or may not be proven • Is generally the hypothesis that the researcher is trying to prove

Hypothesis Testing Process Claim:the population mean age is 50. (Null Hypothesis: Population H0: μ = 50 ) Now select a random sample X = likely if μ = 50? Is 20 Suppose the sample If not likely, REJECT mean age is 20: X = 20 Sample Null Hypothesis

Reason for Rejecting H0 Sampling Distribution of X X 20 μ= 50 IfH0 is true ... then we reject the null hypothesis that μ = 50. If it is unlikely that we would get a sample mean of this value ... ... if in fact this were the population mean…

Understanding Confidence Intervals in Statistics

Understanding Confidence Intervals in Statistics

Presentation Transcript

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics

Economics 105: Statistics