Use of Chebyshev’s Theorem to Determine Confidence Intervals

Use of Chebyshev’s Theorem to Determine Confidence Intervals • Use when a sample comes from a relatively small sample n>.05N • Non normally distributed • (1 – 1/k2) = C%

Find an 80% and a 90% confidence interval for a population of 500 with a sample size n = 100, X = 45 and a standard deviation s = 12. Use Chebyshev’s Theorem 100>.05(500) 1 – 1/k2 = .8, k = 2.23 [42.324, 47.676] 1 – 1/k2 = .9 k = 3.16 [41.208, 48.792]

1. The sample is a simple random sample. 2. The conditions for the binomial distribution are satisfied. 3. The normal distribution can be used to approximate the distribution of sample proportions because np  5 and n(1 – p) 5 are both satisfied. Assumptions Confidence Intervals for Population Proportions

Notation for Proportions

π= population proportion p = x n sample proportion Notation for Proportions of xsuccesses in a sample of size n

DefinitionPoint Estimate

The sample proportion pis the best point estimate of the population proportion π. DefinitionPoint Estimate

Standard Error SE =

Confidence Interval for Population Proportion P + Z(P) * SE

Confidence Interval for Population Proportion A poll of 100 students found that 60 prefer Mrs. Peloquin as a teacher compared to Mr. Roesler. Find a 95% and 85% confidence interval that prefer Mrs. Peloquin. p = .6 (need at least 80 see pg 340) 1 - p = .4 P + Z(P) * SE .6 + 1.96 * .048 [.505,.694] .6 + 1.44 * .048 [.531,.669]

Round the confidence interval limits to three significant digits. Round-Off Rule for Confidence Interval Estimates of p

Determining Sample Size

Determining Sample Size p(1-p) z ME =  n 

Determining Sample Size p(1-p) z ME =  n (solve for n by algebra)

z  Determining Sample Size p(1-p) z ME =  n (solve for n by algebra) ()2 p (1-p) n= ME2

z ()2 p (1-p)  n= ME2 z  Sample Size for Estimating Proportion p When an estimate of p is known: When no estimate of p is known: ()2 0.25 n= ME2

Example:We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail.

n = [z/2 ]2p (1-p) Example:We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail. ME2

n = [z/2 ]2p(1-p) Example:We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail. ME2 = [1.645]2(0.169)(0.831) 0.042

n = [z/2 ]2p (1-p) Example:We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? A 1997 study indicates 16.9% of U.S. households used e-mail. ME2 To be 90% confident that our sample percentage is within four percentage points of the true percentage for all households, we should randomly select and survey 238 households. = [1.645]2(0.169)(0.831) 0.042 = 237.51965 = 238 households

Example:We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage.

n = [z/2 ]2(0.25) Example:We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage. ME2

n = [z/2 ]2(0.25) Example:We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage. ME2 = (1.645)2 (0.25) 0.042 = 422.81641 = 423 households

n = [z/2 ]2(0.25) Example:We want to determine, with a margin of error of four percentage points, the current percentage of U.S. households using e-mail. Assuming that we want 90% confidence in our results, how many households must we survey? There is no prior information suggesting a possible value for the sample percentage. ME2 = (1.645)2 (0.25) With no prior information, we need a larger sample to achieve the same results with 90% confidence and an error of no more than 4%. 0.042 = 422.81641 = 423 households

If 1) n 30 2) The sample is a simple random sample. 3) The sample is from a normally distributed population. Case 1 ( is known): Largely unrealistic; Use z-scores Case 2 (is unknown): Use Student t distribution Small SamplesAssumptions

If the distribution of a population is essentially normal, then the distribution of Student t Distribution x - µ t = s n

If the distribution of a population is essentially normal, then the distribution of Student t Distribution x - µ t = s n • is essentially a Student t Distributionfor all samples of size n. • is used to find critical values denoted by t/ 2

Degrees of Freedom (df ) corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values Definition

Degrees of Freedom (df ) corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values Definition df = n - 1 in this section

Degrees of Freedom (df ) = n - 1 corresponds to the number of sample values that can vary after certain restrictions have imposed on all data values Definition Specific # Any # Any # Any # Any # Any # Any # Any # Any # Any # n = 10 df = 10 - 1 = 9 so that x = 80

Table A-3 t Distribution .005 (one tail) .01 (two tails) .01 (one tail) .02 (two tails) .025 (one tail) .05 (two tails) .05 (one tail) .10 (two tails) .10 (one tail) .20 (two tails) .25 (one tail) .50 (two tails) Degrees of freedom 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 Large (z) 63.657 9.925 5.841 4.604 4.032 3.707 3.500 3.355 3.250 3.169 3.106 3.054 3.012 2.977 2.947 2.921 2.898 2.878 2.861 2.845 2.831 2.819 2.807 2.797 2.787 2.779 2.771 2.763 2.756 2.575 31.821 6.965 4.541 3.747 3.365 3.143 2.998 2.896 2.821 2.764 2.718 2.681 2.650 2.625 2.602 2.584 2.567 2.552 2.540 2.528 2.518 2.508 2.500 2.492 2.485 2.479 2.473 2.467 2.462 2.327 12.706 4.303 3.182 2.776 2.571 2.447 2.365 2.306 2.262 2.228 2.201 2.179 2.160 2.145 2.132 2.120 2.110 2.101 2.093 2.086 2.080 2.074 2.069 2.064 2.060 2.056 2.052 2.048 2.045 1.960 6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 1.717 1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.645 3.078 1.886 1.638 1.533 1.476 1.440 1.415 1.397 1.383 1.372 1.363 1.356 1.350 1.345 1.341 1.337 1.333 1.330 1.328 1.325 1.323 1.321 1.320 1.318 1.316 1.315 1.314 1.313 1.311 1.282 1.000 .816 .765 .741 .727 .718 .711 .706 .703 .700 .697 .696 .694 .692 .691 .690 .689 .688 .688 .687 .686 .686 .685 .685 .684 .684 .684 .683 .683 .675

Given a variable that has a t-distribution with the specified degrees of freedom, what percentage of the time will it be in the indicated region? a. 10 df, between -1 .81 and 1 .81. b. 10 df, between -2.23 and 2.23. c. 24 df, between -2.06 and 2.06. d. 24 df, between -2.80 and 2.80. e. 24 df, outside the interval from -2.80 and 2.80. f. 24 df, to the right of 2.80. g. 10 df, to the left of -1.81. 90% 95% 95 % 99 % 1 % .5 % 5 %

What are the appropriate t critical values for each of the confidence intervals? 95 % confidence, n = 17 b. 90 % confidence, n = 12 c. 99 % confidence, n = 14 d. 90 % confidence, n = 25 e. 90 % confidence, n = 13 f. 95 % confidence, n = 10 2.12 1.80 3.01 1.71 1.78 2.262

Based on an Unknown  and a Small Simple Random Sample from a Normally Distributed Population Margin of Error E for Estimate of  s ME = t/ n 2 where t/ 2 has n - 1 degrees of freedom

Confidence Interval for the Estimate of MEBased on an Unknown  and a Small Simple Random Sample from a Normally Distributed Population x - ME < µ < x + ME s ME = t/2 where n

1. The Student t distribution is different for different sample sizes (see Figure 6-5 for the cases n = 3 and n = 12). 2. The Student t distribution has the same general symmetric bell shape as the normal distribution but it reflects the greater variability (with wider distributions) that is expected with small samples. 3. The Student tdistribution has a mean of t = 0 (just as the standard normal distribution has a mean of z = 0). 4. The standard deviation of the Student t distribution varies with the sample size and is greater than 1 (unlike the standard normal distribution, which has a = 1). 5. As the sample size n gets larger, the Student t distribution gets closer to the normal distribution. For values of n > 30, the differences are so small that we can use the critical z values instead of developing a much larger table of critical t values. (The values in the bottom row of Table A-3 are equal to the corresponding critical z values from the standard normal distribution.) Important Properties of the Student t Distribution

Student t Distributions for n = 3 and n = 12 Student t distribution with n = 12 Standard normal distribution Student t distribution with n = 3 0

Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)

x = 26,227 s = 15,873  = 0.05 /2 = 0.025 Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.)

x = 26,227 s = 15,873  = 0.05 /2 = 0.025 t/2 = 2.201 Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) ME = t2 s =(2.201)(15,873) = 10,085.29 12 n

x = 26,227 s = 15,873  = 0.05 /2 = 0.025 t/2 = 2.201 n Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) ME = t2 s =(2.201)(15,873) = 10,085.3 12 x - ME < µ < x +ME

x = 26,227 s = 15,873  = 0.05 /2 = 0.025 t/2 = 2.201 x - E < µ < x + E Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) E = t2 s =(2.201)(15,873) = 10,085.3 n 12 26,227 - 10,085.3 < µ < 26,227 + 10,085.3

x = 26,227 s = 15,873  = 0.05 /2 = 0.025 t/2 = 2.201 Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) ME = t2 s =(2.201)(15,873) = 10,085.3 12 n x - ME < µ < x + ME 26,227 - 10,085.3 < µ < 26,227 + 10,085.3 $16,141.7< µ < $36,312.3

x = 26,227 s = 15,873  = 0.05 /2 = 0.025 t/2 = 2.201 x - E < µ < x + E Example: A study of 12 Dodge Vipers involved in collisions resulted in repairs averaging $26,227 and a standard deviation of $15,873. Find the 95% interval estimate of , the mean repair cost for all Dodge Vipers involved in collisions. (The 12 cars’ distribution appears to be bell-shaped.) E = t2 s =(2.201)(15,873) = 10,085.3 n 12 26,227 - 10,085.3 < µ < 26,227 + 10,085.3 $16,141.7< µ < $36,312.3 We are 95% confident that this interval contains the average cost of repairing a Dodge Viper.

One Sided Confidence Intervals A one sided confidence interval is a confidence interval that establishes either a likely minimum or a likely maximum value for µ, but not both. Likely maximum Likely minimum

Use of Chebyshev’s Theorem to Determine Confidence Intervals

Use of Chebyshev’s Theorem to Determine Confidence Intervals

Presentation Transcript

Breach of Confidence – The Basics

Nonparametric Methods II

SELF-CONFIDENCE: THE KEY TO SPORT SUCCESS

Chapter 6

Understanding P-values and Confidence Intervals

6.2 Confidence Intervals for the Mean (Small Samples)

Chapter 12 Bond Selection

Final Exam Review Introduction to Biostatistics, Fall 2005

Estimation and Confidence Intervals

Intro to Confidence Intervals

API 653 Tank Inspection Intervals

Basic Statistics

Chapter 10

Chapter 7 Statistical Inference: Confidence Intervals

SAACE CONFIDENCE INDEX

Chapter 3

Introduction to Data Analysis

Chapter 7 Confidence Intervals and Sample Sizes

Integration

Vector Calculus

第四章电路定理 (Circuit Theorems)

Estimating the Value of a Parameter Using Confidence Intervals