Confidence Intervals

Confidence Intervals W&W, Chapter 8

Confidence Interval Review For the mean: If  is known:  = M +/- Z/2*(/N) If  is unknown and N is large ( 100):  = M +/- Z/2*(s/N) If  is unknown and N is small (< 100):  = M +/- t/2*(s/N)

Confidence Interval for the Proportion If  is known:  = P  Z/2[(1-)/n] If  is unknown and n is large (at least 5 successes and 5 failures turn up):  = P  Z/2[P(1-P)/n]

Difference in Means Many times we are interested in comparing the means across groups, such as the mean GDP level for developed versus developing countries. To construct a confidence interval when the population variances are known: (1 - 2) = (M1 - M2) +/- Z/2 (12/n1 + 22/n2)

Difference of Means, Independent Samples It is usually the case that we are comparing means from two random samples and 12 and 22 are not known. (1 - 2) = (M1 - M2) +/- t/2 sp(1/n1 + 1/n2) where sp = pooled variance (we assume that both populations have the same variance so that we can pool the samples together, or 12 = 22)

Pooled Variance sp2 = (X1 - M1)2 + (X2 - M2)2 (n1 - 1) + (n2 - 1) df = (n1 - 1) + (n2 - 1) Example: From a large class, a sample of 4 grades were drawn and from a second large class, an independent sample of 3 grades were drawn. Calculate the 95% confidence interval for the difference between the two class means, 1 - 2.

Example Class 1 Class 2 X1 (X1 - M1)2 X2 (X2 - M2)2 64 (64-74)2 = 100 56 (56-60)2 = 16 66 (66-74)2 = 64 71 (71-60)2 = 121 89 (89-74)2 = 225 53 (53-60)2 = 49 77 (77-74)2 = 9 M1 = 296/4 = 74 M2 = 180/3 = 60 (X1 - M1)2 = 398 (X2 - M2)2 = 186

Example sp2 = 398 + 186 = 584/5 = 117 (4-1) + (3-1) sp = 117 = 10.8 (1 - 2) = (M1 - M2) +/- t/2 sp(1/n1 + 1/n2) = (74 - 60) +/- 2.57*10.8*(1/4 + 1/3) df = (n1 - 1) + (n2 - 1) = (4 - 1) + (3 - 1) = 5

Interpretation (1 - 2) = 14 +/- 21 or -7 to 35 With 95% confidence, we can conclude that the average of the first class may be 7 marks below the average of the second class, or it may be 35 marks above, or anywhere in between.

Difference of Means, Matched or Paired Samples In the previous example, the samples were independently drawn (two separate classes). Often we want to compare two variables that are not drawn independently. This is what we will refer to as matched, or paired samples. Suppose we want to compare the same students across 2 exams. We want to know how the students' grades changed or the difference between the two exam scores. D = X1 - X2 We can treat the differences in scores as a single sample.

Difference of Means, Matched or Paired Samples A confidence interval for matched samples is calculated as follows: • = Dmean +/- t/2*(sD/N) where  is the difference between the two scores in the populations, and Dmean is the average difference in our sample. SD2 = (D - Dmean)2 N -1 which is a standard calculation for variance

Example Student X1 X2 D=X1-X2 D - Dmean (D - Dmean)2 Amy 64 57 7 -4 16 Bill 66 57 9 -2 4 Becky 89 73 16 5 25 Mark 77 65 12 1 1 Dmean = (7+9+16+12)/4 = 44/4 = 11 SD2 = (16+4+25+1)/(4-1) = 46/3 = 15.3 SD = 15.3 = 3.9 Df = N - 1 = 4 - 1 =3

Example • = Dmean +/- t/2*(sD/N) • = 11 +/- 3.18*(3.9/4) = 11 +/- 6 or 5 - 17 Conclusion: The difference in the two exams ranges from 5 to 17 points. Notice that the confidence interval shrinks when we are using paired or matched samples. This is because we are holding constant many extraneous variables (like year in college, IQ, hours spent studying, etc).

Difference in two proportions for large samples (1 - 2) = (P1 - P2) +/- Z/2*[ (P1(1-P1)/N1) + (P2(1-P2)/N2)] Example: The Gallup poll periodically takes a random sample of about 1500 Americans. The percentage who favor the legalization of marijuana possession declined from 52% in 1980 to 46% in 1985.

Example Construct a 95% confidence interval for the population percentage in favor each year. 1980:  = P +/- Z/2*((P(1-P)/N) = .52 +/- 1.96*((.52)(.48)/1500) = .52 +/- .025 .495 to .545; 49.5% to 54.5% of people favored legalization 1985:  = P +/- Z/2*((P(1-P)/N) = .46 +/- 1.96*((.46)(.54)/1500) = .46 +/- .025 .435 to .485; 43.5% to 48.5% of people favored legalization

Example Find a 95% confidence interval for the change in this percentage. (1 - 2) = (P1 - P2) +/- Z/2*[ (P1(1-P1)/N1) + ((P2(1-P2)/N2)] = (.52 - .46) +/- 1.96*[ ((.52)(.48)/1500) + (.46)(.54)/1500))] = .06 +/- .036 We are 95% confident that the difference in legalization attitudes between 1985 and 1980 is between 2.4% and 9.6%.

Confidence Intervals