540 likes | 1.06k Views
Chapter 13: Comparing Two Population Parameters. 13.1 – Comparing Two Means.
E N D
Comparative studies are more convincing than single-sample investigations, so one-sample inference is not as common as comparative (two-sample) inference. In a comparative study, we may want to compare two treatments, or we may want to compare two populations. In either case, the samples must be chosen randomly and independently in order to perform statistical inference. How is this different than a matched pairs design? A matched pairs design is when you compare two similar things given the same treatment. This is when you are comparing two sets of samples given different treatments!
Two-Sample inference: Compare two treatments or two populations. The null hypothesis is that there is no difference between the two parameters.
Review: How do you subtract two means? 1 – 2 How do you subtract two standard deviations? 1 2 2 + 2 Add their variances and take their square roots!
Two Sample Z: Two Sample T: is not known is known SRS SRS Normality Normality • Population approx normal • Population approx normal • n1 + n2 30 by CLT • n1 + n2 30 by CLT • n1 + n2< 30 and data doesn’t have strong skewness Independence Independence N 10n N 10n The two samples are independent The two samples are independent
Note! The t* statistic does not have an exact t-distribution. The degrees of freedom are calculated differently. Your calculator will do this for you!
Confidence Interval: estimate test statistic sd Two Sample Z: Two Sample T:
Hypothesis Test: estimate – hypothesized value test statistic = standard deviation of statistic Two Sample Z: Two Sample T:
Calculator Tip! Two Sample Z: Two Sample T: STAT-TESTS- 2-SampZtest STAT-TESTS- 2-SampTtest STAT-TESTS- 2-SampZInt STAT-TESTS- 2-SampTInt Note: The only time you pool is when the standard deviations are the same. This almost never happens, so just don’t do it!
Example #1 Patients with heart-attack symptoms arrive at an emergency room either by ambulance or self-transportation provided by themselves, family, or friends. When a patient arrives at the emergency room, the time of arrival is recorded. The time when the patient’s diagnostic treatment begins is also recorded. An administrator of a large hospital wanted to determine whether the mean wait time (time between arrival and diagnostic treatment) for patients with heart-attack symptoms differ according to the mode of transportation. A random sample of 150 patients with heart-attack symptoms who had reported to the emergency room was selected. For each patient, the mode of transportation and wait time were recorded. Summary statistics for each mode of transportation are shown in the table below.
Use a 99% confidence interval to estimate the difference between the mean wait times for ambulance transported patients and self-transported patients at this emergency room.
P: μS = mean wait time for diagnostic treatment if traveled by self-transportation μA = mean wait time for diagnostic treatment if traveled by ambulance μD = μA - μS = Difference in wait times
A: SRS (says so) Normality nA + nS 30 By the CLT, ok to assume normality 73 + 77 ≥ 30 150 ≥ 30 Independence (More than 1500 people with heart-attack symptoms) Self-transported patients shouldn’t influence the wait time in ambulance transported patients N: Two-Sample t-interval
I am 99% confident the true mean difference of wait time of ambulance and self-transported patients is between –4.2910 and –0.2291 minutes C: b. Based only on this confidence interval, do you think the difference in the mean wait times is statistically significant? Justify your answer. Since 0 is not in the confidence interval, we can say that the ambulance wait times are statistically significantly shorter than the wait times for self-transported patients at the 99% confidence level.
Example #2: The following is a list of salary rates (per hour in dollars) for men and women with a high school diploma.
If the two samples are independent and are taken randomly, is there significant evidence that the men make more money than the women? Assume that in past experience = 1.99 dollars for men and = 2.01 for women. P: μM = mean dollars per hour for men with high school diploma μW = mean dollars per hour for women with high school diploma μD = μM - μW = Difference in dollars per hour
A: SRS (says so) Normality nM + nW 30 By the CLT, ok to assume normality 26 + 26 ≥ 30 52 ≥ 30 Independence (More than 520 people with engineer degree) Men’s salaries shouldn’t influence the salaries of women with high school diploma. Also, says independent N: Two-Sample Z-Test
O: P(Z > 1.24) = 1 – P(Z < 1.24) =
P(Z > 1.24) = 1 – P(Z < 1.24) = 1 – 0.8925 = 0.1075
M: > 0.1075 0.05 Accept the Null
S: There is not enough evidence to say that men with a high school diploma make more money per hour than women.
If we want to compare two populations or compare the responses to two treatments from independent samples, we look at a two-sample proportion: or
Conditions for Proportion Interval: SRS Normality Independence N 10(n1 + n2) The two samples are independent
Confidence Interval: estimate test statistic sd
Conditions for Proportion Test: SRS Normality Independence N 10(n1 + n2) The two samples are independent
Hypothesis Test: estimate – hypothesized value test statistic = standard deviation of statistic
Calculator Tip! Confidence Interval: Hypothesis Test STAT-TESTS- 2-PropZInt STAT-TESTS- 2-PropZTest Note: The only time you pool is when the standard deviations are the same. This almost never happens, so just don’t do it!
Example #1 An election is bitterly contested between two rivals. In a poll of 750 potential voters taken 4 weeks before the election, 420 indicated a preference for candidate Grumpy over candidate Dopey. Two weeks later, a new poll of 900 randomly selected potential voters found 465 who plan to vote for Grumpy. Dopey immediately began advertising that support for Grumpy was slipping drastically and that he was going to win the election. Statistically speaking (at the 0.05 level), how happy should Dopey be? P: p1 = true proportion of people who want Grumpy to win in 1st poll p2 = true proportion of people who want Grumpy to win in 1st poll pD = p1 - p2 = Difference in proportion of people in 1st poll and second
H: or or
(Says in second one only. Must assume the first) SRS Normality
Independence Safe to assume there were more than 10(750+900), or 16,500 voters The first poll might have influenced the second poll, proceed with caution! N: 2-PropZTest
O: P(Z > 1.75) = 1 – P(Z < 1.75) =
P(Z > 1.24) = 1 – P(Z < 1.24) = 1 – 0.9599 = 0.0401 Or, by calculator: P(Z > 1.24) = 0.03941
M: < 0.03941 0.05 Reject the Null
S: There is enough evidence to say that the proportion of voters that support Grumpy has dropped from the 1st poll to the second. Dopey should be very happy!
Example #2 Two groups of 40 randomly selected students were selected to be part of a study on drop-out rates. One group was enrolled in a counseling program designed to give them skills needed to succeed in school and the other group received no special counseling. Fifteen of the students who received counseling dropped out of school, and 23 of the students who did not receive counseling dropped out. Construct a 90% confidence interval for the true difference between the drop-out rates of the two groups. P: pC = true proportion of students who drop out with counseling pN = true proportion of students who drop out without any counseling pD = pC - pD = Difference in proportion of students who drop out with counseling vs. without
A: SRS (says in both groups) Normality
Independence Safe to assume there were more than 10(40+40), or 800 students The drop out rate of the group with counseling might influence the group without counseling. Proceed with caution! N: 2-PropZInt