310 likes | 404 Views
Announcements. Extra office hours this week: Thursday, 12-12:45. The midterm will cover through Section 13.4. I will spend half of Thursday’s class going over remaining material in Section 13.4 and half of Thursday’s class reviewing.
E N D
Announcements • Extra office hours this week: Thursday, 12-12:45. • The midterm will cover through Section 13.4. • I will spend half of Thursday’s class going over remaining material in Section 13.4 and half of Thursday’s class reviewing. • Instead of a review session on Sunday, I will hold office hours from 2-5. • On Monday, I will hold office hours from 9:00-11:30. I also should be in my office most of the afternoon (after 1:30). • Practice exam and practice problems posted. Answers will be posted Thursday night.
Lecture 7 • Inference for a difference between means from independent samples • Observational vs. experimental data • Differences between means in matched pairs experiments
or 0 Inference about m1– m2: Equal variances • Construct the t-statistic as follows: • Perform a hypothesis test • H0: m1 - m2 = 0 • H1: m1 - m2 > 0 Build a confidence interval or < 0
Inference about the Difference between Two Means • Example 13.1 • Do people who eat high-fiber cereal for breakfast consume, on average, fewer calories for lunch than people who do not eat high-fiber cereal for breakfast? • A sample of 150 people was randomly drawn. Each person was identified as an eater or non-eater of high fiber cereal. • For each person the number of calories consumed at lunch was recorded. There were 43 high-fiber eaters who had a mean of 604.02 calories for lunch with s=64.05. There were 107 non-eaters who had a mean of 633.23 calories for lunch with s=103.29.
Inference about m1– m2: Unequal variances Conduct a hypothesis test as needed, or, build a confidence interval
What is common (cont.) • p-values:
What’s not common • The way the stderr is estimated: • Equal variances: pooled sdev estimate • Separate variances: separate sdev estimates • The degrees of freedom: • Equal variances: simple exact dfs • Separate variances: complex approximate dfs
Which case to use:Equal variance or unequal variance? • Whenever there is insufficient evidence that the variances are unequal, it is preferable to perform the equal variances t-test. • This is so, because for any two given samples The number of degrees of freedom for the equal variances case The number of degrees of freedom for the unequal variances case ³
Equal/different Vars in Practice • First step: obtain sdev or variance estimates for each group. If drastically different, use the unequal-variance test. • Later we will see a test for equality of variances.
Example 13.1 continued • Test the scientist’s claim about high-fiber cereal eaters consuming less calories than non-high fiber cereal eaters assuming unequal variances at the 5% significance level. • There were 43 high-fiber eaters who had a mean of 604.02 calories for lunch with s=64.05. There were 107 non-eaters who had a mean of 633.23 calories for lunch with s=103.29.
Additional Example-Problem 13.49 Tire manufacturers are constantly researching ways to produce tires that last longer and new tires are tested by both professional drivers and ordinary drivers on racetracks. Suppose that to determine whether a new steel-belted radial tire lasts longer than the company’s current model, two new-design tires were installed on the rear wheels of 20 randomly selected cars and two existing-design tires were installed on the rear wheels of another 20 cars. All drivers were told to drive in their usual way until the tires wore out. The number of miles(in 1,000s) was recorded(Xr13-49). Can the company infer that the new tire will last longer than the existing tire?
From 2-sided to right/left sided • Given a 2-sided p-value, how do we get a 1-sided p-value (JMP gives only the former)? • Right-sided: • if xbar-difference > mu-difference: right-sided p-value = 2-sided p-value /2 (!!) • If xbar-difference < mu-difference: right-sided p-value > 0.5, so can’t reject …
Observational vs. Experimental Data • Observational data: The researcher observes individuals and measures variables of interest but does not control which group each individual (unit) is assigned to • Experimental data: The researcher controls which group each individual is assigned to. A common procedure in statistical studies is to use random assignment • Example 13.1 (high fiber cereal example) – observational or experimental data? • Problem 13.49 (tire problem) – observational or experimental data?
Interpretation of Experimental vs. Observational Data • For both observational and experimental data obtained by random sampling, we can use statistical inference to assess the evidence that there is a difference between the two groups. • A well controlled experiment (e.g., a randomized experiment) provides evidence of the effect of the treatment (high-fiber cereal) on the outcome (calories eaten for lunch). • But for observational data, we cannot conclude that a difference between groups is due to the treatment. The difference could be due to a confounding factor, e.g., high-fiber cereal eaters are more health conscious.
13.4 Matched Pairs Experiments • What is a matched pair experiment? • Why matched pairs experiments are needed? • How do we deal with data produced in this way? The following example demonstrates a situation where a matched pair experiment is the correct approach to testing the difference between two population means.
13.4 Matched Pairs Experiment Example 13.3 • To investigate the job offers obtained by MBA graduates, a study focusing on salaries was conducted. • Particularly, the salaries offered to finance majors were compared to those offered to marketing majors. • Two random samples of 25 graduates in each discipline were selected, and the highest salary offer was recorded for each one. The data are stored in file Xm13-03. • Can we infer that finance majors obtain higher salary offers than do marketing majors among MBAs?.
What’s happening here? • Question • The difference between the sample means is 65624 – 60423 = 5,201. • So, why could we not reject H0 and favor H1 where(m1 – m2 > 0)?
The effect of a large sample variability • Answer: • Sp2 is large (because the sample variances are large) Sp2 = 311,330,926. • A large variance reduces the value of the t statistic and it becomes more difficult to reject H0.
Reducing the variability The range of observations sample A The values each sample consists of might markedly vary... The range of observations sample B
Reducing the variability Differences ...but the differences between pairs of observations might be quite close to one another, resulting in a small variability of the differences. The range of the differences 0
The matched pairs experiment • Example 13.4 • It was suspected that salary offers were affected by students’ GPA, (which caused S12 and S22 to increase). • To reduce this variability, the following procedure was used: • 25 ranges of GPAs were predetermined. • Students from each major were randomly selected, one from each GPA range. • The highest salary offer for each student was recorded. • From the data presented can we conclude that Finance majors are offered higher salaries?
Matched Pairs => One-Sample Test • After taking differences of observations within each pair, continue with a one-sample test with
Additional Example-Problem 13.75 Tire example contd. (Problem 13.49) Suppose now we redo the experiment: On 20 randomly selected cars, one of each type of tire is installed on the rear wheels and, as before, the cars are driven until the tires wear out. The number of miles(in 1000s) is stored in Xr13- 75. Can we conclude that the new tire is superior?
Practice Problems • 13.40,13.56,13.68,13.76