210 likes | 347 Views
Lecture 15: Tues., Mar. 2. Inferences about Linear Combinations of Group Means (Chapter 6.2) Chi-squared test (Handout/Notes) Thursday: Simple Linear Regression (Chapter 7). Review of One-way layout. Assumptions of ideal model All populations have same standard deviation.
E N D
Lecture 15: Tues., Mar. 2 • Inferences about Linear Combinations of Group Means (Chapter 6.2) • Chi-squared test (Handout/Notes) • Thursday: Simple Linear Regression (Chapter 7)
Review of One-way layout • Assumptions of ideal model • All populations have same standard deviation. • Each population is normal • Observations are independent • Planned comparisons: Usual t-test but use all groups to estimate . If many planned comparisons, use Bonferroni to adjust for multiple comparisons • Test of vs. alternative that at least two means differ: one-way ANOVA F-test • Unplanned comparisons: Use Tukey-Kramer procedure to adjust for multiple comparisons.
Case Study 5.1.2: Spock Conspiracy Trial • In 1968, Dr. Spock was tried in U.S. District Court of Boston on charges of conspiring to violate Selective Service Act by encouraging young men to resist being drafted into military service. • Defense challenged method by which jurors were selected, claiming that women – many of whom had raised children according to popular methods developed by Dr. Spock - were underrepresented • Venire for trial contained only one woman. • Defense argued that judge in trial had a history had a history of venires in which women were systematically underrepresented.
Data for Spock Conspiracy Trial • Percent of women in recent 30-juror venires for Spock Trial judge and six other Boston area district judges (A,B,C,D,E,F). Seven groups (judges) in one-way layout. Data in spock.JMP. • Key question: How does the mean percentage of women for Spock Trial judge compare to the average of the mean percentage of women for the other six judges, i.e., what is
Inference about Linear Combinations of Group Means • Parameter of interest: For Spock study, • Point estimate: • Standard Error: • 95% Confidence Interval for : • Test of : For level .05 test, reject if and only if does not belong to the 95% confidence interval.
Linear Combinations: Comparing Rates • In mice diet study, we are interested in the rate of increase in lifetime for each additional kilocalorie of reduced diet. • For example we are interested in comparing rate of increase in lifetime associated with reduction from 50 to 40 kcal/wk vs. rate of increase in lifetime associated with reduction from 85 to 50 kcal/wk
Populations of Nominal Data • So far we have focused on comparing populations of interval data (e.g., heights, scores, incomes) • We now consider comparing populations of nominal data. Nominal data are data that are categories. Examples: • Candidate person voted for (Bush or Gore) • Color of M&Ms (brown, yellow, red, orange, green or blue) • A population of nominal data with k categories can be described by the proportion in each category, in category 1, in category 2, …, in category k, ( ) , e.g., population of M&M’s is supposed to have
One Sample Test for Nominal Data • Analogue of one sample problem with interval population: Take random sample of size n from a population of nominal data. We want to test whether population frequencies are
SAT example • People sometimes say that “b” and “c” answers occur most frequently on multiple choice tests. To see if there is any evidence that the answers do not occur with equal frequency, a random SAT exam was selected from The College Board, 10 SATs, New York: College Entrance Examination Board.
Chi-squared Test • Chi-squared test statistic: • Reject for large values of . Critical value for level .05 test is .95 quantile of distribution with k-1 degrees of freedom (Table A.3) • Test is only valid if expected frequencies in each cell are 5 or more. When necessary, cells should be combined in order to satisfy this condition.
Chi-Squared Test in JMP • (For the SAT example) • Method I (list all observations in sample): Create a column for answer and list the sample. Then click Analyze, Distribution, put column with answer in Y, click OK, then click red triangle next to answer, click Test Probabilities and then input the hypothesized probabilities (0.2 for each category for SAT example). Then click OK. The row Pearson gives the chi-squared statistic and the p-value. • Method II (list frequencies for each category): Create a column for each answer (a,b,c,d,e) and another column frequency which contains the frequency of each answer. Then click Analyze, Distribution, put column with answer in Y and put column with frequency in Freq and click OK. Follow above instructions.
Random numbers experiment • When selecting random numbers (e.g., for a random sample or randomized experiment), you should always use a random number generator or a random number table. People are very bad at picking random numbers themselves. • Experiment: Everybody pick a random whole number between 1 and 10. We’ll then survey the class and test whether people’s “random” numbers are really random.
M&M’s • According to the M&M’s web site, the color distribution in peanut butter M&M’s is 20% brown, 20% yellow, 20% red, 20% blue, 10% green and 10% orange. Test