190 likes | 311 Views
Apr. 5. Stat 100. To do. Read Chapter 21, try problems 1-6 Skim Chapter 22. Thought Question. Suppose we think mean classes missed per week is higher for males than for females. How would we determine whether this is the case?. Basic Strategy.
E N D
Apr. 5 Stat 100
To do • Read Chapter 21, try problems 1-6 • Skim Chapter 22
Thought Question • Suppose we think mean classes missed per week is higher for males than for females. • How would we determine whether this is the case?
Basic Strategy • Collect data on missed classes for males and females • Calculate mean for each sex and determine the difference. • Then what? Suppose the mean for 50 males was 0.5 higher than mean for 50 females. • Could we generalize that for all students, mean is higher for males?
An Issue • Did the observed difference occur just by chance or does it reflect a true difference? • What’s the likelihood the observed difference would be 0.5 if there’s really no difference in the populations?
Significance Test • Data used to decide between two competing statements (hypotheses) about the population • Statements are called null hypothesis and alternative hypothesis
The two hypotheses • null hypothesis : a “nothing happening” statement • no difference between groups, or no relationship, or nothing new • alternative hypothesis : "something's happening” • there is a difference, or there is a relationship, or something’s new
Notation • H0 represents null hypothesis • HA represents alternative hypothesis
Missed classes by men and women • H0: No difference in mean classes missed for men and women • HA: There is a difference between mean missed classes by men and women
Deciding between the hypotheses – an example • Students asked to randomly a number between 1 and 10 • In past, a bias toward picking 7 has been noticed • With random picking, what proportion would pick 7? • Answer = 1/10 = 0.1 because there are 10 numbers
Are students picking randomly? • let p represent proportion of all students would pick 7 • null : random picking , p = 0.10 • alternative: bias toward 7 , p> 0.10
The sample data • Suppose 24 of 100 students in the class pick 7. • The proportion picking 7 is 24/100 = 0.24 (much higher than .10) • Could this have happened through random picking? • Or, does it reflect bias toward 7?
With true random picking • Distribution of possible sample proportions would be bell curve • Centered at 0.10 (chance of randomly picking 7) • SD = Sqr root[(0.1)(1-0.1)/100] = 0.03 • About 99.7% chance that sample proportion would fall in range 0.10 3(.03), or .01 to .19
Where the observed statistic falls • Sample value of 0.24 is outside interval of what might normally occur with random picking • Reasonable to reject the hypothesis that picking is random
Where the observed statistic falls • Observed = 0.24 picked 7 • Find percentile rank of 0.24 in bell curve with mean 0.10 and SD = 0.03 • z score = (0.24-0.10) / 0.03 = 0.14 / 0.03= 4.67 • Page 137 table, proportion to left of z is around 0.995
Statistically Significant • A result is called statistically significant when a null hypothesis is rejected • In our example, the result was statistically significant
Thought Question • Imagine all the criminal trials in the United States • In each trial, what is the null hypothesis? • What is the alternative hypothesis? • Overall criminal trials what are the two types of mistakes that can b made?
Statistical Errors • Type 1: rejecting the null hypothesis when you should not (like a convicting innocent person) • Type 2: not rejecting the null hypothesis when you should (like failing to convict a guilty person)
Example • Experiment is done to see if new treatment for depression is better than old treatment • null hyp: new treatment not better • alternative hyp: new treatment is better • Type 1 error: deciding new treatment is better when it is not • Type 2 error: deciding new treatment is not better when it actually is