190 likes | 317 Views
Hypothesis Testing. Null Hypothesis: The means of the populations from which the samples were drawn are the same. The samples come from the same population. m G / R. Alternative Hypothesis: The means of the populations from which the samples were drawn are different.
E N D
Null Hypothesis: The means of the populations from which the samples were drawn are the same. The samples come from the same population. mG/R
Alternative Hypothesis: The means of the populations from which the samples were drawn are different. The samples come from different populations. mG mR
Our problem: All we see are the data we have sampled, not the populations.
Are the sampled data most consistent with the null hypothesis “single population” idea? mG/R
Or with the alternative hypothesis “separate populations” idea, mG mR
300 214 189 183 200 120 118 100 65 54 21 15 11 9 1 0 0 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 Distribution of differences between the means of two samples drawn from the same population (m=0,s=1) Frequency (Probability *1000) Difference Between the Means of the Samples
“What is the probability that the observed samples could have been drawn from the same population?” mG/R P=0.79 x x
300 200 100 0 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 Distribution of differences between the means of two samples drawn from the same population (m=0,s=1) 214 189 183 Frequency (Probability *1000) 120 118 65 54 21 15 11 9 1 0 Difference Between the Means of the Samples
x x “What is the probability that the observed samples could have been drawn from the same population?” mG/R P=0.18
300 200 100 0 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 Distribution of differences between the means of two samples drawn from the same population (m=0,s=1) 214 189 183 Frequency (Probability *1000) 120 118 65 54 21 15 11 9 1 0 Difference Between the Means of the Samples
x x “What is the probability that the observed samples could have been drawn from the same population?” mG/R P=0.02
300 200 100 0 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 Distribution of differences between the means of two samples drawn from the same population (m=0,s=1) 214 189 183 Frequency (Probability *1000) 120 118 65 54 21 15 11 9 1 0 Difference Between the Means of the Samples
The Dilemma How much risk to take when deciding whether the null or alternative hypothesis is correct? Too much risk and you may commit a Type I error – rejection of a true null hypothesis (accepting a false Alternative - deciding samples are different when they are the same). Too little risk and you may commit a Type II error – failure to reject a false null hypothesis (accepting a false null - deciding samples are the same when they are different). The accepted risk level is set at 0.05 or a 1 out of 20 chance that you will accidentally accept a false alternative – commit Type I error. Statistical tables are set to this level of risk.
More on Type I and II Errors • Sometimes in medical studies we seek to lower the chance of Type I errors (rejecting a true null hypothesis) and we may set the level for rejecting the null hypothesis lower – such as p = 0.01 or a 1 in 100 chance of committing Type I error • Decreasing chance of Type I error will increase the chance of Type II error • The only way to simultaneously reduce the risk of Type I and Type II errors is to increase sample size
3 2 1 red blue green yellow black Measurements of Central Tendency Data Set: 2 red, 3 blue, 1 green, 2 yellow, 2 black Frequency Histogram - Mean - sum of data divided by the number of data points Median - middlemost data point when data are arrayed in sequence (lowest to highest) Mode - most frequently occurring value
Data Set: Range: highest and lowest values Variance: s = Standard Deviation: the square root of variance Student A B C D E F G Exam I 90 95 85 90 85 90 95 Exam II 100 80 70 85 95 100 100 x) 2 x – 2 n 2 n – 1 Measurements of Dispersion