150 likes | 262 Views
Hypothesis Testing (Introduction-I). QSCI 381 – Lecture 25 (Larson and Farber, Sect 7.1). Illustrative Example-I.
E N D
Hypothesis Testing(Introduction-I) QSCI 381 – Lecture 25 (Larson and Farber, Sect 7.1)
Illustrative Example-I • It is claimed that 20% of a certain species of rockfish spawns each year. We sample 780 fish and find 125 show evidence for spawning. Is the sample statistic ( ) “sufficiently different” from the claim that we can reject the claim?
Illustrative Example-II • We construct the sampling distribution for samples of size 780 if the population proportion was really 0.2. data Claim Is the difference “big enough”?
Sampling Distributions • A is a (continuous) probability distribution for a sample statistic (given values for the population parameters). • One can think of the sampling distribution for the sample mean as a histogram for the value the sample mean could take given the collection of (very) many samples of size n. Sampling distribution
Hypothesis Tests-I • A is a process that uses sample statistics to test a claim about the value of a population parameter. • Colloquially, what we are going to do is to see whether the data we have “could have happened” if the claim was true. hypothesis test
Hypothesis Tests-II • A claim about a population parameter is called a . To test a statistical hypothesis, we state a pair of hypotheses – one that represents the claim and the other its complement. • We use statistical methods to determine whether or not we can reject the claim. statistical hypothesis
Hypothesis Tests-III null hypothesis • A H0 is a statistical hypothesis that contains a statement of equality, e.g. , , or =. • The is the complement of the null hypothesis. It is a statement that must be true if H0 is false. • H0 is read as “H subzero” or “H naught”, Ha is read as “H sub-a”. • Note that it is sometimes necessary to define the claim as the alternative hypothesis. alternative hypothesis
Null and Alternative Hypotheses • First determine the claim and hence the null hypothesis. The first two hypotheses have one-sided alternatives while the third hypothesis has a two-sided alternative.
Null and Alternative Hypotheses • What are the null and alternative hypotheses for: • The density of salmon is 100 fish / ha. • The escapement is 20%. • More than 60% of the population is mature. • The number of recaptures is 7. • The number of recaptures is not 7.
Null and Alternative Hypotheses • Notes: • The null hypothesis and the alternative hypothesis should be determined before the data are collected. • Always use a two-sided alternative unless there are good theoretical reasons for using a one-sided alternative. This is particularly true if the hypotheses are constructed after the data are collected.
Testing Hypotheses • To test a hypothesis: • We assume the null hypothesis is true. • We determine the sampling distribution for the data. • We compare the data with the sampling distribution for the data if the null hypothesis was true. • There are two outcomes from this: • We reject the null hypothesis. • We fail to reject the null hypothesis. • Note: we do not accept the null hypothesis – why not?
Type I and Type II Errors-I type I error • A occurs if the null hypothesis is rejected when it is actually true. • A occurs if the null hypothesis is not rejected when it is actually false. • Errors occur because the sample is not the same as the population. type II error
Type I and Type II Errors-III • The consequences of Type I and Type II errors can be quite different and very important. Consider the claims: • By implementing this management measure, the rate of recovery will be at least 1% per annum. • We sampled 4 clams and claim that the proportion of the population with a given disease is 3% or less. • We claim that male and female gnu grow at the same rate. We measure 10,000 gnu and compare the mean lengths of males and females. • How would you balance Type I and Type II errors in these cases.
Significance • In a hypothesis test, the is the maximum allowable probability of making a type I error (denoted ). • The probability of making a type II error is denoted by . • There are three commonly used levels of significance (=0.1, 0.05 and 0.01) – they are all arbitrary. level of significance