180 likes | 391 Views
Hypothesis testing. Hypothesis testing. A common aim in many studies is to check whether the data agree with certain predictions. These predictions are hypotheses about variables measured in the study.
E N D
Hypothesis testing A common aim in many studies is to check whether the data agree with certain predictions. These predictions are hypotheses about variables measured in the study. A hypothesis is a statement about some characteristic of a variable or a collection of variables. A significance test is a way of statistically testing a hypothesis by comparing the data to values predicted by the hypothesis. Data that fall far from the predicted values provide evidence against the hypothesis. All significance tests have five elements: assumptions, hypotheses, test statistic, p-value, and conclusion. All significance tests require certain assumptions for the tests to be valid. These assumptions refer, e.g., to the type of data, the form of the population distribution, method of sampling, and sample size.
Hypothesis testing Exmple: a firm produces metal boxes and wants to evaluete the production process. They want to be sure that the longest side of the box is 368 mm. They keep a sample of 25 boxes. If the length of the side would result different, the all production process will need a correction.
Hypothesis testing A significance test considers two hypotheses about the value of a population parameter: the null hypothesis and the alternative hypothesis. The null hypothesis H0 is the hypothesis that is directly tested. This is usually a statement that the parameter has value corresponding to, in some sense, no effect. The alternative hypothesis Ha is a hypothesis that contradicts the null hypothesis. This hypothesis states that the parameter falls in some alternative set of values to what null hypothesis specifies.
Hypothesis testing • A significance test analyzes the strength of sample evidence against the null hypothesis. The test is conducted to investigate whether the data contradict the null hypothesis, hence suggesting that the alternative hypothesis is true. The alternative hypothesis is judged acceptable if the sample data are inconsistent with the null hypothesis. That is, the alternative hypothesis is supported if the null hypothesis appears to be incorrect. The hypotheses are formulated before collecting or analyzing the data. • The test statistics is a statistic calculated from the sample data to test • the null hypothesis. This statistic typically involves a point estimate of the parameter to which the hypotheses refer.
The sample distribution of the test statistics is divided into two regions: Region of rejection Region of acceptance Value of test statistics Falls in region of acceptance Falls in region of rejection Null hypothesis cannot be rejected Null hypothesis must be rejected Hypothesis testing Decision rule:
To decide on the null hypothesis, we need to find the critic value of the test statistics. Hypothesis testing This is the value that divide the acceptance and rejection region Critic value Critic value Rejection region Acceptance region Rejection region
Hypothesis testing • The p-value is the probability, if H0 were true, that the test statistic would fall in this collection of values. • The p-value is the probability, when H0 is true, of a test statistic value at least as contradictory to H0 as the value actually observed. • The smaller the p-value, the more strongly the data contradict H0. • For example, a p-value such as 0.3 or 0.8 indicates that the observed data would not be unusual if H0 were true. But a p-value such as 0.001 means that such data would be very unlikely, if H0 were true. This provides strong evidence against H0.
To verify that the mean of a population is equal to a certain value μ, against the alternative hypotheasis of a value different from it, if we know σ, we can use the test statistics Z: Test for the mean (known variance) X is distributied as a Normal => under H0, Z is distributed as a standardised normal If Z has values near 0 we can accept H0, else we refuse H0 (two side test).
Critical value approach (level of significance of 0.05) Test for the mean (known variance) Decision rule: Refuse H0 if Z>+1,96 or if Z<-1,96 else accept H0 Rejection region Acceptance region Rejection region Critic value Critic value
Test per la media (varianza nota) Example: a firm produces metal boxes and wants to evaluete the production process. They want to be sure that the longest side of the box is 368 mm. They keep a sample of 25 boxes. The standard devistion id 15 mm and the sample mean is 372,5 mm. H0: μ = 368 H1: μ≠ 368 With the value of the test statistics, H0 cannot be refused. Rejection region Acceptance region Rejection region
P-value approach • Decision rule: • if the p-value greather than or equal to , null hypothesis is accepted. • if the p-value è is less than , the null hypothesis is rejected.
Usually we do not know σ and we estimate it through S. In this case the test statistics to be use is t: Test for the mean(unknown variance) It has the Student’s t distribution with n − 1 degrees of freedom if H0 is true. Also in this case we can use the critic value approach or the p-value one. The tables to be used are the t-Student’s ones.
Example: t with a level of significance 0.05 and 11 degree fo freedom Test for the mean(unknown variance) Rejection region Acceptance region Rejection region Critic value Critic value
Test for the mean(unknown variance) Example: The following data are the amounts in dollars in a random sample of 12 sales invoices. 108.98 152.22 111.45 110.59 127.46 107.26 93.32 91.97 111.56 75.71 128.58 135.11 H0: μ=120 H1:μ≠120 α=0.05
…Example Since -2.201<t=-1.19<2.201 we do not reject H0.