600 likes | 1.14k Views
Hypothesis and Testing of Hypothesis. Prof. KG Satheesh Kumar Asian School of Business. Illustration 1: Court Verdict. Under Indian legal system, an accused is assumed innocent until proved guilty “beyond a reasonable doubt”. This is called null hypothesis .
E N D
Hypothesis and Testing of Hypothesis Prof. KG Satheesh Kumar Asian School of Business
Illustration 1: Court Verdict • Under Indian legal system, an accused is assumed innocent until proved guilty “beyond a reasonable doubt”. • This is called null hypothesis. We may write: H0: Accused is innocent • Court holds the null hypothesis as true until it can be proved, based on evidence, and beyond reasonable doubt, that it is false
The Verdict • If H0 is proved to be false, it is rejected and an alternative hypothesis, H1, is accepted • We may write: H1: Accused is not innocent, hence guilty. • If H0 cannot be proved to be false, beyond reasonable doubt, then it cannot be rejected and is hence accepted
Illustration 2: Bottling Cola • Company claims 2 lit volume; consumer advocate wants to test the claim H0: Mean Volume >= 2 lit H1: Mean Volume < 2 lit • Consumers are happy, but company suspects that there is overfilling H0: Mean Volume <= 2 lit H1: Mean Volume > 2 lit
Bottling Cola – Engineer’s View • The plant engineer wants to take corrective action if the average volume is more than or less than 2 litres H0: Mean Volume = 2 lit H1: Mean Volume 2 lit
Prerequisites for this chapter • Random variable and its probability (click) distribution / probability density function • The Normal Distribution (click) • Sampling and Sampling Distribution (click) • Estimation (click)
Hypothesis • A thesis is something that has been proven to be true • A hypothesis is something that has not yet been proven to be true • Hypothesis testing is the process of determining, through statistical methods, whether or not a given hypothesis may be accepted as true • Hypothesis testing is an important part of statistical inference – making decisions about the population based on sample evidence
Setting up and testing hypotheses is an essential part of statistical inference. In order to formulate such a test, usually some theory has been put forward, either because it is believed to be true or because it is to be used as a basis for argument, but has not been proved. E.g.: Claiming that a new drug is better than the current drug for treatment of the same symptoms.
The question of interest is simplified into two competing claims / hypotheses between which we have a choice; the null hypothesis, denoted H0, against the alternative hypothesis, denoted H1. These two competing claims / hypotheses are not however treated on an equal basis: special consideration is given to the null hypothesis. In fact only the null hypothesis is tested whether to reject or not.
Null and Alternative Hypothesis • Hypothesis is a testable assertion about the population (value of a parameter) • Null hypothesis (Ho) is an assertion held true unless we have sufficient statistical evidence to conclude otherwise • The alternative hypothesis (H1, Ha, or Hα) is the negation of the null hypothesis. • The two are mutually exclusive. One and only one of the two can be true.
Determining the null hypothesis • The null hypothesis is often a claim made by someone and alternative hypothesis is the suspicion about that claim. • There may be no claim; then what we wish to demonstrate is the alternative hypothesis and its negation is the null hypothesis. • H1 describes the situation we believe to be true and Ho describes the situation contrary to what we believe about the population. • Null hypothesis is the one which when true does not call for a corrective action. If the alternative hypothesis is true, some corrective action would be necessary. • If the obtained statistics is unlikely to be true, what we reject is Ho • Note: The equality sign always appear in Ho.
Examples of H0 and H1 • Ex 1: A pharmaceutical company claims that four out of five doctors prescribe the pain medicine it produces. Set up Ho and H1 to test this claim. (Answer) • Ex 2: A medicine is effective only if the concentration of a certain chemical is at least 200 ppm. At the same time, the medicine would produce an undesirable side effect if the concentration of the same chemical exceeds 200 ppm. Set up H0, H1. (Answer) • Ex 3: A maker of golf balls claims that the variance of the weights of the company’s golf balls is controlled to within 0.0028 oz2. Set up hypotheses to test this claim (Ans)
More examples • Ex 4: The average cost of a traditional open-heart surgery is claimed to be $49,160. If you suspect that the claim exaggerates the cost, how would you set up the hypotheses? (Ans) • Ex 5: A vendor claims that he can fulfill an order in at most six working days. You suspect that the average is greater than six working days and want to test the hypothesis. How will you set up the hypotheses? (Ans)
More examples • Ex 6: At least 20% of the visitors to a particular store are said to end up placing an order. How will you set up hypotheses to test the claim? (Answer) • Ex 7: Web surfers will lose interest if downloading takes more than 12 seconds. If you wish to test the effectiveness of a newly designed web page in regard to download time, how will you set up the null and alternative hypotheses? (Answer)
Common types of hypothesis tests • Parametric test of hypotheses about population parameters: • Mean (); proportion (p) and variance (2) using z, t and chi-square distributions • Test of difference between two population means using t and z distributions • paid observations; independent observations • Test of difference between two population proportions using z distribution • Test of equality of two population variances using F-distribution • Analysis of variance for comparing several population means • Parametric tests are more powerful than non-parametric tests because the data are derived from interval and ratio measurements • Non-parametric tests are used to test hypotheses with nominal and ordinal data • The Sign Test, The Runs Test, Wald-Wolfowitz Test, Mann-Whitney U Test, Kruskal-Wallis Test, Chi-Square Test for Goodness of fit • An important assumption for parametric tests is that the population is approximately normal (or sample size is large). No such assumptions are required for non-parametric tests, which are hence also called, distribution-free tests.
Steps in Hypothesis Testing • Set up the null and alternative hypotheses • Decide on the significance level, α (standard values: 10%, 5%, 1%) • Using a random sample, get sample statistic and then calculate test statistic • Find the table value of test statistic corresponding to the required α value • Compare the calculated and table values of the test statistic and interpret. • Note: Only the null hypothesis is actually tested
Type I, Type II errors • Four outcomes are possible • Ho is true and is not rejected (Not an error) • Ho is true, but is rejected (Type I error) • Ho is false, but not rejected (Type II error) • Ho is false and is rejected (Not an error) • Type I error is when we reject a true null hypothesis • Type II error is when we do not reject a false null hypothesis
One-tailed and two-tailed tests • Left-tailed test: In case Ho makes a “>=“ claim, then rejection occurs when the statistic is far below, i.e. on the left tail. • Right-tailed test: Ho makes a <= claim and rejection occurs on the right tail • Two-tailed test: Ho makes a “=“ claim and rejection occurs on both tails. • Rejection and non-rejection regions are marked in the distribution of the sample statistic and the test statistic for interpreting the test results.
The p-value • The p-value • is the probability of getting a sample evidence at least as unfavorable as the sample statistic when the null hypothesis is actually true. • is a “credibility rating” for H0 • is the probability of Type I error • is an approximate answer to the question, “given the sample evidence, what is the probability that Ho is true?”
Significance Level, α • This is the maximum “set” probability of type I error. Accordingly, α decides the policy to reject / accept H0. • Policy: If p-value is less than α, reject H0 • If p-value is not less than α, we do not to reject H0, but this does not mean that H0 is true. Only that we do not have sufficient evidence to reject H0. • The selected value of α indirectly decides the probability of making a type II error. We use the symbol β for this probability.
Confidence level • The fraction, 1 – α is called the confidence level. If α = 5%, the confidence level is 95%, which means we want to be at least 95% confident that Ho is false before we reject it. • Optimal α; compromise between Type I and Type II errors; cost of each type of error; producer’s risk and consumer’s risk.
Type II Error and Power of a Test • Type II error, β is difficult to estimate; it depends on α, the sample size, and the actual population parameter. • Power of a Test • The complement of type II error, i.e. 1 - β is called the power of the test. It is the probability that a false null hypothesis will be detected by the test.
Test Statistic • Test Statistic • A random variable, calculated from the sample evidence, and having a well-known probability distribution • Mostly used are Z, t, χ2 and F. The distributions of these random variables are well-known and tables are available. • See tables of Z and t distributions.
Test statistic used • Test statistic for mean is z or t. (See next slide) • Test statistic = (Sample mean – hypothesized population mean)/ SE; where SE is the Standard Error • Test statistic for proportion (assuming large sample) is Z • Z = (sample proportion – p)/ SE; where p is the hypothesized population proportion and SE = (pq)/n); q = 1-p • Test statistic for variance, • χ2= (n-1) S2/2 where S2 is the sample variance and 2 is the hypothesized population variance
Test statistic for population mean • When the null hypothesis is about the population mean, the test statistic is: • Z if population standard deviation, is known • t if sample standard deviation, S is known. When Z is used, Z = (sample mean - )/(/n) When t is used, t = (sample mean - )/(S/n) In the latter case, use degrees of freedom as n-1 • It is necessary that either the population is normal or the sample size is large enough
Examples on Hypothesis Testing • Ex 8: A certain medicine is supposed to contain an average of 247 ppm of a chemical. If the concentration exceeds 247 ppm, the drug may cause undesirable side effects. A random sample of 60 portions is tested and the sample mean is found to be 250 ppm and sample standard deviation 12 ppm. Perform a statistical hypothesis test at 1% and 5% significance. (Ans)
Ex 9: In the above example, assume that there are no side effects, but we are told that the drug may be ineffective if the concentration is below 247 ppm. The sample evidence is the same as before. Formulate and test the hypothesis. (Ans) • Ex 10: In the above example, assume that side effects and effectiveness are both to be considered. The sample evidence is the same. Formulate and test the hypothesis. (Ans)
Ex 11: Certain eggs are stated to have reduced cholesterol content, with an average of only 2.5% cholesterol. A concerned health group wants to test whether the claim is true. A random sample of 100 eggs reveals a sample average content of 3.0% cholesterol with a standard deviation of 2.8%. Does the health group have cause for action? (Ans) • Ex 12: A survey of medical schools indicates that 16% of the faculty positions are vacant. A placement agency conducts a survey to test this claim, using a random sample of 300 faculty positions and finds that 39 out of the 300 are vacant. Test the claim at 5% level of significance (Ans)
Random Variable • A variable associated with a random experiment like drawing a random sample from the population – the variable may be mean, proportion, variance • A random variable is an uncertain quantity whose value depends on chance • A random variable (denoted by X) takes a range of discrete values with some discrete probability distribution, P(X) or continuous values with some probability density, f(X). • P(X) or f(X), as the case may be, can be used to find the probability that the random variable takes specific values or range of values Return
The Normal Distribution • If a random variable, X is affected by many independent causes, none of which is overwhelmingly large, the probability distribution of X closely follows normal distribution. Then X is called normal variate and we write X ~ N(, 2), where is the mean and 2 is the variance • A Normal pdf is completely defined by its mean, and variance, 2. The square root of variance is called standard deviation . • If several independent random variables are normally distributed, their sum will also be normally distributed with mean equal to the sum of individual means and variance equal to the sum of individual variances.
The area under any pdf between two given values of X is the probability that X falls between these two values
Standard Normal Variate, Z • SNV, Z is the normal random variable with mean 0 and standard deviation 1 • Tables are available for Standard Normal Probabilities • X and Z are connected by: Z = (X - ) / and X = + Z • The area under the X curve between X1 and X2 is equal to the area under Z curve between Z1 and Z2.
Return Standard Normal Probabilities (Table of z distribution) • z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 • 0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359 • 0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753 • 0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141 • 0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517 • 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879 • 0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224 • 0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549 • 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852 • 0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133 • 0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389 • 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621 • 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830 • 1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015 • 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177 • 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319 • 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441 • 1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545 • 1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633 • 1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706 • 1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767 • 2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817 • 2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857 • 2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890 • 2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916 • 2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936 • 2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952 • 2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964 • 2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974 • 2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981 • 2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986 • 3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990 • 3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993 • 3.2 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995 • 3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997 • 3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998 The z-value is on the left and top margins and the probability (shaded area in the diagram) is in the body of the table
Sampling Distribution • The sampling distribution of x is the probability distribution of all possible values of x for a given sample size n taken from the population. • According to the Central Limit Theorem, for large enough sample size, n, the sampling distribution is approximately normal with mean and standard deviation /n. This standard deviation is called standard error. • CLT holds for non-normal populations also and states: For large enough n, x ~ N(, 2/n) Return
Estimation • The value of an estimator (see next slide), obtained from a sample can be used to estimate the value of the population parameter. Such an estimate is called a point estimate. • This is a 50:50 estimate, in the sense, the actual parameter value is equally likely to be on either side of the point estimate. • A more useful estimate is the interval estimate, where an interval is specified along with a measure of confidence (90%, 95%, 99% etc) • The interval estimate with its associated measure of confidence is called a confidence interval. • A confidence interval is a range of numbers believed to include the unknown population parameter, with a certain level of confidence
Estimators • Population parameters (, 2, p) and Sample Statistics (x,s2, ps) • An estimator of a population parameter is a sample statistic used to estimate the parameter • Statistic,x is an estimator of parameter • Statistic, s2 is an estimator of parameter 2 • Statistic, ps is an estimator of parameter p Return
Return Standard Normal Probabilities (Table of z distribution) • z 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 • 0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359 • 0.1 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753 • 0.2 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141 • 0.3 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517 • 0.4 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879 • 0.5 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224 • 0.6 0.2257 0.2291 0.2324 0.2357 0.2389 0.2422 0.2454 0.2486 0.2517 0.2549 • 0.7 0.2580 0.2611 0.2642 0.2673 0.2704 0.2734 0.2764 0.2794 0.2823 0.2852 • 0.8 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133 • 0.9 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389 • 1.0 0.3413 0.3438 0.3461 0.3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621 • 1.1 0.3643 0.3665 0.3686 0.3708 0.3729 0.3749 0.3770 0.3790 0.3810 0.3830 • 1.2 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015 • 1.3 0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177 • 1.4 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 0.4306 0.4319 • 1.5 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441 • 1.6 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545 • 1.7 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633 • 1.8 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706 • 1.9 0.4713 0.4719 0.4726 0.4732 0.4738 0.4744 0.4750 0.4756 0.4761 0.4767 • 2.0 0.4772 0.4778 0.4783 0.4788 0.4793 0.4798 0.4803 0.4808 0.4812 0.4817 • 2.1 0.4821 0.4826 0.4830 0.4834 0.4838 0.4842 0.4846 0.4850 0.4854 0.4857 • 2.2 0.4861 0.4864 0.4868 0.4871 0.4875 0.4878 0.4881 0.4884 0.4887 0.4890 • 2.3 0.4893 0.4896 0.4898 0.4901 0.4904 0.4906 0.4909 0.4911 0.4913 0.4916 • 2.4 0.4918 0.4920 0.4922 0.4925 0.4927 0.4929 0.4931 0.4932 0.4934 0.4936 • 2.5 0.4938 0.4940 0.4941 0.4943 0.4945 0.4946 0.4948 0.4949 0.4951 0.4952 • 2.6 0.4953 0.4955 0.4956 0.4957 0.4959 0.4960 0.4961 0.4962 0.4963 0.4964 • 2.7 0.4965 0.4966 0.4967 0.4968 0.4969 0.4970 0.4971 0.4972 0.4973 0.4974 • 2.8 0.4974 0.4975 0.4976 0.4977 0.4977 0.4978 0.4979 0.4979 0.4980 0.4981 • 2.9 0.4981 0.4982 0.4982 0.4983 0.4984 0.4984 0.4985 0.4985 0.4986 0.4986 • 3.0 0.4987 0.4987 0.4987 0.4988 0.4988 0.4989 0.4989 0.4989 0.4990 0.4990 • 3.1 0.4990 0.4991 0.4991 0.4991 0.4992 0.4992 0.4992 0.4992 0.4993 0.4993 • 3.2 0.4993 0.4993 0.4994 0.4994 0.4994 0.4994 0.4994 0.4995 0.4995 0.4995 • 3.3 0.4995 0.4995 0.4995 0.4996 0.4996 0.4996 0.4996 0.4996 0.4996 0.4997 • 3.4 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4997 0.4998 The z-value is on the left and top margin and the probability (shaded area in the diagram) is in the body of the table
Ex 1 The claim is the null hypothesis and its negation is the alternative hypothesis. If p denotes the proportion of doctors prescribing the medicine, we set the hypotheses as: Ho: p >= 0.8 H1: p < 0.8 Return
Ex 2 Null hypothesis is the one which calls for no corrective action and the alternative hypothesis is the one that calls for corrective action If denotes the concentration of the chemical, we set up the hypotheses as: Ho: = 200 ppm H1: 200 ppm Return
Ex 3 The claim is the null hypothesis. Using 2 to denote variance, the hypotheses can be set up as: Ho: 2 <= 0.0028 oz2 H1: 2 > 0.0028 oz2 Return
Ex 4 The claim is the null hypothesis and your suspicion (belief) is the alternative hypothesis. If denotes the average cost, the hypotheses are: Ho: >= $49,160 H1: < $49,160 Return
Ex 5 The claim is the null hypothesis and your suspicion is the alternative hypothesis. If denotes the average number of days to fulfill an order, the hypotheses are: Ho: <= 6 H1: > 6 • Return
Ex 6 The claim becomes the null hypothesis. Let p denote the proportion of visitors placing an order. Then the hypotheses will be set up as: Ho: p >= 0.20 H1: p < 0.20 Return
Ex 7 Corrective action is needed if average downloading time exceeds 12 seconds; so this forms H1. Let denote the average download time. Then: Ho: <= 12 s H1: > 12 s Return
Ex 8 Let denote the average ppm of the chemical. The hypotheses are: Ho: <= 247 H1: > 247 Sample statistic, x = 250; sample SD, s = 12 and sample size n = 60 (large sample); standard error, SE = 12/60 = 1.55. Right-tailed test Since we know only sample SD, test statistic follows t-distribution with degrees of freedom 59 Test statistic, t = (250-247)/1.55 = 1.936 From the table of t-distribution, one-tailed t-values for 59 df are: t5% = 1.671 and t1% = 2.390 Comparing the calculated and table values of the test statistic, we reject the null hypothesis at 5% level of significance (95% confidence); but do not reject null hypothesis at 1% level of significance (99% confidence level) Return