300 likes | 403 Views
No criminal on the run . The concept of test of significance FETP India. Competency to be gained from this lecture. Formulate and test null hypotheses. Key issues. Null and alternate hypotheses Type I and Type II errors Statistical testing. What is the question at hand?.
E N D
No criminal on the run The concept of test of significanceFETP India
Competency to be gained from this lecture Formulate and test null hypotheses
Key issues • Null and alternate hypotheses • Type I and Type II errors • Statistical testing
What is the question at hand? • Estimating a quantity? • Test a hypothesis? Hypotheses
Taking into account the sampling variation in decision-making • Studies are on sample of subjects and not on an entire population • There is sampling variation • Allowance should be given for sampling variation while a decision taking Hypotheses
Rationalizing decision-making • Research studies test hypotheses • Experiment and data collection • Hypotheses are tested on the basis of inference from available data • Considering a difference as significant may be subjective • The concept of statistical significance is a decision-making tool to make a subjective decision objective Hypotheses
A man is brought to court accused of a crime • The judge needs to start from the hypothesis that the person is innocent • The evidence is brought in: • Fingerprints • Pictures Hypotheses
Assessing whether the evidence is caused by chance or not • The judge assesses whether the evidence could be due to chance • If the probability that the evidence is caused by chance is high: • The judge accepts the hypothesis of innocence • If the probability that the evidence is caused by chance is low: • The judge rejects the hypothesis of innocence Hypotheses
Hypotheses formulated by epidemiologists • Ho: Null hypothesis (=“innocence”) • The difference observed is caused by chance, or sampling variation • H1: Alternate hypothesis • The probability that the difference observed is caused by chance alone is low Hypotheses
From sampling distribution to hypothesis testing • Epidemiologists decide a critical / rejection region • That decision is arbitrary • If the value falls under an extreme, rejection region, the null hypothesis is rejected Hypotheses
Type I and type II errors • Type I • Rejection error, also called alpha error • Rejecting the null hypothesis when it is true • Punishing an innocent • Particularly unacceptable to society • Must be minimized • Type II • Acceptance error, also called beta error • Accepting the null hypothesis when it is false • Releasing a guilty person charged Errors
Balancing the risk of errors • If the judge wants to always avoid type I error, he can release everyone • He will always commit the type II error • If the judge wants to always avoid type II error, he can charge everyone • He will always commit the type I error • To balance the risk of errors, we will fix one error and try to minimize the other Errors
Examples of errors • An example where type I error is important • If a new drug becomes available for HIV, we must minimize the risk to reject a drug that would work • An example where type II error is important • If a new drug becomes available for hypertension, since lots of anti-hypertensive are already available, we cannot take a risk and can only accept a drug that is completely safe Errors
Behind errors are the right decisions • 1-alpha • Probability of accepting the null hypothesis when it is the right decision • 1-beta • Probability of rejecting the null hypothesis when it is the right decision • Also called statistical power Errors
Alpha and beta error Errors
Sampling fluctuation in samples of 100 subjects for height measurement • Even when statistically sound sampling techniques are employed • The mean in samples of 100 will not necessarily be 65” • Variation from sample to sample • This must be taken into account when interpreting differences • This method is called a significance test = 1 Sampling error of mean Population of 10,000 Mean height = 65” S.d. = 10” 66” 67” 63” 64” 65” Testing
Magnitude of allowance • Consider an expected difference of 0% • 1%, 2%, 3% • Not large • 20%, 30% • Large, not willing to consider the difference as 0% • WHY? • If the true difference is 0%, the chance (probability) of getting a difference exceeding 20% is very small Testing
Decision rule • Formulate a decision rule based on the probability of getting the observed difference • Null hypothesis (Ho) • Assuming Ho is true, compute the probability of obtaining the observed difference • If the probability is low: • Reject Ho • Else, accept Ho Testing
Choosing a rejection level • The definition of low probability is subjective • Conventionally: • Low probability = 5% (P=0.05) • If P < 0.05, the observed difference is ‘significant (Statistically) • P< 0.01, sometimes termed as ‘Highly significant’ • Computation of P-values: • Statistical exercise • Depends on the nature of data and design of the study • Necessary condition: Probability sample • No test of significance on convenience or quota samples Testing
Concept of test of significance • Question: • Could the population mean be 65” ? • Hypothesis: • Population mean = 65” • Question: • What is the probability of obtaining a sample mean of 68” from this population when sample size = 100 ? • If this probability is small (e.g. < 5%) • Reject the Hypothesis • If not, accept the Hypothesis Population of 10, 000 A random sample of size 100 is drawn Mean height = 68” Testing
Test of significance: Computation of probability • Observed mean = 68” Postulated mean = 65” • Standard deviation = 10” Sample size = 100 • Sampling error (s.e.) of mean = 10 / 100 = 1 • Compute: Observed mean - Postulated mean 68-65 ----------------------------------------- = -------- = 3 s.e. of mean 1 • Critical value for significance at 5% level = 1.96 • Since 3 > 1.96, the difference is statistically significant • Exact probability = 0.0027 , i.e., 0.27% Testing
What if the distribution is not normal? • Transform the data (e.g., drug concentration, cell counts) to some other scale to obtain a normal distribution • e.g., logarithm, square root • If not feasible, and provided sample size exceeds 30, make use of the result that mean is approximately normally distributed Testing
Estimating the sample size • The epidemiologist examines the willingness to commit: • Alpha error • Beta error • Sample size calculation is the step at which decisions will be made in this respect Testing
Interpretation of significance • “Significant” does not necessarily mean that the observed difference is REAL or IMPORTANT • “Significant” only means that it is unlikely (<5%) that the difference is due to chance • Trivial differences can be statistically significant if they are based on large numbers Testing
Interpretation of non-significance • “Non - significant” does not necessarily mean that there is no real difference • “Non - significant” means only that the observed difference could easily be due to chance • Probability of at least 5% • There could be a real or important difference but due to inadequate sample size we might have obtained a non-significant result Testing
Significance does not systematically mean causation: Potential explanations for a significant association • Chance: Addressed by the significance test • Bias • Confounding factor • Causation • Consider after the first three have been ruled out • Test for causality criteria Testing
The choice of a one-sided test depends upon the alternate hypothesis • One-sided test • When the alternate hypothesis is in one direction • The actual P-values need to be quoted instead of stating just p < 0.05 or p < 0.01 Testing
Quick checklist for statistical testing • A statistical test is indeed needed • The test used is adapted • The test is calculated correctly • The interpretation of the test is appropriate Testing
Key messages • Under the null hypotheses, differences observed are caused by chance alone • Type I error consists in rejecting the null hypothesis when it is true while type II error consists in accepting the null hypothesis when it is false • Statistical tests estimate the probability that a difference observed may be caused by chance alone