1 / 101

Inferential statistics

Inferential statistics. In inferential statistics. Data from samples are used to make inferences about populations Researchers can make generalizations about an entire population based on a smaller number of observations

cobbc
Download Presentation

Inferential statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Inferential statistics

  2. In inferential statistics • Data from samples are used to make inferences about populations • Researchers can make generalizations about an entire population based on a smaller number of observations • However, the sample means will not all be the same when repeated random samples are taken from a population Evidence-based Chiropractic

  3. Sampling distributions • If many different samples were taken from a population, it would produce a distribution of sample means • If repeated enough times, the distribution would take on a normal shape • Even if the underlying population is not normal • If repeated an infinite number of times, it would be called a sampling distribution Evidence-based Chiropractic

  4. Sampling distributions (cont.) • Which of the sample means is truly the population mean? • It would be useful to know, but an exact figure is not possible • The population mean can be inferred from the sample • The sample mean is an estimate • Referred to as the point estimate Evidence-based Chiropractic

  5. Sampling distributions (cont.) • Because sampling distributions are normal, the properties of the normal distribution can be used • e.g., the 68.3, 95.5, 99.7 proportion of the area under the curve Evidence-based Chiropractic

  6. Standard error of the mean (SEm) • The spread of means around the mean of a sampling distribution • Can be estimated from the sample • SEm is calculated by dividing the SD of the sample by the square root of the number of units in the sample Evidence-based Chiropractic

  7. SEm (cont.) • SEm is higher when • The sample’s SD is large or • The sample size is small • Lower when • SD is a small or • The sample size is large • A small SEm is preferable because generalizations are more precise Evidence-based Chiropractic

  8. Confidence Intervals (CIs) • A CI is a range of values that is likely to contain the population parameter that is being estimated (e.g., the mean) • The probability that this range of values contains the population parameter is typically 95% • Thus, the 95% confidence interval Evidence-based Chiropractic

  9. Confidence Intervals (CIs) -3 -2 -1 0 +1 +2 +3 Evidence-based Chiropractic

  10. CIs (cont.) • One can have 95% confidence that the value of the true mean lies within the calculated interval (i.e., 95% CI) Evidence-based Chiropractic

  11. Calculating a 95% CI • Find the z-score (using a z-table) that corresponds to the area under the distribution that includes 95% of all values (e.g., z = ±1.96 for a 95% CI) • Multiply the z-scores by the SEm • Add the product to the sample mean to find the upper limit of the CI and subtract to find the lower limit Evidence-based Chiropractic

  12. Size (width) of CIs • The size of the CI is related to the size of the sample and the size of the data variation • Small samples & large variation = larger CIs • Large samples & small variation = smaller CIs Evidence-based Chiropractic

  13. Hypothesis testing • A hypothesis is an assumption that appears to explain certain events, which must be tested to see whether it is true • Research hypothesis • a.k.a., alternative hypothesis • Denoted H1 • The research hypothesis is not tested directly • Instead the null hypothesis (H0) is tested Evidence-based Chiropractic

  14. Hypothesis tests • Depending on the outcome of the test of H0, there is either support for or against the research hypothesis • Hypothesis testing involves the comparison of the means of groups in an experiment • The objective is to find out whether they are significantly different from each other Evidence-based Chiropractic

  15. Hypothesis tests (cont.) • When comparing the means of an active treatment group and a control group, one looks for a difference • The treatment may produce a better outcome leading to a higher mean than the control group • The difference may appear real, but it may be due to chance • Statistical tests verify if it is real Evidence-based Chiropractic

  16. The null hypothesis • H0 states that there is no difference between the group means • H1 is accepted only if the null hypothesis proves to be unlikely • Typically it must be at least 95% unlikely • If H0 is unlikely, it is rejected • Not unlike the innocent until proven guilty concept in our legal system Evidence-based Chiropractic

  17. A hypothetical neck pain study • Patients are treated with chiropractic vs. usual medical care • Outcome measure is the Neck Disability Index (NDI) • H1 • Chiropractic patients will have lower mean NDI scores after treatment • H0 • There is no difference between mean NDI scores Evidence-based Chiropractic

  18. Hypothetical study (cont.) • Results • Mean NDI scores of chiropractic patients • 28 before, 10 after treatment • Mean NDI scores medical patients • 29 before, 15 after treatment • Chiropractic care appears to be better • But is there enough difference to rule out chance • Must perform statistical tests to find out Evidence-based Chiropractic

  19. Hypothetical study (cont.) ChiropracticMedical 30 20 10 0 Is this difference enough to be meaningful? NDI score Baseline Outcome Evidence-based Chiropractic

  20. Statistical significance • The results of a study (i.e., the difference between groups) are unlikely to be due to chance • At a specified probability level, referred to as alpha () •  is the probability of incorrectly rejecting a null hypothesis • If the results are not due to chance, H0 is rejected and H1 is accepted Evidence-based Chiropractic

  21. Statistical significance (cont.) • It must be at least 95% unlikely that H0 is true before it can be rejected • There is still a 5% chance that H0 would be rejected, when it was actually true • Accordingly, P values must be equal to or less than 5% in order for the results of a study to reach a level of statistical significance Evidence-based Chiropractic

  22. Statistical significance (cont.) • The level of significance (alpha level) is not the same as the P value • The alpha level must be set before the study begins • The P value is calculated at the completion of the study and must be ≤ to the alpha level in order to reach statistical significance Evidence-based Chiropractic

  23. Statistical significance (cont.) • Even when studies are not statistically significant, there is a 1:20 chance that significant results would occur if the study was repeated 20 times • Fishing • When researchers perform a lot of statistical tests on their data • Increases the chance that at least one of the tests will wrongly reach statistical significance Evidence-based Chiropractic

  24. Type I & II errors • Type I error (a.k.a., alpha error) • Rejecting a true null hypothesis • The probability of making a Type I error is equal to the value of α • Type II error (a.k.a., betaerror ) • Failure to reject a false null hypothesis • The probability of making a Type II error is equal to the value of beta () Evidence-based Chiropractic

  25. Type I & II errors (cont.) Consequences of accepting or rejecting true and false null hypotheses Evidence-based Chiropractic

  26. Type I & II errors (cont.) • There is a trade-off between the likelihood of a study resulting in a Type I error versus a Type II error • As alpha becomes smaller, the chance of making a Type I error decreases • Whereas the chance of making a Type II error increases • Because it is more likely that a false H0 will not be rejected Evidence-based Chiropractic

  27. Type I & II errors (cont.) The 0.05 alpha level is a compromise between Type I and Type II errors Evidence-based Chiropractic

  28. Power • The probability of correctly rejecting a false H0 • Related to  error • Power is equal to 1- • Power depends on sample size, the magnitude of the difference between group means, and the value of α Evidence-based Chiropractic

  29. Power (cont.) • Power increases as • Sample size increases • Only to a certain extent, then it becomes a waste of resources • The difference between group means increases • α increases • A power value of 0.80 is often sought by researchers Evidence-based Chiropractic

  30. Power (cont.) • Power may be calculated after a study has been completed (post hoc) • If low power is detected during post hoc power analysis and H0 was not rejected, it may be grounds to repeat the study using a larger sample Evidence-based Chiropractic

  31. Confidence intervals and hypothesis testing • If the value specified as the difference between group means in the null hypothesis is included in the 95% CI, then H0 should not be rejected • The test is not statistically significant • H0 states there is no difference between group means, so the specified no difference value is always zero Evidence-based Chiropractic

  32. CIs and hypothesis testing (cont.) • If zero is not included in the 95% CI, the null hypothesis should be rejected • The test is statistically significant • CIs are becoming more prevalent in the health care literature because they convey more information than P values alone Evidence-based Chiropractic

  33. CIs and hypothesis testing (cont.) • Example study • Brinkhaus et al. • Acupuncture was more effective in improving pain on VAS* than no acupuncture in chronic low back pain patients • Difference, 21.7 mm (95% CI 13.9 to 30.0) • But no statistical difference between acupuncture and minimal acupuncture • Difference, 5.1 mm (95% CI -3.7 to 13.9) * Visual analog scale Evidence-based Chiropractic

  34. Clinical significance a.k.a., practical significance • Do the findings of a study really matter in clinical situations • Sometimes a study is statistically significant, but the findings are not important in clinical terms • Large studies with small differences between groups can generate statistically significant findings that are not meaningful to practitioners Evidence-based Chiropractic

  35. Clinical significance (cont.) • For example • A study found a statistically significant difference between mean Headache Disability Inventory (HDI) scores of only 10 points • Yet at least a 29-point change must occur from test to retest before the changes can be attributed to a patient’s treatment • The HDI is not very responsive to change Evidence-based Chiropractic

  36. Commonly encountered statistical tests • Statistical tests determine the probabilities associated with relationships in studies • Are the results real or merely due to chance? • t-test, ANOVA, and chi-square are common in journal articles • Familiarity with these tests is helpful in the appraisal of articles Evidence-based Chiropractic

  37. t-test • Used to find out whether the means of two groups are statistically different • Results are not entirely black-and-white • Only indicates that the means are probably different • Or, that they are probably the same, if the study fails to find a difference • The t-test can be used for a single group by comparing the mean with known values Evidence-based Chiropractic

  38. t-test (cont.) • The actual differences between means is considered • Also the amount of variability of the scores • A high degree of variability of group scores can obscure the differences between means Evidence-based Chiropractic

  39. t-test (cont.) • The differences between means are the same in both examples, but the variability of group scores differs • The lower example would be much more likely to reach statistical significance because of the narrow spread Evidence-based Chiropractic

  40. Assumptions of the t-test • The data should be normal and involve interval or ratio measurement • Groups should be independent • The variances of groups should be equal • When the sample size is large enough (about 30 subjects) violations of these assumptions are less important Evidence-based Chiropractic

  41. Alternatives to the t-test • The t-test for unequal variances • Non-parametric tests for use with skewed data • Mann-Whitney U test • Wilcoxon test Evidence-based Chiropractic

  42. The t-score • The t-score (a.k.a., t-ratio) is similar to the z-score • However, the t-distribution and a t-table are used • This is because the SD of the population is estimated from the sample, whereas it is known in the z-distribution • P values are found using the calculated t-score and a t-table Evidence-based Chiropractic

  43. The t-score (cont.) • t-tables consider the number of subjects in the groups • Referred to as degrees of freedom (df) • Signifies the number of subjects in each group minus 1 • Minus 2 when there are two groups • Thus, a study that compares the means of 2 groups that involve 30 subjects has 28 df Evidence-based Chiropractic

  44. The t-table • t-distributions eventually become nearly normal when many subjects are included • As a result, t-tables usually only go to 100 df • Alpha levels are shown for • When α is all in one tail (α1 or one-tailed test ) • When α is spit between the tails (α2 or two-tailed test) Evidence-based Chiropractic

  45. Critical value for 10 df and α2 = 0.05 To 100 Evidence-based Chiropractic

  46. One-tailed test vs. two-tailed test • One-tailed test (a.k.a., directional test) • Alpha is all in one tail • The researcher specifies the direction the test results will go before the data analysis • Either higher or lower • Two-tailed test (a.k.a., non-directional test) • Alpha is split between the tails • The study’s results could go either way Evidence-based Chiropractic

  47. One-tailed test vs. two-tailed test (cont.) • In a non-directional test, the researcher wants to know if the means are different • For example, in a study comparing manipulation with acupuncture for tension headaches, the results could go either way • That is the case with almost all studies that compare treatments Evidence-based Chiropractic

  48. One-tailed test vs. two-tailed test (cont.) • It is easier to reach statistical significance using a directional test • Consequently it is tempting for researchers to use directional hypotheses • The opposite direction must be of no interest to the researcher • But it is almost always possible for the test to go either way when comparing treatments Evidence-based Chiropractic

  49. Calculating the t-score • Is a ratio of the difference between group means and the variability of the data • Variability is represented by the standard error of the difference ( ) rather than the SD • Thus or Evidence-based Chiropractic

  50. The t-score • For the t-test result to be statistically significant • The difference between the means must be large (the numerator) • And the variability of the data must be small (the denominator) • This results in a t-score that is larger than the critical value of t in the t-table Evidence-based Chiropractic

More Related