1 / 37

Statistics

Explore various hypothesis testing methods in biomedical research such as one vs. two-sided tests, t-tests, sign test, and Wilcoxon rank-sum test with detailed formulas and examples.

cisnerosc
Download Presentation

Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Statistics March 26, 2009 Tim Bunnell, Ph.D. & Jobayer Hossain, Ph.D. Nemours Bioinformatics Core Facility Nemours Biomedical Research

  2. Hypothesis Testing (Quantitative variable) Nemours Biomedical Research

  3. One vs two sided tests • One-sided • One direction of effect (e.g. mean efficacy of treatment group is greater than the mean efficacy of placebo group) • Greater power to detect difference in expected direction • Two-sided • Effect could be in either direction (e.g. mean efficacy of treatment group is not equal to the mean efficacy of placebo group) • More conservative Nemours Biomedical Research

  4. One vs two sided tests Nemours Biomedical Research

  5. Single group sample - one sample t-test • Test for value of a single mean • E.g., to see if mean SBP of all AIDHC employees is 120 mm Hg • Assumptions • Parent population is normal • Sample observations (subjects) are independent Nemours Biomedical Research

  6. Single group sample-one sample t-test • Formula Consider a sample of size n from a normal population with mean µ and variance σ2, then the following statistic is distributed as Student’s t with (n-1) degrees of freedom. x is thesample mean and s is the sample standard deviation Nemours Biomedical Research

  7. One group sample - Sign Test (Nonparametric) • Use: (1) Compares the median of a single group with a specified value (instead of single sample t-test). • Hypothesis:H0:Median = c Ha:Median  c • Test Statistic: We take the difference of observations from median (xi - c). The number of positive difference follows a Binomial distribution. For large sample size, this distribution follows normal distribution. (Rcmdr does not have a default sign test) Nemours Biomedical Research

  8. One group sample: Rcmdr demonstration • Load data default to Rcmdr and then select- • Statistics->Means->single sample t-test. From this window, select variable (e.g.hgt), alternative hypothesis, value for null hypothesis (e.g. 50). data: data$hgt t = -2.6925, df = 59, p-value = 0.009217 alternative hypothesis: true mean is not equal to 50 95 percent confidence interval: 46.50166 49.48460 sample estimates: mean of x 47.99313 Nemours Biomedical Research

  9. Two-group (independent) samples - two-sample t-statistic • Use • Test for equality of two means • Assumptions • Parent population is normal • Sample observations (subjects) are independent. Nemours Biomedical Research

  10. Two-group (independent) samples - two-sample t-statistic • Formula (two groups) • Case 1: Equal Population Standard Deviations: • The following statistic is distributed as t distribution with (n1+n2 -2) d.f. The pooled standard deviation, n1 and n2 are the sample sizes and S1 and S2 are the sample standard deviations of two groups. Nemours Biomedical Research

  11. Two-group (independent) samples - two-sample t-statistic • Formula (two groups) • Case 2: Unequal population standard deviations • The following statistic follows t distribution. • The d.f. of this statistic is, Nemours Biomedical Research

  12. Two-sample (independent) t-statistic • Statistics->Means-> Independent sample t test • From the small window, select response variable (e.g. PLUC.pre), select group variable (e.g. sex) and select alternative hypothesis type (e.g. two sided), confidence level (e.g. .95), and Assume equal variance (e.g yes) Two Sample t-test data: PLUC.pre by sex t = -0.4866, df = 58, p-value = 0.6284 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.4004136 0.8527181 sample estimates: mean in group f mean in group m 9.606148 9.879995 Nemours Biomedical Research

  13. Two-group (independent) samples- Wilcoxon Rank-Sum Test (Nonparametric) • Use: Compares medians of two independent groups. • Corresponds to t-test for 2 Independent sample means • Test Statistic: Consider two samples of sizes m and n. Suppose N=m+n. Compute the rank of all N observations. Then, the statistic, Wm= Sum of the ranks of all observations of the first variable (size m). Nemours Biomedical Research

  14. Two-group (independent) samples- Wilcoxon Rank-Sum Test (Nonparametric) • Statistics-> Nonparametric tests->Two sample Wilcoxon test -> select response variable (e.g. PLUC.pre), group variable (e.g. sex), Alternative hypothesis (e.g. two sided), Type of Test (e.g. exact). Wilcoxon rank sum test data: PLUC.pre by sex W = 439, p-value = 0.8776 alternative hypothesis: true location shift is not equal to 0 Nemours Biomedical Research

  15. Two-group (matched) samples - paired t-statistic • Use: Compares equality of means of two matched or paired samples (e.g. pretest versus posttest) • Assumptions: • Parent population is normal • Sample observations (subjects) are independent Nemours Biomedical Research

  16. Two-group (matched) samples - paired t-statistic • Formula • The following statistic follows t distribution with n-1 d.f. Where, d is the difference of two matched samples and Sdis the standard deviation of the variable d. Nemours Biomedical Research

  17. Paired t-test:Rcmdr demo • Statistics -> Means->paired t-test-> pick First variable (e.g. PLUC.pre), pick second variable (e.g. PLUC.post), Alternative Hypothesis (e.g. Two sided), Confidence level (e.g. 0.95). data: data$PLUC.pre and data$PLUC.post t = -5.4953, df = 59, p-value = 8.739e-07 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -3.275342 -1.526769 sample estimates: mean of the differences -2.401056 Nemours Biomedical Research

  18. Two-group (matched) samples Wilcoxon Signed-Rank Test (Nonparametric) • USE: • Compares medians of two paired samples. • Test Statistic • Obtain Differences of two variables, Di= X1i- X2i • Take Absolute Value of Differences, Di • Assign Ranks to absolute values (lower to higher), Ri • Sum up ranks for positive differences (T+) and negative differences (T-) • Test Statistic is smaller of T- or T+ (2-tailed) Nemours Biomedical Research

  19. Two-group (matched) samples Wilcoxon Signed-Rank Test (Nonparametric) • Statistics -> Nonparametric tests ->Paired sample Wilcoxon test -> select first variable (e.g. PLUC. Pre) and second variable (e.g. PLUC.post), Alternative hypothesis (e.g.two sided), type of test (e.g. normal approximation) median(data$PLUC.pre - data$PLUC.post, na.rm=TRUE) # median difference [1] -2.283691 > wilcox.test(data$PLUC.pre, data$PLUC.post, alternative='two.sided', + paired=TRUE) Nemours Biomedical Research

  20. Comparing more than two independent samples: F statistic (Parametric) • Use: • Compare means of more than two groups • Test the equality of variances. Nemours Biomedical Research

  21. Comparing > 2 independent samples: F statistic (Parametric) • Let X and Y be two independent Chi-square variables with n1 and n2 d.f. respectively, then the following statistic follows a F distribution with n1 and n2 d.f. • Let, X and Y are two independent normal variables with sample sizes n1 and n2. Then the following statistic follows a F distribution with n1 and n2 d.f. Where, sx2 and sy2 are sample variances of X and Y. Nemours Biomedical Research

  22. Comparing > 2 independent samples: F statistic (Parametric) • Hypotheses: H0: µ1= µ2=…. =µn Ha: µ1≠ µ2 ≠ …. ≠µn • Comparison will be done using analysis of variance (ANOVA) technique. • ANOVA uses F statistic for this comparison. • The ANOVA technique will be covered in another class session. Nemours Biomedical Research

  23. More than two independent samples: One-way analysis of variance (ANOVA) • Create a new variable: PLUC.chng=PLUC.post – PLUC.pre (Data-> Manage variable in active dataset-> Compute new variable -> New variable name (on the left): PLUC.chng = PLUC.post – PLUC.pre (on the right). • For ANOVA: Statistics -> means -> One-way ANOVA -> Select model, response variable (e.g. PLUC.chng), group variable (e.g. grp), pairwise comparison of means. Nemours Biomedical Research

  24. More than two independent samples: One-way analysis of variance (ANOVA) AnovaModel.5 <- aov(PLUC.chng ~ grp, data=data) Df Sum Sq Mean Sq F value Pr(>F) > summary(AnovaModel.5) grp 2 93.64 46.82 4.5839 0.01426 * Residuals 57 582.17 10.21 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > numSummary(data$PLUC.chng , groups=data$grp, statistics=c("mean", "sd")) mean sd n 1 1.023051 3.686748 20 2 2.132612 2.509201 20 3 4.047504 3.279040 20 Contd. Nemours Biomedical Research

  25. More than two independent samples: One-way analysis of variance (ANOVA) threegrp <- aov(PLUC.chng ~ grp, data=data) > summary(threegrp) Df Sum Sq Mean Sq F value Pr(>F) grp 2 93.64 46.82 4.5839 0.01426 * Residuals 57 582.17 10.21 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > numSummary(data$PLUC.chng , groups=data$grp, statistics=c("mean", "sd")) mean sd n 1 1.023051 3.686748 20 2 2.132612 2.509201 20 3 4.047504 3.279040 20 Contd. Nemours Biomedical Research

  26. More than two independent samples: One-way analysis of variance (ANOVA) Nemours Biomedical Research

  27. More than two groups: Nonparametric Kruskal-Wallis Test • Compares median of three or more groups or (means of ranks of three or more groups) • Rcmdr Demo: Statistics -> Nonparametric tests ->Kruskal-Wallis test and then select response and group variables etc. > kruskal.test(PLUC.chng ~ grp, data=data) Kruskal-Wallis rank sum test data: PLUC.chng by grp Kruskal-Wallis chi-squared = 6.5452, df = 2, p-value = 0.03791 Nemours Biomedical Research

  28. Proportion Tests • Use • Test for equality of two Proportions • E.g. proportions of subjects in two treatment groups who benefited from treatment. • Test for the value of a single proportion • E.g., to test if the proportion of smokers in a population is some specified value (less than 1) Nemours Biomedical Research

  29. Proportion Tests • Formula • One Group: • Two Groups: Nemours Biomedical Research

  30. Proportion Test • Statistics-> proportions -> single sample (e.g.sex) or two sample-> Select, variable, Alternative hypothesis, null hypothesis, Confidence Level, Type of test. > .Table sex f m 30 30 > binom.test(rbind(.Table), alternative='two.sided', p=.5, conf.level=.95) Exact binomial test data: rbind(.Table) number of successes = 30, number of trials = 60, p-value = 1 alternative hypothesis: true probability of success is not equal to 0.5 95 percent confidence interval: 0.368062 0.631938 sample estimates: probability of success 0.5 Nemours Biomedical Research

  31. Chi-square statistic • USE • Testing the population variance σ2= σ02. • Testing the goodness of fit. • Testing the independence/ association of attributes • Assumptions • Sample observations should be independent. • Cell frequencies should be >= 5. • Total observed and expected frequencies are equal Nemours Biomedical Research

  32. Chi-square statistic • Formula: If xi (i=1,2,…n) are independent and normally distributed with mean µ and standard deviation σ, then, • If we don’t know µ, then we estimate it using a sample mean and then, Nemours Biomedical Research

  33. Chi-square statistic • For a contingency table, we use the following chi- square test statistic, Nemours Biomedical Research

  34. Chi-square statistic Nemours Biomedical Research

  35. Chi-square statistic – calculation of expected frequency • To obtain the expected frequency for any cell, use: • Corresponding (row total X column total) / grand total • E.g: cell for group 1 and female, substituting: (30 X 20 / 60) = 10 Nemours Biomedical Research

  36. Chi-square statistic:Rcmdr demonstration • Statistics->Contingency tables -> Two way table -> Pick row and column variable, select type of cell percentage you want (e.g. row), hypothesis tests (e.g. chi-square) > rowPercents(.Table) # Row Percentages grp sex 1 2 3 Total Count f 30.0 40.0 30.0 100.0 30 m 36.7 26.7 36.7 100.1 30 > .Test <- chisq.test(.Table, correct=FALSE) > .Test Pearson's Chi-squared test data: .Table X-squared = 1.2, df = 2, p-value = 0.5488 Nemours Biomedical Research

  37. Thank you Nemours Biomedical Research

More Related