1 / 18

Common Non-Parametric Methods for Comparing Two Samples

Common Non-Parametric Methods for Comparing Two Samples. (Session 20). Learning Objectives. At the end of this session, you will be able to Understand the type of logic behind common non-parametric tests for comparing two groups based on ranks

borna
Download Presentation

Common Non-Parametric Methods for Comparing Two Samples

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Common Non-Parametric Methods for Comparing Two Samples (Session 20)

  2. Learning Objectives At the end of this session, you will be able to • Understand the type of logic behind common non-parametric tests for comparing two groups based on ranks • Interpret and understand the commonly used Wilcoxon signed-rank test • Appreciate some practical problems associated with the methods

  3. An illustrative example Paired-Samples 10 farmers recorded their crop yield (tonnes/hectare) before and after the use of a fertiliser. Has the use of fertiliser changed the yield? Data (after-before pair-wise differences): 0.02, 0.89, -0.06, 0.26, 0.83, 0.42, 0.80, -0.05, 0.64, 0.84 How can we address this question objectively?

  4. Start by plotting - Roughly symmetric distribution?

  5. Addressing the question … • A paired t-test is often employed in such cases • Recall this is simply a one-sample t-test applied to the pair-wise differences • The procedure assumes the pair-wise differences are from a normal distribution

  6. Addressing the question • Recall the t-test procedure is quite robust against departures from normality (c.f. Session 19) • However, if we are concerned about the validity of the normal assumption we might use a non-parametric test that does not make this assumption

  7. Addressing the question One possibility is to use a sign test, to test H0: Population median difference, =0 vs. H1: Population median difference, 0 However, this procedure is inefficient as it effectively only utilises the signs (+/-) of the pair-wise differences Can more information be used?

  8. Wilcoxon signed-rank test Yes, but at a price… We use the rank order of the pair-wise differences, but not the actual values This leads to the Wilcoxon-signed rank test Assumptions The pair-wise differences are not only independent, but are from a symmetric, but unspecified distribution

  9. Back to the example • Let us assume the distribution of pair-wise differences is symmetrically distributed • Not unreasonable based on the plots • Also, the sample median and mean are similar; 0.53 and 0.46 respectively

  10. Wilcoxon signed-rank test • Rank the n=10 differences according to their magnitude • Re-attach the signs to give signed-ranks: Notes • Use average ranks for ties • Zero differences are ignored in the above process (reducing the sample size)

  11. Wilcoxon signed-rank test • Let T+=sum of +ve ranks = 50 T=sum of –ve ranks = 5 • Take either T+ or T as a test statistic • T+ + T =n(n+1)/2 • Consider T+. A sufficiently small or large value is evidence to reject H0 • To obtain a p-value we compare T+ with its null distribution • This is a symmetric discrete distribution • A two-sided p-value is then Prob(T+5)+Prob(T+50)

  12. The p-value calculation • Exact method • Cumbersome • Use appropriate software • Large sample approximation • Approximate the null distribution of T+ using a normal distribution • n>20 will usually give a reasonable approximation

  13. P-value calculation • Using statistical software, e.g. Stata: sign | obs sum ranks positive | 8 50 negative | 2 5 all | 10 55 Ho: diff = 0 z = 2.293 Prob > |z| = 0.0218 • Approximate p-value = 0.022 • From SPSS, exact p-value = 0.020

  14. Conclusions • The p-value is small. Hence, there is evidence to reject H0 • The estimated median difference (after – before), 0.53, is significant • There is evidence based on this study for a positive fertiliser effect

  15. Comments While the Wilcoxon signed-rank test makes less restrictive assumptions than the t-test there are still a number of major practical problems • The symmetric assumption is still quite limiting, as many distributions are skewed • Confidence intervals (CIs) • As with the sign test (Session 19) most software packages concentrate on the p-value rather than point estimates and confidence intervals

  16. Two independent samples • For comparing independent samples, a t-test for independent samples is often used • If we were concerned about the validity of the underlying assumptions we could employ a non-parametric method • The Wilcoxon rank-sum test (or the equivalent Mann-Whitney U test) is a common choice • Once again this is based on ranks

  17. Concluding remarks • Non-parametric tests may be used. However… • Their usefulness is often over-rated • The lack of confidence intervals is a major disadvantage • Practical statistics is frequently more complicated than comparing two groups. • In this case, t-test methodology naturally extends into more a more general modelling framework • The non-parametric tests discussed do not naturally extend

More Related