Statistics in Applied Science and Technology

Statistics in Applied Science and Technology Chapter14. Nonparametric Methods

Key Concepts in this Chapter • Nonparametric methods • Distribution-free methods • Ranks of observation • Wilcoxon Rank-Sum Test • Kruskal-Wallis One-Way ANOVA By Ranks • Spearman Rank-Order Correlation Coefficient (rs)

Rationale for Nonparametric Methods • Nonparametric methods, often referred to as distribution-free methods, do not require any assumption about the shape of the underlying population distribution or sample size. • Nonparametric methods are appropriate when dealing with data that are measured on a nominal or ordinal scale.

Advantages and Disadvantages • Advantages: • No restrictive assumptions such as normality of the observations and large sample size. • Easy and speedy computation • Good for nominal or ordinal data • Disadvantages: • Less efficient (require larger sample size to reject a false H0) • Less specific • Minimal utilization of distribution

Inherent Characteristic of Nonparametric Methods • Nonparametric methods deal with ranks rather than values of the observations • Computation is simple

Wilcoxon Rank-Sum Test (I) • Wilcoxon Rank-Sum Test is used to test if there is difference in the two population distributions • Corresponds to the t test for two independent sample means • No assumptions are necessary

Wilcoxon Rank-Sum Test (II) • H0: No difference in two population distribution • H1: There is a difference in two population distribution • Test Statistics: Z test using sum of the ranks

Wilcoxon Rank-Sum Test (III) Test Statistics Z can be calculated by: Where: W1 is the sum of ranks of the sample We is the expected sum of the ranks assuming H0 is true. w is the standard error We can be found using the following equation: Where: n1 and n2 is the number of observations in two samples, respectively. W can be found using the following equation: Where: n1 and n2 are defined as above.

Wilcoxon Rank-Sum Test (IV) • Decision Rule: At  of 0.05, reject H0 if Z is above 1.96 or below –1.96. At  of 0.01, reject H0 if Z is above 2.56 or below –2.56.

Kruskal-Wallis One-Way ANOVA By Ranks (I) • Nonparametric equivalent of the one-way ANOVA (the one we discussed in chapter 10). • Appropriate when underlying population is not normally distributed or the samples do not have equal variances. • Appropriate when data is ordinal

Kruskal-Wallis One-Way ANOVA By Ranks (II) • H0: No differences among more than two population distributions (K groups) • H1: There is at least one group has a different population distribution than others • Test Statistics: H test using sum of the ranks

Kruskal-Wallis One-Way ANOVA By Ranks (III) Test statistics H can be calculated by the following: Where: k = the number of groups nj = the number of observations in the jth group N = total number of observations in all groups Rj= the sum of ranks in the jth group

Kruskal-Wallis One-Way ANOVA By Ranks (IV) • Decision Rule: Reject H0 when calculated H is more than critical H which can be found in Appendix F (textbook pg. 298) • Tied Observations will somewhat influence H, a term introduced in the denominator can correct this effect. (pg.230)

Spearman Rank-Order Correlation Coefficient (rs) • Appropriate when two interval-ratio variables deviate away from normal distribution • Appropriate when we deal with two ordinal variables that have a broad range of many different categories since using Gamma because somewhat inconvenient.

Spearman Rank-Order Correlation Coefficient (rs) • rs may take on values from –1 to +1. Values close to 1 indicate a strong correlation; values close to zero indicate a weak association. The sign of rs indicates the direction of association. • rs2 represents the proportional reduction in errors of prediction when predicting rank on one variable from rank on the other variable, as compared to predicting rank while ignoring the other variable.

Calculation of rs Where: di is the difference between the paired ranks n is the number of pairs

Is rs “statistically significant”? • If sample size is at least 10; x and y represent randomly selected and independent pairs of ranks • We can use t test to test hypothesis: • H0: s = 0 • H1: s 0

Is rs “statistically significant”? • t test calculation: With n-2 df

Statistics in Applied Science and Technology