190 likes | 284 Views
Statistics in Applied Science and Technology. Chapter14. Nonparametric Methods. Key Concepts in this Chapter. Nonparametric methods Distribution-free methods Ranks of observation Wilcoxon Rank-Sum Test Kruskal-Wallis One-Way ANOVA By Ranks
E N D
Statistics in Applied Science and Technology Chapter14. Nonparametric Methods
Key Concepts in this Chapter • Nonparametric methods • Distribution-free methods • Ranks of observation • Wilcoxon Rank-Sum Test • Kruskal-Wallis One-Way ANOVA By Ranks • Spearman Rank-Order Correlation Coefficient (rs)
Rationale for Nonparametric Methods • Nonparametric methods, often referred to as distribution-free methods, do not require any assumption about the shape of the underlying population distribution or sample size. • Nonparametric methods are appropriate when dealing with data that are measured on a nominal or ordinal scale.
Advantages and Disadvantages • Advantages: • No restrictive assumptions such as normality of the observations and large sample size. • Easy and speedy computation • Good for nominal or ordinal data • Disadvantages: • Less efficient (require larger sample size to reject a false H0) • Less specific • Minimal utilization of distribution
Inherent Characteristic of Nonparametric Methods • Nonparametric methods deal with ranks rather than values of the observations • Computation is simple
Wilcoxon Rank-Sum Test (I) • Wilcoxon Rank-Sum Test is used to test if there is difference in the two population distributions • Corresponds to the t test for two independent sample means • No assumptions are necessary
Wilcoxon Rank-Sum Test (II) • H0: No difference in two population distribution • H1: There is a difference in two population distribution • Test Statistics: Z test using sum of the ranks
Wilcoxon Rank-Sum Test (III) Test Statistics Z can be calculated by: Where: W1 is the sum of ranks of the sample We is the expected sum of the ranks assuming H0 is true. w is the standard error We can be found using the following equation: Where: n1 and n2 is the number of observations in two samples, respectively. W can be found using the following equation: Where: n1 and n2 are defined as above.
Wilcoxon Rank-Sum Test (IV) • Decision Rule: At of 0.05, reject H0 if Z is above 1.96 or below –1.96. At of 0.01, reject H0 if Z is above 2.56 or below –2.56.
Kruskal-Wallis One-Way ANOVA By Ranks (I) • Nonparametric equivalent of the one-way ANOVA (the one we discussed in chapter 10). • Appropriate when underlying population is not normally distributed or the samples do not have equal variances. • Appropriate when data is ordinal
Kruskal-Wallis One-Way ANOVA By Ranks (II) • H0: No differences among more than two population distributions (K groups) • H1: There is at least one group has a different population distribution than others • Test Statistics: H test using sum of the ranks
Kruskal-Wallis One-Way ANOVA By Ranks (III) Test statistics H can be calculated by the following: Where: k = the number of groups nj = the number of observations in the jth group N = total number of observations in all groups Rj= the sum of ranks in the jth group
Kruskal-Wallis One-Way ANOVA By Ranks (IV) • Decision Rule: Reject H0 when calculated H is more than critical H which can be found in Appendix F (textbook pg. 298) • Tied Observations will somewhat influence H, a term introduced in the denominator can correct this effect. (pg.230)
Spearman Rank-Order Correlation Coefficient (rs) • Appropriate when two interval-ratio variables deviate away from normal distribution • Appropriate when we deal with two ordinal variables that have a broad range of many different categories since using Gamma because somewhat inconvenient.
Spearman Rank-Order Correlation Coefficient (rs) • rs may take on values from –1 to +1. Values close to 1 indicate a strong correlation; values close to zero indicate a weak association. The sign of rs indicates the direction of association. • rs2 represents the proportional reduction in errors of prediction when predicting rank on one variable from rank on the other variable, as compared to predicting rank while ignoring the other variable.
Calculation of rs Where: di is the difference between the paired ranks n is the number of pairs
Is rs “statistically significant”? • If sample size is at least 10; x and y represent randomly selected and independent pairs of ranks • We can use t test to test hypothesis: • H0: s = 0 • H1: s 0
Is rs “statistically significant”? • t test calculation: With n-2 df