360 likes | 560 Views
CHAPTER 15. NONPARAMETRIC STATISTICS. Learning Objectives. Determine situations where nonparametric procedures are better alternatives to the parametric tests Understand the assumptions of nonparametric tests Use one- and two-sample nonparametric tests
E N D
CHAPTER 15 NONPARAMETRIC STATISTICS
Learning Objectives • Determine situations where nonparametric procedures are better alternatives to the parametric tests • Understand the assumptions of nonparametric tests • Use one- and two-sample nonparametric tests • Use nonparametric alternatives to the single-factor ANOVA
Nonparametric vs. Parametric • Used an assumption that we are working with random samples from normal populations • Called parametric methods • Based on a particular parametric family of distributions • Describe procedures called nonparametric methods • Make no assumptions about the population distribution other than that it is continuous
Why Nonparametric Procedures • Distributions are not close to normal • Data need not be quantitative but can be categorical (such as yes or no, defective or non defective) or rank data • Are usually very quick and easy to perform • Provides considerable improvement over the normal-theory parametric methods • Not utilize all the information provided by the sample • Requirement of a larger sample size
Which One? • Which one to choose? • If both methods are applicable to a particular problem • Use the more efficient parametric procedure • Otherwise, use the non parametric procedure
SIGN TEST • Used to test hypotheses about the median of a continuous distribution • Mean of a normal distribution equals the median • Sign test can be used to test hypotheses about the mean of a normal distribution • Used the t-test in Chapter 9 • Sign test is appropriate for samples from any continuous distribution • Counterpart of the t-test
Description of the Test • Use the following differences • Xi is ith the sample observation and is the specified median value • Number of plus signs is a value of a binomial random variable that has the parameter p=1/2 • Reject the if the proportion of plus signs is significantly different from 1/2
Using P-value • Use the P-value • If r+< n/2 the P-value • If r+ > n/2 the P-value • If the P-value is less than the significance level , we will reject H0 and conclude that H1 is true
The Normal Approximation • Binomial distribution has well approximately a normal distribution when n >10 and p=0.5 • Mean=np and the variance=np(1-p) • Test statistics • Critical region can be chosen from the table of the standard normal distribution
Sign Test for Paired Samples • Applied to paired observations drawn from two continuous populations • Define the paired difference as • Test the hypothesis that the two populations have a common median • Equivalent to • Done by applying the sign test to the n observed differences
Example • Ten samples were taken from a plating bath used in an electronics manufacturing process, and the bath pH was determined. • The sample pH values are 7.91, 7.85, 6.82, 8.01, 7.46, 6.95, 7.05, 7.35, 7.25, 7.42 • Manufacturing engineering believes that pH has a median value of 7.0. Do the sample data indicate that this statement is correct? Use the sign test with =0.05 to investigate this hypothesis. Find the P-value for this test
Calculate the differences • Use the general procedure covered in Chapter 8 • Parameter of interest is the median of the distribution of pH • The • The • =0.05
Solution - Cont • Data and the observed plus signs 5. Test statistic is the observed number of plus differences r+=8 6. Reject H0 if the P-value corresponding to r=8 is less than or equal to = 0.05
Solution-Cont. 7. Since r >n/2=5, we calculate the P-value by using the binomial formula with n=10 and p=0.5 • Hence, the P-value = 2P(R+8|p=0.5) • Since P=0.109 is not less than = 0.05, we cannot reject the null hypothesis 8. Observed number of plus signs r = 8 was not large or enough to indicate that median pH is different from 7.0
Using Table • Table of critical values for the sign test • Appendix Table VII is for two-sided and one-sided alternative hypothesis • Let R=min (R+, R-) • Reject H0 • If r-≤ critical value; if (>) used for H1 • If r+≤ critical value; if (<) used for H1 • If r≤ critical value; if (≠) used for H1
Wilcoxon Signed-rank Test • Sign test uses only the plus and minus signs of the differences • Does not take into consideration the size or magnitude of these differences • Uses both direction (sign) and magnitude • In case of symmetric and continuous distributions • Test H0 as µ=µ0
Description of the Test • Compute the following quantities Xi- 0 • Xi is ith the sample observation i and 0is the specified median or mean value • Rank the absolute differences in ascending order • Give the ranks the signs • W+ be the sum of the positive ranks and W- be the sum of the negative ranks, and let W min(W+,W- ) • Table VIII contains critical values of W • Reject H0 • If w-≤ critical value; if (>) used for H1 • If w+≤ critical value; if (<) used for H1 • If w≤ critical value; if (≠) used for H1
Large-Sample Approximation • Large sample size (n>20) • has approximately a normal distribution • Mean and variance • Test statistics • Appropriate critical region can be chosen from a table of the standard normal distribution
Paired Observations • Applied to paired observations drawn from two continuous and symmetric populations • Define the paired difference as • Test the hypothesis that the two populations have a common mean • Equivalent to testing that the mean of the differences
Description of the Test • Differences are first ranked in ascending order of their absolute values • Ranks are given the signs of the differences • Ties are assigned average ranks • W+ be the sum of the positive ranks and W- be the sum of the negative ranks, and let W min(W+,W- ) • Table VIII contains critical values of W • Reject H0 • If w-≤ critical value; if (>) used for H1 • If w+≤ critical value; if (<) used for H1 • If w≤ critical value; if (≠) used for H1
Example • Consider the data in the previous example and assume that the distribution of pH is symmetric and continuous. • Use the Wilcoxon signed-rank test with =0.05 to test the following hypothesis H0: µ=7 vs. H1: µ≠7
Solution 1. Parameter of interest is the mean of the pH 2. H0: µ=7 3. H1: µ≠7 4. α=0.05 5. Test statistic w=min (w+, w-) 6. Reject H0 if w<w*0.05=8 from Table VIII
Solution – Cont. 7. Signed rank • Determine the minimum value of the following • w+ = ( 1.5 + 4 + 5 + 6 + 7 + 8 + 9 + 10)= 50.5 • w – = ( 1.5 + 3) = 4.5 • Test statistic is w = min (50.5,4.5)
Solution-Cont. 8. Since w=4.5 is less than the critical value w0.05 =8 • Reject the null hypothesis
WILCOXON RANK-SUM TEST • Statistical inference for two samples • Wilcox on rank-sum test is a non parametric alternative • Two independent continuous populations X1 and X2 with means 1 and 2 • Wish to test the following hypotheses • n1 and n2 are sample size
Description of the Test • Arrange all n1+n2 observations in ascending order of magnitude and assign ranks to them • Ties are assigned average rank • W1 be the sum of the ranks in the smaller sample (1), and define W2 to be the sum of the ranks in the other sample • Also can be found • Table IX contains the critical value of the rank sums for two significance levels • Reject H0 • If w2≤ critical value; if (>) used for H1 • If w1≤ critical value; if (<) used for H1 • If either w1 or w2≤ critical value; if (≠) used for H1
Large-Sample Approximation • When both n1 and n2 are moderately large • Distribution of w1 can be well approximated by the normal distribution with the following mean and variance • Test statistic • Appropriate critical region can be chosen from the table
Kruskal-Wallis Test • Recall the single-factor analysis of variance model • Error terms ijwere with mean zero and variance • Kruskal-Wallis test is a nonparametric alternative • Error terms ijare assumed to be from the same continuous distribution
Description of the Test • Compute the total number of observations • Rank all N observations from smallest to largest • Assign the smallest observation rank 1, the next smallest rank 2, . . . , and the largest observation rank N • Rij be the rank of observation Yij • Ri. denote the total and the. average of the niranks
Test Statistic • Calculate • H has approximately a chi-square distribution with a-1 degrees of freedom • Reject H0 if the observed value h is greater than the critical value, or • Critical region can be chosen from the Chi-square distribution table depending on whether the test is a two-tailed, upper-tail, or lower-tail test
Ties in the Kruskal-Wallis Test • Observations are tied, assign an average rank • use the following test statistic • niis the number of observations in the ith treatment • N is the total number of observations • S2 is just the variance of the ranks
Example 15-7 • Montgomery (2001) presented data from an experiment in which five different levels of cotton content in a synthetic fiber were tested to determine whether cotton content has any effect on fiber tensile strength. The sample data and ranks from this experiment are shown in following Table • Does cotton percentage affect breaking strength? Use α=0.01
Solution • Rank all observations from smallest to largest • Assign average rank (1 + 2 +3)/3 = 2 • Perform the same calculations for the other tied observations
Solution-Cont. • Data and Ranks for the Tensile Testing Experiment • There is a fairly large number of ties • Use the equation that was defined for the tied observations
Solution-Cont. • Thus • Test statistic • Since h> 13.28, we would reject the null hypothesis • Conclude that treatments differ • Same conclusion is given by the usual analysis of variance
Next Agenda • Introduces statistical quality control • Fundamentals of statistical process control