1.13k likes | 1.4k Views
Nonparametric Statistical Methods. Presented by Guo Cheng, Ning Liu , Faiza Khan, Zhenyu Zhang, Du Huang, Christopher Porcaro, Hongtao Zhao, Wei Huang. Introduction. Definition.
E N D
Nonparametric Statistical Methods Presented by Guo Cheng, Ning Liu , Faiza Khan, Zhenyu Zhang, Du Huang, Christopher Porcaro, Hongtao Zhao, Wei Huang
Definition Nonparametric methods 1: rank-based methods are used when we have no idea about the population distribution from which the data is sampled. Used for small sample sizes. Used when the data are measured on an ordinal scale and only their ranks are meaningful.
Outline • 1. Sign Test • 2. Wilcoxon Signed Rank Test • 3. Inferences for Two Independent Samples • 4. Inferences for Several Independent Samples • 5. Friedman Test • 6. Spearman’s Rank Correlation • 7. Kendall’s Rank Correlation Coefficient
Parameter of interest: Median Median is used as a parameter because it is a better measure of data as compared to the mean for skewed distributions.
Hypothesis test H0: µ = µ0 vs Ha: µ > µ0 where µ0 is a specified value and µ is unknown median
Testing Procedure • Step 1: Given a random sample x1, x2, …, xn from a population with unknown median µ, count the number of xi’s that exceed µ0. • Denote them by s+. • s-= n - s+ • Step 2: Reject H0 if s+ is large or s- is small.
How to reject H0? • To determine how large s+ must be in order to reject H0, we need to find out the distribution of the corresponding random variable S+. • Xi: random variable corresponding to the observed values xi • S-: random variable corresponding to s-
SAS code DATA themostat; INPUT temp; datalines; 202.2 203.4 … ; PROCUNIVARIATEDATA=themostat loccountmu0=200; VAR temp; RUN;
SAS Output Basic Statistical Measures Location Variability Mean 201.7700 Std Deviation 2.41019 Median 201.7500 Variance 5.80900 Mode . Range 8.30000 Interquartile Range 2.90000 Tests for Location: Mu0=200 Test -Statistic- -----p Value------ Student's t t 2.322323 Pr > |t| 0.0453 Sign M 3 Pr >= |M| 0.1094 Signed Rank S 19.5 Pr >= |S| 0.048
Inventor Frank Wilcoxon (2 September 1892 in County Cork, Ireland – 18 November 1965, Tallahassee, Florida, USA) was a chemist and statistician, known for development of several statistical tests.
What is it used for? • Two related samples • Matched samples • Repeated measurements on a single sample
SAS codes DATA thermo; INPUT temp; datalines; 202.2 203.4 … ; PROCUNIVARIATEDATA=thermo loccountmu0=200; TITLE"Wilcoxon signed rank test the thermostat"; VAR temp; RUN;
8 SAS outputs (selected results) Basic Statistical Measures Location Variability Mean 201.7700 Std Deviation 2.41019 Median 201.7500 Variance 5.80900 Mode . Range 8.30000 Interquartile Range 2.90000 Tests for Location: Mu0=200 Test -Statistic- -----p Value------ Student's t t 2.322323 Pr > |t| 0.0453 Sign M 3 Pr >= |M| 0.1094 Signed Rank S 19.5 Pr >= |S| 0.048
Example • To test if the grades of two classes which have the same teacher are the same, we randomly pick 7 students from Class A and 9 from Class B, their scores are as follows • A: 8.50 9.48 8.65 8.16 8.83 7.76 8.63 • B: 8.27 8.20 8.25 8.14 9.00 8.10 7.20 8.32 7.70
SAS code Data exam; Input group $ score @@; Datalines; A 8.50 A 9.48 A 8.65 A 8.16 A 8.83 A 7.76 A 8.63 B 8.27 B 8.20 B 8.25 B 8.14 B 9.00 B 8.10 B 7.20 B 8.32 B 7.70 ;
SAS code Proc npar1way data=exam wilcoxon; Var score; Class group; Exact wilcoxon; Run;
Introduction • We know that if our data is normally distributed and that the population standard deviations are equal, we can test for a difference among several populations by using the One-way ANOVA F test.