280 likes | 386 Views
chapter 10. Nonparametric Techniques. Chapter Outline. Chi square: testing the observed versus the expected Procedures for rank-order data Correlation Differences among groups. Analyzing Data Appropriately.
E N D
chapter10 NonparametricTechniques
Chapter Outline • Chi square: testing the observed versus the expected • Procedures for rank-order data • Correlation • Differences among groups
Analyzing Data Appropriately • Behavioral scientists believe most data are normally distributed: God loves the normal curve! • Is that true? Micceri (1989) said no for large data sets in psychology. • Do we as scientists look carefully at the distribution of our data? God may not love the normal curve!
Parametric Statistical Procedures Are parametric statistical procedures sensitive to nonnormality? Substantial evidence exists that parametric statistical procedures are not as robust to violations of the normality assumption as once thought.
Chi Square: Testing the ObservedVersus the Expected • Formula for chi squareX2 = (O – E)2 / Ewhere O = observed frequency and E = expected frequency
Coach Rabbitfoot and His Tennis Courts Known values Court number 1 2 3 4 Total Observed losses O = 24 34 22 40 120 Expected losses E = 30 30 30 30 120 (O – E) –6 +4 –8 +10 (O – E)2 36 16 64 100 (O – E)2/E 1.20 .53 2.13 3.33 X2 = (O – E)2/E = 1.20 + .53 + 2.13 + 3.33 = 7.19 df = # cells – 1 = # courts – 1 = 4 – 1 = 3 X2 (3) = 7.19, p > .05, not significant
Contingency Tables • Chi square with two or more categories and two or more groups • Athletes and nonathletes respond to an ethical statement about whether one should tell the umpire if they trap a fly ball in baseball. • The athletes respond on a Likert-type scale: 3 = Agree, 2 = No opinion, 1 = Disagree
Working Out the Answer Obs resp. Agree No Disagree Total opinion Athletes 30 46 124 200 Nonathletes 114 80 56 250 Total 144 126 180 450 Expected (column total row total)/N Athletes 64 56 80 Nonathletes 80 70 100 X2 = (–342/64) + (–342/80) + (–102/56) + (102/70) +(442/80) + (–442/100) = 79.29 df = (r – 1) (c – 1) = (2 – 1)(3 – 1) = 2, p < .01
Puri and Sen Rank-OrderGeneral Linear Method • This method maintains good power. • This method protects against type I error. • Change data to ranks. • Use any of the standard parametric procedures for ranked scores using SPSS or SAS.
General Linear Model (GLM) • Basis for procedures of • regression: r, R, Rc • differences: t, ANOVA, MANOVA • Y = B X + EY = vector of scores on p dvsX = vector of scores on q IvsB = pq matrix of reg. CoefficientsE = vector of errors
Calculating the Test Statisticfor Ranked Data • Instead of the parametric test statistic (t or F), calculate L.L = (N – 1)r2df = pq
Example From Regression Can skinfold measurements be used to predict percentage fat (determined by underwater weighing) in women grouped by ethnicity? Data from K.T. Thomas et al., 1997.
Examples From Regression(Distribution) Thigh Frequency Stem & Leaf 8 1* 01233444 20 1. 55555677777788889999 15 2* 000022223333344 17 2. 55555667777888899 6 3* 000224 7 3. 5578899 4 4* 2223 2 4. 66 Stem width: 10.0 Each leaf: 1 case(s) N = 79 M (mm) = 24.97 SD = 8.80 Skewness = 0.67 Kurtosis = 0.28
Examples From Regression(Distribution) Percent fat from hydrostatic weighing Frequency Stem & Leaf 1 Extreme 4 1. 6799 5 2* 01222 12 2. 666777888999 17 3* 01111122222344444 23 3. 55556666778888889999999 15 4* 000001122233334 2 4. 56 Stem width: 10.0 Each leaf: 1 case(s) N = 79 M = 33.89 SD = 7.48 Skewness = 0.67 Kurtosis = 0.09
Multiple Regression on Original Data *p < .05 F(4,74) = 32.87, p < .001, for linear composite of predictors
Multiple RegressionUsing Ranked Data *p < .05 L(4) = 53.01, p < .001, for linear composite of predictors
Example Using Factorial ANOVA Do boys and girls differ in push-up scores in grades 4, 5, and 6? Data from J.K. Nelson et al., 1991.
Stem-and-Leaf, Mean, Standard Deviation, Skewness, and Kurtosis for Push-Up Scores for Boys and Girls in Grades 4, 5, and 6 Frequency Stem & Leaf 30 0* 001111111122222222233334444444 30 0 555556666666777777778888889999 32 1* 00000001111122222233333334444444 21 1 555556667777777889999 25 2* 0000000111112233333344444 25 2 5555555566666666677788999 8 3* 00023444 7 3 5566889 2 4* 02 Stem width: 10 Each leaf: 1 N = 180 M = 15.63 SD = 10.27 Skewness = 0.41 Kurtosis = 0.70
3 2 ANOVA Results for Original Data • Original data • Grade: F(2, 174) = 7.30, p < .001 • Sex: F(1, 174) = 17.48, p < .001 • Interaction: not significant
3 2 ANOVA Resultsfor Ranked Data • Ranked data • Grade: L(2) = 11.67, p < .005 • Sex: L(1) = 13.21, p < .001 • Interaction: not significant
Example UsingRepeated-Measures ANOVA . Does VO2 differ by walking speeds in older and younger participants? Data from P.E. Martin, D.E., Rothstein, & D.D. Larish, 1992, “Effects of age and physical activity status on the speed-aerobic demand relationship of walking,” Journal of Applied Physiology, 73: 200-206.
. Characteristics of VO2at Five Walking Speeds. Miles per hour 1.5 2.0 2.5 3.0 3.5 M 9.3 10.3 11.7 13.6 16.3 SD 1.0 1.1 1.5 1.6 1.6 Median 9.2 10.3 11.4 13.6 16.2 Skewness 0.29 0.30 0.59 0.45 0.44 Kurtosis –0.13 –0.29 0.06 –0.17 0.00
Summary Tables of Repeated-Measures ANOVAs for Original and Ranked Data Source Pillai’s trace df F Signif. Original data Age .14 (r2– SSBet/SSTot) 1, 57 9.98 .003 Speed .98 4, 54 589.37 .0001 Age Speed .22 4, 54 3.70 .01 Huynh-Feldt Epsilon = .65 Ranked Data L Age .14 (r2– SSBet/SSTot) 1 8.12 <.01 Speed .98 5 56.84 <.001 Age Speed .22 5 12.76 <.05 Huynh-Feldt Epsilon = .77
Example Using Factorial MANOVA Do four ethnic groups at two age levels differ on two skinfold measurements and hip-to-waist ratio? Data from K.T. Thomas et al., 1997.
Using MANOVA on Originaland Ranked Data • Data are for four ethnic groups (African American, European American, Mexican American, and Native American) at two age levels (20–30 and 40–50), include the previously reported data on abdomen and calf skinfolds, and add a third dependent variable, hip-to-waist ratio.
4 (Ethnic Group) 2 (Age Level) MANOVA on Three Dependent Variables • Original data • Ethnic group: F(3, 152) = 5.64, p <.0001 • Age level: F(3, 152) = 7.86, p <.0001 • Interaction: not significant • Ranked data (Pillai’s trace = R2) • Ethnic group: L(3) = 22.54, p <.0001 • Age level: L(9) = 41.86, p <.0001 • Interaction: not significant
Applications to GLM • These procedures are appropriate for all GLM models. • Regression: Pearson r, multiple R canonical (Rc) • ANOVA: t, simple and factorial ANOVA (including repeated measures), ANOCOVA • Multivariate techniques: Discriminant analysis, MANOVA (including repeated measures), MANCOVA
Summary • Are data from physical activity normally distributed? • If not, changing data to ranks and using nonparametric procedures allows the researcher the alternative of using standard statistical packages while calculating only the L statistic.