Using Permutation Tests to Study Infant Handling by Female Baboons

Using Permutation Tests to Study Infant Handling by Female Baboons Thomas L. Moore Vicki Bentley-Condit Grinnell College

Infant handling examples

Plan for talk • The data & problem • The use of permutation tests • Interpret results • The stability of results • The choice of test statistics • Summary

Primary question • Is infant handling behavior related to the dominance hierarchy rankings of females in the troupe? • Dominance hierarchy scores are determined from female-on-female interactions (sans infants) • Scores cluster into high, mid, and low ranks

The data (handout) HANDLERS ranks INFANTS/ KM KN NQ PO HQ LL NY PS SK ST WK AL CO DD LS LY MH ML MM PA PH PT RS Mothers 1 1 1 1 | 2 2 2 2 2 2 2 | 3 3 3 3 3 3 3 3 3 3 3 3 ranks KG/KM 1 0 0 4 1 | 1 0 0 0 3 1 0 | 0 0 0 0 0 0 0 0 0 0 2 1 HZ/HQ 2 13 23 7 5 | 0 2 1 1 5 6 18 | 1 6 3 0 1 4 1 0 9 0 10 1 LC/LL 2 4 0 1 4 | 3 0 2 1 1 5 3 | 1 0 0 1 0 2 1 1 1 0 1 6 NK/NY 2 12 4 10 5 | 9 1 0 2 3 11 7 | 8 6 3 1 0 2 1 1 5 3 2 3 PZ/PS 2 1 3 4 1 | 0 0 0 0 0 0 2 | 0 2 0 0 0 3 0 1 1 0 3 0 CY/CO 3 2 2 7 3 | 1 1 2 0 3 12 16 | 3 0 2 0 0 2 0 0 1 0 0 2 LZ/LS 3 1 0 3 2 | 1 1 0 0 2 0 5 | 2 2 2 0 1 9 2 0 0 0 3 2 MQ/ML 3 0 1 5 2 | 2 4 2 2 2 4 5 | 7 5 2 1 1 7 0 4 4 1 0 2 MW/MH 3 3 0 7 4 | 2 3 0 5 2 8 13 | 7 14 2 0 0 0 4 0 8 0 13 6 MX/MM 3 2 3 4 5 | 0 0 0 0 0 5 2 | 9 3 1 0 0 2 0 0 1 2 2 3 PK/PH 3 2 0 6 4 | 3 4 1 0 0 15 10 | 8 5 1 0 3 1 1 6 3 0 7 5

High-ranked female handles mid-ranked infant: Female NQ handles Infant NK 10 timesNK’s mother is NY HANDLERS ranks KM KN NQ PO HQ LL NY PS SK ST WK AL CO DD LS LY MH ML MM PA PH PT RS 1 1 1 1 | 2 2 2 2 2 2 2 | 3 3 3 3 3 3 3 3 3 3 3 3 INFANTS/ Mothers ranks KG/KM 1 0 0 4 1 | 1 0 0 0 3 1 0 | 0 0 0 0 0 0 0 0 0 0 2 1 HZ/HQ 2 13 23 7 5 | 0 2 1 1 5 6 18 | 1 6 3 0 1 4 1 0 9 0 10 1 LC/LL 2 4 0 1 4 | 3 0 2 1 1 5 3 | 1 0 0 1 0 2 1 1 1 0 1 6 NK/NY 2 12 4 10 5 | 9 1 0 2 3 11 7 | 8 6 3 1 0 2 1 1 5 3 2 3 PZ/PS 2 1 3 4 1 | 0 0 0 0 0 0 2 | 0 2 0 0 0 3 0 1 1 0 3 0 CY/CO 3 2 2 7 3 | 1 1 2 0 3 12 16 | 3 0 2 0 0 2 0 0 1 0 0 2 LZ/LS 3 1 0 3 2 | 1 1 0 0 2 0 5 | 2 2 2 0 1 9 2 0 0 0 3 2 MQ/ML 3 0 1 5 2 | 2 4 2 2 2 4 5 | 7 5 2 1 1 7 0 4 4 1 0 2 MW/MH 3 3 0 7 4 | 2 3 0 5 2 8 13 | 7 14 2 0 0 0 4 0 8 0 13 6 MX/MM 3 2 3 4 5 | 0 0 0 0 0 5 2 | 9 3 1 0 0 2 0 0 1 2 2 3 PK/PH 3 2 0 6 4 | 3 4 1 0 0 15 10 | 8 5 1 0 3 1 1 6 3 0 7 5

High-ranked female handles mid-ranked infant: Female NQ handles Infant NK 10 timesNK’s mother is NY KM KN NQ PO HQ 1 1 1 1 2 KG/KM 1 0 0 4 1 1 HZ/HQ 2 13 23 7 5 0 LC/LL 2 4 0 1 4 3 NK/NY 2 12 4 10 5 9 PZ/PS 2 1 3 4 1 0

The variables • Handler rank: high(1), mid(2), low(3) • Infant rank: high(1), mid(2), low(3) • The number of interactions between a given infant-handler pair

3 kinds of handling behavior • Passive: movement to within 1m. of the mother-infant pair with no attempt to handle, • Unsuccessful: movement to within 1m. of the mother-infant pair with attempted (but not successful) handle, or • Successful: a successful handle. • NOTE: Each count in the matrix above is the sum of counts from the 3 categories listed on this slide.

Research hypotheses • Females will tend to handle the infants of females who are ranked the same as or lower than themselves. (RH1) • Females will tend to handle the infants of females who are ranked directly below them (or same rank if female is low-ranked). (RH2)

The data (handout) HANDLERS ranks INFANTS/ KM KN NQ PO HQ LL NY PS SK ST WK AL CO DD LS LY MH ML MM PA PH PT RS Mothers 1 1 1 1 | 2 2 2 2 2 2 2 | 3 3 3 3 3 3 3 3 3 3 3 3 ranks KG/KM 1 0 0 4 1 | 1 0 0 0 3 1 0 | 0 0 0 0 0 0 0 0 0 0 2 1 HZ/HQ 2 13 23 7 5 | 0 2 1 1 5 6 18 | 1 6 3 0 1 4 1 0 9 0 10 1 LC/LL 2 4 0 1 4 | 3 0 2 1 1 5 3 | 1 0 0 1 0 2 1 1 1 0 1 6 NK/NY 2 12 4 10 5 | 9 1 0 2 3 11 7 | 8 6 3 1 0 2 1 1 5 3 2 3 PZ/PS 2 1 3 4 1 | 0 0 0 0 0 0 2 | 0 2 0 0 0 3 0 1 1 0 3 0 CY/CO 3 2 2 7 3 | 1 1 2 0 3 12 16 | 3 0 2 0 0 2 0 0 1 0 0 2 LZ/LS 3 1 0 3 2 | 1 1 0 0 2 0 5 | 2 2 2 0 1 9 2 0 0 0 3 2 MQ/ML 3 0 1 5 2 | 2 4 2 2 2 4 5 | 7 5 2 1 1 7 0 4 4 1 0 2 MW/MH 3 3 0 7 4 | 2 3 0 5 2 8 13 | 7 14 2 0 0 0 4 0 8 0 13 6 MX/MM 3 2 3 4 5 | 0 0 0 0 0 5 2 | 9 3 1 0 0 2 0 0 1 2 2 3 PK/PH 3 2 0 6 4 | 3 4 1 0 0 15 10 | 8 5 1 0 3 1 1 6 3 0 7 5 [,1] [,2] [,3] [1,] 5 5 3 [2,] 97 83 95 [3,] 68 138 184

X=handler rank; Y=Infant rank Handler's rank Hi Mid Low Infant Hi 5 5 3 Rank Mi 97 83 95 Lo 68 138 184 Totals: 170 226 282 (A)Counts Handler's rank Hi Mid Low Infant Hi 2.9% > 2.2% > 1.1% Rank Mi 57.1% > 36.7% > 33.9% Lo 40.0% < 61.1% < 65.0% (B)Column%

Adjusted residuals Handler's rank Hi Mid Low Infant Hi 1.12 0.40 -1.37 Rank Mi 5.06 -1.44 -3.08 Lo -5.34 1.32 3.43

Adjusted residuals Handler's rank Hi Mid Low Infant Hi 1.12 0.40 -1.37 Rank Mi 5.06 -1.44 -3.08 Lo -5.34 1.32 3.43 Is the relationship statistically significant?

The Null Model The female handlers interacted with infants as given in the data set. These interactions involved a variety of complex causes, but none of this complexity had anything to do with ranks. That is, ranks can be viewed as meaningless labels attached to infants and females.

Computing a permutation test • Choose a test statistic, C. • (1) Assign ranks at random to infants and females using the rank distributions of the data set. That is, assign ranks at random so that infants are assigned, in this case, 1 High, 4 Mid, and 6 Low and so that females are assigned 4 High’s, 7 Mid’s, and 12 Low’s. This assignment leads to the original data table but with permuted ranks. • (2) Re-form the 3-by-3 table. • (3) Compute the value of C for this table. • Iterate (1)-(3) many times for empirical null distribution.

Test statistic for Research hypothesis 1

A sample permutation (handout) HANDLERS ranks KM KN NQ PO HQ LL NY PS SK ST WK AL CO DD LS LY MH ML MM PA PH PT RS 1 3 1 3 3 3 2 2 3 1 3 2 2 3 3 1 3 3 3 2 2 3 2 INFANTS/ Mothers ranks KG/KM 1 0 0 4 1 | 1 0 0 0 3 1 0 | 0 0 0 0 0 0 0 0 0 0 2 1 HZ/HQ 3 13 23 7 5 | 0 2 1 1 5 6 18 | 1 6 3 0 1 4 1 0 9 0 10 1 LC/LL 3 4 0 1 4 | 3 0 2 1 1 5 3 | 1 0 0 1 0 2 1 1 1 0 1 6 NK/NY 2 12 4 10 5 | 9 1 0 2 3 11 7 | 8 6 3 1 0 2 1 1 5 3 2 3 PZ/PS 2 1 3 4 1 | 0 0 0 0 0 0 2 | 0 2 0 0 0 3 0 1 1 0 3 0 CY/CO 2 2 2 7 3 | 1 1 2 0 3 12 16 | 3 0 2 0 0 2 0 0 1 0 0 2 LZ/LS 3 1 0 3 2 | 1 1 0 0 2 0 5 | 2 2 2 0 1 9 2 0 0 0 3 2 MQ/ML 3 0 1 5 2 | 2 4 2 2 2 4 5 | 7 5 2 1 1 7 0 4 4 1 0 2 MW/MH 3 3 0 7 4 | 2 3 0 5 2 8 13 | 7 14 2 0 0 0 4 0 8 0 13 6 MX/MM 3 2 3 4 5 | 0 0 0 0 0 5 2 | 9 3 1 0 0 2 0 0 1 2 2 3 PK/PH 2 2 0 6 4 | 3 4 1 0 0 15 10 | 8 5 1 0 3 1 1 6 3 0 7 5 [,1] [,2] [,3] [1,] 5 1 7 [2,] 85 60 119 [3,] 81 117 203

Null distribution: 1000 resamples

Conclusion • P-value ≈ 15/1000 = .015 • Observed pattern is unlikely the result of chance alone.

Summary by type of interaction LTE LT n PA 0.015 0.038 678 Pass 0.013 0.071 377 Un 0.012 0.119 189 Succ 0.372 0.017 112 --------------------------- p-values for two test statistics (LTE and LT) for 4 datasets of counts.

Look at Successful interactions Counts Resids Column%

Stability of results • Suggested by Clifford Lunneborg (Stats 2002) • Stable description: “finding of the study … is not unduly influenced by the inclusion in the study of one particular source of observations.”

Four views of stability

For example … • Remove infants, one at a time, recompute the test statistic. • Use a normalized test statistic = LTE* LTE / Sum of table entries; • LTE* and LTE have same permutation distribution, … • But LTE* accounts for sub-table count variation. • LTE* values are stable (plot below)

Stability summary

Choice of test statistics • LTE and LT ad hoc, but intuitive • Power analysis to compare LTE and LT to some other statistics • Correlation-based statistics • M = Pearson’s correlation putting scores on ranks (Agresti,88) • GK = Goodman and Kruskal’s gamma (Agresti,58) • Beta = the asymmetry parameter in an ordinal quasi-symmetric log-linear model (Agresti,202)

Simulation • Product multinomial model where J-th handler generates nJ interactions • 2^(7-3) design to estimate main effects for 7 factors, including • Total sample size • Table dimension • RH1 vs RH2 • Strength of research hypothesis • Patterns of non-homogeneity of counts • Infant rank distribution • Handler rank distribution

Results of power study • LTE and LT outperform others; • LTE does best under RH1 • LT does best under RH2 • Good news!

Summary of talk • There is evidence for both research hypotheses, depending on the type of handling behavior; • This suggests a nuanced description of how infant-handling behavior works. • Permutation tests gave a way of analyzing a messy data set; • We assessed stability through a simple remove-one-at-a-time strategy; • We compared the power of test statistics and found simple ones to perform well.

Final slide • Thank you for listening. • Email: mooret@grinnell.edu • Slides and handout: http://www.math.grinnell.edu/~mooret/reports/reports.html • Note: this info is on your handout. • Research supported by Grinnell College and the Frank and Roberta Furbush fund

Using Permutation Tests to Study Infant Handling by Female Baboons