260 likes | 281 Views
Explore nonparametric methods, computer algebra, Mann-Whitney statistic, moments computation, generating functions, and practical applications in statistics.
E N D
Computer algebra and rank statistics Alessandro Di Bucchianico HCM Workshop Coimbra November 5, 1997
How to run this presentation? • the presentation runs itself most of the time • click the mouse if you want to continue • type S to stop or restart the presentation • underlined items are hyperlinks to files on the World Wide Web (usually Postscripts files of technical reports) Enjoy my presentation!
Outline • General remarks on nonparametric methods • What is computer algebra? • Case study: the Mann-Whitney statistic • Critical values of rank test statistics • Moments of the Mann-Whitney statistic • Conclusions
General remarks on nonparametric methods Practical problems • tables (limited, errors, not exact,…) • limited availability in statistical software • procedures in statistical software often only based on asymptotics
General remarks on nonparametric methods Mathematical problems • in general no closed expression for distribution function • direct enumeration only feasible for small sample sizes • recurrences are time-consuming
Case study: Mann-Whitney statistic independent samples X1,…,Xm and Y1,…,Yn continuous distribution functions F, G resp. (hence, no ties with probability one) order the pooled sample from small to large
Mann-Whitney (continued) Wilcoxon: Wm,n= Sirank(Xi) Mann-Whitney: Mm,n = #{(i,j) | Yj < Xi} Wm,n = Mm,n + ½ m (m+1) What is the distribution of Mm,n under H0:F=G?
Computational speed (Pentium 133 MHz) Exact: P(M5,5 4) = 1/21 0.0476 computing time: 0.05 sec (generating function: degree 25) P(M5,5 4) 0.0384 Exact: P(M20,20 138) = 0.0482 (rounded) computing time: 8.5 sec (generating function: degree 400) P(M20,20 138) 0.0475 Asymptotics and exact calculations are bothuseful!
Other examples of nonparametric test statistics with closed form for generating function include: • Wilcoxon signed rank statistic • Kendall rank correlation statistic • Kolmogorov one-sample statistic • Smirnov two-sample statistic • Jonckheere-Terpstra statistic Consult the combinatorial literature! What to do if there is no generating function?
Linear rank statistics Z = 1 if th order statistic in the pooled sample is an X-observation, and 0 otherwise Streitberg & Röhmel 1986 (cf. Euler 1748): Branch-and-bound algorithm (Van de Wiel)
Moments of Mann-Whitney statistic Mann and Whitney (1947) calculated 4th central moment Fix and Hodges (1955) calculated 6th central moment Computations are based on recurrences Can we improve? computer algebra and generating functions 21th century solution:
Computing moments of Mm,n recompute E(Mm,n) (following René Swarttouw)
Hence, it remains to calculate for 1 k m : After some simplifications:
L’Hôpital’s rule yields that the limit equals: It is tedious to perform these computations by hand. Alternative: compute moments using Mathematica.
Conclusions • generating functions are also useful in nonparametric statistics • computer algebra is a natural tool for mathematicians • asymptotics and exact calculations complement each other
Topics under investigation • tests for censored data • power calculations • nonparametric ANOVA (Kruskal-Wallis, block designs, multiple comparisons) • Spearman’s (rank correlation) • multimedia/ World Wide Web implementation Click on underlined items to obtain Postscript file of technical report
References • A. Di Bucchianico, Combinatorics, computer algebra and the Wilcoxon-Mann-Whitney test, to appear in J. Stat. Plann. Inf. • B. Streitberg and J. Röhmel, Exact distributions for permutation and rank tests: An introduction to some recently published algorithms, Stat. Software Newsletter 12 (1986), 10-18
References (continued) • M.A. van de Wiel, Exact distributions of nonparametric statistics using computer algebra, Master’s Thesis, TUE, 1996 • M.A. van de Wiel and A. Di Bucchianico, The exact distribution of Spearman’s rho, technical report • M.A. van de Wiel, A. Di Bucchianico and P. van der Laan, Exact distributions of nonparametric test statistics using computer algebra, technical report
References (continued) • M.A. van de Wiel, Edgeworth expansions with exact cumulants for two-sample linear rank statistics , technical report • M.A. van de Wiel,Exact distributions of two-sample rank statistics and block rank statistics using computer algebra , technical report