230 likes | 249 Views
This study introduces new interval estimating procedures for estimating the probability of disease transmission in multiple-vector transfer designs. It discusses the advantages and disadvantages of single-vector and multiple-vector transfers and compares the properties of maximum likelihood estimators, Wald confidence intervals, variance stabilizing intervals, and modified Clopper-Pearson intervals. The study also explores the applications of group testing design in other fields.
E N D
New interval estimating procedures for the disease transmission probability in multiple-vector transfer designs Joshua M. Tebbs and Christopher R. Bilder Department of Statistics Oklahoma State University tebbs@okstate.edu and chris@chrisbilder.com
Introduction Brown planthopper Whitebacked planthopper • Plant disease is responsible for major losses in agricultural throughout the world • Diseases are often spread by insect vectors (e.g., aphids, leafhoppers, planthoppers, etc.) • Example: www.knowledgebank.irri.org/ricedoctor_mx/Fact_Sheets/Pests/Planthopper.htm Joshua M. Tebbs and Christopher R. Bilder
Example • Ornaghi et al. (1999) study the effects of the “Mal Rio Cuarto” (MRC) virus and its spread by the Delphacodes kuscheli planthopper • The MRC virus is most-damaging maize virus in Argentina • It was desired to estimate p, the probability of disease transmission for a single vector • Vector-transfers are often used by plant pathologists wanting to estimate p • In such experiments, insects are moved from an infected source to the test plants Joshua M. Tebbs and Christopher R. Bilder
Single-vector transfers • The most straightforward way to estimate p is by using a single-vector transfer • Each test plant contains one vector, and test plants must be individually caged • Under the binomial model, the proportion of infected test plants gives the maximum likelihood estimate of p • Disadvantages with a single-vector transfer: • Requires a large amount of space (since insects must be individually isolated) • Is a costly design since one needs a large number of test plants and individual cages Joshua M. Tebbs and Christopher R. Bilder
Multiple-vector transfers Planthopper Y=0 Y=1 Y=0 Greenhouse Does not transmit virus Transmits virus Enclosed test plant Y=1 Y=0 Y=0 • A group of s > 1 insect vectors is allocated to each test plant. • Even though test plants are occupied by multiple insects, the goal is still to estimate p, the probability of disease transmission for a single vector Joshua M. Tebbs and Christopher R. Bilder
Multiple-vector transfers • Advantages of a multiple-vector versus single-vector transfer: • Potential savings in time, cost, and space • Statistical properties of estimators are much better (for a fixed number of test plants) • A multiple-vector transfer is an application of the group-testing experimental design • Other applications of group testing: • Infectious disease seroprevalence estimation in human populations • Disease-transmission in animal studies • Drug discovery applications Joshua M. Tebbs and Christopher R. Bilder
Notation and assumptions • Define: • n = number of test plants • s = number of insects per plant (“group size”) • Y=1 “infected test plant” – plant for which at least one vector (out of s) infects • Y=0 “uninfected test plant” – plant for which no vectors (out of s) infect • Assumptions: • Common group size s • The statuses of individual vectors are iid Bernoulli random variables with mean p • The statuses of test plants are independent • Test plants are not misclassified Joshua M. Tebbs and Christopher R. Bilder
Maximum likelihood estimator for p • Let T = Y denote the number of infected test plants. Under our design assumptions, T has a binomial distribution with parameters n and • The maximum likelihood estimator of p is given by where (the proportion of infected test plants) • Estimates of p are computed by only examining the test plants (and not the individual vectors themselves) • The binomial model is only appropriate if test plants do not differ materially in their resistance to pathogen transmission Joshua M. Tebbs and Christopher R. Bilder
Properties of the MLE and the Wald CI • The statistic has the following properties: • Consistent as n gets large • Approximately normally distributed; more precisely, where • A 100(1-) percent Wald confidence interval is given by where Joshua M. Tebbs and Christopher R. Bilder
Variance stabilizing interval (VSI) • Goal: Find whose variance is free of the parameter p • Solve the following differential equation: • With c0 = 1, a solution is given by • It follows that is a 100(1-) percent confidence interval for p. Here, Joshua M. Tebbs and Christopher R. Bilder
Modified Clopper-Pearson (CP) interval • The number of infected test plants, T, has a binomial distribution with parameters n and • One can obtain an exact Clopper-Pearson interval for and then transform back to the p scale (Chiang and Reeves, 1962) • Exact 100(1-) percent confidence limits for p are given by and where F1-,a,b denotes the 1- quantile of the central F distribution with a (numerator) and b (denominator) degrees of freedom Joshua M. Tebbs and Christopher R. Bilder
Comparing the Wald, VSI, and CP • The Wald interval is simple and easy to compute. However, it has three main drawbacks: • Provides symmetric confidence intervals even though the distribution of may be very skewed • Often produces negative lower limits when p is small! • The VSI handles each of these drawbacks • Not symmetric • Always produces lower limits within the parameter space (i.e., strictly larger than zero) • The CP interval’s main advantage is that its coverage probability is always greater than or equal to 1-. However, such intervals can be wastefully wide, especially if n is small. Joshua M. Tebbs and Christopher R. Bilder
Bayesian estimation • Prior distribution for p • One parameter Beta distribution for a known value of • Takes into account p is small • Example when = 52.4 Joshua M. Tebbs and Christopher R. Bilder
Bayesian estimation • Prior distribution for p • Why use one parameter instead of two parameter Beta? • Sensible model acknowledging p is small • Bayes and empirical Bayes estimators are simpler • Resulting estimator using squared error loss with a two parameter beta is ratio of complicated alternating sums • See Chaubey and Li (Journal of Official Statistics, 1995) for Bayes estimators Joshua M. Tebbs and Christopher R. Bilder
Bayesian estimation • Posterior distribution for 0 < p < 1 • Note: U = 1 − (1 − P)s ~ beta(t + 1, n− t + /s) Joshua M. Tebbs and Christopher R. Bilder
Empirical Bayesian estimation • Use the marginal distribution for T to derive an estimate for • Why? • Avoid possible poor choice for • n is often small in multiple-vector transfer experiments • Posterior may be adversely affected by the prior • Marginal distribution of T for t = 0, 1, …, n • Maximize fT(t|) as a function of to obtain the marginal maximum likelihood estimate, • Iteratively solve for inwhere ( ) is the digamma function Joshua M. Tebbs and Christopher R. Bilder
Credible intervals • (1 −)100% Equal-tail • [pL, pU] satisfy and • Use relationship with Beta distribution, U = 1 − (1 −p)s ~ beta(t + 1, n−t + /s) • Interval: where B,a,b is the quantile of a Beta(a,b) distribution Remember that = 1 − (1 − p)simplies p = 1 − (1 − )1/s Joshua M. Tebbs and Christopher R. Bilder
Credible intervals • (1 −)100% highest posterior density (HPD) regions • Posterior is unimodal and right skewed • Find [pL, pU] such that (1 −)100% area of posterior density is included and pU−pL is as small as possible • See Tanner (1996, p. 103-4) • Key is to sample from posterior distribution • Use U = 1 − (1 −p)s ~ beta(t + 1, n−t + /s) relationship Joshua M. Tebbs and Christopher R. Bilder
Example - Ornaghi et al. (1999) • Data • s = 7 planthoppers per plant • n = 24 plants • t = 3 infected plants observed • 95% interval estimates for p Joshua M. Tebbs and Christopher R. Bilder
Interval comparisons • Coverage where I(n,t,s) = 1 if the interval contains 1 and I(n,t,s) = 0 otherwise. • Do not consider the t = 0 and t = n cases • Poor multiple-vector transfer experimental design • See Swallow (1985, Phytopathology) for guidance in choosing s • Brown, Cai, and DasGupta (2001, Statistical Science) • Frequentist evaluation similar to how Carlin and Louis (2000) approach evaluating confidence and credible intervals Joshua M. Tebbs and Christopher R. Bilder
Interval comparisons • = 0.05,n=40, and s=10 • Black line denotes Wald & bold line denotes plot title Joshua M. Tebbs and Christopher R. Bilder
Summary • Best interval: VSI or modified Clopper-Pearson • Credible intervals may be improved by taking into account variability of the estimators • Bootstrap intervals mentioned in abstract – VSI and Clopper-Pearson perform better • Many other intervals could be investigated! • Website • www.chrisbilder.com/bilder_tebbs • Contains R programs for examining the interval estimation properties • Different values of p, n, and s can be used • Also calculates empirical Bayes estimators • Program for Ornaghi et al. (1999) data example Joshua M. Tebbs and Christopher R. Bilder
New interval estimating procedures for the disease transmission probability in multiple-vector transfer designs Joshua M. Tebbs and Christopher R. Bilder Department of Statistics Oklahoma State University tebbs@okstate.edu and chris@chrisbilder.com Contact address starting Fall 2003: Joshua M. TebbsDepartment of StatisticsKansas State University Christopher R. BilderDepartment of StatisticsUniversity of Nebraska-Lincolnchris@chrisbilder.com