1 / 30

Nonparametric Estimation with Recurrent Event Data

Nonparametric Estimation with Recurrent Event Data. Edsel A. Pena Department of Statistics University of South Carolina E-mail: pena@stat.sc.edu. Talk at MUSC, 11/12/01. Based on joint works with R. Strawderman (Cornell) and M. Hollander (Florida State).

pebbles
Download Presentation

Nonparametric Estimation with Recurrent Event Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Nonparametric Estimation with Recurrent Event Data Edsel A. Pena Department of Statistics University of South Carolina E-mail: pena@stat.sc.edu Talk at MUSC, 11/12/01 Based on joint works with R. Strawderman (Cornell) and M. Hollander (Florida State) Research supported by NIH and NSF Grants

  2. A Real Recurrent Event Data(Source: Aalen and Husebye (‘91), Statistics in Medicine) Variable:Migrating motor complex (MMC) periods, in minutes, for 19 individuals in a gastroenterology study concerning small bowel motility during fasting state.

  3. t-S3=4 T2=59 T1=284 T3=186 S2=343 t=533 S3=529 S1=284 Calendar Scale Pictorial Representation of Data for a Subject • Consider unit/subject #3. • K = 3 • Period (Gap) Times, Tj: 284, 59, 186 • Censored Time, t-SK: 4 • Calendar Times, Sj: 284, 343, 529 • Limit of Observation Period: t = 533 0

  4. Features of Data Set • Random observation period per subject (administrative constraints). • Length of period: t • Event of interest is recurrent. A subject may have more than one event during observation period. • # of events (K) informative about MMC period distribution (F). • Last MMC period right-censored by a variable informative about F. • Calendar times: S1, S2, …, SK. • Right-censoring variable: t-SK.

  5. Assumptions and Problem • Aalen and Husebye: “Consecutive MMC periods for each individual appear (to be) approximate renewal processes.” • Translation: The inter-event times Tij’s are assumed stochastically independent. • Problem: Under this IID assumption, and taking into account the informativeness of K and the right-censoring mechanism, to estimate the inter-event distribution, F.

  6. General Form of Data Accrual Calendar Times of Event Occurrences Si0=0 and Sij = Ti1 + Ti2 + … + Tij Number of Events in Observation Period Ki = max{j: Sij<ti} Upper limit of observation periods, t’s, could be fixed, or assumed to be IID with unknown distribution G.

  7. Observables Theoretical Problem To obtain a nonparametric estimator of the gap-time or inter-event time distribution, F; and to determine its properties. Such estimator could be used for other purposes.

  8. Relevance and Applicability • Recurrent phenomena occur in public health/medical settings. • Outbreak of a disease. • Hospitalization of a patient. • Tumor occurrence. • Epileptic seizures. • In other settings. • Non-life insurance claims. • When stock index (e.g., Dow Jones) decreases by at least 6% in one day. • Labor strikes.

  9. Existing Methods and Their Limitations • Consider only the first, possibly right-censored, observation per unit and use the product-limit estimator (PLE). • Loss of information • Inefficient • Ignore the right-censored last observation, and use empirical distribution function (EDF). • Leads to bias (“biased sampling”) • Estimator actually inconsistent

  10. Review: Prior Results Single-Event Complete Data • T1, T2, …, Tn IID F(t) = P(T < t) • Empirical Survivor Function (EDF) • Asymptotics of EDF where W1 is a zero-mean Gaussian process with covariance function

  11. In Hazards View • Hazard rate function • Cumulative hazard function • Equivalences • Another representation of the variance

  12. Single-Event Right-Censored Data • Failure times: T1, T2, …, Tn IID F • Censoring times: C1, C2, …, Cn IID G • Right-censored data (Z1, d1), (Z2, d2), …, (Zn, dn) Zi = min(Ti, Ci) di = I{Ti< Ci} • Product-limit or Kaplan-Meier Estimator

  13. PLE/KME Properties • Asymptotics of PLE where W2 is a zero-mean Gaussian process with covariance function • If G(w) = 0 for all w, so no censoring,

  14. Relevant Processes for Recurrent Event Setting • Calendar-Time Processes for ith unit Then, is a vector of square-integrable zero-mean martingales.

  15. Difficulty: arises because interest is on l(.) or L(.), but these appear in the compensator process as is the length since last event at calendar time v t GapTime s 284 343 529 533 Calendar Time • Needed:Calendar-Gaptime Space For Unit 3 in MMC Data 0

  16. Processes in Calendar-Gaptime Space • Ni(s,t) = # of events in calendar time [0,s] for the ith unit with gaptimes at most t • Yi(s,t) = number of events in [0,s] for the ith unit with gaptimes at least t.

  17. MMC Unit #3 t GapTime t=100 s 284 343 529 533 s=400 Calendar Time K3(s=400) = 2 N3(s=400,t=100) = 1 Y3(s=400,t=100) = 1

  18. “Change-of-Variable” Formulas

  19. Estimators of L and F for Recurrent Event Setting By “change-of-variable” formula, RHS is a sq-int. zero-mean martingale, so Estimator of L(t)

  20. Estimator of F • Since a generalized product-limit estimator(GPLE). • GPLE extends the EDF for complete data and the PLE or KME for single-event right-censored data to recurrent setting.

  21. Asymptotic Properties of GPLE

  22. Theorem: ; Limiting Distribution

  23. EDF: • PLE: • GPLE (recurrent event): For large s, Comparison of Variance Functions • If in stationary state, R(t) = t/mF, so mG(w) is the mean residual life of t, given t> w.

  24. Wang-Chang Estimator (JASA, ‘99) • Can handle correlated inter-event times. Comparison with GPLE may therefore be unfair to WC estimator!

  25. Frailty Model • U1, U2, …, Un are IID unobservedGamma(a, a) random variables, called frailties. • Given Ui = u, (Ti1, Ti2, Ti3, …) are independent inter-event times with • Marginal survivor function of Tij:

  26. Estimator Under Frailty Model • Frailty parameter, a, determines dependence among inter-event times. • Small (Large) a: Strong (Weak) dependence. • EM algorithm needed to obtain the estimator (FRMLE). Unobserved frailties viewed as missing.GPLE needed in EM algorithm. • Under frailty model: GPLE not consistent when frailty parameter is finite.

  27. Monte Carlo Comparisons • Under gamma frailty model. • F = EXP(q): q = 6 • G = EXP(h): h = 1 • n = 50 • # of Replications = 1000 • Frailty parameter a took values {Infty (IID), 6, 2} • Computer programs: S-Plus and Fortran routines.

  28. Simulated Comparison of the Three Estimators for Varying Frailty Parameter Black=GPLE; Blue=WCPLE; Red=FRMLE IID

  29. Effect of the Frailty Parameter (a) for Each of the Three Estimators (Black=Infty; Blue=6; Red=2)

  30. The Three Estimates of MMC Period Survivor Function IID assumption seems acceptable. Estimate of a is 10.2.

More Related