650 likes | 663 Views
This study aims to estimate the distribution function of the incubation period of HIV/AIDS, which is important for predicting the future course of the epidemic. The nonparametric maximum likelihood estimator (MLE) for the 2-dimensional distribution is computed using parameter reduction and optimization techniques.
E N D
Estimating the distribution of the incubation period of HIV/AIDS Marloes H. Maathuis Joint work with: Piet Groeneboom and Jon A. Wellner
HIV AIDS 1996 1985 Incubation period 11 years Incubation period Time between HIV infection and onset of AIDS
1996 1985 AIDS 1980 1980 HIV
Lower bound of incubation period 6 years Upper bound of incubation period 13 years Censored data Interval of onset of AIDS Interval of HIV infection 1983 1986 1992 1996
Y (AIDS) 1996 Interval of onset of AIDS 1992 1980 1980 1983 1986 X (HIV) Interval of HIV infection
Y (AIDS) 1980 1980 X (HIV)
Distribution functions • Goal: estimate the distribution function of the incubation period of HIV/AIDS • Why? This is important for predicting the future course of the epidemic • Strategy: First estimate the 2-dimensional distribution
Main focus • Nonparametric maximum likelihood estimator (MLE) for 2-dimensional distribution: • Computational aspects • Theoretical properties (consistency)
Computation of the MLE • Parameter reduction: determine the inner rectangles • Optimization: determine the amounts of mass assigned to the inner rectangles.
Inner rectangles Y (AIDS) max X (HIV)
Inner rectangles Y (AIDS) max X (HIV)
Inner rectangles Y (AIDS) max X (HIV)
Inner rectangles Y (AIDS) max X (HIV)
Inner rectangles Y (AIDS) max X (HIV)
Inner rectangles Y (AIDS) max X (HIV) The MLE is insensitive to the distribution of mass within the inner rectangles. This gives non-uniqueness.
Y (AIDS) α1 α2 α3 α4 X (HIV)
Y (AIDS) α1 α2 α3 α4 X (HIV)
Y (AIDS) α1 α2 α3 α4 X (HIV)
Y (AIDS) α1 α2 α3 α4 X (HIV)
Y (AIDS) α1 α2 α3 α4 X (HIV)
Y (AIDS) α1 α2 α3 α4 s.t. and X (HIV)
Y (AIDS) 2 5 s.t. and X (HIV) 3/5 0 0 The αi’s are not always uniquely determined: second type of non-uniqueness
R3 R4 R1 R2 R5 Graph theory R1 R2 R3 R4 R5 Set of rectangles Intersection graph Maximal cliques: {R1,R2,R3}, {R3,R4}, {R4,R5}, {R2,R5} The maximal cliques correspond to the inner rectangles
Existing reduction algorithms • Betensky and Finkelstein (1999) • Gentleman and Vandal (2001,2002) • Song (2001) These algorithms are slow, complexity O(n4) to O(n5)
New algorithms • MaxCliqueFinder complexity ≤ O(n2 log n) • SimpleCliqueFinder complexity O(n2)
R1 9 R2 8 R3 7 R4 6 5 4 R5 3 Segment tree 2 1 0 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
R1 9 R2 8 R3 7 R4 6 5 4 R5 3 Segment tree 2 1 0 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
R1 9 R2 8 R3 7 R4 6 5 4 R5 3 Segment tree 2 1 0 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
R1 9 R2 8 R3 7 R4 6 5 4 R5 3 Segment tree 2 1 0 0 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Maximal cliques: {R5,R2} {R3,R1,R2} R1 9 R2 8 R3 7 R4 6 5 4 R5 3 2 1 0 0 0 1 2 3 4 5 6 7 8 9 R1 R2 R2 R1 R3 R3 R5 R5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 R1 R2 R5 R3
SimpleCliqueFinder 1 1 1 1 1 0 0 0 0 1 2 2 2 1 0 0 0 0 1 2 3 3 2 1 1 1 0 1 2 3 2 3 1 2 1 0 1 2 2 2 1 0 1 0 0 0 1 1 1 0 0 1 0 0 0 1 1 2 1 1 2 1 1 0 0 0 1 1 1 2 1 1 0 0 0 0 0 0 1 0 0
Computation of the MLE • Parameter reduction: determine the inner rectangles • Optimization: determine the amounts of mass assigned to the inner rectangles.
Optimization • High-dimensional convex constrained optimization problem
Amsterdam Cohort Study among injecting drug users • Open cohort study • Data available from 1985 to 1997 • 637 individuals were enrolled • 216 individuals tested positive for HIV during the study
Model X: time of HIV infection Y: time of onset of AIDS Z = Y-X: incubation period U1 ,U2: observation times for X C: censoring variable for Y (X, Y) and (U1 ,U2, C) are independent
u1 u2 AIDS HIV
u1 u2 AIDS HIV
u1 u2 AIDS HIV
t = min(c,y) u1 u2 AIDS HIV
u1 u2 AIDS t = min(c,y) HIV
u1 u2 AIDS t = min(c,y) HIV
t = min(c,y) AIDS We observe: W = (U1, U2, T=min(C,Y), Δ) u1 u2 HIV
t = min(c,y) AIDS We observe: W = (U1, U2, T=min(C,Y), Δ) u1 u2 HIV
t = min(c,y) AIDS We observe: W = (U1, U2, T=min(C,Y), Δ) u1 u2 HIV
t = min(c,y) AIDS We observe: W = (U1, U2, T=min(C,Y), Δ) u1 u2 HIV