1 / 17

MISSING DATA IN THE INFECTIOUS DISEASES INSTITUTE CLINIC DATABASE

East African. MISSING DATA IN THE INFECTIOUS DISEASES INSTITUTE CLINIC DATABASE. Regional consortium. Agnes N Kiragga East Africa IeDEA investigators’ meeting 4-5 th May 2010. Objectives. Describe level of missing data for key variables

fausto
Download Presentation

MISSING DATA IN THE INFECTIOUS DISEASES INSTITUTE CLINIC DATABASE

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. East African MISSING DATA IN THE INFECTIOUS DISEASES INSTITUTE CLINIC DATABASE Regional consortium Agnes N Kiragga East Africa IeDEA investigators’ meeting 4-5th May 2010

  2. Objectives • Describe level of missing data for key variables • Factors associated with missing for patients on Antiretroviral therapy (ART) • Assess missing data assumptions in observational databases

  3. Assumptions of missing data • “missing completely at random” [MCAR] - not dependent on anything important • blood sample lost or not taken in error • “missing at random” [MAR] • - dependent only on other measured factors, not on the missing (unobserved) value • study specifies blood pressure below a threshold, so after registering a high value, patient is withdrawn [blood pressure at this visit] • “missing not at random” [MNAR] • related to the missing outcome itself • patient withdrew from study because they "didn't feel well“

  4. Registered 23121 Active 15070 Non-ART 13310 ART 9811 • DART 300 9511 Before 2005 1043 After 2005 8468 Study population 04/2000 – 04/2010

  5. Source of CD4 data • Electronic download (86146 (95%) • Recorded (3085 (5%))

  6. Missing baseline variables Note: 1=3mth pre-ART, 2=6mths pre-ART, 3=12mths pre-ART

  7. Number of missing baseline variables Note: a variables include weight, height and CD4 count

  8. Factors associated with missing baseline CD4 count No association with gender, age, weight

  9. CD4 counts at follow-up visits • CD4 tested 6 monthly (± 2 months) • Exclude baseline CD4 counts • Complete CD4 data No. of cd4 test expected >= No. total cd4 Given duration on ART counts observed • Missing CD4 data No. of cd4 test expected ≠ No. total cd4 Given duration on ART counts observed • 1423 (15%)- insufficient follow-up • 8088 (85%) assessed for missing CD4

  10. Categorization of follow-up CD4 data (N= 8088) Categorization | Freq. Percent -------------------------------------+------------------------ complete baseline+ complete follow-up | 2,878 35.58 complete baseline + missing follow-up | 2,529 31.27 missing baseline + complete follow-up | 1,315 16.26 missing baseline + missing follow-up | 1,366 16.89 ------------------------------- -----+------------------------ Total | 8,088 100.00 • Complete baseline + complete f/up + cd4 testing + timely cd4 tests = 864 (10.7%) • Included all nested research cohort patients

  11. Categorization of follow-up CD4 data year of ART initiation for patients with atleast 6 months follow-up n=995 n=2487 n=1174 n=1555 n=960 n=917

  12. Validation of incident Post-ART Tuberculosis cases • Tuberculosis most common opportunistic infection (rate (95% CI) 2.79 (2.45-3.16)) in first 24 months after ART initiation • Merged flagged TB cases with TB drug database • Identified patients on TB treatment • 334 incident post-ART cases

  13. Probability of development of Tuberculosis (TB) by baseline CD4 data 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0 .5 1 1.5 2 2.5 analysis time Complete baseline CD4 data Missing baseline_ CD4 data Log rank P<0.435 Assumption 1 Baseline CD4 data Missing completely at random

  14. Probability of development of Tuberculosis (TB) by follow-up CD4 data 0.30 0.25 0.20 0.15 0.10 0.05 0.00 0 .5 1 1.5 2 2.5 analysis time Missing follow cd4 data Complete follow up cd4 data Assumption 2 Baseline CD4 data missing at random

  15. Preliminary Insights from analysis • Reconcile local and IeDEA wide analyses • Baseline CD4 missing completely at random (MCAR) • Follow-up CD4 data missing at random • Ignoring the missing data will lead to biased estimates of ART • Strategies needed to identify patterns and mechanisms of missing data in observational data prior to analysis

  16. Planned analyses • missing data and other HIV outcomes e.g. • immune response • Incidence of other opportunistic infections • toxicity • treatment changes/switches • Strength of nested research cohort can be used to validate imputed data in large database • CD4 trajectories versus mortality -estimate the distribution of CD4 marker trajectories and the distribution of log survival time using mixed-effects models, measuring time from the first pre-HAART CD4

More Related