1 / 24

Data structure for a continuous-time event history analysis

Data structure for a continuous-time event history analysis. Jane E. Miller, PhD. Overview. Structure of most survey data: one record per respondent Event history analysis requires separate records for each period at risk of the event One record per spell How to create one record per spell

evette
Download Presentation

Data structure for a continuous-time event history analysis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data structure for a continuous-time event history analysis Jane E. Miller, PhD

  2. Overview • Structure of most survey data: one record per respondent • Event history analysis requires separate records for each period at risk of the event • One record per spell • How to create one record per spell • Components of the dependent variable • Fixed characteristics • Time-varying characteristics The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

  3. Data preparation for an event history • To conduct a continuous-time event history analysis requires one record per period at risk • Also known as a spell • Survey data often contains one record per respondent • Creating an event history data set involves generating one record per spell • Creating the components of the dependent variable • Mapping the fixed covariates onto each spell • Calculating time-varying covariates for each spell

  4. Source data: 1 record per respondent

  5. Creating an analytic data set for study of divorce Fixed covariate Dates of beginning of period(s) at risk Dates pertaining to censoring and status at end of observation Dates pertaining to time-varying covariates Dates of event

  6. Example timelines for study of divorce M = Married D = Divorced L = Lost to follow-up O = Censored by end of study. X = Died Case 1: Never married -> no spells Case 2: Married once, censored by end of survey O M Not married -> not at risk of divorce -> not part of a spell Case 3: Married twice, lost to follow-up before end of survey M D M L Case 4: Married once, died before end of survey X M Start of observation period End of observation period

  7. Calculating number of spells for each respondent • Each respondent contributes a spell for each time they are at risk of the event under study • If they are never at risk -> no spells • Thus some respondents in the original data set might not be included in the event history analysis. E.g., in an analysis of • Getting married, anyone who was already married throughout the period of observation is not at risk of becoming married -> no spells • Getting divorced, the same respondent would be at risk the entire time! • If they are at risk once -> one spell • No more than one spell/respondent for non-repeatable events like death • For repeatable events -> potential for multiple spells • In an event history analysis of divorce, anyone who is observed during two periods of marriage contributes two spells

  8. Example spells for a study of divorce M = Married D = Divorced L = Lost to follow-up O = Censored by end of study. X = Died Case 1: Never married -> Contributes zero spells to the divorce event history data set Case 2: Married once, censored by end of survey. Contributes one open spell O M Case 3: Married twice, lost to follow-up before end of survey. Contributes two total spells: one closed and one open (censored) Event under study = divorce M D M L Case 4: Married once, died before end of survey. Contributes one open spell X M

  9. Event history data: Continuous time 1 record per spell Case 1 does not contribute ANY spells at risk of divorce because she was never married Cases 2 and 4 each contribute 1 spell, because each was married once Case 3 contributes 2 spells (periods at risk of divorce), 1 for each time married

  10. Dependent variables for an event history • Duration of spell • Event indicator • Can be dichotomous • Occurred or not • Can be multichotomous • Differentiate between different reasons for nonevent • Death • Lost to follow-up • Both components must be constructed from information about the respondent’s timeline

  11. Measures of duration replace most dates • In the event history data set, measures of duration replace • Dates of onset of risk period • Event occurrence • Censoring • Calculated from distance between dates from event history

  12. Duration calculations • Unless precise dates are known, events or censoring are assumed to have occurred half-way through the period, yielding 0.5 person-units of exposure. • Assuming that an event occurred in the middle of a time period corresponds to a constant risk of the event during that time interval (Trussell and Hammerslough, 1983) • If exact dates are known, fractional person-time units can be assigned accordingly. • For instance, if a person was divorced on March 10, they would be assigned 10/30 or 0.333 person-months at risk in that month.

  13. Detailed indicator of status at end of spell Case 2: Married once, still married at end of survey Case 3: First marriage ended in divorce Case 3: Married second time, lost to follow-up in 2005 Case 4: Married once, died in 2002 • Coding of status indicator: • 0 = censored • 1 = divorced • 2 = lost to follow-up (LFU) • 3 = died • Coding of divorce event indicator: • 0 = censored, LFU, died • 1 = divorced

  14. Dependent variable components Dependent variable component #1 – duration measure Dependent variable component #2 – dichotomous event indicator

  15. Fixed and time-varying covariates • Covariate = independent variable • Fixed covariates are those that have the same value for a given respondent throughout the spell • E.g., except in rare cases, each person’s gender remains constant • Time-varying covariates are those whose values can change for a respondent between or during spells • E.g., number of children • Need to map each of these correctly from the one-record-per-respondent onto one-record-per-spell

  16. Fixed covariates Age at start of spell and gender do not change during the course of a spell

  17. Time-varying covariate None of these respondents had children prior to their first marriage (though theoretically some cases could). Respondent #3 had one child during his first marriage, so the value of that variable changes between his first and second marriages (first and second spells at risk of divorce).

  18. Presenting information on event history construction: Background work • Most of the gory details of programming creation of an event history are parts of behind-the-scenes work • Important to do consistency checks to make sure event histories were created correctly given • Original data source of information for timeline construction • Type of event under study • Fixed covariates • Time-varying covariates • E.g., correct • Number of spells for each respondent • Duration and event indicators for each spell

  19. Presenting information on event history construction • In the data and methods section, describe: • Original data source of information for timeline construction • Dates, status, duration of events • Type of event under study • What constitutes censoring • Fixed covariates • Time-varying covariates • Source(s) of information for determining timing of changes in those variables • See checklist in chapter 17 of Writing about Multivariate Analysis, 2nd Edition for more detail on what to report

  20. Summary • A continuous-time event history analysis requires a separate record for each period at risk of the event • For each spell, calculate • Components of the dependent variable • Duration measure • Event indicator • Fixed characteristics • Time-varying characteristics • In data and methods section, describe data sources and variables for the event history The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

  21. Suggested resources • Allison, P. D. 2010. Survival Analysis Using the SAS System: A Practical Guide, 2nd Edition. Cary, NC: SAS Institute. • Trussell, James, and Charles Hammerslough. 1983. “A Hazards-Model Analysis of the Covariates of Infant and Child Mortality in Sri Lanka.” Demography20 (1): 1–26. • Miller, J. E. 2013. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. University of Chicago Press, chapter 17. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

  22. Suggested online resources • Podcast on data structure for a discrete-time event history analysis

  23. Suggested exercises • Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. • Question #3a in the problem set for chapter 17 • Suggested course extensions for chapter 17 • “Reviewing” exercises #2a through 2h • “Applying statistics and writing” exercises #1 and 2a

  24. Contact information Jane E. Miller, PhD jmiller@ifh.rutgers.edu Online materials available at http://press.uchicago.edu/books/miller/multivariate/index.html The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.

More Related