240 likes | 256 Views
This article provides an overview of event history analysis and explains the process of creating one record per spell for continuous-time data. It covers the components of the dependent variable, fixed and time-varying characteristics, and various examples.
E N D
Data structure for a continuous-time event history analysis Jane E. Miller, PhD
Overview • Structure of most survey data: one record per respondent • Event history analysis requires separate records for each period at risk of the event • One record per spell • How to create one record per spell • Components of the dependent variable • Fixed characteristics • Time-varying characteristics The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Data preparation for an event history • To conduct a continuous-time event history analysis requires one record per period at risk • Also known as a spell • Survey data often contains one record per respondent • Creating an event history data set involves generating one record per spell • Creating the components of the dependent variable • Mapping the fixed covariates onto each spell • Calculating time-varying covariates for each spell
Creating an analytic data set for study of divorce Fixed covariate Dates of beginning of period(s) at risk Dates pertaining to censoring and status at end of observation Dates pertaining to time-varying covariates Dates of event
Example timelines for study of divorce M = Married D = Divorced L = Lost to follow-up O = Censored by end of study. X = Died Case 1: Never married -> no spells Case 2: Married once, censored by end of survey O M Not married -> not at risk of divorce -> not part of a spell Case 3: Married twice, lost to follow-up before end of survey M D M L Case 4: Married once, died before end of survey X M Start of observation period End of observation period
Calculating number of spells for each respondent • Each respondent contributes a spell for each time they are at risk of the event under study • If they are never at risk -> no spells • Thus some respondents in the original data set might not be included in the event history analysis. E.g., in an analysis of • Getting married, anyone who was already married throughout the period of observation is not at risk of becoming married -> no spells • Getting divorced, the same respondent would be at risk the entire time! • If they are at risk once -> one spell • No more than one spell/respondent for non-repeatable events like death • For repeatable events -> potential for multiple spells • In an event history analysis of divorce, anyone who is observed during two periods of marriage contributes two spells
Example spells for a study of divorce M = Married D = Divorced L = Lost to follow-up O = Censored by end of study. X = Died Case 1: Never married -> Contributes zero spells to the divorce event history data set Case 2: Married once, censored by end of survey. Contributes one open spell O M Case 3: Married twice, lost to follow-up before end of survey. Contributes two total spells: one closed and one open (censored) Event under study = divorce M D M L Case 4: Married once, died before end of survey. Contributes one open spell X M
Event history data: Continuous time 1 record per spell Case 1 does not contribute ANY spells at risk of divorce because she was never married Cases 2 and 4 each contribute 1 spell, because each was married once Case 3 contributes 2 spells (periods at risk of divorce), 1 for each time married
Dependent variables for an event history • Duration of spell • Event indicator • Can be dichotomous • Occurred or not • Can be multichotomous • Differentiate between different reasons for nonevent • Death • Lost to follow-up • Both components must be constructed from information about the respondent’s timeline
Measures of duration replace most dates • In the event history data set, measures of duration replace • Dates of onset of risk period • Event occurrence • Censoring • Calculated from distance between dates from event history
Duration calculations • Unless precise dates are known, events or censoring are assumed to have occurred half-way through the period, yielding 0.5 person-units of exposure. • Assuming that an event occurred in the middle of a time period corresponds to a constant risk of the event during that time interval (Trussell and Hammerslough, 1983) • If exact dates are known, fractional person-time units can be assigned accordingly. • For instance, if a person was divorced on March 10, they would be assigned 10/30 or 0.333 person-months at risk in that month.
Detailed indicator of status at end of spell Case 2: Married once, still married at end of survey Case 3: First marriage ended in divorce Case 3: Married second time, lost to follow-up in 2005 Case 4: Married once, died in 2002 • Coding of status indicator: • 0 = censored • 1 = divorced • 2 = lost to follow-up (LFU) • 3 = died • Coding of divorce event indicator: • 0 = censored, LFU, died • 1 = divorced
Dependent variable components Dependent variable component #1 – duration measure Dependent variable component #2 – dichotomous event indicator
Fixed and time-varying covariates • Covariate = independent variable • Fixed covariates are those that have the same value for a given respondent throughout the spell • E.g., except in rare cases, each person’s gender remains constant • Time-varying covariates are those whose values can change for a respondent between or during spells • E.g., number of children • Need to map each of these correctly from the one-record-per-respondent onto one-record-per-spell
Fixed covariates Age at start of spell and gender do not change during the course of a spell
Time-varying covariate None of these respondents had children prior to their first marriage (though theoretically some cases could). Respondent #3 had one child during his first marriage, so the value of that variable changes between his first and second marriages (first and second spells at risk of divorce).
Presenting information on event history construction: Background work • Most of the gory details of programming creation of an event history are parts of behind-the-scenes work • Important to do consistency checks to make sure event histories were created correctly given • Original data source of information for timeline construction • Type of event under study • Fixed covariates • Time-varying covariates • E.g., correct • Number of spells for each respondent • Duration and event indicators for each spell
Presenting information on event history construction • In the data and methods section, describe: • Original data source of information for timeline construction • Dates, status, duration of events • Type of event under study • What constitutes censoring • Fixed covariates • Time-varying covariates • Source(s) of information for determining timing of changes in those variables • See checklist in chapter 17 of Writing about Multivariate Analysis, 2nd Edition for more detail on what to report
Summary • A continuous-time event history analysis requires a separate record for each period at risk of the event • For each spell, calculate • Components of the dependent variable • Duration measure • Event indicator • Fixed characteristics • Time-varying characteristics • In data and methods section, describe data sources and variables for the event history The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Suggested resources • Allison, P. D. 2010. Survival Analysis Using the SAS System: A Practical Guide, 2nd Edition. Cary, NC: SAS Institute. • Trussell, James, and Charles Hammerslough. 1983. “A Hazards-Model Analysis of the Covariates of Infant and Child Mortality in Sri Lanka.” Demography20 (1): 1–26. • Miller, J. E. 2013. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. University of Chicago Press, chapter 17. The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.
Suggested online resources • Podcast on data structure for a discrete-time event history analysis
Suggested exercises • Study guide to The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition. • Question #3a in the problem set for chapter 17 • Suggested course extensions for chapter 17 • “Reviewing” exercises #2a through 2h • “Applying statistics and writing” exercises #1 and 2a
Contact information Jane E. Miller, PhD jmiller@ifh.rutgers.edu Online materials available at http://press.uchicago.edu/books/miller/multivariate/index.html The Chicago Guide to Writing about Multivariate Analysis, 2nd Edition.