510 likes | 525 Views
This article explores the use of instrumental variables (IVs) and examines the sensitivity to unobserved biases in IV methods. It presents an example of the IV method in studying the effect of World War II military service on future earnings. The strength of IVs and the challenges of invalid exclusion restrictions are also discussed.
E N D
Strategies for Using Partially Valid Instrumental Variables Dylan Small Department of Statistics, Wharton School, University of Pennsylvania Joint work with: Paul Rosenbaum Mike Baiocchi Marshall Joffe Tom Ten Have
Overview • Example of Instrumental Variables (IV) method: Effect of World War II military service on future earnings. • Sensitivity to unobserved biases for IV method. • Strength of IVs and sensitivity to unobserved biases: How do small studies with strong IVs compare to large studies with weak IVs? • Extended instrumental variables methods when exclusion restriction for IV is invalid.
WWII Veteran Status and Earnings • Does military service raise or lower earnings? • Angrist and Krueger (1994) studied this in context of WWII military service and 1980 earnings (using 5% public use sample of US Census). • Lower earnings? Military service in WWII interrupts education or career. • Higher earnings? Labor market might favor veterans, GI Bill increases education.
WWII Vets (76% of men) earned on average $4500 more in 1980 than Non-Vets. This is association not causation: WWII Vets might not be comparable to Non-Vets in terms of health, criminal behavior…
We created matched triples: men matched on quarter of birth, race, age, education up to 8 years and location of birth. This figure provides reason to doubt military service increases earnings by $4500. From 1924 to 1926, the proportion of veterans stayed about constant and the earnings stayed about the same. From 1926 to 1928, the proportion of veterans decreased by 50% but earnings increased, suggesting military service decreases earnings.
Unmeasured Confounding Graph is conditional on measured confounders (race, education up to 8 years, location of birth) Earnings Veteran Status Unobserved Variables
Instrumental Variables Strategy Y=Outcome W=Treatment Z=IV Extract variation in W from Z that is free of unobserved confounders and use this variation to estimate the causal effect of W on Y. Key IV Assumptions: (1) Z independent of unobserved variables; (2) Z does not have direct effect on outcome. Y:Earnings Graph is conditional on measured confounders (race, education up to 8 years, location of birth) X W: Veteran Status Z: Year of Birth Unobserved Variables X
Strength of IV • An IV is strong if encouragement has a strong effect on treatment received; An IV is weak if encouragement has only a weak effect on treatment received. • Effects of Weak IVs • Increased Variance • Increased Sensitivity to Bias
Effect of Weak IVs I: Increased Variance If Z is a weak IV, then the variance of the IV estimate will be higher because less variation in W from Z can be extracted. Y X W|X Z|X Unobserved Variables X
95% CI for effect of military service using 1926 vs. 1928 IV: (-$1,445, -$500). 95% CI for effect of military service using 1924 vs. 1926 IV: (-$10,130, $10,750)
Extended IV Methods for Addressing Violation of Exclusion Restriction • Angrist, Imbens and Rubin (1996): two key conditions for valid IV are : • IV effectively random assigned conditional on measured covariates X • No direct effect on Y (exclusion restriction). • We consider situations in which the random assignment is plausible but the exclusion restriction is not.
Instrumental Variables Strategy Y=Outcome W=Treatment Z=IV Extract variation in W from Z that is free of unobserved confounders and use this variation to estimate the causal effect of W on Y. Key IV Assumptions: (1) Z independent of unobserved variables; (2) Z does not have direct effect on outcome. Y:Earnings Graph is conditional on measured confounders (race, education up to 8 years, location of birth) X W: Veteran Status Z: Year of Birth Unobserved Variables X
Vascular access in hemodialysis • Hemodialysis • One of main treatment options in end-stage renal disease (ESRD) • Requires access to vascular system • Three main types • Catheter • Synthetic material • Native arteriovenous fistula (AVF)
Vascular access (cont’d) • Type of VA (A) partially determines dose of dialysis (DD; S) • Native AVF allows larger doses than catheter • S may affect outcomes (e.g., mortality) • VA may have effects on outcome (Y) not mediated by dose (e.g., infection) • Incomplete directed acyclic graph (DAG) of key variables
Estimand of interest • To gauge impact of type of VA, interested in overall effect • Involves both • Direct effect (A->Y) • Indirect effect (A->S->Y) • Formulate in terms of potential outcomes:
Confounding by indication • AVFs given preferentially to healthier subjects • Results in confounding by indication • Often difficult to control using standard methods based on ignorable treatment assignment • Variety of treatments of dialysis patients in which standard approaches based on ignorability lead to implausible results • Dose of dialysis choice (S) also nonignorable
Instrumental variables • Alternative approach for estimation • Need to find instrumental variable (R) • Associated with treatment of interest (A) • Independent of unmeasured confounders, i.e., shares no unmeasured common cause with outcome Y. • Has no direct effect on outcome (exclusion restriction) • Practice at which dialysis provided reasonable candidate • Used for various analyses in Dialysis Outcomes and Practice Patterns Study (DOPPS) • Large, international study with hundreds of practices • Will assume that practice (R) shares no unmeasured common causes with S or Y.
Revise DAG • Need to elaborate DAG • Include • instrument/center (R) • Measured (X) and unmeasured (U) common causes of variables of interest • Is R a valid instrument for the overall effect of A on Y?
Graphical criteria for instrument • Remove effect of treatment of interest • Check whether R independent of/D-separated from Y • Directed path R->S->Y • Criterion not satisfied • R not a valid instrument for overall effect of A • In Angrist, Imbens & Rubin framework, the problem is that R has direct effect on Y through S and hence violates the exclusion restriction.
Second Example: Return to Schooling • Y=Earnings, A=Years of Education • Unmeasured confounders: Ability, Motivation. • Card (1993) proposes as an IV, R= distance person grew up from nearest four year college. • Problem: • R also affects whether person lives in an SMSA as an adult (S) conditional on A and measured confounders X (whether lived in an SMSA growing up, region where grew up and family background variables). • There is a wage premium to living in an SMSA as an adult.
Return to Schooling DAG • R (living near college growing up) is not a valid instrument for the overall effect of A (years of schooling) on Y (earnings) because it has direct effect on Y through S (lives in SMSA as an adult).
Estimation • For estimating overall effects of A in these two problems, can’t use • Standard methods based on ignorability • Standard instrumental variables methods • Idea: Look for interactions between R and X that can serve as instruments.
Extended Instruments • Look for component of X that interacts with R to affect A but not Y directly. • Card proposes family income as component of X that • Interacts with R to affect A : college proximity is a factor that lowers costs of higher education, consequently it has a bigger effect on a poorer family • Does not directly effect S nor Y: the direct earnings effect of living near a college or the direct effect on living in an SMSA does not vary by family background. R*X
Two-step approach • Estimate joint effect of A, S on Y • Estimate effect of A on S • Combine to obtain overall effect • In systems of linear models, overall effect is sum of • Direct effect of A: ψA • Indirect effect of A: ψSΦA
Two-step approach (1st step) • Yaspotential outcome • Model for joint effect: • Yas=Y00+aψA+sψS • Rank-preserving/deterministic formulation • Model for observables • E*=Best Linear Predictor • E*(Y|X,R)=E*(YAS|X,R)= E*(Y00|X,R,X*R)+E*(A|X,R,X*R)ψA+E*(S|X,R,X*R)Ψs • Identifiability requires that E*(Y00|X,R,X*R), E*(A|X,R,X*R) and E*(S|X,R) not collinear. • One way: Assume E*(Y00|X,R,X*R) only depends on X. Then we need one component of X that interacts with R to affect A. • Another way: Assume E*(Y00|X,R,X*R) depends on X and R but not X*R. Then we need at least two components of X that interacts with R to affect. • Estimation by two stage least squares. Regress A and S on X, R and X*R. Regress Y on
Two-step approach (2nd step) • Under assumptions • Effect of A on S confounded • R not instrument for effect of A on S • Consider alternative • Linear model for joint effect of R, A • Sra=S00+rΦR+aΦA • Model for observables • E*(S|X,R)=E*(S00|X,R,X*R)+RΦR+ E*(A|X,R,X*R)ΦA • Can estimate by 2SLS under the assumption that E*(S00|X,R,X*R) does not depend on X*R (uncheckable) and that X*R affects A. • Regress A on X, R, X*R. Regress S on , X, R. R*X
Summary • The IV method can be a powerful strategy for observational studies when there are confounders that are hard to measure and there is a “random” encouragement to receive treatment. • When encouragement is not actually random, it is important to do a sensitivity analysis. • Strong IVs are much less sensitive to bias. • When the exclusion restriction might be violated, developed extended IV methods that use X*R as IVs.
Papers • Small, D.S. and Rosenbaum, P.R. (2008), “War and Wages: The Strength of Instrumental Variables and Their Sensitivity to Unobserved Biases,” Journal of the American Statistical Association, 103, 924-933. • Joffe, M. M., Small, D.S., Brunelli, S., Ten Have, T.R., and Feldman, H. I. (2008), "Extended Instrumental Variables Estimation for Overall Effects," International Journal of Biostatistics, 4. • Baiocchi, M., Small, D.S., Lorch, S.A. and Rosenbaum, P.R. (2010), “Building a Stronger Instrument in an Observational Study of Perinatal Care for Premature Infants,” Journal of the American Statistical Association, 105, 1285-1296 • e-mail: dsmall@wharton.upenn.edu
Alternative estimands • Assumed that interested in overall effect • Vascular Access (VA) inevitably affects Dose of Dialysis (DD) • Type of VA limits possible dose • However, may be possible to alter DD • Interested in • Effect of DD • Effect of VA if affects DD in different fashion from under current practice
Alternative estimands (cont’d) • Show altered effect, new intervention on DAG • Formulate in terms of potential outcomes • Contrast for different levels of treatment
Alternative estimands (cont’d) • Defining intervention on S • Individualize target levels of S • e.g., base on maximum tolerated DD • Insufficient information in established databases (e.g, DOPPS) • Set target level of S based on A, covariates X • Currently little information to set target levels • Available covariate information may be insufficient to determine whether particular DD feasible for individual
Alternative estimands (cont’d) • Defining intervention on S • Speculate about feasible interventions on S at aggregate level • Consider effects of A on S under those interventions; i.e., propose value for ΦA* • Compute overall effect from component effects: ψA+ψSΦA* • Perform sensitivity analysis for values of ΦA*
One-step approach • Estimator of effect of A on S does not require either standard ignorability or IV • Can we do same for overall effect of A on Y? • Remove S from graph, redraw diagram • Graph identical to original graph removing Y • Use same methods of estimation for effect of A on S R*X R*X