530 likes | 542 Views
This book provides methods and applications for estimating causal effects using observational data, focusing on propensity score-based methods, matching, weighting, and limitations.
E N D
Causal Effect Estimation with Observational Data: Methods and ApplicationsPart II Michael Lamm and Yiu-Fai Yung SAS Institute 2018 Iowa and Nebraska SAS Users Groups
Outline • Part I • Issues of causal inference from observational data • Introducing the propensity score • Theories and assumptions • Matching methods • Part II • Weighting methods • Doubly robust methods • Limitations • Summary and conclusions
Confounding Variables complicate the estimation of causal effect from an observational study Sports Music (Music Training) GPA (Academic Performance) Confounding variables are pretreatment characteristics associated with both the treatment and the outcome variables Confounding variables explain parts of the observed treatment outcome association and can bias causal effect estimates
The propensity score is commonly used as the basis of matching methods No (Re-) Specify a propensity score model Outcome analysis Yes Good covariate balance? A propensity score is the probability of receiving treatment given :
Causal effects are defined by using potential outcomes • Potential outcomes are used to describe what outcome would occur for a subject under every possible treatment scenario • potential outcome in the treatment condition • potential outcome in the control condition • You can estimate the ATE: E – or the ATT: E • The stable unit treatment value assumption (SUTVA) ensures that causal effects are well-defined • The consistency assumption relates the observed outcomes to the potential outcomes: • No unmeasured confounding is assumed to enable the identification of treatment effects:
If you can observe all the potential outcomes ... • A hypothetical perfect sample in which you can observe all potential outcomes Y(1), Y(0) • Average treatment effect = • Mean(Y(1)) Mean(Y(0)) = 64 =2
The fundamental problem of causal inference • Holland (1986) • Each Y indicates only one of the potential outcomes • The other potential outcome is always missing • Observed mean of (Y|T=0) = 2 • Observed mean of (Y|T=1) = 6 • Observed effect = 4
Inverse probability weighting can create a pseudo population with comparable treatment conditions Inverse probability weighting is a common approach for handling missing data A patient with covariates and treatment will have a weight based on the propensity score
Observational studies have a large amount of missing potential outcomes Only a single outcome is observed for each subject, so at least half of the potential outcomes are “missing” This is a much larger amount of missing data than typically encountered in an experiment of RCT
Like experiments, observational studies should be carefully designed to ensure proper analyses Designing an experiment helps ensure that you are examining a well-defined causal question that satisfies the SUTVA • What is the target population? • When is treatment assigned and how long is the treatment period? • Is the outcome being properly measured? The same questions should be considered when designing and analyzing observational studies Clear answers to design questions are necessary to state the conditions under which claims of causality are valid
Does quitting smoking lead to weight change? Data: A subset (N=1,746) of NHANES I Epidemiologic Follow-Up Study (NHEFS) in Hernán and Robins (2016) Collect medical and behavioral information in an initial physical examination Follow-up interviews were done approximately 10 years later Treatment variable Quit: quit smoking during the 10-year period Outcome variable Change: change in weight (in kg) Confounders include:Activity, Age, BaseWeight, Education, Exercise, PerDay, Race, Sex, Weight, YearsSmoke
PROC PSMATCH: Estimating ATT through output weights proc psmatchdata=smokingweight; class Sex Race Education Exercise Activity Quit ; psmodelQuit(Treated='1') = Sex Age Education Exercise Activity YearsSmokePerDay; output out= smokeATTWeightsattwgt=attwgt; run; proc ttestdata=smokeATTWeights; class Quit; varChange; weight attwgt; run;
ATT weights correct for bias and target the analysis to the population of interest For an individual with observed treatment and covariates , the ATT weight equals to The propensity score is, For subjects in the treatment condition, ATT Weight = = 1 For subject in the control condition, ATT Weight =
The output data set contains the original data, propensity scores, and ATT weights Obssex age ... _PS_ attwgt 1 0 42 ... 0.20604 0.25951 2 0 36 ... 0.16018 0.19073 3 1 56 ... 0.27455 0.37845 4 0 68 ... 0.46022 0.85260 5 0 40 ... 0.28227 0.39327 . . . ... . . . . . ... . . . . . ... . .
Inverse propensity score weighting method for estimating ATT T-test with weights based on the sample created from the PSMATCH procedure Variable: Gpa Quit Method Mean 95% CL Mean Std Err 0 1.2495 0.8253 1.6736 0.2162 1 4.2551 3.6684 5.3818 0.4358 Diff (1-2) Pooled -3.2756 -4.0773 -2.4740 0.4087 Diff (1-2) Satterthwaite -3.2756 -4.2310 -2.3203 0.4865
PROC CAUSALTRT: Estimating ATT with METHOD=IPWR proc causaltrt data=School method=ipwratt; class Sex Race Education Exercise Activity Quit ; psmodel Quit(Event='1') = Sex Age Education Exercise Activity YearsSmokePerDay / plots = pscovden(effects(age YearsSmoke)); model gpa; run; • The model statement and psmodelstatement are both required when you use PROC CAUSALTRT • You use the method= option to select an estimation method • By default PROC CASUALTRT estimates the ATE, to you request estimation of the ATT you use the attoption
Inverse propensity score weighting method for estimating ATT Estimation of ATT by the IPWR method of the CAUSALTRT procedure Analysis of Causal Effect Treatment Robust Wald 95% Parameter Level Estimate Std Err Confidence Limits Z Pr > |Z| POM 1 4.5251 0.4352 3.6720 5.3781 10.40 <.0001 POM 0 1.2495 0.2565 0.7467 1.7522 4.87 <.0001 ATT 3.2756 0.4815 2.3319 4.21936.80 <.0001
Which PS-Weighting method in the CAUSALTRT and PSMATCH procedures should you use? Both yield the same estimates of ATT in this example PROC CAUSALTRT produces standard error estimates that takes the estimation of propensity scores into account The catch: You must be certain that your propensity score model is correct
CAUSALTRT or PSMATCH Same theoretical foundations: Potential outcomes framework (Neyman 1923; Rubin 1974) Some overlap in functionalities (e.g., weighting methods) PROC PSMATCH motto: “Do not involve the outcome variables when you do propensity score analysis---stratification, matching or weighting” • Advantage: separation of design from analysis enables exploratory analysis in propensity score analysis PROC CAUSALTRT: Results from the propensity score model and outcome model can be “combined”--- AIPW or IPWREG • Advantage: more efficient point estimates and more accurate standard error estimates
Regression Adjustment Method Estimation by regression adjustment performs the following steps: • Fit models for the outcome separately within each of the treatment conditions • Compute predicted outcomes for each subject from these models • Use the predicted values to estimate the treatment effect of interest METHOD=REGADJ option in PROC CAUSALTRT
Considerations when using regression adjustment The method is dependent on a correctly specified outcome model • Incorrectly specified outcome models can lead to biased model estimates and biased treatment effects Extrapolation might be a concern if covariate distributions in treatment conditions are systematically different
Doubly Robust Methods of PROC CAUSALTRT Augmented inverse probability weighting • Estimate the propensity score and perform weighing • Augment weighting by using predicted outcome values • METHOD=AIPW in PROC CAUSALTRT Inverse probability weighted regression • Estimate the propensity scores • Fit outcome models with inverse probability weights • Estimate the causal effects by using predicted values from • METHOD=IPWREG in PROC CAUSALTRT
Main Idea of Doubly Robust Methods You can get unbiased estimation of causal treatment effects if either or both of the following models that you specify are true: • Propensity score model for the treatment variable • Regression model for the outcome model “Doubly” robust: You have two chances to get it right AIPW formulas: Analytic formulas for computing standard errors (Lunceford & Davidian 2004)
Estimating ATE by the AIPW Method proc causaltrt data=smokingweightmethod=aipwcovdiffps plots=all; class Sex Race Education Exercise Activity Quit /descending; psmodel Quit = Sex Age Education Exercise Activity YearsSmokePerDay; model Change = Sex Age Exercise Activity BaseWeight; run; • The aipwestimation method requires models for both the treatment and the outcome • You can request measures of covariate balance by using the covdiffps option
Estimation of ATE by the AIPW Method Treatment Robust Wald 95% Parameter Level Estimate Std Err Confidence Limits Z Pr > |Z| POM 1 5.0830 0.4495 4.2019 5.9641 11.31 <.0001 POM 0 1.7781 0.2156 1.3556 2.2007 8.25 <.0001 ATE 3.3049 0.4911 2.3423 4.2675 6.73 <.0001
Assessing Covariate Balance Covariate Differences for Propensity Score Model Standardized Difference Variance Ratio Parameter Unweighted Weighted Unweighted Weighted Sex 1 -0.1603 -0.0200 0.9962 1.0006 Sex 0 Age 0.2820 0.0318 1.0731 0.9847 Education 5 0.1660 0.0111 1.4610 1.0268 Education 4 -0.0270 0.0196 0.9167 1.0624 Education 3 -0.0472 -0.0015 0.9811 0.9994 Education 2 -0.1116 -0.0034 0.8498 0.9953 Education 1 Exercise 2 0.0568 -0.0029 1.0252 0.9986 Exercise 1 0.0398 0.0166 1.0119 1.0049 Exercise 0 Activity 2 0.0740 -0.0074 1.2182 0.9796 Activity 1 0.0268 0.0196 1.0043 1.0029 Activity 0 YearsSmoke 0.1589 0.0253 1.1846 1.0894 PerDay-0.2167 0.0027 1.1679 1.3323
Family Aid and Child Development Asubset of data from the 1997 Child Development Supplement to the Panel Study of Income Dynamics (Hofferth et al. 2001; Guo and Fraser 2015) Treatment variable AFDC: Receiving welfare benefit Outcome variables Lwi: child’s development, as measured by the age-normalized letter-word identification portion of the Woodcock-Johnson Tests for Achievement N=1,003 children whose primary caregiver was less than 36 years old
Other Variables Age: Age of the child in 1997 PcgAFDC: Indicator for whether the child’s primary caregiver received support from a public assistance program when the primary caregiver was between the ages of 6 and 12 PcgEd: Number of years of schooling for the child’s primary caregiver Race: Indicator for whether the child is African-American Ratio: Ratio of family income to the poverty threshold in 1996 Sex: Indicator for whether the child is male
Data Set data Children; input Sex Race Age Ratio PcgEdPcgAFDC AFDC Lwi; datalines; 0 0 4 0.6089 12 0 1 81 0 0 12 0.4113 9 0 1 93 0 0 12 4.9965 12 0 0 109 1 0 6 1.0683 11 1 0 74 1 0 4 1.0683 11 1 0 79 0 0 4 3.1081 12 1 0 88 ... more lines ... 1 1 6 0.7390 12 1 1 99 0 1 5 1.1932 12 1 0 115 0 1 4 1.5719 11 1 0 108 1 1 3 1.1919 12 1 0 108 1 1 3 0.3129 9 0 1 101 0 1 5 2.3229 12 0 0 79 ;
Estimating Welfare Effect on Child Development proc causaltrt data=Children covdiffpsnthreads=2; class AFDC PcgAFDC Race Sex; psmodel AFDC(ref='0') = Sex Race Age PcgEdPcgAFDC/ plots=(pscovdenweightcloud); modelLwi = Sex PcgEd Ratio; bootstrap seed=1776; run;
AIPW Estimation of ATE by PROC CAUSALTRT Analysis of Causal Effect Treatment Robust Bootstrap Wald 95% Parameter Level Estimate Std Err Std Err Confidence Limits POM 1 98.5565 1.4458 2.1344 95.7229 101.39 POM 0 103.14 0.8086 0.7919 101.56 104.73 ATE -4.5867 1.6437 2.2470 -7.8082 -1.3652 Bootstrap Bias Treatment Corrected 95% Parameter Level Confidence Limits Z Pr > |Z| POM 1 94.5800 103.17 68.17 <.0001 POM 0 101.74 104.64 127.56 <.0001 ATE -8.8993 0.4247 -2.79 0.0053
Assessing Covariate Balance Covariate Differences for Propensity Score Model Standardized Difference Variance Ratio Parameter Unweighted Weighted Unweighted Weighted Sex 0 0.0335 -0.0433 1.0036 0.9936 Sex 1 Race 0 -0.9343 -0.0621 0.7404 0.9989 Race 1 Age 0.2196 0.0020 1.0266 0.9650 PcgEd -0.9067 -0.0974 0.6789 0.5439 PcgAFDC 0 -0.6476 -0.0660 1.8658 1.0739 PcgAFDC 1
Other Types of Propensity Score Models PROC PSMATCH uses logistic regression models for estimating propensity scores If a propensity score model does not lead to good covariate balance, what can you do? • Use another set of predictors in the logistic model • Use another type of model
Can another modeling techniques yield better propensity scores? /* Use of HPSPLIT to fit PSMODEL */ proc hpsplit data=children seed=12345 ; class AFDC PcgAFDC Race Sex; model AFDC = Sex Race Age PcgEdPcgAFDC; output out=smpred; id AFDC PcgAFDC Race Sex Age PcgEdLwi; run; A decision tree that uses the same covariates offers a non-parametric alternative for predicting the propensity scores
HPSPLIT Output Data Set PcgPcg Obs AFDC AFDC Race Sex Age Ed Lwi _Node_ _Leaf_ P_AFDC0 P_AFDC1 1 1 0 0 0 4 12 81 6 1 0.92088 0.07912 2 1 0 0 0 12 9 93 9 3 0.28571 0.71429 3 0 0 0 0 12 12 109 6 1 0.92088 0.07912 4 0 1 0 1 6 11 74 10 4 0.75000 0.25000 5 0 1 0 1 4 11 79 10 4 0.75000 0.25000 6 0 1 0 0 4 12 88 6 1 0.92088 0.07912 7 1 0 1 0 7 12 95 12 6 0.59633 0.40367 8 0 0 1 0 6 14 140 4 0 0.79070 0.20930 9 0 0 1 1 4 14 84 4 0 0.79070 0.20930 10 0 0 0 1 11 12 99 6 1 0.92088 0.07912 … More lines …
Full matching with propensity scores from HPSPLIT proc psmatch data=smpred; class afdc; psdatatreatvar=afdc(treated='1') ps=p_afdc1; match method=full(kmax=5) stat=ps caliper=. ; assess psvar=(age PcgEd) / plots=all weight=matchatewgt; output out=fullMatchTreematchid=_MID_ matchatewgt=atewgt; run; • You specify thepsdatastatementwhen the input data set includes precomputed propensity scores • The treatvar= option identifies the treatment variable • The ps= option identifies the variable that contains the propensity scores
Assessing balance of categorical covariates of the data set created by full matching proc freqdata=fullMatchTree; table afdc*Sex afdc*race afdc*pcgAFDC; weight atewgt; run;
Estimation of the ATE proc ttestdata=fullMatchATE; class afdc; varlwi; weight atewgt; run; Variable: Lwi AFDC Method Mean 95% CL Mean 0 103.5 102.3 104.7 1 94.7856 93.0614 96.5099 Diff (1-2) Pooled 8.7242 6.7855 10.6629 Diff (1-2) Satterthwaite 8.7242 6.6174 10.8309