IV Analysis

IV Analysis Stefan Walter Dept of Epidemiology and Biostatistics UCSF swalter@psg.ucsf.edu

U Causality from IV analysis X Y IV methods can consistentlyestimatetheaverage causal effect of anexposureonanoutcomeeven in thepresence of unmeasuredconfounding! Instrumental Variable estimation uses theunconfoundedcomponent of thevariance (whichisdeterminedbytheinstrument Z) in theexposure X to estimatetheeffect of X onoutcome Y. Itestimatestheeffect of treatmentamongthosewhoreceivetreatmentbecause of theinstrument. … iftheinstrumentisvalid …

IV Analyses: leverage (pseudo-)randomization Use pseudo-randomization as an instrument to estimate the effect of a phenotype on the outcome. Example instruments: • Randomization in an RCT • Before/after policy change, e.g., labeling rules, pharmacy rules, especially if not implemented universally • Physician preference • Distance to service provider • Any characteristic that makes patient ineligible for treatment but does not otherwise affect outcome Natural experiment (Z) Phenotype (X) Disease (Y) Unmeasured Confounders

Instrumental Variable Analysis • Causal diagram representing the assumptions for genetic IV analyses to estimate the effect of BMI on anxiety. The causal diagram follows the rules for directed acyclic graphs (DAG) • 1) the genotype affects BMI; • 2) the genetic instrumental variables do not influence the outcome except via BMI; and • 3) there are no common causes of genotype and cognition. UnmeasuredConfounder Cognition BMI Gene

Link to RCTs U Z – Randomization X – Treatment U – Unmeasured Confounding / Selection Y – Outcome X Y Z We value RCTs so much because we are relatively confident that randomization fulfills the assumptions for a valid IV.

Estimation: Many options Relation Instrument  Outcome (ITT) βIV = = Relation Instrument  Treatment(Adherence)

2 Stage Least Squares (2SLS) • Calculatethepredictedvalues of theexposure. (ZX) eg linear regression of BMI on IV 1st Stage: E(X|Z) = phênotype=b0+b1Z+ bkOther Covariates • Use thepredictedvalue to explaintheoutcome. (X(Z)Y) eg linear regression of cognitiononpredictedBMI 2nd Stage: E(Y|E(X|Z))=g0+g1 E(X|Z) + gkOther Covariates g1 isthe IV2SLSestimate (Local Average Treatment Effect - LATE) (Angrist, Imbens, Rubin, 1993,p.19) swalter@psg.ucsf.edu

Control Function Approach (Tchetgen Tchetgen and Vansteelandt, 2013) swalter@psg.ucsf.edu

Practice Session with simulated data • GeneratetheUniverse: z1<-sample(c(0,0,0,0,0,1,1,1,2,2),10000, replace=T) z2<-sample(c(0,0,0,0,0,1,1,1,2,2),10000, replace=T) z3<-sample(c(0,0,0,0,0,1,1,1,2,2),10000, replace=T) e1<-rnorm(10000, sd=3) e2<-rnorm(10000, sd=3) U<-rnorm(10000, sd=1) A<-27+0.6*z1+0.3*z2+0.1*z3+2*U+e1 Y<-30+1.5*A+4*U+e2 swalter@psg.ucsf.edu

2SLS  Y<-30+1.5*A+4*U+e2 summary(lm(A~z1)) #beta = 0.6 summary(lm(A~z2)) #beta = 0.3 summary(lm(A~z3)) #beta = 0.1 summary(lm(A~z1+z2+z3)) summary(lm(Y~A)) #beta = 2.2 #twostep pred1<-predict(lm(A~z1)) summary(lm(Y~pred1)) #beta = 1.5 #Control Function IV res1<-summary(lm(A~z1))$residual summary(lm(Y~A+res1)) #beta = 1.5 swalter@psg.ucsf.edu

2 Stage Least Squares (2SLS): in SAS, Stata, R • Procsyslin • Ivreg2 • tsls (Local Average Treatment Effect - LATE) (Angrist, Imbens, Rubin, 1993,p.19) swalter@psg.ucsf.edu

IV assumptions • Assumption IV.1: • Z and exposure A are associated • Z has a causal effect on A • Z and A share common causes • Assumption IV.2: • Z affects the outcome Y only through A. • no direct effect of Z on Y (“exclusion restriction”) • Assumption IV.3: • Z does not share common causes with the outcome Y, orallcommon causes controlled • no confounding for the effect of Z on Y . • Assumption IV.4: Most popular option: There are no defiers • This assumption is sometime described as a “monotonicity assumption” • no individual in the population who would be exposed, i.e. A = 1 under Z = 0, but would be unexposedunder Z = 1. • In an RCT, this would be a person who would do the exact opposite of what he/she is told to. swalter@psg.ucsf.edu

The IV estimate is not necessarily the population average causal effect. Whose Causal Effect is it? Classify people based on their treatment under either value of the IV/randomly assigned treatment. What the person will do if assigned to Experimental Treatment B: Take B Take A What the person will do if assigned to Control Treatment A: Take A Take B

Whose Causal Effect? What the person will do if assigned to experimental treatment: Classify people based on their treatment under either value of the IV/randomly assigned treatment. Never-takers do not contribute to any outcome differences between the IV=0 and IV=1 group. Take experimental Take control Take control What the person will do if assigned to control treatment: Always-takers do not contribute to any outcome differences between the IV=0 and IV=1 group. Take experimental

Overview • IV analysis with outcome • IV analysis in case-control studies • IV analysis with survival outcomes • IV analysis in R swalter@psg.ucsf.edu

IV Analysis with binary outcome • Traditionally: use 2SLS with a linear probability model • Problem: no restriction on the space of a valid probability (0<=P<=1) • … might not be a problem when using genetic variants as instruments given that they explain so Little that the estimate will hardly ever be out off bounds • Solution: use a link function: log, logit, probit swalter@psg.ucsf.edu

IV Analysis with binary outcome IV Analysis with logit link swalter@psg.ucsf.edu

IV for survival outcomeIV for Cox Proportional Hazards Model swalter@psg.ucsf.edu

Two Sample IV designs • Using published data only: • The effect of BMI on Late Onset Alzheimer´s Disease • The effect of Type 2 Diabetes on Late Onset Alzheimer´s Disease

Inverse Variance Weighted IV of separate samples (Burgess et al. 2013) Burgess, Stephen, Adam Butterworth, and Simon G. Thompson. "Mendelian randomization analysis with multiple genetic variants using summarized data." Genetic epidemiology 37.7 (2013): 658-665. Geneticvariantk, k = 1, . . . , K is associated with an observed Xkmean change in the risk factor per additional variant allele with standard error σXkand an observed Ykmean change in the outcome per allele with standard error σYk swalter@psg.ucsf.edu

Inverse Variance Weighted IV: Effect of BMI on Dementia swalter@psg.ucsf.edu

The Model Dementia Related Phenotypes

BMI on Dementia Mukherjee, Shubhabrata, et al. "Genetically predicted body mass index and Alzheimer's disease–related phenotypes in three large samples: Mendelian randomization analyses." Alzheimer's & Dementia 11.12 (2015): 1439-1451.

Split Sample IV mrozb<-as.data.frame(cbind(Y,A,U, z1,z2,z3)) pred1<-predict(lm(A~z1)) summary(lm(Y~pred1)) #beta = 1.5 mrozvs<-mrozb[sample(1:10000, 500),] a<-coef(lm(A~z1+z2+z3, data=mrozb)) mrozvs$GRS<-apply(sweep(mrozvs[c("z1", "z2","z3")],MARGIN=2,c(a[2:4]),`*`),1,function(x) sum(x, na.rm=T)) summary(lm(Y~GRS, data=mrozvs)) #beta = 0.77 swalter@psg.ucsf.edu

Inverse Variance Weighted IV (external data) ### BurgessApproach coeftest(lm(A~z1)) 0.516322 0.046800 coeftest(lm(A~z2)) 0.342791 0.046807 coeftest(lm(Y~z1)) 0.84012 0.11495 coeftest(lm(Y~z2)) 0.66891 0.11469 Xk<-c(0.516,0.343) Xkse<-c(0.047,0.047) Yk<-c(0.840,0.669) Ykse<-c(0.115,0.115) sum(Xk*Yk*Ykse^-2)/sum(Xk^2*Ykse^-2) #InverseVarianceWeighted IV (1/sum(Xk^2*Ykse^-2))^0.5 swalter@psg.ucsf.edu

Inverse Variance Weighted IV (external data) library(meta) sum(Xk[1]*Yk[1]*Ykse[1]^-2)/sum(Xk[1]^2*Ykse[1]^-2) #1.626 (1/sum(Xk[1]^2*Ykse[1]^-2))^0.5 #0.223 sum(Xk[2]*Yk[2]*Ykse[2]^-2)/sum(Xk[2]^2*Ykse[2]^-2) #1.950 (1/sum(Xk[2]^2*Ykse[2]^-2))^0.5 #0.0.335 metagen(c(1.626,1.950),c(0.223, 0.335)) #identical --> no heterogeneity, Instrument OK swalter@psg.ucsf.edu

X Y Z U Doubting Instruments: major biases similar to critiques of RCTs • Do they have other pathways to the outcome? • Unblinded trials: controls become demoralized • Is there a common cause of the instrument and the outcome? • Trials: unfair random assignment • Do they actually affect anyone’s exposure? • Trials: nobody adheres to assignment U2 X Y Z G X Y U U X Y Z U

Evaluating the assumptions • Constraints implied by theory • Over-identification tests • Stratification-based tests (similar to over-identification) • IV inequality constraints • Negative controls • Independent from known confounders • Egger tests

The End …. • 

What to do with a binary exposure? • In genetic IV, convert the binary exposure to the probability scale by reweighting the predicted probability from a first stage model swalter@psg.ucsf.edu

Two Stage Least Squares swalter@psg.ucsf.edu

IV Analysis with binary outcome IV Analysis with log link swalter@psg.ucsf.edu

IV Analysis with binary outcome IV Analysis with logit link swalter@psg.ucsf.edu

IV Analysis with binary outcome swalter@psg.ucsf.edu

IV for survival outcomeIV for Cox Proportional Hazards Model swalter@psg.ucsf.edu

IV for survival outcomeIV for Aalen Additive Hazards Models swalter@psg.ucsf.edu

Practice Session with simulated data • GeneratetheUniverse: z1<-sample(c(0,0,0,0,0,1,1,1,2,2),10000, replace=T) z2<-sample(c(0,0,0,0,0,1,1,1,2,2),10000, replace=T) z3<-sample(c(0,0,0,0,0,1,1,1,2,2),10000, replace=T) e1<-rnorm(10000, sd=3) e2<-rnorm(10000, sd=3) U<-rnorm(10000, sd=1) A<-27+0.6*z1+0.3*z2+0.1*z3+2*U+e1 Y<-30+1.5*A+4*U+e2 swalter@psg.ucsf.edu

2SLS summary(lm(A~z1)) #beta = 0.6 summary(lm(A~z2)) #beta = 0.3 summary(lm(A~z3)) #beta = 0.1 summary(lm(A~z1+z2+z3)) #beta = 0.7 summary(lm(Y~A)) #beta = 2.2 #twostep pred1<-predict(lm(A~z1)) summary(lm(Y~pred1)) #beta = 1.5 #Control Function IV res1<-summary(lm(A~z1))$residual summary(lm(Y~A+res1)) #beta = 1.5 swalter@psg.ucsf.edu

2SLS #ivreg 2 script http://diffuseprior.wordpress.com/tag/over-identification/ #ivreg2(form,endog,iv,data,digits) mroz<-as.data.frame(cbind(Y,A,z1,z2,z3)) ivreg2(form=Y ~ A ,endog="A",iv=c("z1","z2","z3"),data=mroz) mrozs$res1<-summary(lm(A~z1,data=mrozb))$residual summary(lm(Y~A+res1, data=mrozs)) #beta = 1.5 swalter@psg.ucsf.edu

With another assumption can say: Whose Causal Effect? Classify people based on their treatment under either value of the IV/randomly assigned treatment. What the person will do if assigned to Experimental Treatment B: Take B Take A What the person will do if assigned to Control Treatment A: Take A Take B

Whose Causal Effect? What the person will do if assigned to experimental treatment: Classify people based on their treatment under either value of the IV/randomly assigned treatment. Never-takers do not contribute to any outcome differences between the IV=0 and IV=1 group. Take experimental Take control Take control What the person will do if assigned to control treatment: Always-takers do not contribute to any outcome differences between the IV=0 and IV=1 group. Take experimental

Egger Regression: treat each IV estimate as an element in a meta-analysis • Under the InSIDE (instrument strength independent of direct effects) assumption, bias converges to zero. • Regress the Z-Y associations on the Z-X associations. The intercept is an estimate of average pleiotropy and the slope is an estimate of the true causal effect under InSIDE. Bowden, Davey Smith, and Burgess, Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression, IJE 2015

IV Analysis

IV Analysis

Presentation Transcript

Syntax Analysis – Part IV Bottom-Up Parsing

Intelligent Finance Component IV – Financial Strategic Analysis

IV

Preliminary Analysis of Survey IV Data

Developing IV&V Information Assurance Analysis Techniques

Chapter IV: Link Analysis

Practical Reversing IV – Advanced Malware Analysis

Basic Data Analysis IV Regression Diagnostics in SPSS

HAMLET: Analysis of Act III, Scene iv

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT

Syntax Analysis – Part IV Bottom-Up Parsing

IV. UNIT IV

Acids – Bases Equilibria Part IV: Analysis of Salts

IPB in Counterinsurgency (Part IV Insurgency COA Analysis )

Acids – Bases Equilibria Part IV: Analysis of Salts

Analysis of Security Protocols (IV)

IV Ibuprofen Market Size, Analysis & Forecast 2015 - 2022

ANALYSIS AND ALGORITHM DESIGN - IV

Analysis of Security Protocols (IV)

Disposable IV Therapy Products Market Analysis

IV Analysis

IV Analysis

Presentation Transcript

Syntax Analysis – Part IV Bottom-Up Parsing

Intelligent Finance Component IV – Financial Strategic Analysis

IV

Preliminary Analysis of Survey IV Data

Developing IV&amp;V Information Assurance Analysis Techniques

Chapter IV: Link Analysis

Practical Reversing IV – Advanced Malware Analysis

Basic Data Analysis IV Regression Diagnostics in SPSS

HAMLET: Analysis of Act III, Scene iv

UNIT IV ITEM ANALYSIS IN TEST DEVELOPMENT

Syntax Analysis – Part IV Bottom-Up Parsing

IV. UNIT IV

Acids – Bases Equilibria Part IV: Analysis of Salts

IPB in Counterinsurgency (Part IV Insurgency COA Analysis )

Acids – Bases Equilibria Part IV: Analysis of Salts

Analysis of Security Protocols (IV)

IV Ibuprofen Market Size, Analysis & Forecast 2015 - 2022

ANALYSIS AND ALGORITHM DESIGN - IV

Analysis of Security Protocols (IV)

Disposable IV Therapy Products Market Analysis

Developing IV&V Information Assurance Analysis Techniques