360 likes | 378 Views
This talk explores study designs, estimating treatment effects, and developing decision tools for observational Comparative Effectiveness Research (CER). Challenges and ongoing programs also discussed.
E N D
Practical Applications and Decisions for Using Propensity Score Methods Doug Landsittel, PhD Professor of Medicine, Biostatistics and Translational Science Director, Section on Biomarkers and Prediction Modeling CER Track Director, Institute for Clinical Research Education Director of Biostatistics, Starzl Transplant Institute Faculty Member, CER Center and Center for Research on Health Care
Agenda for this talk • Study designs and PCORI Methodology Standards • Propensity Score (PS) methods • Estimating the PS • Estimating treatment effect with PSs • Developing a decision tool for observational CER • Systematic review of studies on statistical properties • Decision Tool for Observational Data Analysis Methods for CER (DecODe CER) • Challenges in clinical research education • Some ongoing programs and resources
We will focus on the analysis methods, but… • Study design must be considered • All observational studies are not created equal • If data have too many inherent limitations… consider not doing the study • Propensity scores can also have a role in study design
“The choice of study design often has profound consequences for the causal interpretation of study results.”
Observational Studies • Treatments are assigned by the mechanisms of routine practice. • The actual treatment assignment process or mechanism is generally unknown. Developing a Protocol for Observational Comparative Effectiveness Research: A User’s Guide. Chapter 2. Study Design Considerations. Page 22.
There are many different designs used for CER studies Not CER Objectives CER Experimental Observational Standard RCT Variations of the RCT Epidemiologic Studies Quasi- Experimental Existing Records Survey Research Gold Standard Exp: Pre-post Intervention or Systematic Assignment Exp: NHANES Exp: Pragmatic &/or Cluster Randomized Exp: Cohort or Case-control Study Exp: Registry, EHRs Systematic Reviews and Meta Analysis Analytical Techniques Decision Analysis Account for stochastic events, costs, utilities Summarize multiple studies
The importance of the study design is emphasized in the PCORI Methodology Standards • Methods are part of the ACA • Methodology Committee • Numerous advisory committees • Methodology Report • Methodology Standards • 5 cross-cutting standards • 6 study design specific standards (including causal inference and observational designs, that include registries and data networks) http://www.pcori.org/research-results/research-methodology/pcori-methodology-standards
There are 6 causal inference standards; the first 4 are on design and reporting • CI-1: Define analysis population using covariate histories • Specify timing; use data from appropriate intervals • CI-2: Describe population that gave rise to the effect estimate(s) • Justify exclusions; describe final analysis population • CI-3: Precisely define timing of the outcome relative to exposure initiation and duration • CI-4: Measure confounders before exposure and report on confounders
The other 2 CI standards are specific to propensity scores and IVs • CI-5: Report the assumptions underlying the construction of propensity scores • Describe resulting groups in terms of the overlap and balance of potential confounders • CI-6: Assess the validity of the instrumental variable report the balance of covariates • Describe how the IV satisfies the 3 key assumptions
Ideally, we can use observational data to make causal inferences • Would like to say ‘using treatment A versus treatment B caused an improvement in y’ • Usually limited to saying ‘using treatment A versus treatment is associated with an improvement in y’
What exactly do we mean by causal inference? • Concept can be illustrated through potential outcomes • Two treatments (a=1 and a=2) are available • Patients receive treatment 1 or treatment 2 at baseline • Observe an outcome at a subsequent time Ya • Potential outcomes: Y1and Y2 • Only Y1or Y2could be observed, the other is the counterfactualoutcome • The idea of causal inference is to estimate the difference in potential outcomes • i.e. causal effect = E(Y1-Y2) or some function of that difference (e.g. relative risk)
Concept of propensity scores • Balance the treatment groups w.r.t. the propensity for being treated • Emulate a randomized trial • PS = an optimal balancing score • Two steps: 1) estimate the PS, and 2) apply the PS to estimate treatment effects • Step 1: modeling the probability for treatment • Step 2: match, stratify, adjust, or weight the data using its propensity score
Estimating the propensity score • Select some probability model, as done for outcome regression • Example: logistic regression: g(T) = Xβ • PS = predicted probabilities of a specific treatment given their covariate profile • The propensity score model is often limited to main effects, but it does not need to be • Objective: predicting the PS, not estimating β’s • Not concerned about 10 observations/variable, parsimony, or other rules used for fitting the outcome model
Other considerations in estimating the propensity score • Specifying which variables to use in the PS • Discourage exclusion based on non-significant tests • Encourage inclusion of clinically significant variables • Include both confounders and predictors • Dangerous to exclude potential confounders • May use modern regression methods • May need to consider complexities of high dimensional data with small treatment counts • Pay attention to missing data • PS may by continuous or >2 categories • Greatly complicates the associated methods
Other reporting for the propensity score • Refer to CI-5 from the PCORI Methodology Standards • Report on covariate balance via standard differences before and after PS adjustment • Display PS distributions by treatment group • Useful comparisons are limited to areas of overlap • May prefer matching if PS distributions have poor overlap • Report statistics on sensitivity to unmeasured confounding
Estimating the treatment effect via adjustment or stratification • Adjust for the logit(PS) as a covariate in a multivariable regression • Simple with some theoretical justification • Simulations seem to show greater bias in practice • Create quintiles based on the PS • Estimate treatment effect within strata • Calculate an overall estimate over strata • Investigate standardized differences within strata • Also referred to as sub-classification
Estimating the treatment effect via matching • Match 1:m depending on sample sizes • Apply existing methods for greedy matching • Existing algorithms in R and Stata, other software • Larger m increases the utilized sample size but may worsen the utility of selected matches • Choose a caliper to limit acceptable matches • Common choice = 0.2×standard deviation • Substantial literature on the matching criteria • Analysis based on the matching strategy • e.g. conditional logistic regression • Optimal statistical properties in some scenarios
Using inverse probability of treatment weighting (IPTW) • Basic Ideas: • Reweight the population to resemble groups with equal propensities for treatment • If PS = P(Trt A), weight those on Trt A by 1/PS, and those on Trt B by 1/(1-PS) • Down-weight observations with expected treatment • Existing algorithms, as used for survey weighting, in R and Stata, other software • Check if extremes of the PS distributions overlap between treatment • May truncate weights if extremely large • Optimal statistical properties in some scenarios
Some assumptions http://idbdocs.iadb.org/wsdocs/getdocument.aspx?docnum=35320229
An interesting example • Compare effectiveness of bare metal stents (BMS) to covered stents (CS) in common iliac artery (CIA) interventions for aortoiliac occlusive disease using 2010-14 data from the Vascular Quality Initiative (VQI) • Outcome: time to loss of primary patency • Multivariable Cox regression and PS analysis • Patients had unilateral or bilateral CIA stents • 1,727 unilateral stents; 85.7% BMS, 14.3% CS • 1,101 bilateral stents; 83.3% BMS, 16.7% CS • Results still pending
Why is this example interesting? • Evaluation of bilateral stents conceptually similar to unilateral stents • Bilateral stents impose a specific structure to the data • Decided to match patients not stents (bilateral case) • Decided to separate bilateral and unilateral analyses • %CS relatively small compared to %BMS • Exploring 1:1 and 1:2 matching • Difficult to match with m>2; reduces sample size • Now exploring IPTW (similar distributions of PSs) • Bilateral stents: multiple sources of correlation • Less published on PS properties for survival • The approach matters to the final conclusions
Some closing thoughts on propensity scores • PSs only account for measured confounders • Sensitivity methods are important • Need to check resulting balance achieved after applying propensity methods • Plots of standardized differences • Balance can get worse if a variable is excluded from the PS model • Describing the distribution of PSs across treatment groups yields useful information
Our group is working to develop a decision tool and further educational efforts • Decision Tool for Observational Data Analysis Methods in CER (DecODe CER) • Educational efforts including the Expanding National Capacity in PCOR through Training Program
Decision Tool for Observational Data Analysis Methods in CER (DecODe CER) • Motivated by CER course at Pittsburgh • Informed the systematic review of the literature • Conducting simulations to fill in the gaps • Working with an advisory committee and ‘stakeholder co-investigators’ to develop DecODe CER • Funded by PCORI from 3/2014 through 2/2017
Basic format of DecODe CER Introduction and motivation for causal inference in observational CER Restrictions, Limitations, Assumptions, Minimum study design considerations Summary of methods Static Treatment (e.g. Propensity Scores) Time-Varying Treatment (e.g. Marginal Structural Models) Questions, initial analyses, diagnostics Input from the systematic review User answers questions; conducts analyses (e.g. distribution of propensity scores, measures of covariate balance) Input from stakeholders and advisory committee Final table with pros and cons for each method and scenario Limit or guide methods under consideration
Published methods papers focused on… • 105 studies of propensity scores utilized • Covariate adjusted (n=29) • Probability weighting (41) • Matching (46) • Stratification (34) • 102 studies of other causal inference methods • Doubly robust methods (22) • Instrumental variables (18) • Marginal structural models (35) • Structural nested mean models/G-estimation (19) • Other (29) • 39 included PS and other methods
Final Thought: What is the role of teaching propensity scores in clinical research education?
Many ongoing educational efforts • Institute for Clinical Research Education CER course at the University of Pittsburgh • CER Track for MS and Certificate Program • Workshops and invited talks • Many resources have been developed • PCORI methods curriculum (Johns Hopkins) • UC-Davis, OSU and others have methods courses • http://ctsa-cermethodscourse.org/cer-lessons/ • http://cph.osu.edu/hopes/cer • Pitt’s CER Center website has other resources • http://www.healthpolicyinstitute.pitt.edu/cerc/about-comparative-effectiveness-research-center
AHRQ supports five R25 grants in PCOR training Expanding National Capacity in PCOR through Training (ENACT) Program The RFA included the following requirements: Target a particular research community Collaborate with specific program partners Basic, advanced, and experiential training Pittsburgh application focused on collaborations with Minority-Serving Institutions (MSIs) Partners have been instrumental in all phases Other R25s funded at Albert Einstein, Brown U., MD Anderson, U. of Washington
Pittsburgh is partnering with 6 Minority-Serving Institutions Program Partners: 6 Institutions from Research Centers in Minority Institutions (RCMIs) Charles R. Drew University of Medicine and Science Howard University MeharryMedical College Morehouse School of Medicine University of Hawai’i at Manoa University of Puerto Rico, Medical Science Campus Developed a fundamental course (=vocabulary ) of PCOR and an advanced methods and grant writing course (online through Acatar Learning Environment).
Some final thoughts on propensity scores • ‘Propensity scores’ does not, alone, describe a specific method • Many other methods utilize the PS, including doubly robust methods • The PS can be a valuable tool for exploring the nature of treatment selection
Acknowledgements (PCORI contract) • Funded by a Patient-Centered Outcomes Research Institute (PCORI) Project Award (ME-1306-03827) • Disclaimer: All statements in this report, including its findings and conclusions, are solely those of the author and do not necessarily represent the views of the PCORI, its Board of Governors, or Methodology Committee. • Investigative Team • Sally Morton, PhD, Joyce Chang, PhD ` • Margaret Chen, MS, Andrew Topp, PhD Candidate • Librarians: Ester Saghafi, MEd,MLS, Barb Folb, MM, MLS, MPH, Andrea Ketchum, MLIS • Stakeholders Co-investigators • Michael Schneider, PhD, DCC, Pam Smithberger, PharmD, MS • Advisory Board • Sharon-Lise Normand, PhD (Harvard), Dylan Small, PhD (Penn), Maria Glymour, PhD (UCSF), Dominick Esposito, PhD (Mathematica Policy Institute), Gerard Brennan, PhD, PT (Intermountain Healthcare)
Acknowledgements: ENACT R25 Funded by the AHRQ (through PCORI trust funds) Investigative Team and Executive Committee (Pittsburgh) W. Kapoor, S. Morton, D. Rubio, E. Davis, K. Abebe, K. McTigue, C. Shen, V. Gilliam, L. Bell Advisory Board (from the MSIs and Pittsburgh) P. Davis, E. Garcia-Rivera, E. Miller, N. Morone, S.M. Nouraie, C. Pettigrew, M.F. Lima, A. Quarshie, T.B. Seto, M.A. Shaheen, J. South-Paul, C. Wilkins ENACT Fellows and students Other leadership at the MSIs Dean’s office of the U. of Pittsburgh SOM (A. Levine)
Questions? Feel free to contact me at landsitteldp@upmc.edu or dpl12@pitt.edu.