Propensity Scores Friday, June 1 st , 10:15am-12:00pm

Propensity ScoresFriday, June 1st, 10:15am-12:00pm Deborah Rosenberg, PhD Kristin Rankin, PhD Research Associate Professor Research Assistant Professor Division of Epidemiology and Biostatistics University of IL School of Public Health Training Course in MCH Epidemiology

Propensity Scores • The goal of using propensity scores is to more completely and efficiently address observed confounding of an exposure-outcome relationship. • Program evaluation – Addresses selection bias • Epidemiology – Addresses non-randomization of exposure • Propensity scores are the predicted probabilities from a regression model of this form: • Exposure = pool of observed confounders • “Conditional probability of being exposed or treated (or both)” 1

Propensity Scores When exposed and unexposed groups are not equivalent such that the distribution on covariates is not only different, but includes non-overlapping sets of values, then the usual methods for controlling for confounding may be inadequate. Non-overlapping distributions (lack of common support) means that individuals in one group have values on some of the covariates that don’t exist in the other group and vice versa. 2

Area of “Common Support” Sturmer, et al 2006, J Clin Epidemiol

Benefits of Propensity Score Methods • The accessibility of multivariable regression methods means they are often misused, with reporting of estimates that are extrapolations beyond available data. • The process of generating propensity scores: • focuses attention on model specification to account for covariate imbalance across exposure groups, and support of data with regard to “exchangeability” of exposed and unexposed • Allows for trying to mimic randomization by simultaneously matching people on large sets of known covariates • Forces researcher to design study/check covariate balance before looking at outcomes Oakes and Johnson, Methods in Social Epidemiology

Propensity Scores • Propensity scores might be used in three ways: • as a covariate in a model along with exposure, or as weights for the observations in a crude model (not recommended due to possible off-support inference) • as values on which to stratify/subclassify data to form more comparable groups • as values on which to match an exposed to an unexposed observation, then using the matched pair in an analysis that accounts for the matching

Propensity Scores Propensity scores are the predicted probabilities from a regression model of this form: Exposure = pool of observed confounders proclogisticdata=analysis desc; class &propenvars / param=ref ref=first; model adeq=&propenvars; outputout=predvalues p=propscore; run; Once the propensity scores are generated, they are used to run the real model of interest: outcome = exposure *Note: Make sure you start with a dataset with no missing values on outcome, or you will end up with unmatched pairs 6

Generating Propensity Scores • Consider only covariates that are measured pre-program/intervention/exposure or do not change over time; value shouldn’t be affected by exposure or in causal pathway between exposure and outcome • Covariates should be based on theory or prior empirical findings; never use model selection procedures such as stepwise selection for these covariates – if conceptually based, they should stay in the model regardless of statistical significance • Include higher order terms and interactions to get best estimated probability of exposure and balance across covariates; trade-off between fully accounting for confounding and including so many unnecessary variables/terms that common support becomes an issue and PS distributions are more likely to be non-overlapping 7 Oakes and Johnson, Methods in Social Epidemiology

Propensity Score Distributions • Examine the distribution of propensity scores in exposed and unexposed • If there is not enough overlap (not enough “common support”), then these data cannot be used to answer the research question • Observations with no overlap cannot be used in matched analysis • If there are areas that don’t overlap, the matched sample may not be representative (examine characteristics of excluded individuals to assess this) 8

Propensity Scores • Sometimes propensity scores are used to verify that pre-defined comparison groups are actually equivalent; • If they are, then the propensity scores may not have to be used in analysis

Propensity ScoresFlorida Healthy Start Evaluation: from Bill Sappenfield .5 .6 .7 .8 .9 1 Propensity Score Reference 1 Care Coordination

Propensity ScoresFlorida Healthy Start Evaluation: from Bill Sappenfield .2 .3 .4 .5 .6 .7 Propensity Score Reference 2 Care Coordination

Analysis Approach 1: Propensity Score as a Covariate or Weight in Model • Use the propensity score as a covariate in model • 1 degree of freedom as opposed to 1 or more for each original covariate; particularly useful when the prevalence of outcome is small relative to the number of covariates that must be controlled, leading to small cell sizes • Weight data using the propensity scores • the weight for an “exposed” subject is the inverse of the propensity score • the weight for an “unexposed” subject is the inverse of 1 minus propensity score; weights must be normalized • These approaches do not handle the issue of off-support data unless data are restricted to the range of propensity scores common to both the exposed and unexposed 12

Analysis Approach 2: Subclassification by Categories of the Propensity Scores • Stratifying by quintiles of the overall distribution of propensity scores can remove approx 90% of the bias caused by the propensity score • The measure of effect is then computed in each stratum and a weighted average is estimated based on the number of observations in each stratum 13

Analysis Approach 3: Propensity Score Matching • Several matching techniques are available: • Nearest Neighbor (with or without replacement) • Caliper and Radius • Kernal and Local Linear • Several software solutions available to perform matching. Two examples include: • PSMATCH2 in STATA • GREEDY macro in SAS 14

Analysis Approach 3: Propensity Score Matching • PSMATCH2 (STATA): • PSMATCH2 is flexible and user-controlled with regard to matching techniques • GREEDY (51 digit) macro in SAS: • The GREEDY (51 digit) Macro in SAS performs one to one nearest neighbor within-caliper matching: • First, matches are made within a caliper width of 0.00001 (“best matches”), then caliper width decreases incrementally for unmatched cases to 0.1 • At each stage, “unexposed” subject with “closest” ; propensity score is selected as the match to the exposed; in the case of ties, the unexposed is randomly selected • Sampling is without replacement 15

After Matching… • Check for balance in the covariates between the exposed and unexposed groups • If not balanced, re-specify the model and re- generate propensity scores; consider adding interactions or higher order terms for variables that were not balanced • If balanced, calculate a measure of association from an analysis that accounts for matched nature of data • Relative Risk / Odds Ratio / Hazard Ratio/ Rate Ratio and 95% CI • Risk Difference (Attributable Risk) and 95% CI 16

Matched Analysis • Analysis to estimate effect of exposure on outcome should account for matched design in estimation of standard errors, since matched pairs are no longer statistically independent • Estimates of effect need not be adjusted for matching because exposed are matched to unexposed; therefore a selection bias is not imposed on the data as it is in a matched case- control study where conditional logistic regression is needed

Matched Analysis • Multivariable regression not necessary (but GEE can be used) since matching addresses confounding, so a simple 2x2 table can be used, but this 2x2 table must reflect the matched nature of the data Exposed Experiences Outcome Unexposed Experiences Outcome

Matched Analysis: Measures of Effect (95% CI) • Relative Risk (RR) = (a+c)/(a+b) • SE (lnRR) = sqrt [(b+c) / {(a+b)(a+c)}] • 95% CI = exp[lnRR ± (1.96*SE)] • Risk Difference (RD) / Attributable Risk (AR) = (b-c)/n • SE (RD) = ((c + b)−(b−c)2/n)/n2 • 95% CI = RD ± 1.96(SE) • Note: Measures of effect from propensity score-matched analyses are often called “Average Treatment Effect in the Treated (ATT)” in the propensity score literature. This usually refers to RD, but sometimes ATTratio is reported

Propensity Scores Using the 2007 National Survey of Children’s Health (NSCH) for Illinois

Example: Association between receiving care in a medical home and reported overall health • Exposure • Outcome • Output from • SAS proc surveryfreq 21

Example: Association between medical home (Y/N) and reported overall health • % of children whose • overall health was • reported as excellent or • very good, according • to whether the care they • received met the • medical home criteria. 22

Crude Logistic Regression ModelOutput from SAS proc surveylogistic • The odds of a child’s overall health being described as at least very good are 3.7 times greater for those who receive care that met the medical home criteria compared to those whose care did not. 23

Creating Propensity Scores for the Medical Home • Many factors—sociodemographic as well as medical—are likely to confound the association between medical home and reported overall health. • It may not be feasible to adjust for all of these factors in a conventional regression model. • Instead, propensity scores will be generated to simultaneously account for many factors. 24

Creating Propensity Scores for the Medical Home: 3 Versions • 12 variables—demographic variables only • 14 variables—12 demographic variables plus a composite variable used to identify children with special health care needs (CSHCN) and a composite variable indicating severity of any health conditions • 38 variables—12 demographic variables plus 5 individual CSHCN screener variables and 21 indicators of condition severity

Distribution of Propensity Scores Before Matching • Version 3 – 38 Variables • Before Matching (n=1428) Medical Home = NO Medical Home = YES

Creating Propensity Scores for the Medical Home: 3 Versions 27

Creating Propensity Scores for the Medical Home • Sample SAS code for outputting the predicted values that are the propensity scores: • procsurveylogistic data=datasetname; • title1“text”; • strata state; • cluster idnumr; • weight nschwt; • classclassvars (ref=“ “)/ param=ref; • model medical_home (descending) = confounder pool; • output out=outputdataset p=name for pred. value; • run; 28

Creating Propensity Scores for the Medical Home: Excerpt from SAS proc print 29

Modeling General Health: 3 approaches for each of 3 pools of Variables *SAS Greedy Macro used for matches; PROC GENMOD used for GEE logistic regression with no weights or survey design variables. 30

Modeling General Health: 3 approaches for each of 3 pools of Variables • Example of • statistical results • when including • the medical home • plus 12 covariates: 31

Modeling General Health: 3 approaches for each of 3 pools of Variables • As the number of variables increases, it becomes more difficult to implement a conventional model. • With the medical home plus 38 variables, there were convergence problems: • Warning: Ridging has failed to improve the loglikelihood. You may want to increase the initial ridge value (RIDGEINIT= option), or use a different ridging technique (RIDGING= option), or switch to using linesearch to reduce the step size (RIDGING=NONE), or specify a new set of initial estimates (INEST= option). • Warning: The SURVEYLOGISTIC procedure continues in spite of the above warning. Results shown are based on the last maximum likelihood iteration. Validity of the model fit is questionable. • Fortunately, convergence was not a problem when using the 38 variables to create the propensity scores. 32

Modeling General Health: 3 approaches for each of 3 pools of Variables • Using the propensity scores • as a covariate in the model • only requires 1 df making it • feasible to account for many • variables simultaneously 33

Distribution of Propensity Scores Before and After Matching • Version 3 – 38 Variables • Before After Medical Home = NO Medical Home = NO Medical Home = YES Medical Home = YES

Modeling General Health: Stratified by Whether the Child is Screened as CSHCN • 12 Variable Version ^Stratum-specific estimates for the unmatched analyses were obtained using a DOMAIN statement in PROC SURVEYLOGISTIC in SAS 9.2 *PROC GENMOD was used for GEE logistic regression with no weights or survey design variables; Matching was performed separately within CSHCN and non-CSHCN 35

Modeling General Health: Stratified by Whether the Child is Screened as CSHCN • Rather than stratified analysis, obtain stratified results by including a product term in the model: • genhealth = medical home(Y/N) + prop score (12) + medical home*cshcn • Use contrast statements in SAS to generate the stratum-specific results: • contrast'odds ratio among cshcn y' medicalhome 1 medicalhome*cshcn 1 • / estimate=exp; • contrast'odds ratio among cshcn n' medicalhome 1 / estimate=exp; • These results attenuated compared to the matched, stratified results. 36

Propensity Score Example:Using 2003 Natality Data for Illinois

Example: Association between receiving adequate prenatal care and Preterm Birth • Exposure • Outcome • Output from • SAS PROC FREQ 38

Crude Measures of Effect • proc freq data=analysis order=formatted; • tables adeq*ptb/relrisk riskdiff; • format adeq ptb yn.; run; 39

Creating Propensity Scores for PNC Adequacy 40 How might variables be different if exposure was entry into PNC?

Creating Propensity Scores for PNC Adequacy • Sample SAS code for outputting the predicted values that are the propensity scores: • proclogistic data=datasetname desc; • title1“text”; • classclassvars / param=ref ref=first; • model adeq = confounder pool; • output out=outputdataset p=name for pred. value; • run; 41

Creating Propensity Scores for PNC Adequacy: Excerpts from SAS proc print n=160,642 42

Distribution of Propensity Score by PNC Adequacy, before Matching Inadequate (range): 0.386-0.988 38 observations at top and 2 at bottom of distribution in Adequate group Adequate (range): 0.366-0.995 On Support = 0.386-0.988 43

Analyzing Data: Four Approaches 44

Checking Covariate Balance Before Propensity Score Matching (GREEDY 1:1 Match) *Calculated as: 100*(meanexp - meanunexp) SQRT((s2exp + s2unexp) / 2 ) where s=std dev of mean Commonly, a Standardized Difference of >=10% or indicates imbalance Note: All factors are significantly associated with adequate PNC at p<0.0001 45

Checking Covariate Balance Before and After Propensity Score Matching (GREEDY 1:1 Match) ^Calculated as: 46

Distribution of Propensity Score by PNC Adequacy, after Matching (GREEDY) 47

Results: Four Approaches Using SASIs PNC Associated with Reduced Risk of Preterm Birth? 48

Results: Restructuring data for matched 2x2 table • /*Restructuring data from one observation per infant to one observation per matched pair (n obs from 30020  15010)*/ • data adeq (rename=(ptb=InAdeqPTB)); • set matched; where adeq=0; run; • proc sort data=adeq; by matchto; run; • data inadeq (rename=(ptb=AdeqPTB)); • set matched; where adeq=1; run; • proc sort data=inadeq; by matchto; run; • data matchedpair; • merge adeqinadeq; • by matchto; • run;

Propensity Scores Friday, June 1 st , 10:15am-12:00pm