280 likes | 603 Views
Propensity Score Models . Michael Massoglia Department of Sociology University of Wisconsin Madison . General Overview. The logic of propensity models Application based discussion of some of the key features Emphasis on working understanding use of models
E N D
Propensity Score Models Michael Massoglia Department of Sociology University of Wisconsin Madison
General Overview • The logic of propensity models • Application based discussion of some of the key features • Emphasis on working understanding use of models • Brief formal presentation of the models • Empirical example • Questions and discussion • Please interrupt with questions and clarifications
My orientation • Not an advocate nor a detractor • Try to understand the strengths and weakness • The research is vastly expanding in this area • Focus on 1 statistics program -- 2 modules • Used in published work • Level of talk • Data is often problematic in social science research • Propensity models • One tool that can help with data limitations
Part I: Basic LogicStandard Regression Estimator • Net of controls, the estimate is based upon mean differences on some outcome between those who experienced the event or treatment – marriage, incarceration, job -- and is assumed to be an average effect generalizable to the entire population • Under conditions in which • 1) The treatment is random and the • 2) Population is homogeneous (prior) • Often unlikely in the social sciences
Problems of Experiential Design • Many social processes cannot be randomly designed • Incarceration • Marriage • Drug use • Divorce • And the list goes on • Data limitations • Cross sectional, few waves, retrospective data, measures change • Propensity models attempt to replicated experimental design with statistics
Propensity models • Rooted in classic experimental design • Treatment group • Exposed to some treatment • Control group • Not exposed to treatment • Individuals are statistically randomization into groups • Identical (net of covariates) • Or differ in ways unrelated to outcomes • Treatment can be seen as random • Ignorable treatment (conditional independence) assumption
Counterfactuals • PSM: Toward a consideration of counterfactuals • Some people receive treatment -- marriage, incarceration, job. • The counterfactual • “What would have happened to those who, in fact, did receive treatment, if they had not received treatment (or the converse)?” • Counterfactuals cannot observed, but we can create an estimate of them • Rubin “The fundamental problem…” • At the heart of PSM
Part II: Application Based Discussion Propensity Score • Calculate the predicted probability of some treatment • Assuming the treatment can be manipulated • Comparatively minor debate in literature • We have predicted probability (for everything) • Predicted probability is based observed covariates • Once we know the predicted probability • 1) Find people who experiences a treatment • 2) Match to people who have same* predicted probability, but did not experience treatment • 3) Observe differences on some outcome
The process of Matching • All based on matching a treated to a controlled • 1 program 2 modules • Nearest neighbor matching • 1-1 match • Kernel matching • Weights for distance • Radius matching • 0.01 around each treated • Stratification matching • Breaks propensity scores into strata based on region of common support • Great visual from Pop Center at PSU • http://help.pop.psu.edu/help-by-statistical-method/propensity-matching/Intro%20to%20P-score_Sp08.pdf/?searchterm=None
3 Key Compondents • Range of common support • Existence Condition • Balancing Property • Ignorable treatment assumption • Observed Covariates • Reviewers pay attention • ? More so than other methods • Important to keep in mind: Cross group models • Not within person “fixed effects models”
Range of Common Support • We use data only from region of common support: Violates existence condition. Assumption of common support (1) Range of matched cases.
Balanced • Among those with the same predicted probability of treatment, those who get treated and not treated differ only on their error term in the propensity score equation. • But this error term is approximately independent of the X’s. • Ignorable treatment assumption • The reality: • The same given the covariates
Observed Covariates • Propensity models based on observed covariates • Much like many other regression based models • Yet, reviewers pay particular attention • Models get additional attention • PSM • Cannot: Fix out some variables • Fixed effects models: Hard to measure time stable traits • Can: Assess the role of unobserved variables with simulations
Part 3: Brief Formal PresentationPropensity score • More formally: • The propensity score for subject i (i = 1, …, N), is the conditional probability of being assigned to treatment Zi = 1 vs. control Zi = 0 given a vector xi of observed covariates: • where it is assumed that, given the X’s the Zi’sare independent
Assumption(s) • Given the X’s the Zi’sare independent (given covariates) • Moves propensity scores to logic to that of an experiment • Substantively means • Treatment status is independent of observed variables • Treatment status occurs at random • Ignorable Treatment Assumption (2) • Stable unit treatment value assumption. The potential outcomes on one unit should be unaffected by the particular assignment of treatments to the other units • Issues of independence
Part 4: Empirical Example • 3 part process • 1)Assign propensity scores • Create your matching equation • Some programs do this at the same they estimate treatment score • My view is do them separately • Greater flexibility if you have pp scores independent of treatment effects • High, low, females, makes • 2) Create matched sample • Average treatment effect • 3) Tests of robustness
Add on to Stata • Can be done in SAS, S-Plus R, MPLS, SPSS* • Stata- • PSMATCH2: Stata module for propensity score matching, common support graphing, and covariate imbalance testing • psmatch2.ado • PSCORE – same basic features • More user “friendly” • pscore.ado • .net search psmatch2 • .net search pscore • .sscinstall psmatch2, replace
Moving into stata • Estimation of average treatment effects based on propensity scores (2002) The Stata Journal Vol.2, No.4, pp. 358-377. • Walk through the process • Create propensity score • From observed covariates in the data • Use different matching groups • Estimates • Test the robustness of effect • Bias from unobservables
Twoquick notes 1) tab mypscore Estimated | propensity | score | Freq. Percent Cum. ------------+----------------------------------- .000416 | 1 0.02 0.02 .000446 | 1 0.02 0.04 .0004652 | 1 0.02 0.05 .0005133 | 1 0.02 0.07 .0005242 | 1 0.02 0.09 .0005407 | 1 0.02 0.11 .0005493 | 1 0.02 0.13 .0005666 | 3 0.05 0.18 .0005693 | 1 0.02 0.20 .0005729 | 1 0.02 0.22 2) Bad Matching Equation: Link back to PSU 3) Link : IU
Sensitivity Tests • gen delta • delta is the difference in treatment effect between treated and untreated • rbounds delta, gamma (1 (0.1)2) • gamma: log odds of differential assignment due to unobserved heterogeneity • Rosenbaum bounds takes the difference in the response variable between treatment and control cases as delta, and examines how delta changes based on gamma • LINK TO IU 2
A few concluding comments • Propensity models • Dependent on data • As are all models • Reviewers and editors seem to care more • Yet weakness appear similar traditional regression models • You can empirically test the role of unobservableswith simulations • Significant advancement
Thank you! • A small window into propensity models • Regression, matched sample, use as covariates, as an instrument • Longitudinal data perfectly measured on all variables over time • Open to an argument preferences • Fixed effects models • And variants: Difference in differences • Do notlive in such world • Propensity models help us through imperfect data • Questions? (5) • Preference an open discussion