240 likes | 333 Views
POLS 7170X Master’s Seminar Program/Policy Evaluation. Class 5 Brooklyn College – CUNY Shang E. Ha. How to Measure Impact?. Impact Assessment : to determine what effects programs have on their intended outcomes and whether perhaps there are important unintended effects
E N D
POLS 7170XMaster’s SeminarProgram/Policy Evaluation Class 5 Brooklyn College – CUNY Shang E. Ha
How to Measure Impact? Impact Assessment: to determine what effects programs have on their intended outcomes and whether perhaps there are important unintended effects What would have happened in the absence of the program? Since counterfactual is not observable, the key goal of all impact evaluation methods is to construct or “mimic” the counterfactual
Constructing the Counterfactual • Counterfactual is often constructed by selecting a group not affected by the program • Randomized: • Use random assignment of the program to create a control group which mimics the counterfactual • Non-randomized: • Argue that a certain excluded group mimics the counterfactual
Types of Impact Evaluation Methods • Randomized Evaluations • Random Assignment Studies • Randomized Field Trials • Social Experiments • Randomized Controlled Trials (RCTs) • Randomized Controlled Experiments • The “golden standard” research design for assessing causal effects
Types of Impact Evaluation Methods (Cont.) • Non-experimental or Quasi-experimental Methods • Pre-Post • Differences-in-Differences • Statistical Matching • Instrumental Variables • Regression Discontinuity • Interrupted Time Series • Lack the random assignment to conditions that is essential for true experiments
Randomize or not? Designs using nonrandomized controls universally yield less convincing results than well-executed randomized field experiments The randomized field experiment in always the optimal choice for impact assessment Nevertheless, quasi-experiments are useful for impact assessment when it is impractical or impossible to conduct a true randomized experiment
Randomized trials • Simple case • Take a sample of program applicants • Randomly assign them to either • Treatment group – is offered treatment • Or Control group – not allowed to receive treatment (during the evaluation period) • [Or Placebo group – receives an innocuous one]
Randomized trials • The critical element in estimating program effects by randomized field experiment is configuring a control group that does not participate in the program but is equivalent to the group that does • Identical composition: Intervention and control groups contain the same mixes of persons or other units in terms of their program-related and outcome-related characteristics • Identical predispositions: Intervention and control groups are equally disposed toward the project and equally likely, without intervention, to attain any given outcome status • Identical experiences: Over the time of observation, intervention and control groups experience the same time-related processes – maturation, interfering events, etc
Randomized trials Even though target units are assigned randomly, the intervention and control groups will never be exactly the same But if the random assignment were made over and over, those fluctuations would average out to zero Statistical methods are used to guide a judgment about whether a specific difference in outcome is likely to have occurred simply by chance or it is more likely to represent the effect of the intervention
Units of analysis • The units on which outcome measures are taken in an impact assessment are called the units of analysis • The choice of the units of analysis should be based on the nature of the intervention and the target units to which it is delivered • C.f., units of randomization
Key steps in conducting a randomized experiment Design the study carefully Randomly assign people to treatment or control Collect baseline data Verify that assignment looks random Monitor process so that integrity of experiment is not compromised Collect follow-up data for both the treatment and control groups in identical ways Estimate program impacts by comparing mea outcome of treatment group vs. mean outcomes of control group Assess whether program impacts are statistically significant and practically significant
Key advantages of experiments • Because members of the groups (treatment and control) do not differ systematically at the outset of the experiment, any difference that subsequently arises between them can be attributed to the treatment rather than to other factors • Relative to results from non-experimental studies, results from experiments are: • Less subject to methodological debates; • More likely to be convincing to program funders and/or policymakers
Limitations of experiments Ethical considerations Time and cost External validity?
Validity A tool to assess credibility of a study Internal validity: related to ability to draw causal inference, i.e., can we attribute our impact estimates to the program and not to something else External validity: related to ability generalize to other settings of interest, i.e., can we generalize our impact estimates from this program to other populations, time periods, countries, etc?
Example 1 • [Exhibit 8-B] • The Child and Adolescent Trial for Cardiovascular Health (CATCH) • 96 elementary schools (CA, LA, MN, TX); 56 intervention sites and 40 control sites • Intervention sites: training sessions for the food service staffs informing them of the rationale for nutritionally balanced school menus and providing recipes and menus that would achieve that goal • Measured by 24-hour dietary intake interviews with children at baseline and at follow-up, children in the intervention schools were significantly lower than children in control schools in total food intake and in calories derived from fat and saturated fat, but no different with respect to intake of cholesterol or sodium
Example 2 • The Minnesota Family Investment Program (MFIP) [Exhibit 8-E] • Problem of the Aid to Families with Dependent Children (AFDC): its does not encourage recipients to leave the welfare rolls and seek employment • Conduct an experiment that would encourage AFDC clients to seek employment and allow them to receive greater income than AFDC would allow if they become employed • Three conditions: • An MFIP intervention group receiving more generous benefits and mandatory participation in employment and training activities • An MFIP intervention group receiving only the more generous benefits and not the mandatory employment and training activities • A control group that continued to receive the old AFDC benefits and services • MFIP intervention families were more likely to be employed and when employed had larger incomes than control families • Those in the intervention group receiving both MFIP benefits and mandatory employment and training activities were more often employed and earned more than the intervention group receiving only the MFIP benefits
Some variations on the basics • Assigning to multiple treatment groups • [Example – Education Program] • Problems • Large class size • Children at different levels of learning • Teachers often absent • Possible remedies • More teachers to split classes • Streaming of pupils into different achievement bands • Make teachers more accountable, may show up more
Solutions • Do smaller class sizes improve test scores? • Add new teachers • Does accountable teacher get better results? • New teachers more accountable • Does streaming improve test scores? • Divide some classes by achievement
How to Randomize? Lottery Phase-In
Lottery • An example of clinical trial • Take 1000 people and give half of them the new drug • Can we simply apply this approach to social programs? • Need to consider the constraints
Resource Constraints • Many programs have limited resources • Many more eligible recipients than resources will allow services for • Quite common in practice: • Training for entrepreneurs or farmers • School vouchers • How do you allocate these resources?
Advantages of Lottery Simple, common, transparent, and flexible Participants know who the “winners” and “losers” are There is no a priori reason to discriminate Perceived as fair
Phase-In • Everyone gets program eventually • “In five years, we will cover 500 schools, 100 per year” • Advantages • Everyone gets treatment eventually • Concerns • Can complicate estimating long-run effects • Care required with phase-in windows • Do expectations change actions/behavior today?