Dr. Mario A. Rivera Regents’ Professor School of Public Administration

PA 522—Program Evaluation—Summer 2013Introduction to Program Evaluation: Contextual Planning and Theory-Driven Frameworks Dr. Mario A. Rivera Regents’ Professor School of Public Administration

Program Evaluation • Refers to the methodical collection of evidence to judge the effectiveness of social programs or other community, healthcare, and similar kinds of intervention. • The use of social research procedures and methods to systematically investigate the effectiveness of such interventions and programs. • In short, the purpose of an evaluation is to assess the effects and effectiveness of a social, policy, or programmatic intervention. • As a field and as a complex of methods, Program Evaluation draws broadly from social and behavioral science disciplines, particularly education, sociology, social psychology, economics, and political science. In practice, it involves the application of some combination of qualitative and quantitative methods, or “mixed methods.” The course will examine a number of these, relying on case studies. Case study is in itself one of those methods.

History of Evaluation: In education, the major impetus to the development of decision-oriented evaluation was the curriculum reform movement in the 1960’s. Human service evaluation began primarily as an extension of educational evaluation. This field has also applied methods from economics and operations research, especially cost-effectiveness and cost-benefit analysis. The rise of the field prompted development of professional associations (foremost, the American Evaluation Association). The past 25 years have seen enactment of influential pieces of legislation and government mandates. Second, they have seen academics, practitioners, and foundations trying to help nonprofit organizations improve their evaluation practices by developing training manuals and tools (i.e., Kellogg Foundation on logic models, United Way’s Outcome Measurement System).

History of Evaluation: • The field has its roots in government-sponsored evaluation initiatives, which became institutionalized during the 1960s. • In an effort to assess the effectiveness of the War on Poverty’s social initiatives, improve public accountability, learn what worked, and lend legitimacy to these new public programs, the federal government required evaluation research to be conducted as these programs were implemented. • Sucessive Presidents promulgated various initiatives designed to link budgetary expenditures with programmatic results (such as the Planning-Programming-Budgeting system under President Johnson, Management by Objectives under President Nixon, and Zero-Based Budgeting under President Carter) • Under the Reagan administration, however, spending on social services was cut, and evaluation was deemphasized. Government agencies were not required to report much beyond program inputs and outputs to justify their existence; nor were they required as often to conduct a formal evaluation of their programs. New social initiatives, which often created a ripe opportunity for evaluation work, were curbed, further impacting the practice of program evaluation.

History of Evaluation: • This trend, however, shifted by the late 1980s and early 1990s, with the movement to “reinvent government” coming to the fore during the Clinton administration. Interest in evaluation and performance measurement reemerged. Most notable was the National Performance Review of the mid- to late-nineties, designed to streamline bureaucracy, and before that the Government Performance and Results Act of 1993, or GPRA. By the early 1990s, there was already a concern in Congress that there was a lack of confidence in the efficacy of public programs, and GPRA was a means to address this concern. GPRA requires that federal agencies • produce a strategic plan that includes organizational goals and objectives, • generate a performance plan that includes measurement and data on meeting these goals and objectives, and • compile a performance report that includes actual performance data.

Evaluation History: • The Bush administration continued the Clinton administration’s emphasis on implementing performance-based management at the executive level, by promulgating and implementing the Program Assessment Rating Tool (PART) within the Office of Management and Budget, as a supplement to GPRA reporting. PART provides a focused summary of key performance findings from GPRA reports • With the signing of the Serve America Act in 2009, the Obama administration restored the priority of program evaluation, making it key to “scaling up what works.” Program evaluation has come back in government, and it is a requirement for federally-funded or contracted nonprofit implementation as well. Federal funding often requires that grant proposals specify evaluation plans (external evaluation in particular). Increasingly, it also requires specification of partnership-based implementation and evaluation plans—a major concern in this course.

Practical Program Evaluation: Assessing and Improving Planning, Implementation, and Effectiveness– by Huey-Tsyh Chen Chen’s Approach: Chen proposes a taxonomy or classification of types of program evaluation, one built around the program stage that is the desired focus of the evaluation, as well as around the desired function of the evaluation (either program improvement or program impact assessment). Certain themes crosscut all stages and types of evaluation—especially the importance of stakeholder participation in evaluation—stakeholder involvement, “stakeholder validity,” stakeholder empowerment, and communications with stakeholders.

Chen • Program Planning Stage. The first of the four stages is the program planning stage. This is the very beginning of program evaluation. Stakeholders at this stage—for example, program designers and managers, other principals, clients or beneficiaries—are developing a plan that will serve as a foundation for organizing and implementing a program at some future date. • One needs to distinguish between program planning and planning for a program evaluation, which can take place at any program stage of development. A design-phase evaluation or developmental evaluation refers to the assessment of a program at pre-deployment or early-deployment stages. How is such early evaluation possible?

Chen • Implementation Stage. Program evaluation has, for much of its history, focused on outcomes. Lessons from the field, however, have shown that program failures are often essentially implementation failures, and evaluation focus has gradually broadened to include process evaluation. The current view is that a much of implementation failure can be traced to poor program planning and development. Evaluators can make important contributions in these areas where attention is most needed.

Chen • Mature Implementation Stage. This stagefollows initial implementation at a point when the program has settled into fairly routine activities. Rules and procedures for conducting program activities are now well established. Stakeholders are likely to be interested in one or more of the following: continued unearthing of the sources of immediate problems, generation of data reassuring to those to whom stake-holders are accountable, and program improvement. Even in maturity, a program is subject to problems such as client dissatisfaction with services. Identifying and resolving problems are key to improving a program. And, as a program matures, stakeholders may think more about accountability.

Chen • Outcome Stage. A fourth stage of program growth is known as the outcome stage. Following a period of program maturity, stakeholders inside and outside the program want to know whether the program is achieving its goals. An evaluation at this point can serve any of several evaluation needs, including merit assessment and fidelity assessment (how well the program has functioned or how closely it has come to projected outcomes).

Evaluation Phases, Purposes & Types Design-phase or Developmental Evaluation Helps ensure that programs are well conceived, well designed Program Development Stage or Phase Formative or Process Evaluation Helps improve the implementation and management of programs Mature Program Implementation Stage or Phase Summative or Outcome or Impact Evaluation Helps determine whether or to what extent a program worked Program Outcome Stage or Phase

Chen Theory-driven program evaluation All programs have implicit theories Program modeling (e.g., via logic models) helps make implicit (or tacit) theory more explicit and therefore subject to scrutiny Implementation fidelity Preserving causal mechanism in implementation Scaling up Staying close to projected/intended outcomes (judgment call—what of positive unintended outcomes? Or negative unintended consequences of projected and attained outcomes?)

Chen: Programs and Fidelity Intended model: Implemented model Normative theory (program model—what program is intended to achieve, and how what it does, works) Causative theory (theory of change) Models too often substitute for reality (they should not—a kind of “formalism”) Models can support: Assessment of “evaluability” Needs assessments Program development, refinement, etc. Monitoring and evaluation

The need for a classification system Classifying evaluation approaches has a pragmatic purpose as it provides evaluation practitioners with the detail to make a choice among various evaluation approaches based on their inherent parameters, purposes and processes (Mathison 2005) It assists in finding the most appropriate fit between the evaluation’s purpose, its underlying values and appropriate methodologies to achieve the most rigorous results

Chen’s functional classification systemlinked to evaluation strategies Chen (2005) suggests four evaluation strategies: assessment strategies (performance of the program/intervention), development strategies (planning the programmatic intervention), enlightenment strategies (examining and shedding light on underlying assumptions and mechanisms), and partnership strategies (involving stakeholders). The distinction is based on the purpose or objectives of the evaluation.

Evaluation approaches vary in underlying philosophy Theory-driven evaluation philosophies are ones that adopt a more “scientific” approach to evaluation research in order to identify the critical success factors of the evaluation, linked to an in-depth understanding of the workings of a program or activity. They turn on notions of scientific validity. Participation-driven evaluation philosophies lean toward a more applied social improvement approach to evaluation research with the general aim of development, empowerment, and creating shared understanding of the program, between the evaluator(s) and stakeholders (beneficiaries and decision-makers). They turn on notions of “stakeholder-validity” (Chen)>

Our practitioner-oriented classification—based on broad terminological consensus • Formative/process evaluation (stress on program improvement, increasing organizational or inter-organizational capacity) • Outcome/impact or summative evaluation (stress on assessing program results, with due attention to fidelity of results to original intentions of or for the program)

Logic Model A graphic representation that clearly identifies and lays out the logical relationships among program conditions (needs), resources/inputs, activities, outputs, and outcomes or impacts.

Logic models & implicit/explicit program theory • A good logic model clearly Identifies Program Goals, Objectives, Inputs, Activities, Outputs, Desired Outcomes, and Eventual Impacts • A more or less explicit program theory specifies the relationship between program efforts and expected results (cf. theory-driven versus utilization-focused evaluation—Chen and critics) • Helps specify what to measure in evaluation • Guides assessment of underlying assumptions • Allows for stakeholder consultation and corrective action, for “telling performance story”

Welfare-To-Work Logic Model InputsActivities/Outputs Intermediate Outcomes End Outcomes $ FTE $ FTE $ FTE $ FTE $ FTE $ FTE $ FTE Outputs for Strategy 1 -# of clients trained for standard employment -# of clients trained or completing degree in high-wage employment area Activities for Strategy 1 -- # of training courses held -- # training methodologies developed --# employer surveys completed --# training promotional kits deployed --# career counseling sessions provided -# employers offering continuing education assistance Goal: Increase Self-Sufficiency in the Community through Increased Employment Measures: -Decrease in Welfare Ratio of $paid to #clients -Decrease Unemployment -# unemployment rate total; # unemployment rate for clients -Increase Self-Sufficiency -% of community achieving a self-sufficient wage; % of clients achieving self-sufficient wage Strategy 1: Improve Hard Skills of Clients to Reflect Hiring Needs of the Economy -Increase % of clients with adequate hard skills for standard employment -Increase % of clients completed continuing ed for high-wage career advancement Strategy 2: Improve the Soft Skills of Clients to Aid in Job Placement and Retention -Increase % of clients with appropriate soft skills Strategy 3: Reduce Substance Abuse and Mental Health Barriers -Decrease % of clients with substance abuse -Decrease % of clients with mental health Strategy 4: Enhance Access to Day Care -Decrease % of clients without day care access Strategy 5: Enhance Access to Transportation -Decrease % of clients without transportation Strategy 6: Decrease Barriers Presented by Physical Disability -Increase % of employers offering “integrative” workplace for people with disabilities -Decrease % of clients with physical disability preventing employment External Factors: # jobs created in economy annually; % jobs created with self-sufficient income potential

Chen: Programs and Fidelity Implementation and evaluating fidelity Context for evaluating fidelity—it may become evident that the program has strayed from its design but for good reasons, making for better outcomes. Considerations for conceptualizing fidelity Multilevel nature of many interventions Level and intensity of measurement aligned with need Capacity for monitoring fidelity Burden of monitoring fidelity Alignment with desired outcomes

‘Evaluation Planning Incorporating Context’ Debra Holden, RTI International, Marc Zimmerman, Univ. of Michigan, A Practical Guide to Program Evaluation Planning (Sage, 2008). In Chapters 1 and 2, the authors provide a step-by-step process to guide evaluators as they begin developing an evaluation plan in various settings. The stress is therefore on planning evaluations, not program planning. The authors define evaluation planning as the initial or background processes that go into the final design and implementation of a program evaluation. Their five-step conceptual framework, called Evaluation Planning Incorporating Context (EPIC), includes:(1) assessing context (e.g., stating the purpose of the evaluation); (2) gathering “reconnaissance” (e.g., determining the evaluation uses); (3) engaging stakeholders (e.g., ensuring stakeholders’ buy-in); (4) describing the program (including its theoretical underpinnings); and 5) focusing the evaluation (e.g., assessing the feasibility of proposed measures).

Evaluation Planning Incorporating Context (EPIC) Model Overview The EPIC model provides a heuristic (or set of rules) for evaluation planning rather than a specified set of steps that are required for all evaluations. Some parts of the model may be more or less applicable depending on such issues as the type of evaluation, the setting of the evaluation, the outcomes of interest, and the sponsor's interests. Thus, the EPIC model can be used as a kind of instruction guide to prepare for a program evaluation. It is a model of evaluation plan preparation, as distinguished from program planning. Several evaluation plans will be posted to the class webpage for your review.

Chapter 3. Planning for an Education Evaluation Julie Marshall, University of Colorado Denver. In this chapter, contextual factors identified in planning the evaluation of a school-based nutrition curriculum in a rural, low-income community are described. Curriculum delivery included hands-on food preparation, cooperative learning, and activities that were tied to content standards in math, literacy, and science. The evaluation focused on understanding long term curriculum effectiveness and factors that influence curriculum adoption and delivery. Evaluation planning considered local and state stakeholders; the stress surrounding high-stakes testing and the burden placed on schools for health related activities. Inclusion of teachers as part of the evaluation team was critical for framing evaluation questions, for informing the context within which teachers and students operate that may modify curriculum delivery and impact, and in developing evaluation tools.

Chapter 4. Planning for a Service Program Evaluation Mari Millery, Columbia University. In this case study, the EPIC model is applied to describe the planning for a study at the Leukemia & Lymphoma Society's Information Resource Center (LLS IRC), which responds to nearly 80,000 telephone inquiries a year from cancer patients and their family members. The study focused on a patient navigation intervention consisting of follow-up calls LLS IRC made to its clients. The case study describes each planning step and discuss lessons learned. Issues of particular importance to service programs are highlighted, including the complexity of context, importance of stakeholders, process versus outcome evaluation, and use of tools, conceptual frameworks, and evaluation research concepts while working with service program stakeholders.

Chapter 5. Call and Response: Developing a Collaborative Evaluation Plan for a Community-Based Program Thomas Reischl & Susan Franzen, Univ. of Michigan. Planning an evaluation of a new community-based program required successful partnerships with the project's coordinating agency and other community-based organizations. The case study describes these relationships and the role the evaluators played in developing and assessing the new program. They adopted a "responsive predisposition" to focus on key issues, problems, and concerns experienced by the program's stakeholders in plan development. They also engaged key stakeholders in the evaluation planning and in the implementation of the evaluation study. The authors describe the development of the evaluation plan for a new telephone information and referral service focused on serving African American families and reducing infant mortality among African American mothers. Finally, they discuss the utility of using an Evaluation Planning Matrix to help focus the evaluation on process evaluation goals and attempted an outcome evaluation study using baseline data from a previous study.

Chapter 6. Planning for a Media Evaluation: Case Example of the National Truth Campaign W. Douglas Evans, George Washington University. Media evaluation is an overarching subject area that includes the study of marketing campaigns intended to promote or change consumer behavior, as well as assessments of educational and entertainment media and the effects of news media on public discourse and policy. This chapter describes evaluation planning strategies and research methods for health communication and marketing campaigns designed to affect consumer health behavior. Media evaluation is distinct from other forms of program evaluation. It focuses on media effects on healthy behaviors or avoidance of unhealthy behaviors, as opposed to broad evaluation strategies that cross-cut multiple venues and approaches. Media evaluations measure four key process and outcome dimensions of campaign effectiveness: 1) exposure and recall, 2) message reactions and receptivity, 3) behavioral determinants, and 4) behavioral outcomes. After describing media evaluation methods, the authors describe the “truth campaign” evaluation, the largest anti-tobacco media campaign ever conducted in the United States.

The text considers the CDC Program Evaluation Frame-work; we will add the CDC Partnership Evaluation model and some network evaluation models as well MMWR, 1999Framework for Program Evaluation in Public Health

Dr. Mario A. Rivera Regents’ Professor School of Public Administration