Evaluation Policy & Monitoring: State and Impact Analysis

What is the Current State of Evaluation Policy and Methodologies for Monitoring Program Performance? OR GPRA & PART: Through a Glass Darkly Irwin Feller Senior Visiting Scientist American Association for the Advancement of Science WREN Workshop Washington, DC, June 6, 2008

Evaluation Modalities

“Here Comes Performance Assessment-and It Might Even be Good for You” (Behn, 1994) • Having objectives (“knowing where you want to go”) is helpful • Objectives provide useful baseline for assessing each of 4 modalities of accountability (finance; equity; use of power; performance) • Well defined objectives and documentation of results facilitate communication with funders, performers, users, and others • R&D mangers ought to devise their own performance measures

R&D Investment Criteria • Consistent with President’s priorities • Focus on activities that require a federal presence to attain national goals • Maximize quality of the research process and efficiency of public r&d programs through use of competitive, merit-based processes where appropriate (Exceptions must be justified) • Reduce or eliminate funding for programs that have completed their mission or that are redundant or obsolete

Research Agency Officials Question White House’s Review of Basic Science • “The cure for cancer can’t be compared to the delivery of a FedEx package, but right now its being put in the same mold” • “Science can’t tell you about what the results and outcomes will be in the time frame they want” D. Duran, WREN Workshop, Chronicle of Higher Education, 12/19/2003, p. A25

FY 2008 R&D Investment Budget Priorities In general, the Administration favors Federal R&D Investments that: • Advance fundamental scientific discovery;… • Support high-leverage basic research to spur technological innovation, economic competitiveness and new job growth;…………………………. • Maximize the efficiency and effectiveness of the S&T enterprise though expansion of competitive, merit-base peer review processes and phase-out of programs that are only marginally productive or are not important to an agency’s mission…

PART: Empirical Estimates • Effects on small programs may be large—as high as 20% change (over previous year budget) • Weighted by program size, effect is 3% • Findings relate to all surveyed programs; not r&d programs (Olsen and Levy, 2004)

R&D Performance Metrics (for Basic Research)? Justifying its recommendation that the US act to expand its investments in particle physics research: • “The committee affirms the intrinsic value of elementary particle physics as part of the broader scientific and technological enterprise and identifies it as a key priority within the physical sciences” • National Research Council, Revealing the Hidden Nature of Space and Time

Evaluating the Evaluators “I know nothing of the licenser, but that I have his own hand here for his arrogance; who shall warrant me his judgment?” (Milton, Areopagitica, 1644)

How to Evaluate • Input Additionality? • Output Additionality? • Behavioral Additionality?

Counterfactual History What would the national s&t enterprise have looked like, or performed as, between FY1994/FY2008 in the absence of GPRA and/or PART?

Input Additionality? What impacts have GPRA/PART had on: • total level of Federal r&d? • allocation of r&d by functional fields/programs? • allocation of r&d by agencies?

Dominant Trends in Executive Budget Proposals for Science and Technology • Initial increases, then steady 5 year decline in real terms (FY2009 down by 9.1% from FY2004; AAAS) • Strong support for basic research, • Discontinuous priority shifts (between biomedical and physical sciences) • NIH Roller Coaster • COMPETES (NSF, NIST, DOE-OFFICE OF SCIENCE) • Erosion of funding for environmental/climate/”regulatory” -related research across agencies • Longstanding, manifest antipathy to civilian technology programs (ATP/MEP) • Strong support for competitive, merit-based allocations

Output Additionality? What effects have GPRA/PART had on the outputs generated from federally funded r&d with respect to: • r&d priorities (feedback loops); • r&d strategies; • effectiveness/efficiency; • choice of performers?

R&D Performance Metrics

Behavioral Additionality? What effects have GPRA/PART had on: • agency r&d priorities • r&d management strategies • interactions with OMB • relationships with performers • relationships with Congress/other intra-agency units/”stakeholders”/constituencies?

Evaluation in the PART Process Evaluation—evidence that the program has independent and quality evaluations indicating effectiveness and achievement of results—constitutes only one portion of the PART score

Feller (Revised)Theorem on Use of Evaluation/Evidence in Policy Making • Tolstoy: “Doing good will not make you happy but doing bad will surely make you unhappy”. • Feller: “A good (well-done) evaluation showing bad (program) results may or may not kill a program. A good (well-done) evaluation showing good (program) results may or may not save a program”.

PART’s Method(s) for Evaluating R&D Programs • Extensive reliance on National Research Council “expert” assessments of basic research programs • Familiarity with relevant published evaluations • Few self-initiated independent assessments or methodological innovations

Evaluation Policy & Monitoring: State and Impact Analysis