1 / 34

Establishing Successful Project Evaluation for NSF Funded Programs

Establishing Successful Project Evaluation for NSF Funded Programs. Mack Shelley Iowa State University mshelley@iastate.edu Prepared for presentation at the CESMEE STEM Education Collaboration Coffee March 5, 2012. Your Presenter. Statistics and Political Science, Iowa State University

Download Presentation

Establishing Successful Project Evaluation for NSF Funded Programs

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Establishing Successful Project Evaluation for NSF Funded Programs Mack Shelley Iowa State University mshelley@iastate.edu Prepared for presentation at the CESMEE STEM Education Collaboration Coffee March 5, 2012

  2. Your Presenter • Statistics and Political Science, Iowa State University • Currently, Director of the Public Policy and Administration program in Political Science • Research Institute for Studies in Education (RISE), Iowa State University; Coordinator of Research, 1999-2003; Director, 2003-2007 • 35 years of experience with consulting and evaluation activity • PI, co-PI, evaluator, consultant on federal, state, and other grants and contracts • National Science Foundation • STEP (SEEC project; E2020) [STEM] • Industry/University Cooperative Research Centers • Integrative Graduate Education and Research Traineeship • Also U.S. Department of Education, U.S. Department of Health and Human Services, various state agencies

  3. The Need for Rigorous Evaluation: NCLB, The Education Sciences Reform Act and Beyond • In the U.S. and many other countries, research and evaluation funding from government and other sources has become tied more closely to use of the “medical model” of randomized clinical trials (RCTs) • randomized assignment of subjects to treatment or control groups • fidelity of treatment effects • consistent measurement of well-defined outcomes

  4. The Need for Rigorous Evaluation: NCLB, The Education Sciences Reform Act and Beyond • This focus on RCT-style interventions has been emphasized particularly in the requirements for research and evaluation in education and other human sciences areas. • This leads to a greater need for careful attention to evaluation and research requirements by content experts in education, and for evaluation and methodology experts to be willing and available to partner in joint efforts. • NSF’s The 2010 User-Friendly Handbook for Project Evaluation http://www.westat.com/pdf/projects/2010ufhb.pdf

  5. The Need for Rigorous Evaluation: NCLB, The Education Sciences Reform Act and Beyond • Standards for acceptable research and evaluation by federal funders, particularly in the area of education, have been affected greatly by: • the No Child Left Behind Act of 2001, Public Law 107–110 [H.R. 1], passed on January 8, 2002 • the Education Sciences Reform Act (H.R. 3801), passed January 23, 2002 • creation of the Institute of Education Sciences (IES) in the U.S. Department of Education http://www.ed.gov/about/offices/list/ies/index.html

  6. The Need for Rigorous Evaluation: NCLB, The Education Sciences Reform Act and Beyond • Together, these developments • have reconstituted federal support for research, evaluation and dissemination of information in education • are meant to foster “scientifically valid research,” and • have established what is referred to as the “gold standard” for research in education • These and other developments denote that greater emphasis in fundable education research now is placed on • quantification, • the use of randomized trials, and • the selection of valid control groups

  7. The Need for Rigorous Evaluation: NCLB, The Education Sciences Reform Act and Beyond • In HR 3801, “scientifically valid education evaluation” • adheres to the highest possible standards of quality with respect to research design and statistical analysis • examines the relationship between program implementation and program impacts • provides an analysis of the results achieved by the program with respect to its projected effects • employs experimental designs using random assignment when feasible, and other research methodologies that allow for the strongest possible causal inferences when random assignment is not feasible • may study program implementation through a combination of scientifically valid and reliable methods

  8. The Need for Rigorous Evaluation: NCLB, The Education Sciences Reform Act and Beyond • Evaluating whether an intervention is backed by “strong” evidence of effectiveness hinges on • well-designed and well-implemented randomized controlled trials • demonstrating there are no systematic differences between intervention and control groups before the intervention • measures and instruments of proven validity • “real-world” objective measures of the outcomes the intervention is designed to affect

  9. The Need for Rigorous Evaluation: NCLB, The Education Sciences Reform Act and Beyond • Whether an intervention is backed by “strong” evidence of effectiveness also hinges on • attrition of no more than 25% of the original sample • effect size, as well as p-values • W.T. Grant Foundation’s Optimal Design software http://sitemaker.umich.edu/group-based/optimal_design_software • Russ Lenth’s (U. of Iowa) Java applets for power and sample size http://www.cs.uiowa.edu/~rlenth/Power/ • adequate sample size to achieve statistical significance • controlled trials implemented in more than one site representing a cross-section of all schools

  10. Logic Models:A Tool for Planning Evaluations • Purposeful Planning for the Future • [Thanks to University of Wisconsin Extension] • Use Logic models to help guide us through purposeful activity planning for each of the grant objectives. • Logic Models provide a process for linking activities to outcomes (and in turn evaluation). • Logic Models are becoming more prevalent in grant proposal submissions and grant evaluations.

  11. Logic Models:A Tool for Planning Evaluations A logic model is an advance organizer used to help design evaluation and performance measurement, including: a model of how the program works evaluation questions key performance measures outline of the story to be told in the evaluation report a shared understanding among program and evaluation staff of what is important

  12. Logic Models:A Tool for Planning Evaluations • What’s the benefit of using Logic Models? • Focus on and be accountable for what matters – • OUTCOMES • Provides common language • Supports continuous improvement • Promotes communications • Makes assumptions EXPLICIT Assumptions underlie much of what we do. It is often these underlying assumptions that hinder success or produce less-than-expected results. One benefit of logic modeling is that it helps us make our assumptions explicit.

  13. Logic Models:A Tool for Planning Evaluations Resources Activities Outputs Outcomes Impact 1 2 3 4 5 Your Planned Work Your Intended Results Where are you going? How will you get there? What will show that you’ve arrived?

  14. Logic Models:A Tool for Planning Evaluations Resources Activities Resources include: Program Activities are: People, Time, Materials, Funds… …dedicated to or consumed by the program Resources can often be referred to as inputs. What the program does with the resources to achieve desired results. The processes, tools, events, technology, and actions are the intentional part of the program implementation.

  15. Logic Models: A Tool for Planning Evaluations Outputs are: Outcomes are: Outputs Outcomes Impact Impact is: • The fundamental intended or unintended change occurring in organizations, communities or systems as a result of program activities within 7-10 years. • The direct product of program activities and • may include types, levels, and targets of services to be delivered. • The changes expected to result from a program- • Changes among participants, clients, communities, systems, or organizations. • Short-term 1-3yrs • Long-term 4-6yrs

  16. Logic Models:A Tool for Planning EvaluationsUnderlying a logic models is a series of if-then relationships IF then IF then IF then We invest time, effort, and money The number of students who are retained and graduate will increase We can provide advising 10 hrs/week for 50 students Students struggling academically can be advised They will learn better and improve their skills They will get better grades

  17. Logic Models:A Tool for Planning Evaluations OUTCOMES INPUTS OUTPUTS Activities Participation Short Medium Long-term Program investments What we invest What we do Who we reach What results Feedback loops and multi-dimensions

  18. GEEMaP IGERT Project at University of Iowa • http://www.geemap.stat.uiowa.edu/ Logic Model Planning

  19. Simpson’s Paradox OK, not those Simpsons

  20. Simpson’s Paradox (the real one) • Edward H. Simpson (1951) • The successes of groups seem reversed when the groups are combined. • This result is often encountered in social and medical science statistics. • Occurs when a third (weighting) variable, which is irrelevant to the individual group assessment, must be used in the combined assessment. • The effect appears paradoxical only because of the statistical interpreter's tendency to causal interpretation of proportional changes. • Counterintuitive results come from inferring causal relationships based on the association between two variables. • The issue is “lurking variables”

  21. Simpson’s Paradox (an example) • To help consumers make informed decisions about health care, the government releases “report card” data about patient outcomes in hospitals. Here is a two-way table of data on the survival of patients after surgery in these two hospitals. All patients undergoing surgery in a recent time period are included. “Survived” means that the patient lived at least 6 weeks following surgery.

  22. Simpson’s Paradox (an example) • This is not a hypothetical situation. • These “report cards” on comparative hospital death rates are exactly what is compiled and released by the U.S. Health Care Financing Administration (HCFA; in the U.S. Department of Health and Human Services).

  23. Simpson’s Paradox (an example) Hospital A Hospital B Died 63(3%) 16(2%) Survived 2037 784 Total 2100 800 So, with the fatality rate lower for Hospital B, that’s where you want to go for surgery, right? Well …

  24. Simpson’s Paradox (an example) • Not all surgery cases are equally serious. Later in the government report card you find data on the outcome of surgery broken down by the condition of the patient before the operation. Patients are classified as being in either “poor” or “good” condition. Here are the more detailed data.

  25. Simpson’s Paradox (an example) Good condition Poor condition A B A B Died 6(1%) 8(1.3%) 57(3.8%) 8(4%) Survived 594 592 1443 192 Total 600 600 1500 200 So, in fact Hospital A is safer for patients in both good and poor condition. If you are facing surgery, you should choose Hospital A. [This example is drawn from David Moore and George McCabe, Introduction to the Practice of Statistics (3rd ed.)]

  26. Simpson’s Paradox (an example) • The patient’s condition is a lurking variable when we compare the death rates at the two hospitals. • When we ignore the lurking variable, Hospital B seems safer, even though Hospital A does better for both classes of patients. • How can this be?

  27. Simpson’s Paradox (an example) • Looking at the data, Hospital A attracts seriously ill patients from a wide region; 1500 were in poor condition, whereas Hospital B had only 200 in poor condition. • Patients in poor condition are more likely to die, so Hospital A has a higher death rate despite its superior performance for each class of patients. • HCFA’s report card actually does include information on patients’ condition and diagnosis, but simply looking at the mortality rates can be misleading.

  28. A Possible Template for STEM Research and Evaluation? • Ways to Extend the Reach of STEM Evaluation and Research • Data mining • Experimental and quasi-experimental designs of all types • Synthesize multiple studies (meta-analysis) • Longitudinal studies of long-term effects • Growth-curve models (within-person designs) • Structural equation modeling (SEM) • Hierarchical linear modeling (HLM)

  29. Confirmatory Factor Analysis (a Type of SEM) from Program for International Student Assessment (PISA) 2003 Mathematics Data INTMAT = interested in and enjoy material INSTMOT = usefulness for career

  30. Hierarchical Linear Modelfrom PISA data level-1 Model (student) level-2 Model (school)

  31. Individual Growth Curves • Limitations of traditional repeated-measures analysis • Restrictive assumptions (e.g., sphericity) often not met • Ignores individual growth trajectories • Has difficulty dealing with missing data and inconsistent time periods • Growth curve modeling • Models change in an outcome (Y) as a function of time • Can estimate both rate and pattern of growth

  32. Individual Growth Curves Level-2 Model Level-1 Model • 0i = 00 + 01 Xi + r0i • predictors of baseline performance • 1i = 10 + 11 Xi + r1i • predictors of initial learning rate • 2i = 20 + 21 Xi + r2i • predictors of acceleration • Yti = 0i + 1i ati + 2i ati2 + eti

  33. How to Model the Time Variable Is a Major Methodological Issue • For example, centering the time variable can dramatically change the interpretation of lower-order coefficients • This diagram below shows what growth curves might look like for 3 students

  34. Questions? • Comments?

More Related