1 / 24

“Value added” measures of teacher quality: use and policy validity

“Value added” measures of teacher quality: use and policy validity. Sean P. Corcoran New York University NYU Abu Dhabi Conference January 22, 2009. Overview. An introduction to the use of “value added” measures (VAM) of teacher effectiveness – in both research and practice.

Download Presentation

“Value added” measures of teacher quality: use and policy validity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. “Value added” measures of teacher quality: use and policy validity Sean P. Corcoran New York University NYU Abu Dhabi ConferenceJanuary 22, 2009

  2. Overview An introduction to the use of “value added” measures (VAM) of teacher effectiveness – in both research and practice. A discussion of the policy validity of VAM – motivated by current work on “teacher effects” on multiple assessments of similar skills. With: Jennifer L. Jennings (Columbia U) Andrew A. Beveridge (Queens College)

  3. What are “value added” measures? Essentially, an indirect estimate of a teacher’s contribution to learning, measured using gains in students’ standardized test score results What makes them “indirect?” Uses a statistical model to account for certain student characteristics (key: past achievement), attributing remaining test score gains to the teacher Clearly an improvement over test score levels

  4. What are “value added” measures? Generally, “teacher effects” cannot be separated from “classroom effects” E.g. two classrooms of similarly situated students where one has a particularly disruptive student May be able to improve VAM with multiple years of results for teachers This approach raises a range of additional issues and questions, some of which I will address in a moment

  5. Growth in VAM VAM of teacher effectiveness were initially mostly of academic interest Rivkin et al. (2005): effect size of .10/.11 SD for reading/math Nye et al. (2004): 25-75th percentile shift in teacher quality increased reading/math by .35/.48 SD

  6. Growth in VAM Value added assessment of teachers is becoming widespread practice in the U.S. Houston, Dallas, Denver, Minneapolis, Charlotte EVASS New York City – for now a “development tool” only The Teacher Data Tool Kit

  7. Why the sudden interest? A logical extension of school accountability Movement to collect, publicly report student achievement measures at the school level In some cases, rewards and sanctions (e.g. NCLB) Common sense appeal (both Obama and McCain supported “pay for performance” for teachers)

  8. Why the sudden interest? Data availability Large longitudinal databases of student performance enabled these calculations Concurrent advancements in methodology

  9. Why the sudden interest? Improving our assessment and measurement of teacher quality Easily observed characteristics of teachers are often poor predictors of classroom achievement (Hanushek and Rivkin 2006) Especially true of qualifications for which teachers are remunerated (e.g. education, certification, experience)

  10. Issues with VAM (to name a few…) Focus on a narrow measure of educational outcomes: does “the test” adequately reflect our expectations of the educational system? E.g. skill content, short-term vs. long-term benefits Validity: assuming “the test” reflects outcomes we care about, is the instrument a valid one? Teaching to the test and test inflation (Koretz 2007) – even “good” tests lose validity over time

  11. Issues with VAM (to name a few…) Modeling for causal inference: how can we be confident that our VAM are providing “good” estimates of the teachers true (i.e. causal) contribution to student learning? Students are not randomly assigned to teachers Dynamic tracking “Teacher effects” may be context dependent

  12. Issues with VAM (to name a few…) Precision Estimates of teacher effects are just that: estimates Each student’s test score gain is a small—and noisy—indicator of teacher effectiveness Are our estimates precise enough to base personnel decisions on them?

  13. Issues with VAM (to name a few…) Other Perverse incentives (gaming / cheating) Subject dependency Persistence Scaling issues – e.g. ceiling effects Missing data – e.g. absent or exempted students

  14. The “policy validity” of VAM Do VAM of teacher effectiveness have “policy validity?” That is, are they appropriate for practical implementation, and for what purposes? (Harris 2007) If one were to make personnel decisions based on VAM, at the very least these measures should be: Convincing as “causal” estimates Relatively precise

  15. Our research question If VAM are meaningful indicators of teacher effectiveness, they should be relatively consistent across alternative assessments of the same skills (especially for narrowly defined skills) In most cases we only observe one assessment – the “high stakes” state assessment – upon which teacher effects are estimated

  16. Houston Houston is somewhat unique in that one can observe two measures of student achievement: TAKS – a “high stakes” exam Stanford 10 – a “low stakes” exam Both test reading and math skills How consistent are VAM of effectiveness on these two tests?

  17. Houston data and method Longitudinal student-level data on all students in the Houston ISD, 1998 – 2006 (we use 2003-06) Students are linked to their teachers Student background About 127,000 students We estimate teacher effects for 4th and 5th grade teachers on both TAKS and Stanford tests Using 1 and 3 years of results

  18. Correlation across tests

  19. Teacher effects on multiple tests

  20. Teacher effects on multiple tests (one year of data only)

  21. Teacher effects on multiple subjects

  22. Teacher effect stability

  23. Conclusions Teachers who are good at promoting growth on a high-stakes test are not necessarily those who are good at promoting growth on a low-stakes tests of the same subject. Teacher effects vary significantly across years and subjects Useful for policy? Probably—but we should resist relying too heavily on these measures Of course, more research is needed!

More Related