180 likes | 293 Views
Seattle’s Teacher Evaluation Reform in Context. Two questions. How does PG&E’s design compare to evaluation systems nationwide?. What can SPS learn about implementing PG & E from similar efforts in other districts?. Bottom lines. Today’s briefing on implementation analysis. Approach
E N D
Two questions • How does PG&E’s design compare to evaluation systems nationwide? • What can SPS learn about implementing PG&E from similar efforts in other districts?
Today’s briefing on implementation analysis • Approach • Findings • Implications • Discussion
Our approach • Reviewed empirical studies on implementation of PG&E-like reforms • Looked for evidence on what districts are actually doing, not should be doing.
Presentation includes information from studies on: • Chicago (2), Denver, Washington D.C., Coventry, RI, Washoe Country (Reno), NV, Cincinnati • The Measure of Effective Teaching (MET) Project, and the Teacher Advancement Program (TAP)
What the research base covers • Studies focus on • Implementation dynamics and fidelity • Validity and reliability of performance rating • But generally no evidence on • Effects on teacher workforce or classroom practice • Effects on student learning
Four key findings • Evaluation reforms can expose problems in other district-wide systems • Teachers and principals often struggle with understanding and carrying out the reforms • Observation-based ratings can identify “effective” teachers, but there’s room to improve • Observation-based ratings are more reliable when based on multiple observations
Reforms expose problems in other district-wide systems • Teaching-focus of reform highlights misalignments in instructional and operational systems • Are PD and curriculum aligned with instructional frameworks and assessments? • Are training, hiring, and payroll aligned in HR? • Do data systems speak to each other (E.g., compensation and evaluation)?
People struggle to understand and implement the reforms • Teachers struggle to understand structure and purpose of new evaluation systems • Especially financial incentives • Principals struggle to work with teachers to improve teaching practice • Most training focuses on calibrating observations and ratings • Time constraints are big issue
Observation-based ratings “work,” but could be better • Teachers who do well on observation ratings also tend to have higher VAM scores • Ratings are better at identifying “effective” teachers when combined with other measures
Combining measures adds predictive power (Kane & Staiger, 2012, p.9)
More observations = more reliable (Kane & Staiger, 2012, p.37)
Implications • Ensure district improvement initiatives complement and support PG&E implementation • E.g., Work of EDs, HR, C&I, and DoTS • Assess how well people understand PG&E and redouble communication efforts
Implications con’t • Train principals in observations and rating but also working with teachers to improve practice • Place a premium on hiring and developing leadership talent • Create systematic process to monitor reliability and validity of PG&E evaluations. • Double ratings • Comparing ratings to VAM
Inclusion criteria • Research must be on programs with teacher evaluation systems, not simply pay reform systems • Research must evaluate domestic reform at the district, county or state level • Study must examine student outcomes, instructional practice, or effects on staffing (recruitment, retention, dismissal) • Studies must clearly state the methodology that the authors use, the research sample and the sources of data that the research uses • Authors must explain and justify the thoughtful creation of their samples (i.e., reports must not simply use convenience samples) • The study must include quantitative or qualitative data that represents reform outcomes throughout the geographic area of implementation • The research must compare measured outcomes to either a control group, the school’s past performance, or both.
Studies in review • Milanowski, A.T. (2004). The Relationship Between Teacher Performance Evaluation Scores and Student Achievement: Evidence from Cincinnati. Peabody Journal of Education , 79(4), 33-53. • Proctor, D., Walters, B., Reichardt, R., Goldhaber, D., Walch, J. (2011). Making a difference in education reform: ProComp external evaluation report 2006- 2010. University of Colorado Denver Center for Education Data and Research. • Sartin, L., Stoelinga, S.R., Brown, E.R. (2011). Rethinking teacher evaluation in Chicago: Lessons learned from classroom observations, principal-teacher conferences, and distriCurtis, District of Columbia Public Schools: Defining Instructional Expectations and Aligning Accountability and • Glazerman, S., Seifullah, A. (2012) An evaluation of the Chicago Teacher Advancement Program (Chicago TAP) after four years. Washington, D.C: Mathematica Policy Research. • Thomas J. Kane and Douglas O. Staiger, Gathering Feedback for Teaching: Combining High-‐Quality Observations with Student Surveys and Achievement Gains (Seattle, WA: Bill & Melinda Gates Foundation, January 4, 2012) • Kimball, S. M., White, B., Milanowski, A. T., Borman, G. (2004). Examining the relationship between teacher evaluation and student assessment results in Washoe County. Peabody Journal of Education. 79(4), 54-79. • Milanowski, A.T. (2004). The ctimplementation. Consortium on Chicago School Research at the University of Chicago Urban Education Institute. • Springer, M. G. (2008). Impact of the Teacher Advancement Program on student test score gains: Findings from an independent appraisal. National Center of Performance Incentives, Peabody College of Vanderbilt University. Retrieved from http://www.performanceincentives.org/data/files/news/PapersNews/200819_Springer_ImpactAdvancedProg1.pdf. • White, B. (2004). The relationship between teacher evaluation scores and student achievement: Evidence from Coventry, RI. Madison, WI:Consortium for Policy Research in Education.