440 likes | 588 Views
Teacher Effectiveness Measurement Some Whys and Hows. Amy McIntosh Senior Fellow, Regents Research Fund November 1, 2012 All Materials from research studies described here are reprinted with permission of authors. Agenda. Discussion of new research studies that confirm:
E N D
Teacher Effectiveness MeasurementSome Whys and Hows Amy McIntosh Senior Fellow, Regents Research Fund November 1, 2012 All Materials from research studies described here are reprinted with permission of authors
Agenda Discussion of new research studies that confirm: • Teacher effectiveness matters • To improve it, NY needs to measure it, using multiple measures. How we are measuring teacher effectiveness in NYS, • And how you can help
Study Number One: Long-term Impacts of Teachers The Long-Term Impacts of Teachers: Teacher Value-added and Student Outcomes in Adulthood (Chetty, Friedman & Rockoff). http://obs.rc.fas.harvard.edu/chetty/value_added.html Study Data: • 2.5 MM children from childhood to early adulthood in 1 large district • Teacher/course linkages and test scores in grades 3-8 from 1991-2009 • US government tax data from W-2s: on parents AND students • About parents: household income, retirement savings, home ownership, marriage, age when student born • About students up to age 28: teen birth, college attendance, earnings, neighborhood “quality”
What is “teacher value added” • A statistical measure of the • growth of a teacher’s students • that takes into account the differences in students across classrooms that school systems can measure but teachers can’t control. • Researchers using “value-added” are measuring: • Growth compared to the average growth of similar students • “similar” includes student, classroom and school characteristics
Key Finding: Teacher effectiveness matters Having a higher value-added teacher for even one year in grades 4-8, has substantial positive long-term impacts on a student’s life outcomes including: • Likelihood of attending college (UP 1.25%) • Likelihood of teen pregnancy (DOWN 1.25%) • Salary earned in lifetime (UP $25K per avg. student) • Neighborhood (More college grads live there) • Retirement savings (UP)
Study Number Two: Measures of Effective Teaching http://www.metproject.org
Study Number Two: Measures of Effective Teaching Unique project in many ways: • in the variety of indicators tested, 5 instruments for classroom observations Student surveys (Tripod Survey) Value-added on state tests • in its scale, 3,000 teachers 22,500 observation scores (7,500 lesson videos x 3 scores) 900 + trained observers 44,500 students completing surveys and supplemental assessments • and in the variety of student outcomes studied. Gains on state math and ELA tests Gains on supplemental tests (BAM & SAT9 OE) Student-reported outcomes (effort and enjoyment in class)
Dynamic Trio Measures have different strengths …and weaknesses H M L M H M M/H L H
Framework for Teaching (Danielson) Four Steps
Student Feedback: related to student learning gains Rank Survey Statement Student survey items with strongest relationship to middle school math gains: • Students in this class treat the teacher with respect 1 • My classmates behave the way my teacher wants them to 2 3 • Our class stays busy and doesn’t waste time 4 • In this class, we learn a lot every day 5 • In this class, we learn to correct our mistakes Student survey items with the weakest relationship to middle school math gains: • I have learned a lot this year about [the state test] 38 39 • Getting ready for [the state test] takes a lot of time in our class Note: Sorted by absolute value of correlation with student achievement gains. Drawn from “Learning about Teaching: Initial Findings from the Measures of Effective Teaching Project”. For a list of Tripod survey questions, see Appendix Table 1 in the Research Report.
Combining Observations with other measures improved predictive power Dynamic Trio
Key Finding: Use multiple measures • All the observation rubrics are positively associated with student achievement gains • Using multiple observations per teacher is VERY important (and ideally multiple observers) • The student feedback survey tested is ALSO positively associated with student achievement gains • Combining observation measures, student feedback and value-added growth results on State tests was more reliable and a better predictor of a teacher’s value-added on State tests with a different cohort of students than: • Any Measure alone • Graduate degrees • Years of teaching experience • Combining “measures” is also a strong predictor of student performance on other kinds of student tests.
Key Points about NYS Growth Measures • We are measuring student growth and not achievement • Allow teachers to achieve high ratings regardless of incoming levels of achievement of their students • We are measuring growth compared to similar students • Similar students: Up to three years of the same prior achievement, three student-level characteristics (economic disadvantage, SWD, and ELL status) • In 12-13, NY’s “value-added model” which needs Board of Regents approval, will consider additional student and classroom characteristics Every educator has a fair chance to demonstrate effectiveness on these measures regardless of the composition of his/her class or school.
Prior Year Performance for Students in Two Teachers’ Classrooms ─ Proficiency
Current Year Performance of Same Students ─ Proficiency
Student A’s Current Year Performance Compared to “Similar” Students If we compare student A’s current score to other students who had the same prior score (450), we can measure her growth relative to other students. We describe her growth as a “student growth percentile (SGP”). Student A’s SGP is 45, meaning she performed better in the current year than 45 percent of similar students. High SGPs ELA Scale Score Student A 450 Low SGPs 2011 2012
Comparing Performance of “Similar” Students Given any prior score, we see a range of current year scores, which give us SGPs of 1 to 99. Current Year Score Prior Year Score
From Student Growth to Teachers and Principals (continued) To measure teacher performance, we find the mean growth percentile (MGP) for her students. To find an educator’s mean growth percentile, take the average of SGPs in the classroom. In this case: Step 1: 45+40+70+60+40=255 Step 2. 255/5=51 Ms. Smith’s mean growth percentile (MGP) is 51, meaning on average her students performed better than 51 percent of similar students. A principal’s performance is measured by finding the mean growth percentile for all students in the school.
Expanding the Definition of “Similar” Students • So far we have been talking about “similar” students as those with the same prior year assessment score • We will now add two additional features to the conversation: • Two additional years of prior assessment scores • Remember—a student MUST have current year and prior year assessment score to be included • Student-level factors • Economic disadvantage • Students with disabilities (SWDs) • English language learners (ELLs)
Scatter Plot of Teacher MGPs and Percent Poverty Students in Class – Adjusted Model Another very small downward slope suggesting very small ED relationship 11-12 Technical Report NYS Growth Meausres
MGP Lower Limit Upper Limit Confidence Range MGPs and Statistical Confidence 87 • NYSED will report a 95 percent confidence range, meaning we can be 95 percent confident that an educator’s “true” MGP lies within that range. Upper and lower limits of MGPs will also be reported. • An educator’s confidence range depends on a number of factors, including: number of student scores included in their MGP and the variability of student performance in their classroom.
From MGPs to Growth Ratings: TeachersRules on last slide result in these HEDI criteria for 2011-12 Mean Growth Percentile Confidence Range HEDI Rating Is your MGP ≥ 69? Highly Effective: Results are well above state average for similar students Is your Lower Limit > Mean of 52? If no Is your MGP 42-68? Effective: Results equal state average for similar students Any Confidence Range If no Is your MGP 36-41? Developing: Results are below state average for similar students Is your Upper Limit < Mean of 52? If no Ineffective: Results are well below state average for similar students Is your MGP ≤ 35? Is your Upper Limit < 44? If no If yes If yes If yes If yes If yes If yes If yes If yes If no If no
Jane’s MGP = 47 (this is what is used to determine the growth score and growth rating) Jane’s Upper Limit = 55 and Lower Limit = 39 First let’s look at a growth report about a teacher… Jane Eyre
Teacher-level Report Teacher 1D’s Growth Score and Growth Rating are listed here Teacher 1D does not have any growth data reported for any of the subgroups because 16 student scores are required to report any data Teacher 1D has a higher adjusted MGP in Math than ELA District X School #1 Jane Eyre Jane Eyre
School-level Report The Growth Score and Growth Rating for the Principal of School #1 are listed here An adjusted MGP and associated confidence range will be reported for each subject and grade level within the school. 49 % of students at School #1 scored above the State median. 36% of the student scores are from economically disadvantaged students, and no scores from English language learners. School #1 has scores broken out by subject for grades 4-6. District X School #1 Summary of Revised APPR Provisions Memo: http://engageny.org/wp-content/uploads/2012/03/nys-evaluation-plans-guidance-memo.pdf
School-level Report—Detailed View Each teacher receives an adjusted MGP and associated confidence range that are used to determine the growth rating and growth score Teachers 1E and 1G did not receive any growth data because they are linked to less than 16 student scores Teacher 1B has the most student scores linked to him (43 scores) 43 student scores could not be linked to any of the teachers District X School #1 School #1 has 12 teachers who teach grades 4-8 ELA and Math Teacher 1A Teacher 1B Teacher 1C Teacher 1D Teacher 1E Teacher 1F Teacher 1G Teacher 1H Teacher 1I Teacher 1J Teacher 1K Teacher 1L
District-level View—Page 1 NY State Summary Number of student scores included in calculation of State MGP NYS Summary Data—Included on ALL District reports NY Statewide Adjusted MGP = 52 State Median = 50 Statewide about 50% of ELL, SWD, and economically disadvantaged students scored above the State median. District X
District-level View—Page 1-2 District Summary District-wide Adjusted MGP Number of student scores included in calculation of district-wide MGP District X Summary Data District X District X District X Summary Data—continued on next page of report
District-level View—Page 3 List of Schools District X District X has two schools that have grades 4-8 ELA and Math scores School #1 School #2 Principal of School #1 Growth Score = 14 Growth Rating = Effective Principal of School #2 Growth Score = 6 Growth Rating = Developing
Using Growth Score results • Beyond evaluation, growth score information can provide additional information to help with instructional improvement. • Of course, these measures are only one of multiple sources of evidence to use for this purpose • The best insight comes from considering the results in the context of other information about a teacher, group of teachers, principal or group of schools.
Districts may want to: Analyze district-level information using these reflective questions: How much did our students grow, on average, compared to similar students? Is this higher, lower, or about what we would have expected? Why? How do our MGPs for each reported subgroup (ELL, SWD, economically disadvantaged students, high- and low-achieving students) compare to each other and to our overall MGP? Are there any patterns? Are the MGPs higher, lower, or about what we would have expected? Why? How do the MGPs compare by subject and across grade levels? Why might they be similar or different? What should we do to understand any surprises using other information and evidence? Do we have the right plans in place to aid in professional learning?
Districts may want to: Convene principals to reflect upon their school growth results in context of other information about student learning and teacher effectiveness in their schools: • Use BOCES trainers and/or SED online resources to ensure basic understanding of the measures and what information is found on reports • Engage principals individually or in a group to reflect on questions about their school information in the context of other evidence of teacher effectiveness: • How much did the students of my teachers grow, on average, compared to similar students and how does this differ across teachers? Are there differences across grades or subjects? • How do my teachers’ MGPs differ across each reported subgroup? Do I see any patterns?
Principals may want to: • Consider the reflective questions in their school-level reports: • See the Principal’s Guide to Interpreting Growth Scores: http://engageny.org/wp-content/uploads/2012/06/Principals_Guide_to_Interpreting_Your_Growth_Score.pdf • See the Sample Principal Report—Annotated: http://engageny.org/wp-content/uploads/2012/06/Principal_Sample_Growth_Report.pdf • Plan how teachers will get the information they need to understand their own growth reports
Teachers may want to: • Review materials from SED about growth measures • View the “Growth Model for Educator Evaluation 2011-12” Webinar: http://engageny.org/resource/growth-model-for-educator-evaluation-in-2011-2012/ • View the “Using Growth Measures for Educator Evaluation in 2011-12” Webinar: http://engageny.org/resource/using-growth-measures-for-educator-evaluation-in-2011-2012/ • See the Teacher’s Guide to Interpreting Growth Scores: http://engageny.org/wp-content/uploads/2012/06/Teachers_Guide_to_Interpreting_Your_Growth_Score.pdf • See the Sample Teacher Report—Annotated: http://engageny.org/wp-content/uploads/2012/06/Teacher_Sample_Growth_Report.pdf • Consider the following reflective questions: • How much did my students grow, on average, compared to similar students? Is this higher, lower, or about what I would have expected? Why? • How does this information about student growth align with information about my instructional practice received through observations or other measures? Why might this be?
You can help by supporting Districts to: Understand the basics of the growth measures Analyze district-level information using these and other reflective questions using growth and other measures: How much did our students grow, on average, compared to similar students? Is this higher, lower, or about what we would have expected? Why? What do we learn from subgroup, grade and subject level information? What should we do to understand any surprises using other information and evidence? Put the right plans in place to aid in understanding evaluation measures and using them to support professional growth and learning for our educators. Ensure accurate reporting of student/teacher linkage information for 12-13 school year.
How would you answer these common misconceptions? • New York’s evaluation system is based mostly on State test scores and that’s not good. • A principal knows a good teacher when s/he sees one; we don’t need to include value-added results too. • I’ve been doing teacher observations for years. I don’t need to go to your training. • Teacher Value-added information is unreliable and shouldn’t be a part of teacher evaluation. • I am a teacher with lots of students in poverty. How can measuring my student test score results be fair? • I have a lot of high achieving students in my classes/school. They have no where to go but down so we won’t do well on “growth” measures.
How would you answer these common misconceptions? • New York’s evaluation system is based mostly on State test scores and that’s not good. • NY uses multiple measures as research advises. 60% involves measures of educator practice. 20-25% involves GROWTH on state assessments or comparable measures. And the remaining points will be a locally-selected measure of student growth or achievement. • A principal knows a good teacher when s/he sees one; we don’t need to include value-added results too. • Recent MET study shows that combining observation results and teacher value-added is more predictive and reliable than either measure alone. • I’ve been doing teacher observations for years. I don’t need to go to your training. • The MET study shows that regularly recalibrating observers against benchmarks of accurate observation ratings is critical to ensuring a valid and reliable evaluation system. Even the best observers can “drift” over time. And the best can help others stay in sync. In addition, NYS training will help everyone identify evidence that the new Common core standards are being implemented well in classrooms.
How would you answer these common misconceptions? • I am a teacher with lots of students in poverty. How can measuring my student test score results be fair? • NY’s growth measures compare the performance of students to that of similar students including similar prior test score history, poverty and other student characteristics. There is little relationship between the percent of students in poverty and a teacher’s mean growth percentile. • I have a lot of high achieving students in my classes/school. They have no where to go but down so we won’t do well on “growth” measures. • NY’s growth measures compare performance of students to that of similar students using prior test score history and other student characteristics. Teachers whose high achieving students outperform other high achieving students will do well on these growth measures whether or not the students scale scores go up year over year. • Teacher Value-added information is unreliable and shouldn’t be a part of teacher evaluation. • Many researchers have shown that teacher value-added is the best predictor we have of the future learning growth of a teacher’s students. Two new research studies, Chetty/Friedman/Rockoff and the Measures of Effective Teaching Study add new evidence in support of this argument.
For More Information… Please review resources about the State-provided growth measures here: http://engageny.org/resource/resources-about-state-growth-measures/ And the guidance on NYS’s APPR Law and Regulations: http://engageny.org/resource/guidance-on-new-yorks-annual-professional-performance-review-law-and-regulations/
Review of Terms • SGP (student growth percentile): • the result of a statistical model that calculates each student’s change in achievement between two or more points in time on a State assessment or other comparable measure and compares each student’s performance to that of similarly achieving students • Similar students: • students with the similar prior test scores,(up to three years), and ELL, SWD, and economic disadvantage status • Also include test measurement error correction • Unadjusted and adjusted MGP (mean growth percentile): • the average of the student growth percentiles attributed to a given educator • For evaluation purposes, the overall adjusted MGP is used. This is the MGP that includes all a teacher or principal’s students and takes into account student demographics.