Educator Evaluations, TSDL, Growth, VAMs

Educator Evaluations, TSDL, Growth, VAMs Office of Psychometrics, Accountability, Research and Evaluation

Important Dates - Overview • During school years 2011/12 and 2012/13, Educator Evaluation Systems are locally determined, but evaluations must be based on student growth measures. • Data from local, state, and nationally standardized assessments should be integrated if/where available along with other evidence of growth from portfolios, behavior rubrics, etc.

Requirements • Report one of four labels required by legislation in REP: • Highly effective • Effective • Minimally effective • Ineffective • The Governor’s Council* will develop a tool to be used by districts beginning in 2013-14. • *Renamed the Michigan Council for Educator Effectiveness as part of DTMB as of April 30, 2012.

MCEE’s Interim Progress Report Michigan Council for Educator Effectiveness (MCEE) issues its Interim Report. MCEE is recommending that the state start with a pilot program that will be tested in 12 school districts during the 2012-13 school year.

Detroit Free Press – 4/27/2012 • MCEE concluded a pilot is imperative, saying rushing to develop a system "would be reckless, both fiscally and technically”. • During the pilot, districts would test three tools for observing teachers and several models for using student assessment to evaluate teachers. • Results of the pilot will be used to develop MCEE’s final recommendations. • http://www.freep.com/article/20120428/NEWS06/204280401/Council-says-state-should-start-small-on-educator-evaluation-system?odyssey=mod%7Cnewswell%7Ctext%7CFRONTPAGE%7Cs

Current Overview of Dates

Who MUST be evaluated? • Based on the code used to report the employee in the REP. • Visit www.michigan.gov/CEPI. • Click on CEPI Applications on the left • Then, click on Registry of Educational Personnel on the left • Scroll down to EOY 2012 REP Preview • Click on EOY 2012 REP Data Descriptions and go to page 71.

Who MUST be evaluated? • Required Reporting Codes

Who is OPTIONAL to evaluate? • Optional Reporting Codes

TSDL The Teacher-Student Data Link: What it is and how it could be used as part of a district evaluation system

Teacher/Student Data Link • Data initiative to link each student to the courses he/she took and to the teachers who taught those courses • Required under State Fiscal Stabilization Fund as a deliverable • Will mature in the coming years to be able to provide measures and information over time

State-provided measures • Extremely limited, so districts choose which “pieces” make sense in their local context • Generated for each educator of students in tested grades, regardless of subject taught. • BUT “growth”, or Performance Level Change (PLC), exists only for reading and mathematics for MEAP and MI-Access FI in grades 4-8

How does the TSDL Work? • Teachers are linked to courses • Students are linked to courses • For each course taught, a teacher has a list of students who were reported as taking that course. • Spring assessment data 2011 and fall assessment data 2011 will attribute to teachers from the 2010-2011 school year • “Feeder school” for fall assessment data

Linking assessment data to students • Once teachers are linked to students, the TSDL file provides: • Performance level change (PLC) for MEAP and MI-Access FI in reading and mathematics for each teacher where available (regardless of subject taught) in grades 4-8. • Performance level in writing, science, social studies, reading and mathematics for each teacher where available (regardless of subject taught) across all tested grades.

Performance Level Change (“growth”)

Access to TSDL data • TSDL User role must be established in the Secure Site to access the data at the district or school level • Spring Assessments/High school link available through the Secure Site as of January. • Fall Assessments (Elementary and Middle) TSDL through the Secure Site as of March.

After dowloading the TSDL File District/school needs to adjust each list based on rules like • student attendance • subject taught match • grade taught • other local factors

Sample Components of Evaluation

Using PLC Data with MDE Tool • This year, the TSDL provides PLC data linked to teachers to districts for integration into local systems along with an optional tool. • These are general guidelines/suggestions—NOT requirements for reading and math in grades 4-8 **Currently, MDE is working with districts in pilot programs to research the most valid way to use PLC and other assessment data in value-added models and educator evaluation systems.

One Possible Method Using MDE Tool STEP #1 • Download TSDL file through BAA Secure Site • Apply rules regarding which students “count” toward a teacher’s evaluation (i.e. attendance rules) • Consider de-duplication of records • Paste your modified TSDL data into the Weighted PLC Tool

One Possible Method Using MDE Tool STEP #2 • Determine/Adjust the Weight the PLCs in the tool (calculations automatically adjust/are calculated) • Default weights in the MDE TSDL Weighted PLC Tool:

One Possible Method Using MDE Tool STEP #3 • Look at the results at various levels: what is the Weighted PLC at the district, school, grade, and/or subject level? • What is a reasonable Weighted PLC for teachers to show? • Note: Possible range using this Weighted PLC method is from -2 to 2. • The meaning of 0 here is that you’re, on average, maintaining your proficient students. • If using a different weight, it’s necessary to determine the range & meaning of the results.

Example: Determining Thresholds • In Sunshine School: • The weighted PLC is .643 for math at the school level Considerations • Positive Weighted PLC = effective • Negative Weighted PLC = minimally effective Determine threshold for highly effective or ineffective • Set the bar based on the school level—that teachers should at least meet the school level weighted PLC. • For example, for a teacher to be considered effective for this portion of the evaluation, he/she must have a Weighted PLC of .60 or greater.

Using weighted PLC and thresholds • To calculate the teacher’s percent of students demonstrating growth, divide Weighted PLC by number of students: 3/8 = .375 • If target for “effective” was .643, this teacher did not meet the “effective” threshold. • BUT, if the target for effective was having a positive Weighted PLC (>0), this teacher would have met it. • Use this as one “growth” component of a multi-measure evaluation system

Paste the modified* TSDL data into the Weighted PLC tool.

School Level Weighted PLC = .643

PIC (teacher) Level Weighted PLC = 1.33

Cautions • Must base targets on data; need to set targets that are attainable but also challenge educators to improve student learning • Make decisions about the extent (if at all) reading and math PLC should count in subjects other than reading and math • Make decisions about which students contribute; need firm business rules that apply to all! • Use other measures and factors!

Integrating Growth Carefully • Use in conjunction with other measures • Use other types of growth too (i.e. portfolios, rubrics, performance-based assessments) particularly in non-tested subjects and grades—and for special populations.

Integrating Growth (again) • Can be used more qualitatively too—set general guidelines/targets, but use it to inform the decision • Consider the measures that may already be in place in your district that are meant to show growth and develop a rules around that data

Non-Tested Grades and Content Areas • Caveat: No easy answer to this question! • One answer: Develop more tests • But….tests in new content areas take time (and are difficult to do, particularly if a state has not yet adopted content standards in those areas) • Another answer: set standards for “adequate growth” on common assessments that you already have • One more answer: use instruments to measure growth that are not “assessments” (i.e. portfolios, progress toward goals, etc.)

Step #1: Take stock of your common assessments/local assessments • Ask yourself the following questions: • What is our spectrum of locally administered and identified assessments? • What content areas do we currently assess? For your common assessments: • Do we currently have standards set for proficiency (i.e. how much is enough to be considered “proficient” on this assessment)? • How often do we administer this test? • How is the test scored? Scaled?

Step #1: Take stock of your common assessments/local assessments For purchased assessments: • Do they establish how much “more” is sufficient growth? • How are these tests scaled and scored? • Work with the company producing these tests to determine how they can or cannot be used to measure growth.

Step #2: Setting standards for GROWTH on common assessments • Even if you have standards for proficiency, may need to set standards for growth • Several methods of setting standards that can be used to help make these determinations

Step #3: Set the standards, implement and EVALUATE • Although legislation does not necessarily provide for this (yet??)—all of these systems need continuous improvement • If the standards for growth that you set using one of the above methods seem out of synch with actual student learning—re-evaluate!

Value-Added Modeling • VAMs attempt to isolate the contributions, or “value add”, of teachers or schools to student learning. • MDE is not, at the present, running VAMs for every teacher in the state • Issue #1: We do not have sufficient data • Issue #2: We do not have sufficient systems • Issue #3: We do not necessarily believe we can specify these models in a way that allow us to get fair estimates • Issue #4: We are not currently given the direct authority/responsibility to do this

VAMs • We do want to start to provide our best expert advice to districts • Non-binding • Take it or leave it • Tell us if you disagree The level at which the VAM is run is important for three reasons: • available data • coherence with local expectations • ability to compare teachers

VAMs: Who should run these? • VAM run by MDE for the whole state: • Pros: Standardized model; allows you to talk about distribution of effectiveness across the state. • Cons: Less data available (i.e. only state assessments); not reflective of district expectations; only available for a limited number of educators (i.e. those who teach in a tested grade/subject)

VAMs: Who should run these? • VAM run by a district: • Pros: Can utilize both state and local assessment data; can be more sensitive to local assumptions; can hypothetically include many more teachers. • Cons: Analytically difficult to do; may not follow standard format; “lowest X% of teachers” is now district-specific

Value-Added Modeling: What to do? • Involve educators in roster verification process • Are these my students? Which ones should be included in the VAM? Need business rules around this—districts need to make rules • Use prior achievement data as a predictor • More years of data are better • Can use data from all available subjects • Can use data from different types of tests • BUT—need complete data on every student • SO—tradeoff on how many years of data you include and how many students you include in the model • More assessment scores = fewer students who will have all those scores available

Value-Added Modeling: What to do? • Consider using aggregate peer effects at the teacher level as a covariate • For example: average economic disadvantage • McCaffrey—including student level covariates is not helpful • However—if there is a correlation between the aggregate demographics and the value added estimate, that sets up differential expectations depending on classroom demographics • Teachers with small numbers of students: need firm business rules • McCaffrey recommends 10 or more students • Rules for what to do with less than 10 teachers

Value-Added Modeling: What to do? • Many different types of VAMs all key to a certain set of assumptions and more importantly, the underlying data that you have. • Gain scores: Need a vertical scale • Covariate adjustment • Which covariates you select are critical • Dropping students with incomplete data AND differential expectations • Random or fixed effects • Are the time-invariant teacher characteristics fixed? • Do we need to explicitly model the variation within and between teachers?

Including growth = VAM? • You can include student growth data in an evaluation system WITHOUT running a VAM • Our TSDL tool does this • Don’t forget to call your local university, your ISD OR the Department if you want to run a VAM and need assistance • AND—growth data does not always have to be obtained from assessments

Contact Information Carla Howe Olivares Evaluation Research and Accountability Office of Psychometrics, Accountability, Research and Evaluation (OPARE) Bureau of Assessment and Accountability (BAA) Mde-accountability@michigan.gov 517-373-1342 olivaresc@michigan.gov

MDE website for Ed Eval info • www.michigan.gov/baa • Click on the Educator Evaluation tab on the left to access materials, resources, and links

Educator Evaluations, TSDL, Growth, VAMs