120 likes | 205 Views
The Determinants of Student Achievement: Different Estimates for Different Measures. Tim Sass Department of Economics Florida State University. CALDER Conference October 4, 2007. Different Measures. Types of Tests Criterion Reference Tests
E N D
The Determinants of Student Achievement:Different Estimates for Different Measures Tim Sass Department of Economics Florida State University CALDER Conference October 4, 2007
Different Measures • Types of Tests • Criterion Reference Tests • Test whether student has learned elements in state established instructional standards • State specific • Nationally Normed Tests • Tests whether student has learned a set of concepts and skills that may or may not correspond to any particular state’s curriculum benchmarks • Allows interstate comparisons
Different Measures • Scaling • Non-Vertically Aligned Scale Scores • Scale potentially different at each grade level • Can’t compare learning gains • Criterion reference tests are typically not vertically aligned • Vertical or Developmental Scales • A single equal-interval scale that spans all grade levels • A one-unit change means the same at all levels within and between grades • Some norm-referenced exams are of this type • Stanford Achievement Test
Non-Vertically Aligned Scores Grade 10 Trigonometry Grade 9 Grade 8 Grade 7 Grade 6 Grade 5 Single-Digit Addition Grade 4 Grade 3
Vertically Scaled Scores Trigonometry Grade 10 Grade 9 Grade 8 Grade 7 Grade 6 If done right, vertically scaled exam ideal for analyzing learning gains since one-point change has same meaning everywhere on the scale. Grade 5 Simple Addition Grade 4 Grade 3
Different Measures • Scale Scores Normalized by Grade and Year • Frequently used by researchers to compare a student’s performance on criterion referenced tests over time • Compares a student’s performance relative to the performance of other students taking the same grade-level exam in the same year • Unit of measure is the standard deviation • If performance distribution changes from grade to grade, normalized scores may not be comparable • Also sometimes used to try to equate performance on different exams when a state changes their test midstream
Normalized Scores 0 Normalized score sets mean to zero and rescales score Grade 5
Different Results • Analysis of the Effectiveness of NBPTS Certified Teachers • Harris and Sass, “The Effects of NBPTS-Certified Teachers on Student Achievement” (February, 2007) • Compares the effectiveness of NBPTS-certified teachers (NBCTs) with the effectiveness of non-NBCTs in Florida • In many cases, results vary whether use scores from Florida’s criterion reference test, the FCAT-Sunshine State Standards exam (FCAT-SSS), or the Stanford Achievement Test, a norm-referenced test (FCAT-NRT)
Value-Added Estimates of Reading Achievement Note: all coefficients expressed in standard deviation units omitted experience category is teachers with 10+ years of experience coefficients in green are statistically significant at the 95% confidence level
Different Results • More variation in estimated effects across exams than in different scalings of same exam • Estimated effects of variables representing small proportions of teachers most variable • NBPTS Certification • Advanced Degrees • Why are there differences across exams? • Differences in material covered • Differential ceiling effects
Vertically Scaled Scores With Ceiling Trigonometry Grade 10 Grade 9 Grade 8 Grade 7 Grade 6 Grade 5 Simple Addition Grade 4 Grade 3
Conclusions • Not much difference between developmental scale scores and non-vertically aligned scores that are normalized by grade and year • Different tests can yield different results • Low-incidence variables seem to be most sensitive to test instrument • Not clear whether differences due to material tested or differential ceiling effects