280 likes | 530 Views
Heather Hill University of Michigan School of Education Learning Mathematics for Teaching 2007 MSRI June 1, 2007. Measuring Effectiveness in Mathematics Education for Teachers. Avoiding Arbitrariness!. 16 is my favorite number. Avoid Arbitrariness!. Challenge.
E N D
Heather Hill University of Michigan School of Education Learning Mathematics for Teaching 2007 MSRI June 1, 2007 Measuring Effectiveness in Mathematics Education for Teachers
Avoiding Arbitrariness! 16 is my favorite number
Challenge Knowing you’ve added (relevant) knowledge to prospective or in-service teachers Not going to discuss student achievement as outcome Issues to consider as you pursue understanding impact: Getting clear on your question Research design Instrument selection Comparability to other projects
Getting clear on your question Do you want to know the effect of: A set of materials? A course? Course & instructor? Sequence of courses/instructors? Different questions imply different designs Simplest design: What is effect of course and instructor?
Getting clear on your question Do you want to know the effect of: A set of materials? A course? Course & instructor? Sequence of courses/instructors? Different questions imply different designs Simplest design: What is effect of course and instructor?
Research Design Question: What would these people have known and been able to do in the absence of our program? Estimate difference between actual and “counterfactual” Problem: Cannot estimate with program and without program at the same time e.g., Marcia in December WITH and WITHOUT TE401 Random assignment provides best estimate of counterfactual Quasi-experimental designs more possible
Stop. Design. 1 minute: Think about how you would evaluate your work with teachers What is your question? How can you gather evidence about your question? 3 minutes: Share & critique with neighbors
Best Solution: Random Assignment Problem Rules out easiest research question: you + your materials Treatment/random assignment of students occurs in classes Statistical tests must be performed at the level of treatment (e.g., compare this class to that) Using students = cheating by boosting your power Need large N of classrooms or programs for statistical power Even mathematicians aren’t this prolific Another: Technically complex
Quasi-Experimental Designs Definition: No randomization to treatment Problems: Not causal -- always threat to inferences Selection, pre-test controls, “natural” learning “Assignment” is still class level for some questions But easier to implement
Quasi-Experimental Designs Worst: Threats: selection, no comparison, no pre-test control Second-Worst Threats: Selection into T and C, no pre-test control Tpost Tpost Cpost
Quasi-Experimental Designs Slightly less bad, but still not good: Threats: “Natural” learning over time; learning from instrument; selection Good: Threats: Selection Tpre Tpost Tpre Tpost Cpost Cpre
Quasi-Experimental Designs Best: Threats: Selection Advantage: Allows for growth modeling T1 T2 T3 C1 C2 C3
Quasi-Experimental Design: Unit of Analysis Problem Does Not Go Away To understand YOUR effect with YOUR materials, unit of analysis can be student E.g., comparing 32 pre/post tests To separate materials effect from instructor effect, need multiple classrooms
Example: Quasi-Experimental Design Hill/Ball study of MPDI (2002-2003 data): Pre/post for “treatment” group (1000 teachers in about 25 sites) Pre/post for “comparison” group (300 teachers who signed up for MPDIs but did not attend) Can compare change in treatment to change in comparison MKT instrument Compare among 25 programs
Instrumentation Criteria: Aligned to your program’s content Technically checked and validated Linked to student achievement Types of instruments: Teacher knowledge Teacher “practice” Mathematical quality of teaching
Teacher Knowledge: Multiple Choice LMT: K-5, 6-8 measures in number/operations, algebra, geometry (soon: rational number, proportional reasoning) www.sitemaker.umich.edu/lmt KAT: Algebra www.msu.edu/~kat/ DTAMS: K-5, 6-8 measures in Whole Number Computation, Rational Number Computation, Geometry/Measurement, Probability/Stats/Algebra http://louisville.edu/edu/crmstd/diag_math_assess_elem_teachers.html
Knowledge: Other Methods Kersting (LessonLab): Teacher analysis of video segments Discourse analysis, clinical interviews (e.g., TELT -- see Ball’s personal website), videos of clinical teaching experiences Home-grown tests
Possible Instruments: Observational Of “practice”: Reformed Teaching Observation Protocol Horizon’s Inside the Classroom Of “mathematical quality” of instruction LMT Mathematical Quality of Instruction TIMSS instruments
Plea from Meta-Analysts: Comparability Use common measures across teacher education efforts. Why? Knowledge is built by comparing effects of different programs Knowing that program A has a .5 effect is good But knowing that Program A =.5 and Program B = .3 is better; can ask what aspects of program A “worked” Must do with large “N” of programs
Comparison Example Example: Carnegie (Matt Ellinger) Formative assessment (feedback to programs involved) Four programs with math/math ed collaboration Seven sections Place value is content focus LMT instrument focused on place value is pre/post No comparison/control; internal variation
Comparison Example Mathematical Education of Elementary Teachers (Raven McCrory) 37 sections, 27 instructors, 13 institutions 588 total matched-pair student responses Can compare outcomes by program characteristics Instructor surveys of topics taught Textbook used, chapters covered Cognitive demand measure (based on Adding It Up) Instructor characteristics
Randomized Example: Hill (fall 2007) Videopre Videopost Lesson Study Math Content Videopost Videopre Videopre Coaching Videopost Records of Practice Videopre Videopost
Conclusion Don’t be arbitrary Link to many instruments described here www.sitemaker.umich.edu/lmt Good design advice: Institute for Social Research: Robin Jacob (rjacob@umich.edu) Local university-based evaluators