1 / 27

It ’ s a myth: High stakes cause test score inflation

It ’ s a myth: High stakes cause test score inflation. Richard P. Phelps International Test Commission 11th Conference, July 4, 2018 Montréal, Canada. Educational testing in the US: early 1980s. Educational testing in the US: 1980s.

urbanski
Download Presentation

It ’ s a myth: High stakes cause test score inflation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. It’s a myth: High stakes cause test score inflation Richard P. Phelps International Test Commission 11th Conference, July 4, 2018 Montréal, Canada

  2. Educational testing in the US: early 1980s International Test Commission, 11th Conference, Montreal, Canada

  3. Educational testing in the US: 1980s Student testing with stakes reintroduced: late 1970s, early 1980s Debra P. v. Turlington “Truth in testing” laws International Test Commission, 11th Conference, Montreal, Canada

  4. Residency in rural, poor Appalachia, 1980s Surprised by claims that state and school district scored “above average” on national tests Investigated, all US states claimed to be “above average” John J. Cannell, M.D. International Test Commission, 11th Conference, Montreal, Canada

  5. “Welcome to Lake Wobegon, where all the women are strong, all the men are good-looking, and all the children are above average.” - Garrison Keillor, A Prairie Home Companion International Test Commission, 11th Conference, Montreal, Canada

  6. Lax security Outdated or invalid norms Deliberate educator manipulation (i.e., cheating) Cannell’s suspects International Test Commission, 11th Conference, Montreal, Canada

  7. “While supporting Cannell’s general finding … our analyses lead us to conclusions that are different, and certainly less sensational, than the ones he reached.” — Linn, Graue, Sanders , CRESST, 1990 “There are many reasons for the Lake Wobegon Effect, most of which are less sinister than those emphasized by Cannell.” — Linn, CRESST, 2000 International Test Commission, 11th Conference, Montreal, Canada

  8. Outdated or invalid norms High stakes, that induce “teaching to the test” (i.e., test coaching) under pressure CRESST’s Lake Wobegon suspects International Test Commission, 11th Conference, Montreal, Canada

  9. CRESST counters Cannell’s Lake Wobegon study with their own, 1991 Students took test a few years. Scores rose. Then took “competing test” district had used before. Scores fell. International Test Commission, 11th Conference, Montreal, Canada

  10. CRESST 1991 “Generalization” Study • 3 tests in the study • Annual NRT • Parallel form • A “competing” NRT International Test Commission, 11th Conference, Montreal, Canada

  11. CRESST 1991 “Generalization” Study Unnamed school district Unnamed tests Neither replicable nor falsifiable A conference presentation; not peer-reviewed Called an “experiment”, but no controls for test security or other factors. International Test Commission, 11th Conference, Montreal, Canada

  12. 1991 CRESST “Generalization” Study Study’s assumptions 1. Publication of aggregate results = “high stakes” 2. “Competing” NRTs should get same results 3. “Test coaching” improves scores 4. Low-stakes test scores are reliable and can be used to benchmark unreliable high stakes scores 5. High-stakes cause test-score inflation? International Test Commission, 11th Conference, Montreal, Canada

  13. Jim Popham “high stakes” definition 1987 1. Publication of aggregate results = high stakes? “... Such tests include the many statewide achievement tests whose results are reported by local newspapers on a school-by-school or district-by-district basis.” International Test Commission, 11th Conference, Montreal, Canada

  14. 2. Research: Comparability of different tests Scores Comparable ? Scores Not Comparable NRTs Freeman, Kuhs, Porter, Floden, Schmidt, Schwille (1983);Debra P. v. Turlington (1984); Cohen, Spillane (1993); La Marca, Redfield, Winter, Bailey, and Despriet (2000); Wainer (2011) Standards Archbald (1994); Buckendahl, Plake, Impara, Irwin (2000); Bhola, Impara, Buckendahl (2003); Phelps (2005) CRTs Massell, Kirst, Hoppe (1997); Wiley, Hembry, Buckendahl, Forte,Towles Nebelsick-Gullett (2015) International Test Commission, 11th Conference, Montreal, Canada

  15. 3. Research: Effects of test coaching It works Significant score increase from learning format tricks Aldeman & Powers (1980) Samson (1985) Scruggs (1985) Roznowski & Bassett (1992) McMann (1994) Holmes, Keffer (1995) Camel & Chung (2002) Filizola (2008) It doesn’t work Negligible score increase Messick & Jungeblut (1981) Ellis, Konoske, Wulfeck, & Montague (1982) DerSimonian and Laird (1983) Kulik, Bangert-Drowns & Kulik (1984) Fraker (1986/1987) Halpin (1987) Whitla (1988) Snedecor (1989) Becker (1990) Smyth (1990) Moore (1991) Alderson & Wall (1992) Powers (1993) Powers & Rock (1994) Scholes, Lane (1997) Allalouf & Ben Shakhar (1998) Robb & Ercanbrack (1999) McClain (1999) Camara (1999, 2001, 2008) Stone & Lane (2000, 2003) Din & Soldan (2001) Briggs (2001) Palmer (2002) Briggs & Hansen (2004) Cankoy & Ali Tut (2005) Crocker (2005) Allensworth, Correa, & Ponisciak (2008) Domingue & Briggs (2009) International Test Commission, 11th Conference, Montreal, Canada

  16. 4. Research: Low-stakes test reliability Not reliable student effort varies; scores easy to manipulate Rothe (1947); Jennings (1953); Uguroglu, Walberg (1979); Taylor & White (1981); Arvey, et al. (1990); Schmit, Ryan (1992); Brown & Walberg (1993); Kim, McLean (1995), Wolf, Smith (1995), Wolf, Smith, DiPaulo (1996); Schiel (1996); Sundre (1999), Sundre, Moore (2002), Sundre, Wise (2003); DeMars (2000), Wise (2006ª, 2006b), Wise, DeMars (2005, 2005, 2006, 2010), Wise, et al., (2009); Hoyt (2001); Eklof (2006, 2007, 2010); List, Livingston, Neckerman (2016) ….....etc. Reliable “no incentive to manipulate scores” Kipliinger, Linn (1992) O’Neil, Sugre, Baker (1995) * Hout, Elliot (2011) * 1 of 2 groups International Test Commission, 11th Conference, Montreal, Canada

  17. 5. High stakes cause test score inflation? Then, why no score inflation with certification and licensure tests? International Test Commission, 11th Conference, Montreal, Canada

  18. Large-scale internally-administered test, tight security International Test Commission, 11th Conference, Montreal, Canada

  19. Large-scale internally-administered test, lax security International Test Commission, 11th Conference, Montreal, Canada

  20. Cannell found score inflation in elementary school tests in dozens of states – none of those tests had high stakes. Cannell also found score inflation in secondary school tests in dozens of states – only one had high stakes. Test Score Inflation Occurs where Security is Lax International Test Commission, 11th Conference, Montreal, Canada

  21. Harms of misinformation • 1. Unfairly discredits useful evaluation tool • 2. Test security (in U.S.) remains shoddy • 3. Teachers given mixed messages • 4. Now spreading worldwide International Test Commission, 11th Conference, Montreal, Canada

  22. 1. Uniquely useful evalution tool is discredited …and, in the US, the only objective measure available to the public (i.e., not under the control of insiders). International Test Commission, 11th Conference, Montreal, Canada

  23. 2. Test security (in U.S.) remains shoddy ACT, SAT, PARCC, SBAC now administered statewide by schools, on varying dates. Tests save money, hassle, gain customers by outsourcing (or, ignoring) test security. International Test Commission, 11th Conference, Montreal, Canada

  24. 3. Teachers given mixed messages “Teaching to the test” is unethical; Don’t do it! Teach content beyond the standards. “Teaching to the test works! You and your students will be better off if you do it! International Test Commission, 11th Conference, Montreal, Canada

  25. 4. Misinformation spreading worldwide International Test Commission, 11th Conference, Montreal, Canada

  26. Cover-up successful; most believe CRESST’s version Cannell’s work was an opportunity to fix a large problem US education chose to deny, confuse, and cover up. This unfortunate tendency blocks genuine progress.

  27. http://nonpartisaneducation.org/Review/Articles/v6n3.htm richard@nonpartisaneducation.org International Test Commission, 11th Conference, Montreal, Canada

More Related