290 likes | 471 Views
Accommodations Research: Reconsidering the Test Accommodation Validation Process: A Paradigm for Research Design With Initial Outcomes. Gerald Tindal University of Oregon. Accommodations: Issues and Options. A Bi-polar Condition OR Universal Design of Something. Marshall McLuhanisms.
E N D
Accommodations Research:Reconsidering the Test Accommodation Validation Process: A Paradigm for Research Design With Initial Outcomes Gerald Tindal University of Oregon
Accommodations:Issues and Options A Bi-polar Condition OR Universal Design of Something
Marshall McLuhanisms • Why is it so easy to acquire the solutions of past problems and so difficult to solve current ones? • Mud sometimes gives the illusion of depth. • The answers are always inside the problem, not outside. • “I may be wrong, but I’m never in doubt.” • You mean my whole fallacy’s wrong?
The Measurement Conundrum • Fixing a condition of measurement reduces error and increases the precision of measurements, but it does so at the expense of narrowing interpretations of measurements” (Brennon, 2001, p. 2). • The reliability-validity paradox: Attempts to increase reliability through standardization can actually lead to a decrease in the validity of interpretations.
Distinguishing between Nouns and Verbs • Constructs • Meaning and Interpretation • Construct Irrelevant Variance • Construct Under or Misrepresentation • To Construct: The Test Environment • Contexts and Settings • Expected Routines • Enacted Behaviors
Research Basis • Naturalistic Evaluations • That it works • Quasi-experimental Studies • Sometimes it works • Experimental Studies • How it works
The Unit of Analysis • Test Level • Bundled Items • Variation in Skills • Reporting Categories • Item Level • Specific Skills • Difficulty and Discrimination • Differential Item Functioning
Keeping Score For All • The effects of inclusion and accommodation policies on large-scale educational assessment • National Research Council, 2004
Chapter 1. Introduction: Two Questions • “Do commonly used accommodations yield scores that are comparable to those obtained when accommodations are not used? Do they over- or under-correct for the impediment for which they are designed to compensate” (p. 13)? • “Do commonly used accommodations alter the construct being tested? What methods should be used for evaluating the effects of a particular accommodations on the validity of test results (p. 13)?
Criticisms of Previous Research • Score Gains • Differential score gain vs. overall score gain • Quasi-experimental studies • Descriptive research • Population comparisons
Chapter 2. Characteristics of SWD and Assessments and Accommodations • Identification rates • Legal mandates • State testing programs • Allowable accommodations • No mention of skill levels • No mention of dissaggregated performances • No mention of rationale for accommodations given type of test
Chapter 3. Participation in NAEP • Considerable increase in participation of SWD and ELL from 1992 to 1998 • Use of 3 cohorts to study the effects of accommodations (and maintain integrity to past data): (a) SWD excluded, (b) no accommodations, (b) accommodations permitted.
Conclusion on Participation • “The increased use of accommodations with NAEP assessments has corresponded to increased participation rates for students with disabilities and English Language Learners.” • So what?
What We Don’t Know • What accommodations have been used in NAEP (singly and bundled). • The SKILL of the student (versus the disability). • Use of the accommodation in instruction. • Teacher recommendation for accommodation (on NAEP or state test) • Performance levels on accommodated versus non-accommodated items.
Chapter 4. Factors that Affect Accuracy of NAEPs Estimates of Achievement • Comparison of states that allow an accommodation and NAEP allowance of the accommodation. • Representativeness of NAEP samples and comparison of NAEP with national samples. • “Decision making regarding the inclusion or exclusion of students and the use of accommodations for NAEP is controlled at the school level. There is variability in the way these decision are made, both across schools within a state and across states” (p.83).
Recommendations • Review inclusion criteria for inclusion and accommodations of SWD and ELL • Clarify, elaborate, and revise the criteria • Standardize implementation of criteria at the school level. • Make policies more consistent between state and NAEP. • “More clearly define the characteristics of the population of students to whom the results are intended to generalize. This definition should serve as a guide for decision-making and the formulation of regulations regarding inclusion, exclusion, and reporting” (p. 84). • Confirm the inclusion rates with state data.
Chapter 5. Available Research on Effects of Accommodations on Validity • “The vast majority of studies pertaining to the interaction hypothesis showed that all student groups (SWD, ELL, and their general education peers) had score gains under accommodation conditions. Moreover, in general, the gains for SWD and ELL were greater than their general education peers under accommodation conditions. These conclusions varied somewhat across student groups and accommodation conditions, as we discuss below” …(p. 60)
Chapter 5. Available Research on Effects of Accommodations on Validity • However, it appears that the interaction hypothesis needs qualification. When SWD or ELL students exhibit greater gains with accommodations than their general education peers, an interaction is present. When the gains experienced by SWD or ELL are significantly greater than the gains experienced by their general education peers, the fact that the general education students achieved higher scores with an accommodation condition does not imply that the accommodation is unfair. It could imply that the standardized test conditions are too stringent for all students (p. 60).
Chapter 6. Articulating Validation Arguments • Target and ancillary skill required by NAEP reading and math items • Use of claims, data, and warrants • Disconnect with previous literature • No anchor to state and NAEP relationships • No focus on item level
No Right Way to Do a Wrong Thing • NAEP data base as it is structured can never address the question of accommodations. • Research designs are lacking. • Data are too global to answer any serious question. • Construct validity at item level is lacking.
What We Know and Don’t Know • Need to consider accommodations as complex packages • Need different research designs than randomized experiments (because of low sample size and inappropriate use of group statistics) • We need to study populations and items more carefully • Smart about items • Smart about people
Two Examples • Kansas Computer-Based Testing • Guidelines, highlight and erase, presentation of passage and item format, mark text with icons, cross-out, synthesized read aloud (in math), mark for review. • Oregon Accommodation Station • MATH: Reading skills analysis, comprehension, computation skills, trial changes: Read aloud in math, simplified, Spanish, and perception survey • READING: word search, split screen, drag and drop, highlight
Design 1 of Research: Smart about Items No Yes • Student is presented a standard item • Can I solve the problem as presented? Incorrect Correct Accommodated Accommodated Standard
Design 2 of Research: Smart about People Low Fluency Low Math Skill Low Fluency Intact Math Skill Intact Fluency Low Math Skill Intact Fluency Intact Math Skill • Pre Measure Student Reading Fluency • Pre Measure Student Basic Math Skill Simplified Read Aloud Read Aloud Simplified Standard Correct Incorrect Standard
Assessment Adaptations beyond Research Findings • The ASK Settlement in Oregon • When the Sidewalk Ends: Practice in the Absence of Research • Purpose: What is the construct irrelevant variance? • Function: How does it work? • Error: What are the false positives and false negatives? • Systems: What are the implications for the whole? • Modifications
Accommodations to Modified Achievement Standards • System Level Uniformity • Manipulation of Breadth and Depth • Modified Achievement Standards (2%) • Alternate Achievement Standards (1%) • Meaning of Score Reporting Categories • Exceeds, Meets, Does Not Meet • Consequences as Validation Process • Social policy versus construct validity
Manipulating Breadth and Depth • Content • Grade Level • Context • Applications and experience • Concepts • Attributes and Examples