1 / 23

The use of the Test Standards with evolving program uses

This presentation explores the use of test standards in evolving educational assessment programs, including K-12 statewide assessments, admissions testing, and classroom-based assessments. The speaker discusses the five sources of validity evidence and highlights the development processes, response processes, and consequences associated with each type of assessment. The presentation also addresses the challenges and options for improving validity arguments in testing programs.

Download Presentation

The use of the Test Standards with evolving program uses

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The use of the Test Standards with evolving program uses Presented at the annual meeting for the National Council on Measurement in Education April 29, 2017 Andrew Wiley, PhD

  2. TEST STANDARDS Standard 1.1 The test developer set forth clearly how test scores are intended to be interpreted and consequently used. Standard 1.2 The rationale should be presented for each intended interpretation of test scores for a given use. Standard 1.3 If validity for some common or likely interpretation for a given use has not been evaluate, …, that fact should be made clear …

  3. Test Standards – 5 sources of validity evidence • Evidence based on test content • Evidence based on response process • Evidence based on internal structure • Evidence based on relations to other variables • Evidence for validity and consequences of testing

  4. So let’s think about different types of educational assessment programs • K-12 statewide assessment programs • Grades 3 to 8 • Grades 10, 11, EOC • Admissions testing • Classroom based assessments • Formative • Interim

  5. K12 statewide assessment program • Developmental processes • Who was involved? • What steps/policies did they follow? • What review procedures were in place? • Response process • How did they ensure that test takers were answering as expected • Consequences • How did the introduction of the assessment program impact students progression across grade levels?

  6. ADMISSIONS Assessment • Relations with other variables • What is the relationship between test scores and eventual performance in the school/program of interest? • Fairness • Do test scores predict accurately and consistently across critical demographic groups (e.g. race/ethnicity, gender, language)? • Development processes • Does the content of the assessment reflect critical KSAs required for success in the program of interest? • Consequences • How does the use of the assessment for admissions impact the overall admissions process, including who applies and who is accepted

  7. CLASSROOM BASED ASSESSMENT • Developmental procedures • What steps/procedures were followed when determining the KSAs that would be assessed? • Relations with other variables • What type of alignment evidence is available and what procedures were followed during this work? • What is the relationship between test scores of this assessment and other high-stakes assessment (e.g. statewide, admissions, etc.)? • Consequences • How has the use of the program impacted other aspects of the educational program (i.e. time available for teaching, classroom activities, impact on curriculum)

  8. But then things get a little messy • Use of admissions test for K12 accountability • Diagnostic feedback from a K12 assessment • Students • Teachers • Schools • Use of classroom assessment • promotion to next grade • Teacher evaluations • School funding

  9. Building a comprehensive validity argument • I think we have to acknowledge the fact that for most testing programs, the process of gathering evidence to support their program is completed in an environment where resources are limited and budgets do not allow programs to complete every study that would be appropriate. • Can we figure out better ways to identify what components should be considered essential, which components could wait a little while, and perhaps even identify components that can reasonably be postponed for significant periods of time.

  10. Let’s highlight some specific standards

  11. Evolving from classroom to higher stakes • Test and item development procedures • Administration practices • Fairness • Validation

  12. Test and item development procedures Standard 4.0 Test developers should document steps taken during the design and development process to provide evidence of fairness, reliability, and validity for intended uses … Standard 4.9 When items or test form tryouts are conducted, the procedures used for selecting the sample(s) of test takers as well as the resulting characteristics of the sample(s) should be documented.

  13. Test and item development procedures As stakes begin to rise • Revamp/replace the entire item pool • Costly / Lost resources (items) • Time consuming, long time to implement • Link to historical test forms • Gradual revision following best practices • Uneven performance on items/test forms • Timeline for “reaching goal” is tenuous

  14. Test and item development procedures As stakes begin to rise Some other options to consider • Retroactively conduct independent reviews of items • Focus groups to evaluate how students read/react to items

  15. Test administration Standard 6.0 …Assessment instruments should have established procedures for test administration scoring, reporting, and interpretation. Those responsible for administering, scoring, reporting, and interpreting should have sufficient training and supports to help them follow the established procedures.

  16. Test administration Some options to consider • Introduction of new standardized procedures • Loss of flexibility • Resource requirements for test administration • Comparability between old and new test administrations • Customer compliance with new/updated administration protocols

  17. Fairness in testing Standard 3.3 Those responsible for test development should include relevant subgroups in validity, reliability/precision and other preliminary studies used when constructing the test. Standard 3.5 Test developers should specify and document provisions that have been made to test administration and scoring procedures to remove construct-irrelevant barriers for all relevant subgroups in the test-taker population.

  18. Fairness in testing Some options to consider • Changing of performance standards moving forward • Continuity between old and new content • Use of the “old content” while the new content is being developed • Length of time to create new content • Retroactive committee review of the current item pool • Loss of items/test forms • Can be slightly less time/consuming for the initial phase

  19. validation Standard 1.0 Clear articulation of each intended test score interpretation for a specified use should be set forth, and appropriate validity evidence in support of each intended interpretation should be provided.

  20. Validation – Employment testing Predictor Criterion measure Measure Predictor 3 Criterion construct construct domain domain 1 5 4 2

  21. Some other possible changes in use • Use of admissions test for K12 accountability • Diagnostic feedback from a K12 assessment

  22. Validation – Employment testing Predictor Criterion measure Measure Predictor construct 3 Criterion Construct Domain Domain 1 5 2 4

  23. Some more stuff • I want to add some stuff here related to K12 and admissions testing using the Test Standards and: • The rights of test takers to fair and accurate information; • To appropriate information to help them prepare (repeat test takers right to know why they failed) • Also, consequential validity stuff, admission testing for K12 • Value to community • Load of testing requirements

More Related