1 / 34

Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity. Developed by. Bruce R. Fay, PhD Wayne RESA James Gullen , PhD Oakland Schools. Support.

tal
Download Presentation

Michigan Assessment Consortium Common Assessment Development Series Module 16 – Validity

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Michigan Assessment ConsortiumCommon Assessment Development SeriesModule 16 –Validity

  2. Developed by Bruce R. Fay, PhD Wayne RESA James Gullen, PhD Oakland Schools

  3. Support The Michigan Assessment Consortium professional development series in common assessment development is funded in part by the Michigan Association of Intermediate School Administrators in cooperation with …

  4. In Module 16 you will learn about • Validity: what it is, what it isn’t, why it’s important • Types/Sources of Evidence for Validity

  5. Validity & Achievement Testing – The Old(er) View Validity is the degree to which a test measures what it is intended to measure. This view suggests that validity is a property of a test.

  6. Validity and Achievement Testing – The New(er) View Validity relates to the meaningful use of results Validity is not a property of a test Key question: Is it appropriate to use the results of this test to make the decision(s) we are trying to make?

  7. Validity & Proposed Use “Validity refers to the degree to which evidence and theory support the interpretations of test scores entailed by proposed uses of the tests.” (AERA, APA, & NCME, 1999, p. 9)

  8. Validity as Evaluation “Validity is an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment.” (Messick, 1989, p. 13)

  9. Meaning in Context Validity is contextual – it does not exist in a vacuum Validity has to do with the degree to which test results can be meaningfully interpreted and correctly used with respect to a question to be answered or a decision to be made – it is not an all or nothing thing

  10. Prerequisites to Validity Certain things have to be in place before validity can be addressed

  11. Reliability A property of the test Statistical in nature “Consistency” or repeatability The test actually measures something

  12. Fairness • Freedom from bias with respect to: • Content • Item construction • Test administration (testing environment) • Anything else that would cause differential performance based on factors other than the students knowledge/ability with respect to the subject matter

  13. The Natural Order of Things Reliability precedes Fairness precedes Validity. Only if a test is reliable can you then determine if it is fair, and only if it is fair, can you then make any defensible use of the results. However, having a reliable, fair test does not guarantee valid use.

  14. Validity Recap Not a property of the test Not essentially statistical Interpretation of results Meaning in context Requires judgment

  15. Types/Sources of Validity Internal Validity External Validity Criterion (ext) Concurrent Predictive Consequential • Face • Content • Response • Criterion (int) • Construct

  16. Internal Validity Practical Not so much Face Construct • Content • Response • Criterion (int)

  17. External Validity Criterion (ext) Consequential Relates directly to the “correctness” of decisions based on results Usually established over multiple cases and time • Usually statistical (measures of association or correlation) • Requires the existence of other tests or points of quantitative comparison • May require a “known good” assumption

  18. To Validate or Not to Validate… …is that the question?

  19. Decision-making without data… is just guessing.

  20. But use of improperly validated data… …leads to confidently arriving at potentially false conclusions.

  21. Practical Realities Although validity is not a statistical property of a test, both quantitative and qualitative methods are used to establish evidence for the validity for any particular use Many of these methods are beyond the scope of what most schools/districts can do for themselves…but there are things you can do

  22. Clear Purpose Be clear and explicit about the intended purpose for which a test is developed and how the results are to be used

  23. Documented Process Implementing the process outlined in this training, with fidelity, will provide a big step in this direction, especially if you document what you are doing

  24. Internal First, then External • Focus first on Internal Validity • Content • Response • Criterion • Focus next on External Validity • Concurrent • Predictive • Consequential

  25. Content & Criterion Evidence • Create the foundation for these by: • Using test blueprints to design and explicitly document the relationship (alignment and coverage) of the items on a test to content standards • Specifying appropriate numbers, types, and levels of items for the content to be assessed

  26. More on Content & Criterion • Have test items written and reviewed by people with content/assessment expertise using a defined process such as the one described in this series. Be sure to review for bias and other criteria. • Create rubrics, scoring guides, or answer keys as needed, and check them for accuracy

  27. It’s Not Just the Items… • Establish/document administration procedures • Determine how the results will be reported and to whom. Develop draft reporting formats.

  28. Field Testing and Scoring Field test your assessment Evaluate the test administration For open-ended items, train scorers and check that scoring is consistent (establish inter-rater reliability) Create annotated scoring guides using actual (anonymous) student papers as exemplars

  29. Field Test Results Analysis • Analyze the field test results for reliability, bias, and response patterns • Make adjustments based on this analysis • Report results to field testers and evaluate their ability to interpret the data and make correct inferences/decisions • Repeat the field testing if needed

  30. How Good is Good Enough? Establish your initial performance standards in light of your field test data, and adjust if needed Consider external validity by “comparing” pilot results to results from other “known good” tests or data points

  31. Ready, Set, Go! (?) When the test “goes live” take steps to ensure the it is administered properly; monitor and document this, noting any anomalies

  32. Behind the Scenes Ensure that tests are scored accurately. Pay particular attention to the scoring of open-ended items. Use a process that allows you to check on inter-rater reliability, at least on a sample basis

  33. Making Meaning • Ensure that test results are reported: • Using previously developed formats • To the correct users • In a timely fashion • Follow up on whether the users can/do make meaningful use of the results

  34. Conclusion

More Related