80 likes | 190 Views
Take-Home Message: Principles Unique to Alternate Assessments. William D. Schafer University of Maryland. What is the Inference?. Validity depends on the inference that is to be taken from an assessment.
E N D
Take-Home Message: Principles Unique to Alternate Assessments William D. Schafer University of Maryland
What is the Inference? • Validity depends on the inference that is to be taken from an assessment. • For alternate assessments, it makes most sense to consider a summative PROGRAM EVALUATION inference. • The crucial inference is: Did the student’s teacher meet his or her educational goals?
Interpreting Scores • In order to make any inference, a student’s score needs to be contextualized • For NCLB assessments, it makes most sense to contextualize the score using achievement levels and their associated cut-scores • Criterion referencing rather than norm referencing
Evaluating Cut Scores • Cut-score reliability and validity are as important as are reliability and validity of student scores • Cut scores and proficiency level descriptions help implement the FUNDAMENTAL ACCOUNTABILITY PRINCIPLE: Test every student on what they are supposed to be studying • For the regular assessment, cut score reliability and validity are developed through standard-setting studies • Alternate Assessments are different:
Individualizing Success • All students who take Alternate Assessments have IEPs • Students’ educational goals may be individualized (e.g., through IEPs) • Achievement standards should also be individualized • Judgments about reliability and validity of achievement standards (criteria) should reflect that individualization
Grouping for Psychometric Study • Groupings of students may make sense in order to generate reliability and validity evidence • Degree of challenge might be low, medium, high – or other system • Age of diagnosis might be a proxy for degree of challenge • Qualitative differences might also be used to develop groups
An Expectation for Validity Evidence • A positive outcome for validity evidence would be to find that the degrees of challenge students face are independent of the achievement level judgments they receive • This is my belief, but it is controversial • Others believe that like the regular assessment, we should expect scores to reveal lower achievement (and achievement levels) for students who are most challenged • This is a fundamental philosophical principle that separates alternate assessments from each other
Reliability: True Variancevs. Replicability • We should be more interested in documenting capacity for replication of results than identification of individual differences (traditional reliability) • True variance is not a useful construct and neither is variance partitioning • More useful is to conceptualize reliability as sufficiency of evidence for replication • Decision Accuracy for Alt-MSA is an example