120 likes | 268 Views
CCSSO Criteria for High-Quality Assessments. Technical Issues and Practical Application of Assessment Quality Criteria. Background. Test Validity Madaus call for Test Monitoring Board Unified Concept of Test Validity AERA/APA/NCME Standards for Educational and Psychological Testing
E N D
CCSSO Criteria for High-Quality Assessments Technical Issues and Practical Application of Assessment Quality Criteria
Background • Test Validity • Madaus call for Test Monitoring Board • Unified Concept of Test Validity • AERA/APA/NCME Standards for Educational and Psychological Testing • Peer-Review • Criteria for High-Quality Assessments
Technical Quality • Indicate Progress toward College & Career Readiness • Valid for Required and Intended Purposes • Ensuring Reliability • Valid and Consistent Test Score Interpretations within and across Years • Accessibility to ALL students, including English Learners and Students with Disabilities • Transparency of Test Design and Expectations • Meeting Requirements for Data Privacy and Ownership
Progress Toward College & Career Readiness • Description of process for developing performance level descriptors and setting performance standards, including: • Involvement of higher education & career experts • Use of external evidence to inform standards • Evidence that external benchmarks valid for intended purpose • Description of studies to be conducted to evaluate validity of standards over time
Valid for Required and Intended Purposes • Well-articulated validity evaluation that focuses on: • the validity of test score uses • scoring and reporting structures consistent with structure of state standards • total test and sub-scores are related to external variables as expected • assessments lead to intended outcomes • content validity of test forms and usefulness of score reports
Ensuring Reliability • Reliability of test scores for total population and reported sub-populations • Precision of scores at cut points and consistency of student classifications • Generalizability for relevant sources • Variability within and across groups • Variability among schools • Consistency across test forms • Consistency across rater scores
Consistency in Score Interpretations Within and Across Years • Assessment Forms • Comparability within years • Linking across years • Consistency in meaning across achievement level • Score Scales • Method used to transform raw to scale score coherent with test design and intended claims • Scaling procedures support intended interpretations • Evidence supports validity of vertical scales
Accessibility for ALL Students • Principles of Universal Design • Description of item development process used to reduce construct irrelevance • Sample items and interfaces that reflect principles of Universal Design • Appropriate Accommodations • Accessibility features • Access to translations and definitions • Construct validity of accessibility features • Evidence that the test items and accessibility features permit ELL and SWD to demonstrate their knowledge, skills, and abilities
Transparency of Test Design • Test blue prints demonstrate range of standards covered • Release plan yields representative sample of items on regular basis and across grades • Sample items with annotations and scoring rubrics available • Item development specifications available
Data Privacy • Assurance of student privacy protection that complies with all federal and state requirements • Assurance of state ownership of all data • State is provided all underlying data in timely and usable manner to support secondary state analyses • Description of secure data management procedures
Challenges to Implementation • Validity and Reliability are not dichotomous and acceptable levels vary based on test use • Guidance on what is acceptable levels for different uses • Timing of availability of evidence • New Programs may only have descriptions of plans • Existing Programs have student data to perform analyses • Objectivity, Transparency, and Reliability of Quality Review
Proposed Phased Approach • Phase I: Content and Test Design • Focus on alignment with standards, item quality, and accessibility features • Occurs after initial item development and test form construction • Phase II: Test Characteristics • Focus on validity, reliability, and generalizability • Occurs after field testing and/or first operational use • Phase III: Program Implementation • Focus on Test Administration, Reporting, and Test/Item Pool Maintenance • Occurs after second or third year of operational administration