150 likes | 161 Views
Learn about the CAEP Evaluation Framework, validity, reliability, and actionability in educational program assessments. Get insights and examples for improving your assessment practices. Explore data validity and reliability, survey content, and more.
E N D
EVALUATING EPP-CREATED ASSESSMENTS Donna Fiebelkorn, Lake Superior State University Beth Grzelak, Eastern Michigan University
Goals for today’s session: • Introduce the CAEP EvaluationFramework • Provide the opportunity for participants to engage with the Evaluation Framework • Discuss validity and reliability from the CAEP perspective • Q & A
Evaluation Framework focuses on: Relevancy 1. Administration and Purpose 2. Content of Assessment Reliability and Actionability 3.Scoring Quality 4. Data Reliability 5. Data Validity 6. Survey Content 7. Survey Data Quality Note: For Survey Instruments, use Sections 1, 2, 6, and 7 only. All others, use Sections 1, 2, 3, 4, and 5.
Briefly, about the work we have done • Between us, we have reviewed assessments from nine programs, public and private, small and medium. • Not one assessment, from any EPP, met the ‘Sufficient’ level as defined by the CAEP Evaluation Framework • In each case, we worked in teams of 2-3 • Initially independently, and then with team members to reconcile discrepancies. • The review team’s report was then reviewed and accepted by CAEP staff member. • Before we discuss the issues we saw, let’s introduce you to the CAEP Evaluation Framework…
And then . . . Holistic Evaluation of EPP-Created Assessments Criteria evaluated during stages of accreditation review and decision-making: • EPP provides evidence that data are compiled and tabulated accurately • Interpretations of assessment results are appropriate for items and resulting data • Results from successive administrations are compared
Data Validity Example: Sufficient? CAEP Training Example: This observation rubric has “face validity” because all of the stakeholders (supervising faculty, program faculty, mentors and principals) agreed, after modifications, that it was measuring good teaching.
Data Reliability Example: Sufficient? CAEP Training Example: We have not yet determined inter-rater reliability for this assessment but plan to conduct a series of faculty meetings in which faculty observe videos of candidates in their teaching residency, complete the observation rubric individually and then discuss differences in ratings. Inter-rater reliability will be determined and differences discussed in order to ensure consistency across raters.
Opportunities for Continued Discussion Open Space Topic? Team Work? Other?