230 likes | 411 Views
Examining Rubric Design and Inter-rater Reliability: a Fun Grading Project . Presented at the Third Annual Association for the Assessment of Learning in Higher Education (AALHE) Conference, Lexington, Kentucky, June 3, 2013 Dr. Yan Zhang Cooksey University of Maryland University College.
E N D
Examining Rubric Design and Inter-rater Reliability: a Fun Grading Project Presented at the Third Annual Association for the Assessment of Learning in Higher Education (AALHE) Conference, Lexington, Kentucky, June 3, 2013 Dr. Yan Zhang Cooksey University of Maryland University College
Outline of Today’s Presentation • Background and purposes of the full-day grading project • Procedural methods of the project • Discuss the results and decisions informed by the assessment findings • Lessons learned through the process
Purposes of the Full-day Grading Project • To simplify the current assessment process • To validate the newly developed common rubric measuring four core student learning areas (written communication, critical thinking, technology fluency, and information literacy)
Purposes of the Full-day Grading Project • To simplify the current assessment process • To validate the newly developed common rubric measuring four core student learning areas (written communication, critical thinking, technology fluency, and information literacy)
Procedural Methods of the Grading Project • Data Source • Rubric • Experimental design for data collection • Inter-rater reliability
Procedural Methods of the Grading Project (Cont.) • Data Source (student papers/redacted)
Procedural Methods of the Grading Project (Cont.) • Common Assignment • Rubric (rubric design and refinement) • 18 Raters (faculty members)
Procedural Methods of the Grading Project (Cont.) • Experimental design for data collection • randomized trial (Group A&B) • raters’ norming and training • grading instruction
Procedural Methods of the Grading Project (Cont.) • Inter-rater reliability (literature) • Stemler (2004): in any situation that involves judges (raters), the degree of inter-rater reliability is worthwhile to investigate, as the value of inter-rater reliability has significant implication for the validity of the subsequent study results. • Intraclass Correlation Coefficients (ICC) were used in this study.
Results and Findings • Two-sample t-test
Results and Findings (Cont.) • Inter-rater Reliability:Intraclass Correlations Coefficients (ICC)
Results and Findings (Cont.) • Intraclass Correlation Coefficient by Criterion
Results and Findings (Cont.) • Inter-Item Correlation for Group A
Lessons Learned through the Process • Get faculty excited about assessment! • Strategies to improve inter-rater agreement • More training • Clear rubric criteria • Map assignment instructions to rubric criteria • Make decisions based on the assessment results • Further refined the rubric and common assessment activity
Resources • McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1(1), 30-46 (Correction, 1(1), 390). • Nunnally, J. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill. • Stemler, S.E. (2004). A comparison of consensus, consistency, and measurement approaches to estimating. Practical Assessment, Research & Evaluation,9(4). Retrieved from http://pareonline.net/getvn.asp?v=9&n=4. • Shrout, P.E. & Fleiss, J.L. (1979). Intraclass Correlations: Uses in Assessing Rater reliability. Psychological Bulletin, 2, 420-428. Retrieved from http://www.hongik.edu/~ym480/Shrout-Fleiss-ICC.pdf.
Stay Connected… • Dr. Yan Zhang Cooksey Director for Outcomes Assessment The Graduate School, University of Maryland University College Email: yan.cooksey@umuc.edu Homepage: http://assessment-matters.weebly.com