Examining Rubric Design and Inter-rater Reliability: a Fun Grading Project

Examining Rubric Design and Inter-rater Reliability: a Fun Grading Project Presented at the Third Annual Association for the Assessment of Learning in Higher Education (AALHE) Conference, Lexington, Kentucky, June 3, 2013 Dr. Yan Zhang Cooksey University of Maryland University College

Outline of Today’s Presentation • Background and purposes of the full-day grading project • Procedural methods of the project • Discuss the results and decisions informed by the assessment findings • Lessons learned through the process

Purposes of the Full-day Grading Project • To simplify the current assessment process • To validate the newly developed common rubric measuring four core student learning areas (written communication, critical thinking, technology fluency, and information literacy)

UMUC Graduate School Previous Assessment Model: 3-3-3 Model

Previous Assessment Model:3-3-3 Model (Cont.)

Previous Assessment Model: 3-3-3 Model (Cont.)

C2 Model: Common activity & Combined rubric

Compare 3-3-3 Model to (new)C2 Model

Purposes of the Full-day Grading Project • To simplify the current assessment process • To validate the newly developed common rubric measuring four core student learning areas (written communication, critical thinking, technology fluency, and information literacy)

Procedural Methods of the Grading Project • Data Source • Rubric • Experimental design for data collection • Inter-rater reliability

Procedural Methods of the Grading Project (Cont.) • Data Source (student papers/redacted)

Procedural Methods of the Grading Project (Cont.) • Common Assignment • Rubric (rubric design and refinement) • 18 Raters (faculty members)

Procedural Methods of the Grading Project (Cont.) • Experimental design for data collection • randomized trial (Group A&B) • raters’ norming and training • grading instruction

Procedural Methods of the Grading Project (Cont.) • Inter-rater reliability (literature) • Stemler (2004): in any situation that involves judges (raters), the degree of inter-rater reliability is worthwhile to investigate, as the value of inter-rater reliability has significant implication for the validity of the subsequent study results. • Intraclass Correlation Coefficients (ICC) were used in this study.

Results and Findings • Two-sample t-test

Results and Findings (Cont.)

Results and Findings (Cont.) • Inter-rater Reliability:Intraclass Correlations Coefficients (ICC)

Results and Findings (Cont.) • Intraclass Correlation Coefficient by Criterion

Results and Findings (Cont.) • Inter-Item Correlation for Group A

Results and Findings (Cont.)

Lessons Learned through the Process • Get faculty excited about assessment! • Strategies to improve inter-rater agreement • More training • Clear rubric criteria • Map assignment instructions to rubric criteria • Make decisions based on the assessment results • Further refined the rubric and common assessment activity

Resources • McGraw, K. O., & Wong, S. P. (1996). Forming inferences about some intraclass correlation coefficients. Psychological Methods, 1(1), 30-46 (Correction, 1(1), 390). • Nunnally, J. (1978). Psychometric theory (2nd ed.). New York: McGraw-Hill. • Stemler, S.E. (2004). A comparison of consensus, consistency, and measurement approaches to estimating. Practical Assessment, Research & Evaluation,9(4). Retrieved from http://pareonline.net/getvn.asp?v=9&n=4. • Shrout, P.E. & Fleiss, J.L. (1979). Intraclass Correlations: Uses in Assessing Rater reliability. Psychological Bulletin, 2, 420-428. Retrieved from http://www.hongik.edu/~ym480/Shrout-Fleiss-ICC.pdf.

Stay Connected… • Dr. Yan Zhang Cooksey Director for Outcomes Assessment The Graduate School, University of Maryland University College Email: yan.cooksey@umuc.edu Homepage: http://assessment-matters.weebly.com

Examining Rubric Design and Inter-rater Reliability: a Fun Grading Project

Examining Rubric Design and Inter-rater Reliability: a Fun Grading Project

Presentation Transcript

Rater Reliability

Geometry Biography Project – Grading Rubric

Principles of Inter-rater Reliability (IRR)

EXPLORING INTER-RATER RELIABILITY

Inter-rater Reliability of a Pilates Movement Based Classification System

Understanding Rater-to -Rater Reliability A Closer Look At Standards I through IV

Inter-rater Agreement In Kansas

Examining Grading Practices

Bell Ringer Grading Rubric

Inter-Rater Reliability on Observation Practices

Inter-Rater Reliability: Purpose and Value

Grading Rubric

Examining Grading Practices

Rubric Design

New Book Cover Grading Rubric

OPD-2 Axis: Inter rater agreement (%)

Rubric Design

Inter-Rater Reliability

Rubric for AAA Grading

The Basic Components of Inter-Rater Reliability

Project Rubric

Inter-Rater Reliability