110 likes | 351 Views
Electronic Essay Graders. Jay Lubomirski. Topics. How electronic essay graders evaluate writing samples Comparing the electronic graders to the human graders Gaming the system. ETS e-reader. Educational Testing Services ( ETS ) is a non-profit test administration company
E N D
Electronic Essay Graders Jay Lubomirski
Topics How electronic essay graders evaluate writing samples Comparing the electronic graders to the human graders Gaming the system
ETS e-reader Educational Testing Services (ETS) is a non-profit test administration company Responsible for tests like GRE®, SAT® Subject tests, TOEFL® Test, etc Criterion® Service – online writing evaluation service e-rater® Scoring engine – system that scores essays written within the Criterion® Service
e-rater® Started in 1998, new versions since Focuses on writing quality rather than content Uses natural language processing to look at grammar, usage, mechanics, and development Goal is to predict the score a human grader would give an essay
Process e-rater® is feed a sample set of essays based on the same prompt (question) and their scores from a human grader e-rater® builds a model of the essay content and how it relates to the scores the human grader gave the essays e-rater® is then fed the evaluation essays to score Assumption is that “good essays resemble other good essays”
Grammar & Lexical Complexity • Grammar checker looks for 30 error types • Subject- verb agreement • Homophone errors • Misspellings • Overuse of vocabulary • The lexical complexity scorer computes a word frequency index and compares it against the word frequency in model
Organization and Development • Automatically identifies sentences that follow essay-discourse categories • Introductory material • Thesis • Main ideas • Supporting ideas • Conclusion • Organization is determined by computing length of discourse elements • Scored against the model
Scoring the Systems In 2012, Mark Shermis compared 9 electronic grading systems (8 commercial, 1 open source) against 8 essay prompts Essays sourced from high school writing assessments that were graded by human readers Results demonstrated that electronic essay scoring was capable of producing scores similar to human readers
Problems with electronic grading systems • These systems are looking at language structure, they cannot verify facts presented in the essay • Les Perelman, Director of Writing at MIT, wrote an essay that received the top score from e-rater®. • The essay prompt was about the rising costs of college. • Perelmen based his essay on the premise that college costs are so high because “Teaching assistants are paid an excessive amount of money.”
“In conclusion, as Oscar Wilde said, "I can resist everything except temptation." Luxury dorms are not the problem. The problem is greedy teaching assistants. It gives me an organizational scheme that looks like an essay, it limits my focus to one topic and three subtopics so I don’t wander about thinking irrelevant thoughts, and it will be useful for whatever writing I do in any subject. I don’t know why some teachers seem to dislike it so much. They must have a different idea about education than I do.”
Sources Winerip, M. (2012) “Facing a Robo-Grader? Just Keep Obfuscating Mellifluously” New York Times, April 22, 2012. Retrieved 4/13/2013 from http://www.nytimes.com/2012/04/23/education/robo-readers-used-to-grade-test-essays.html Ramineni, C (2012) “Evaluation of the e-rater® Scoring Engine for the GRE® Issue and Argument Prompts” Educational Testing Service. Retrieved 4/13/2013 from http://www.ets.org/Media/Research/pdf/RR-12-02.pdf Kolowich, S (2012) “Large study shows little difference between human and robot essay graders.” Inside Higher Ed. Retrieved 4/11/2013 from http://www.insidehighered.com/news/2012/04/13/large-study-shows-little-difference-between-human-and-robot-essay-graders Shermis, M (2012) “Contrasting State-of-the-Art Automated Scoring of Essays: Analysis” Retrieved 4/13/2013 from http://www.scoreright.org/NCME_2012_Paper3_29_12.pdf Dikli, S. (2006). “An Overview of Automated Scoring of Essays.” Journal of Technology, Learning, and Assessment, 5(1).