Automated Essay Scoring: Giving Voice to Grievances and Exploring New Contexts

Automated Essay Scoring: Giving Voice to Grievances and Exploring New Contexts Elizabeth Edwards PhD Candidate, Graduate Teaching Assistant Washington State University

A Need to Define Terms and Situations • With efficiency and profit override the AES debate, what argumentative options are available to composition studies? • What areas of compromise and consideration exist, how are they discussed, and in which directions do they propel debates about AES?

The Complexity of Writing • Vojak et al.: writing “a socially situated activity, … functionally and formally diverse, [and] … a meaning-making activity that can be conveyed in multiple modalities” (98). • Ericsson: “[c]onsidering the meaning of meaning is vitally important…because most people believe that conveying meaning is the most important goal of the written word” (30). • Anson: “[i]nferencing…provide[s] the connective tissue between assertions and yielding meaning and interpretation”(42). Inferencing “can’t be replicated by a computer” (39), without “turn[ing words]…into mindless bits of linguistic code” (48).

The Commodification of Writing • Haswell: “data mining” allows entrepreneurs to “develop front ends and sets of documentation that make their systems ‘friendly’ – that is, easy to use, cheap…, and accurate” (24). • Burstein and Marcu: machines can build “model[s to] look at trends across 1,200 essays” (458). • Kukich: AES “might…elucidate many of the features that characterize good and bad writing, and many of the linguistic, [and] cognitive…skills that underlie the human capacity for…writing” (22).

Subtracting Humans (Teachers) from the Equation • Kukich: “… using e-rater clearly relieves a significant portion of the load on human scoring experts” (25). • Rudner and Gagne: “computer scoring can be faster, reduce costs, increase accuracy, and eliminate concerns about rater consistency and fatigue” (1). • Shermis et al.: “[o]bviously computers do not ‘understand’ written messages in the same way that humans do” (403), but assert that “alternative technologies achieve similar results” (403).

Mild Concessions from AES Supporters • Klobucar et al.: “… automated essay scoring might fit within a larger ecology as one among a family of assessment techniques supporting the development of digitally enhanced literacy” (105). • Klobucar et al.: “[W]e must guard against the possibility that the machine will misjudge a writing feature and that students will be wrongly counseled” (114). • Messick: validity should be “plural, not singular” (37)… and “issues with construct validity are not just errors or ‘bugs’ in the experiment – they are the issues with the entire test or system in the first place” (41). • Byrne et al. write in “ethical postscript” that “machines [can] not detect other subtleties of writing such as irony, metaphor, puns, connotation, or other rhetorical devices” (33).

Options for Reframing the AES Debate • Whithaus: “…composition researchers and teachers need to step back from a discourse of rejection…to make finer…distinctions among types of software and their uses” (170). Finer distinctions will allow a “situation by situation consideration of how software is used and its impact on writing pedagogy” (176). • Condon: when assessment and prompt design occur at a national level, both “almost certainly have little to do with local curricula, and they may well be inappropriate for a local student population” (214). • Condon: “machine scoring’s principle advantage – economy – comes at too great a cost…Machine scoring simply cannot compete economically, as long as we consider all the costs of employing it” (217, original emphasis).

Works Cited (1) • Anson, Chris M. “Can’t Touch This: Reflections on the Servitude of Computers as Readers.” Ericsson, Patricia Freitag, and Haswell, Richard H., Eds. Machine Scoring of Student Essays: Truth and Consequences. Logan, UT: Utah State University Press, 2006. 38-56. Print. • Burstein, Jill and Daniel Marcu. “A Machine Learning Approach for Identification of Thesis and Conclusion Statements in Student Essays.” Computers and the Humanities 12.3 (2004): 455 – 467. • Byrne, Roxanne et al. “eGrader, a software application that automatically scores student essays: with a postscript on the ethical complexities.” Journal of Systemics, Cybernetics & Informatics 8.6 (2010): 30 – 35. • Condon, William. “Why Less Is Not More: What We Lose by Letting a Computer Score Writing Samples.” Ericsson, Patricia Freitag, and Haswell, Richard H., Eds. Machine Scoring of Student Essays: Truth and Consequences. Logan, UT: Utah State University Press, 2006. 211-220. Print.

Works Cited (2) • Ericsson, Patricia Freitag. “The Meaning of Meaning: Is a Paragraph More Than an Equation?” Ericsson, Patricia Freitag, and Haswell, Richard H., Eds. Machine Scoring of Student Essays: Truth and Consequences. Logan, UT: Utah State University Press, 2006. 28-37. Print. • Haswell, Richard H. “Automatons and Automated Scoring: Drudges, Black Boxes, and Dei Ex Machina.” Machine Scoring of Student Essays: Truth and Consequences. Logan, UT: Utah State University Press, 2006. 57-78. Print. • Klobucar, A., et al.“Automated scoring in context: Rapid assessment for placed students.” Assessing Writing 18 (2013): 62 – 84. • Kukich, Karen. “Beyond Automated Essay Scoring.” IEEE Intelligent Systems 15.5 (2000): 22 – 27. • McCurry, Doug. “Can machine scoring deal with broad and open writing tests as well as human readers?” Assessing Writing 15.2 (2010): 118 – 129.

Works Cited (3) • Messick, S.“Test validity: A matter of consequences.” Social Indicators Research 45(1998): 35 – 44. • Rudner, Lawrence, and Phill Gagne. “An Overview of Three Approaches to Scoring Written Essays by Computer.” Practical Assessment, Research, and Evaluation 7.26 (2001). • Shermis, Mark D., et al. “Applications of Computers in Assessment and Analysis of Writing.” Handbook of Writing Research. Ed. Charles A. MacArthur, Steve Graham, & Jill Fitzgerald. New York: Guilford Press, 2005. 403 – 416. • Vojak, Colleen, et al. “New Spaces and Old Places: An Analysis of Writing Assessment Software.” Computers and Composition 28 (2011): 97 – 111. • Whithaus, Carl. “Always Already: Automated Essay Scoring and Grammar-Checkers in College Writing Courses. Ericsson, Patricia Freitag, and Haswell, Richard H., Eds. Machine Scoring of Student Essays: Truth and Consequences. Logan, UT: Utah State University Press, 2006. 166-176. Print.

When Machines Grade,Who Reads?Automation and the Production and Consumption of Writing Mike Edwards@preterite #cwcon #g6

Deautomation • Automation • Capital • Labor • Economics • Technology • Futures

de-automated labor of reading

didactics • modeling • finding topics • enculturation • self-discovery • inquiry • analysis • engaging difficulty • apprenticeship

to count and quantize and match patterns

AES systems don’t understand what they read

replace labor-intensive processes with capital-intensive processes

purpose of essay as labor purpose of essay as capital

purpose of assessment as labor purpose of assessment as capital

no increased value for student learning

William & Flora Hewett ASAP prize $200,000

O’Reilly Strata Conference Natural Language Processing

capital cannot idle

administrators like quantification

technological capital: 1x investment pedagogical labor: ongoing investment

College English 1947: Arthur Coon, “An Economic X Marks the Spot.”

Computers & Composition 1994: Carolyn Dowling,“The Ongoing Difficulty of Writing.”

over the past few decades, student essays have gotten longer

labor’s value measured by time and volume

Aggregation Problem (Piero Sraffa) Cambridge Capital Controversy

Keynesian, Marxian, Neoclassical investigations of AES algorithms:

who produces, distributes, uses, re-produces; how value is appropriated in assessment

Piketty, Thomas. Capital in the Twenty-First Century. Cambridge, MA: Belknap, 2014.

“a redistribution of income away from labor and toward holders of capital”

technological innovation and replacing labor with capital contributes to rising inequality

technological casualization of the labor of writing instruction

labor is objectively scarce as time-metered (C-M-C) commodity

what are the comparative measures of increase in G and R for students & instructors?

mike.edwards@wsu.edu

The Machine Has Spoken Automated Invention in an Era of Automated Assessment Matthew Frye (matthew.frye@email.wsu.edu) PhD Student, Washington State University Machine Loyalist

Overview • Short, qualitative study (12 participants) • What happens when machines write and humans evaluate? • Are our programs showing progress towards “understanding” writing?

Overview • Re: Perelman’s “Basic Automatic BS Essay Language” (BABEL) Generator • We’ve gotten really good at showing that we can fool the machine evaluations • Can the machines fool us? Chronicle of Higher Ed. April 28, 2014

Automated Essay Scoring: Giving Voice to Grievances and Exploring New Contexts

Automated Essay Scoring: Giving Voice to Grievances and Exploring New Contexts

Presentation Transcript

Giving Linux a Voice

The problems and potential of giving voice to children

Automated Essay Scoring for Swedish

GIVING VOICE TO SEXUALITY IN ALS

GIVING A VOICE TO PLACE

How to Submit an Essay Holt Online Essay Scoring

Giving Voice to Content: A New Generation of Products and Services

Essay Outline and Citations Scoring Guide

Grievances

GRIEVANCES

Giving ‘voice’ to Aspies...

“Giving Voice to 4G”

Giving voice to the voiceless

Combined Human and Automated Scoring of Writing

Automatic Essay Scoring

Missouri’s Experience with Automated Scoring

Grievances

Giving customers a voice

Grievances and Arbitration

Giving Linux a Voice