360 likes | 532 Views
CAA-2003. Genre Analysis and the Automated Extraction of Arguments from Student Essays. Emanuela Moreale e.moreale@open.ac.uk Maria Vargas-Vera m.vargas-vera@open.ac.uk. Introduction. Automatic extraction of arguments from student essays. Aim “Proof of concept”
E N D
CAA-2003 Genre Analysis and the Automated Extraction of Arguments from Student Essays Emanuela Moreale e.moreale@open.ac.uk Maria Vargas-Vera m.vargas-vera@open.ac.uk
Introduction • Automatic extraction of arguments from student essays • Aim • “Proof of concept” • Provide feedback & aid assessment • “Qualitative” • Aims for limited “understanding” • Theory-based (v.s. stats approaches) • Who for? • Mainly tutors • Also students CAA2003 - Genre Analysis & Arguments in Student Essays
Topics of Discussion • Argument Modelling in Academic Papers • Ontologies / Schemas • Swales, Teufel, Hyland, ClaiMaker • Our own categorisation • Experiments in using NLP (IE techniques) for argument extraction • Gate, MnM, Amilcare (Annie) • Student Essay Viewer (SEV) + our categorisation • SEV: architecture, GUI, feedback and assessment • Results / Future Work CAA2003 - Genre Analysis & Arguments in Student Essays
Argumentation Modelling in Papers Research into - supporting rhetorical argument construction / writing process - tools for “making thinking visible”: • Writer’s Assistant • Belvedere • SenseMaker Research on Paper Structure • Swales’s CARS model • Teufel • Hyland • ClaiMaker CAA2003 - Genre Analysis & Arguments in Student Essays
Argument Construction / Writing • Writer’s Assistant • Text and Argument views • Belvedere and SenseMaker • Develop scientific argumentation skills • Users: unpracticed beginners • Focus on rhetorical relations between items: evidence, claims, explanations Interesting but… • is this generic enough? • Our users: tutors and university students CAA2003 - Genre Analysis & Arguments in Student Essays
Argumentation in Papers: Swales • Swales – CARS model (1990) • Research paper introductions • Three moves with 3 or 4 steps each: • Establish territory • “There has been a great interest…” • Establish a niche • “This approach fails to…” • Occupy niche • “The purpose of this paper is to…” • “The paper is structured as follows…” • Influential model CAA2003 - Genre Analysis & Arguments in Student Essays
Argumentation in Papers (2): Teufel • Teufel et al. (1999) • Extend CARS model to whole essay • Add a few new moves • Bias automatic summarisation • Focus: mark the purpose of the paper in relation to previous literature • Types of sentences: • Background (background knowledge), other (outside this paper), own (author’s new contributions); • Aim - main research goal of this paper • Textual – structure of the paper • Contrast – own work to other work • Basis – work used as basis to this work • Human annotators (not an implemented system) CAA2003 - Genre Analysis & Arguments in Student Essays
Argumentation in Papers (3): Hyland • Hyland’s Metadiscourse Schema(1998) • Metadiscourse in academic texts; 1. Textual metadiscourse - allows the recovery of the writer’s intention by explicitly establishing preferred interpretations: • Code glosses (namely, in other words…) - relates propositions to each other and to other texts: • Logical Connectives (but, therefore, thus…) • Endophoric Markers (noted above, see Fig 1) CAA2003 - Genre Analysis & Arguments in Student Essays
Argumentation in Papers (4): Hyland Hyland’s Metadiscourse Schema(cont.) • Metadiscourse in academic texts: 2. Interpersonal metadiscourse alerts readers to the author’s perspective to both the information and the readers themselves (expresses the writer’s persona). - Hedges: might, perhaps, it is possible…; - Emphatics: in fact, definitely, it is clear…; - Relational markers: Frankly, note that, you can see that - Attitude markers: Surprisingly, I agree, X claims - Person markers: I, we, my, mine, our CAA2003 - Genre Analysis & Arguments in Student Essays
Mixed Approaches We also looked at SchoolOnto, a project aiming to: • model arguments in academic papers • devise an ontology for scholarly discourse The team have also produced a tool called ClaiMaker Claim types: general, problem-related, evidence-related, similarity and causal CAA2003 - Genre Analysis & Arguments in Student Essays
ClaiMaker • ClaiMaker is mainly meant for academic papers: • Academic paper = set of interlinked parts • Statements: in one paper -> in another paper • Result: network of cross-referencing claims • Use - Taken a complete paper, claims are annotated manually by the reader; - In the future, paper authors may publish: paper AS WELL AS claims; - ClaiMaker could be used when writing a paper (claim search, issue focusing) CAA2003 - Genre Analysis & Arguments in Student Essays
Towards Our Categorisation • Interesting categorisations/tools but imperfect fit • ClaiMaker (tool): • academic papers (not student essays) • currently, claims are entered manually • Belvedere/Sensemaker (tool): • Unpracticed beginners in scientific argumentation • Argumentation Categorisations so far: • Target academic papers • Swales: introduction only • Teufel: manual system, no implementation • ClaiMaker: ontology OK for manual editing but for automated system? CAA2003 - Genre Analysis & Arguments in Student Essays
Towards Our Categorisation (2) • With these concerns in mind, we identified categories of possible arguments in student essays. • Preliminary manual analysis of essay texts • Some categories may have been “influenced” by ClaiMaker • Our initial categories were: • Definition • Comparison • General • Critical thinking • Reporting • Viewpoint • Problem • Evidence • Causal • Taxonomic • Content/expected • Connectors CAA2003 - Genre Analysis & Arguments in Student Essays
Towards Our Categorisation (3) • A review of the first categorisation led us to: • Reduce number of categories • Cognitive overload • Ease of display • Reassess main categories • Main elements in a university-level essay: • Showing knowledge of background / research area • Reporting other people’s views • Demonstrating analytical thinking • Contrasting & comparing viewpoints • Define and “relate” terms / concepts CAA2003 - Genre Analysis & Arguments in Student Essays
Our Categorisation CAA2003 - Genre Analysis & Arguments in Student Essays
Our Categorisation: Characteristics • Overall, remarkable similarities across categorisations • STRATEGY: • Teufel: Textual category • Hyland: Endophoric markers “see section 4” • Swales: M3, S1a: Purpose & M3, S1d: Structure • REPORTING: • Swales: M1, S3 verbs like “show, demonstrate, establish” • Teufel: Other • Hyland: Evidentials e.g. “according to X (1990)” • POSITIONING: • Swales: Move 2 (Establishing a niche) • Teufel: Contrast • Hyland: Emphatics, Attitude Markers, Person Markers CAA2003 - Genre Analysis & Arguments in Student Essays
Our Categorisation: Characteristics BUT • No AIM category (unlike Teufel’s schema): • Essay have implicit aim to answer the essay question(s) • No distinction between OTHER and OWN • Tricky distinction anyway • Not applicable to student essays • New category: Content/Expected • Relates to essay content (material covered) • Student essay-specific • Domain-dependent and Tutor-specified CAA2003 - Genre Analysis & Arguments in Student Essays
Natural Language Technologies • Problem to be solved: • How can Natural Language technologies help in the automatic extraction of arguments? • Requirements: • provide tutors with a tool that is easy to use • Analysed Tools: • Gate identifies entities (names of people, organizations, dates, money and locations). It uses gazetteers and regular expressions • Good tool for developers • Too difficult for tutors to use • MnM • Uses Amilcare, an Information extraction engine • Requires a training set from which it learn patterns. These patterns can then be recognized in new documents (from the test set). CAA2003 - Genre Analysis & Arguments in Student Essays
Natural Language Technologies • What is Information Extraction (IE)? • IE extracts facts about pre-defined types of information from documents; • Origin: IE research begun late 1980s, as a product of cold war: automatic extraction of information from naval messages. • Uses: • An IE system designed for terrorist domain might extract perpetrators, victims, physical targets, weapons, dates and locations (Riloff et al. 1993) • Domains: • IE has been used in scientific articles, bibliographic notices [Proux et al. 1997], medical records [Soderland et al. 1995]. CAA2003 - Genre Analysis & Arguments in Student Essays
Natural Language Technologies • First MnM implementation was able to fill templates such as the following: • “visiting-a-place-or-people” event • Template looks like this: • Visitor • Date of visit • Place being visited • Pattern learnt: • X visited Y: where • The type of X is Person and • The type of Y is Location CAA2003 - Genre Analysis & Arguments in Student Essays
Natural Language Technologies • Information Extraction seems very appealing for finding patterns • IE tools work very well in narrow domains BUT • In our domain (student essays), it is not easy to define in advance templates containing concepts and relations between them: • We cannot anticipate which concepts / relations students going to be use CAA2003 - Genre Analysis & Arguments in Student Essays
Proposed Solution • Our approach combines: • cue phrases with • a set of patterns without template filling. • We started off by defining gazetteers of cue phrases and patterns written as regular expressions. • The set of patterns is organised on the basis of our categories • We then developed a tool to show this “in use”: the Student Essay Viewer (SEV). CAA2003 - Genre Analysis & Arguments in Student Essays
Architecture – SEV Components CAA2003 - Genre Analysis & Arguments in Student Essays
About SEV • WHY: Gives a visual representation of argumentation within an essay • INTUITION: • essays with considerably more “highlighted text” contain much more argumentation (and “content”) THUS • they attract higher grades than essays with little highlighting CAA2003 - Genre Analysis & Arguments in Student Essays
SEV - Users • Main Target Group: Tutors • Context: assessment / feedback • SEV’s automatic counts indicator • Citation highlighting • Amount/distribution of highlighting • Possible Target Group: Students • Formative assessment • Essay has little highlighting -> revise • Needed type of argumentation CAA2003 - Genre Analysis & Arguments in Student Essays
SEV - Interface • SEV looks like a simple webpage • Top part • categorisation browser/selector • Main part • Essay • Initially not annotated (no highlighting) • After category/categorisation selected, the annotated essay is displayed • Automatic link counts (by type) also displayed CAA2003 - Genre Analysis & Arguments in Student Essays
SEV: Initial Presentation CAA2003 - Genre Analysis & Arguments in Student Essays
SEV: After Selecting “Definition” CAA2003 - Genre Analysis & Arguments in Student Essays
SEV: Highlighted Essay (Ours) CAA2003 - Genre Analysis & Arguments in Student Essays
SEV & Assessment 3 underlying assumptions: • 1) Number of annotations: Bad essays have fewer annotations than better essays • 2) Critical analysis and background are two essential elements in essays: corresponding annotations expected to closely correlate with grade; • 3) The relative importance of annotation categories may vary & depend on essay type CAA2003 - Genre Analysis & Arguments in Student Essays
Assumption 1: annotation count & score Correlation: r =0.878; N=12;p<0.01 ANOVA F-stat: F(1,10)=33.501; p<0.01 CAA2003 - Genre Analysis & Arguments in Student Essays
Assumption 2: essential elements in essay Corr(Positioning/Score): r =0.753; N=12;p<0.01 MANOVA Positioning/Background: F(1,10)=18.462; p<0.01 NOTE: Background = expected + reporting CAA2003 - Genre Analysis & Arguments in Student Essays
Assumption 3: Essay “Types” + Annotations • Analysis of 4 different assignments • Questions: • Say How and Why • Opinion about X • Describe and Discuss • Give example of X and critique X • Basic Idea: different essay questions require different “link/annotation profiles” • E.g. “Summarise X” will be answered by an essay with many “reporting” links • Better fit for some assignments (1) than others (4) CAA2003 - Genre Analysis & Arguments in Student Essays
Results - Summary • Student essay metadiscourse schema • compared with research essay categorisations • Links: Argument Extraction and Assessment • Total number of links correlated with score • Positioning and background most useful for score prediction • Different link profiles to answer different essay questions • Student Essay Viewer (SEV) • Highlights instances of our categories in an essay • Helpful to tutors • Quick visualisation of type of argumentation • Quick visualisation of amount/distribution of arg. • Helpful for students • Feedback about the essay (e.g. lacking categories) CAA2003 - Genre Analysis & Arguments in Student Essays
Results & Future Work • Real Data (actual postgraduate essays) • Higher number in the future • Encouraging results • Future Work • SEV could categorise longer linguistic units • e.g. sentences or paragraphs • SEV to provide reasons why a specific categorisation is assigned to a linguistic unit • Explanation in pseudo-natural language • Inclusion of “Essay Question Analysis” Tool • Particularly useful for students CAA2003 - Genre Analysis & Arguments in Student Essays
Conclusion • Generic metadiscourse schema for student essays • Links to similar schemas for research papers • SEV: Argumentation visualisation tools • Detects argumentation by using: • our annotation schema and cue phrases • Target users: tutors (and students) • Can aid Assessment • Link between annotation count and score • Positioning and background links most important in predicting score • Formative feedback to students CAA2003 - Genre Analysis & Arguments in Student Essays