220 likes | 333 Views
A Human-Computer Collaboration Approach to Improve Accuracy of an Automated English Scoring System. NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU ). Outline. Overview of the system Issue Redundant errors Solution Introducing method to determine redundant errors
E N D
A Human-Computer Collaboration Approach to Improve Accuracy of an Automated English Scoring System NAACL-HLT 2010 June 5, 2010 Jee Eun Kim (HUFS) & Kong Joo Lee (CNU)
Outline • Overview of the system • Issue • Redundant errors • Solution • Introducing method to determine redundant errors • Evaluation • Conclusion NAACL-HLT2010
automated scoring system question database Teacher Input: She play footboll. scoring result score: 3 points out of 6 jerror in number agreement(play plays|played) kmisspelling (footboll football) ltense mismatching (play played) mmissing elements “after school” Student Question: 그녀는 방과 후에 축구를 했다. Correct answers: She played soccer after school. She played soccer after school is over. Procedure of Automated Scoring System feedback NAACL-HLT2010
Automated English Scoring System • Scoring a single sentence not an essay • Target users • Junior high school students learning English as a second language • Calculating a score based on • the number of errors • the types of errors NAACL-HLT2010
System Overview a scoring result & diagnostic feedback inter-sentential error detection module comparing sentences & calculating similarity mapping errors dependency structures dependency structures lexical information & syntactic rules & synonyms lexicon lexicon intra-sentential error detection module syntactic analyzer syntactic errors morphological analyzer word errors a student’s answer a set of correct answers NAACL-HLT2010
Errors • 76 error types to be detected by the system • 16 word errors morphological analyzer • 46 syntactic errors syntactic analyzer • 14 mapping errors comparing sentences • Error Reporting • She is too week to carry the bag. ERROR_ID |ERROR_POSITION |ERROR_CORRECTION_INFO e.g., CONFUSABLE_WORD_EROR | 4 | weak NAACL-HLT2010
Issue Correct Answer: She is too weak to carry the bag. Student Answer: She is too weak to carry the her bag. Teacher’s assessment : ‘her’ has to be omitted • A single error has been detected • Error detection result produced by the system Syntactic processing phase EXTRA_DET_ERROR | 7-9 | UNNECESSARY_NODE_ERROR | 8 | (her) Mapping processing phase • System’s assessment: treated them as two distinctive errors NAACL-HLT2010
Error Example Correct Answer: She is a teacher who came to our school last week. Student Answer: She is a teacher who come school last weak. One of the errors has to be removed!!! NAACL-HLT2010
Redundant Errors • A pair of errors is determined as redundant errors if • they satisfy the following 3 conditions all together • COND1: Sharing an error position • COND2: Detected from different process phases • COND3: Dealing with the same linguistic phenomenon • Objectives • To remove one of the redundant errors • To improve the accuracy of the system NAACL-HLT2010
Deciding Redundant Errors 14,892 sentences with errors detected by the system Filtering by Cond #1 & #2 150,419 pairs of errors 657 pairs of error ID Filtering by PMI & RFC 29,588 pairs of errors 111 pairs of error ID Filtering by human experts 20 pairs of error ID 47 pairs of error ID 44 pairs of error ID Deciding by Decision Tree redundant redundantor non-redundant non-redundant NAACL-HLT2010
Deciding Redundant Errors (1) • Filtering by COND #1 & #2 • Input • 14,892 task-takers’ sentences scored by the system • All the possible pairs of errors which could occur in a sentence • Output • 150,419 pairs of errors were filtered • 657 pairs of error ID COND1: Sharing an error position COND2: Detected from different process phases ERROR_ID |ERROR_POSITION |ERROR_CORRECTION
Deciding Redundant Errors (2) • Filtering using threshold of PMI & RFC[Su et al, 1994] • Input • 657 pairs of error ID from the previous step • Pointwise Mutual Information (PMI) • Relative Frequency Count (RFC) • Filtering • Output • 111 pairs of error ID NAACL-HLT2010
Deciding Redundant Errors (3) • Filtering by human experts • Background of the experts • Junior high school English teachers • With Linguistics knowledge • With teaching experiences of 10 years or more • Input • 111 pairs of error ID • Output • Categorized errors into 3 classes NAACL-HLT2010
Deciding Redundant Errors (4) • 3 error classes NAACL-HLT2010
Deciding Redundant Errors (5) • For 44 “yet to be decided” pairs • Need additional information to determine if they are redundant or not • Using Decision Tree • Extracting decision rules NAACL-HLT2010
Deciding Redundant Errors (6) • Features for decision tree learning • For a pair of errors (E1, E2) NAACL-HLT2010
Examples of Decision Rules NAACL-HLT2010
Evaluation • Scoring 200 unseen student-sentences by the system • Overall system’s performance • 2.6% improved… • Reducing a gap between human scoring and machine scoring 20 pairs of error ID 47 pairs of error ID 44 pairs of error ID Deciding by Decision Tree redundant redundantor non-redundant non-redundant NAACL-HLT2010
Conclusion • Improvement was achieved by collaborating with human experts • Overall accuracy of the system has been improved NAACL-HLT2010
Thank you! NAACL-HLT2010
Cannot be decided yet NAACL-HLT2010
Cannot be decided yet (cont’d) NAACL-HLT2010