370 likes | 647 Views
Direkt Profil: an automatic analyzer of texts written in French as a second language. Jonas Granfeldt(1), Pierre Nugues(2), Suzanne Schlyter(1), Malin Ågren (1), Edin Kukovic (1), Emil Persson (1), Jonas Thulin (2), Lisa Persson (2), Fabian Kostadinov (3)
E N D
Direkt Profil: an automatic analyzer of texts written in French as a second language Jonas Granfeldt(1), Pierre Nugues(2), Suzanne Schlyter(1), Malin Ågren (1), Edin Kukovic (1), Emil Persson (1), Jonas Thulin (2), Lisa Persson (2), Fabian Kostadinov (3) (1) Lund University, Centre for languages and literature, French (2) Lund Institute of Technology, Department of Computer Science (3) University of Zürich, Department of Computer Science http://profil.sol.lu.se Jonas.Granfeldt@rom.lu.se
OUTLINE • Introduction • The idea • Rationale • The knowledge bases • Demo • Theoretical background • Developmental sequences and developmental stages in L2 French • Method • CEFLE - The development corpus • The Direkt Profil system • Overview of the system • Annotation • Defining profiles/stages with machine learning • Results • Annotation • Defining profiles/stages • Example of an applied study with Direkt Profil • Direkt Profil and teachers’ assessments: a correlation study • Conclusion • Problems • Future work
INTRODUCTION • The idea was… • To provide researchers, teachers and learners with an easy-to-use tool for overall diagnostic assessment of developmental stage. • To base the assessment on current research on second language acquisition. • To automatically provide feedback to teachers and learners on language level and central target features of the language. • To use learners’ free written production as the basis of assessment (rather than close-tests)
INTRODUCTION Rationale • Language acquisition is a process which follows a specific and definable order. • Learners and teachers want to know about the progress the learners make. • Instruction is probably most effective if it is adopted to the learners’ present developmental level (cf. The Teachability Hypothesis, Pienemann, 1985)
INTRODUCTION The knowledge bases for the project • Second Language Research • Linguistics (French) • Natural Language Processing • Engineering
<CORPUS> <SAMPLE SUBJECT_ID="XXXX"> <TEXT>C'est deux personne, une fille et sa mère. La fille est grand et elle a une robe blue. Sa mère est petite mais grosse et elle a une robe vert. Elles va à L'Italie dans ses vacances. La fille pense à les garcons italien et sa mere pense du soleil. Elles sont derière un table avec une map. Elles boire des café. Leur voiture est vert. La voiture est trés petite est la bagage n'est pas fit. Maintenant elles à destination D'Italie. Elles check in. Le monsiuer fait une ronde tête est une grand moustache. Leur chambre est beau avec deux lis est une trés beaux vue. Elle est sur la plage. Sur la mere il y a des bateaux. Elles fait du soleil. Dans la soir elle a dîner dans une restaurant. À côté il y a un garcon avec une costume blue. Après le diner elles boire du vin rouge dans la bar. Les deux garcon d'italien ils voir la mère et sa fille. Ils sont d'amour. Ils parlent et boire de alcohol. Aprés ils fait du dancing. Le jour aprés ils fait du sightseeing avec Tony et son autobus rouge. Il est bold. Après le sightseeing ils visite un marche. La dame grosse a une hat rouge. Le monsieur grand a un hat noir. La fille grand amour le garcon petite mais grosse. Sur le soir ils separé - le grand monsieur avec la petite mais grosse dame et la grand fille avec le petite mais trés grand monsieur. Le jour après ils revenir a Suede avec les deux monsieurs. </TEXT> <INFO TASK_NAME="VOYAGE_ITALIE" GROUP_SUBJECT="MAIN" SUBJECT_LEVEL="2" SOURCE_SCHOOL="XXXX"/> </SAMPLE> INTRODUCTION An example: a learner text from the corpus
INTRODUCTION DEMO HERE
THEORETICAL BACKGROUND French L2 in a developmental perspective • Many projects since 1980s (examples) • ESF-project (Perdue, 1993, L2 French, different L1s) • InterFra project (Bartning, 1997 and later) (Swedish L1) • FIFI/DURS project (Schlyter, 1986 and later, Granfeldt, 2003) (Swedish L1) • Myles & Mitchell Myles (2002 and later) (Flloc-project, English L1) • Empirical objectives of this research: • arrive at rich and empirically valid descriptions of how French interlanguage develops over time. • identify features at different linguistic levels which are developmentally related. • Some syntheses are emerging: • Bartning & Schlyter (2004): A proposal of six stages of development. • Véronique et al. (2009): A proposal of three stages
THEORETICAL BACKGROUND Benchmarking grammatical development of French L2 (Bartning & Schlyter, 2004) • Objectives: • Describe developmental sequences in French L2 for a number of morphosyntactic phenonema • Establish general learner stages/profiles wrst to grammatical development • Data: • Oral corpora of French L2 (L1 = Swedish). • Post-puberty learners (N=35, 80 recordings) • Method: • Frequency analysis and linguistic profiling • Manual and semi-automated tagging of transcriptions
A model with 6 profiles/stages (sample) Initial Intermediate Advanced Granfeldt (2003); Bartning & Schlyter (2004)
METHOD Direkt Profil • Objectives: • To implement the model of Bartning & Schlyter (2004) • To develop an easy-to-use system for automated annotation, extraction and frequency analysis of as many as possible of the features in B&S work • To develop a system for defining developmental stages/profiles • Method: • Constructing an interlanguage partial parser for L2 French • Connecting the parser to a module for machine learning • Constructing an interface • We have expanded on B&S original work wrst : • Type of data (written rather than oral) • Quantity of data • Additional features (more morph.synt. features, lexical and quantitative features)
The development corpus CEFLE • CEFLE: Corpus Ecrit de Français Langue Etrangère • 400 texts written under controlled conditions by 85 Swedish and 22 French students (317 texts used here) 4 texts / learner. • Manual assignment of “stage” to one text from each learner using B&S criteria (Voyage en Italie) Granfeldt, Nugues et al. (2006)
ANNOTATION • We developed an annotation scheme based on B&S (2004) framework. • The concepts of noun or verb group is the grammatical representation of most phenomena in this framework. • Essential to the Direkt Profil annotation • Many syntactic annotation frameworks for French take this into consideration • An example from Gendner et al. (2004): et mademoiselle qui <NV> appelait </NV> au secours ! ... ou plutôt non , <NV> on ne l' entendait </NV> plus ... <NV> elle était </NV> peut-être morte ... • This annotation make no provision however for the specific details in B&S framework
ANNOTATION (cont’d) • The Direkt Profil annotation is an XML-based mark up, split into 5 levels: • Tokenisation 2. Identification of prefabricated structures (c’est; je m’appelle etc)
ANNOTATION (cont’d) 3) POS-tagging (Det, Prep, Pron, V(être/avoir), Konj) • Groupe detection/chunking: rule-based (decision tree) and uses a set of grammatical words (« mots vides », Tesnière, 1959; Vergne, 1998) • Chunk classification: rule-based feature checking between elements.
The sentence Ils parlons dans la bar is annotated as <segment class="c5148"><tag pos="pro:nom:pl:p3:mas"> Ils</tag> <tag pos="ver:impre:pl:p1"> parlons </tag></segment> dans <segment class="c3071"> <tag pos ="det:fem:sg">la</tag> <tag pos="nom:mas:sg">bar</tag> </segment> c5148 reads: “Lexical verb/Present tense/3rd.pers.PL/no_agreement” c3071 reads: “Det_Noun_NP/singular_det/without_gender_agreement” • Features are finally counted and raw occurrences are converted to percentages (where relevant)
The dictionary • The engine uses a dictionary of French inflected forms available freely from Association des Bibliophiles Universels (ABU) • We have corrected, complemented it, and converted it to XML. • We have also added frequency-of-use information from the Lexique database (New, Pallier & Ferrand, 2005)
DEFINING STAGES/PROFILES • Using the criteria in Bartning & Schlyter (2004) two researchers manually classified 82 texts of the sub-corpus Le voyage en Italie (part of CEFLE). • The classification was subsequently re-used with all texts from the same learner, resulting in 317 classified texts. • We trained/build classifiers where we used automatically extracted phenomena as features representing the learners’ texts. • Currently 142 phenomena (features/attributes) are used when establishing a learner profile stage. • We used C4.5 (Quinlan, 1986), LMT (Landwehr et al., 2003), and Support Vector Machines (Boser et al al., 1992) from the Weka collection (Witten & Frank, 2005)
RESULTS Annotation Granfeldt, Nugues et al., 2005
RESULTS CLASSIFICATION using all features Granfeldt & Nugues, 2007
A sample decision tree • % NPs with gender agreement <= 93 • | % nominative pronouns <= 4: 1 (7.0/1.0) • | % nominative pronouns > 4 • | | % NPs with num+gen agreement <= 94: 1 (2.0) • | | % NPs with num+gen agreement > 94 • | | | % pluperfect verbs in S-V agreement <= 0 • | | | | S-V agreement w/ modal verbs <= 10 • | | | | | Average sentence length <= 15 • | | | | | | % of the next 2,000 words <= 0: 1 (2.0/1.0) • | | | | | | % of the next 2,000 words > 0 • | | | | | | | % D-N-A in agreement <= 0: 2 (11.0) • | | | | | | | % D-N-A in agreement > 0 • | | | | | | | | % D-A-N in agreement <= 50 • | | | | | | | | | % of the next 2,000 words <= 1: 2 (8.0/1.0) • | | | | | | | | | % of the next 2,000 words > 1 • | | | | | | | | | | % prepositions <= 9 • | | | | | | | | | | | % vbs in the imperfect <= 0 • | | | | | | | | | | | | % mod+inf verbs in S-V agreement <= 33: 2 (4.0) • | | | | | | | | | | | | % mod+inf verbs in S-V agreement > 33: 3 (3.0/1.0) • | | | | | | | | | | | % vbs in the imperfect > 0: 3 (2.0)
Attribute selection • We ran an attribute selection procedure in order to identify the best features at this point. • To evaluate the 142 attributes, we measured the information gain for each attribute with respect to the class. This method is derived from ID3 and is part of the Weka software. Top 10 features according to InfoGain metric Average merit Feature 0.4371 % Determiner Noun agreement (gender errors) 0.3351 % Unknown words (i.e. not in dictionary) 0.3232 % NPs with gender agreement (including adjectives) 0.2925 Average sentence length 0.2565 % Prepositions (out of all parts-of-speech) 0.2082 % S-V agreement with modal verbs followed by infinitive 0.1953 % Noun Adjective with agreement (gender and number) 0.1793 % S-V agreemet w auxiliary in passé composé 0.1739 % S-V agreement with être/avoir 3ppl (all tenses) 0.153 % K1Tokens (out of all tokens) Granfeldt & Nugues, 2007
Results after feature selection (top 20 attributes) Granfeldt & Nugues, 2007
Direkt Profil and teachers’ assessment: a correlation study • An example of an applied study with Direkt Profil • Several scholars have suggested that work on developmental sequences and stages could be used as a mean for assessing language development of a particular individual at a given time (Clahsen, 1985, Pinemann & Johnston, 1987, the Rapid Profile program Pinemann & Mackay, 1992, Brindley, 1998)
Research questions • What is the correlation between the developmental stage and teachers’ assessments of the same texts? (RQ1) • To what extent can the developmental stage predict teachers’ ranking of a particular text? (RQ2)
Method • 50 texts from the CEFLE- corpus (Ågren, 2005) were selected (Task: Le voyage en Italie picture series) • The learner texts had previously been manually analysed according to developmental stage following the criteria in B&S The texts were also analysed by Direkt Profil resulting in two separate indications for developmental stage (manual and automated)
Method (cont’d) 7 experienced teachers of upper secondary school rated the 50 texts on a six grade scale (6 = highest level) They were asked to assess the texts in three domains: • “Form”, i.e. language (grammar, lexicon, spelling etc.) • “Content and Communication” (content in relation to the pictures, the communicative success of the text) • “Overall”, i.e. combining a and b (in a way they found suitable) The teachers also stated for each assessment the degree of certainty with which they had rated the text (scale of 5 where 5 indicated completely certain and 1 indicated completely uncertain)
RESULT: Median and distribution of ratings for form (language) Granfeldt & Ågren, 2009
RESULT Inter-rater agreement between teachers Granfeldt & Ågren, 2009
RESULT: Correlating developmental stage and teachers’ assessments Answering Research Question 1: The developmental stage is better correlated with the assessments of the teachers than instructional level. Granfeldt & Ågren, 2009
RESULT: Regression analysis Answering Research Question 2: Apprx. 70% of the variance in the teachers ranking of the texts can be explained by the developmental stage as analysed by Direkt Profil
Conclusion • We have presented a system for assessment of developmental stage/profile in French as a second language French. • The system implements the current theory of stages/profiles of development in French. • The system consists of • a interlanguage partial parser for French L2 called Direkt Profil and • a machine-learning module connected to it. • Results: • An evaluation of the annotation showed mixed results, depending very much on the developmental stage of the writer. • Results from classification experiments show: • Best results with a 3-stage classification: a mean F of 0.82 • Stage 1 is the most problematic • The texts from the natives are relatively easy to classify: a mean F of 0.91 • A large feature set does not seem to be necessary (at least not for this data) • Using an attribute/feature selection method, we have identified a list of ”10 best attributes”
Problems ”Briefly, the language produced by learners is about the worst imaginable type of language for NLP.” (Tschichold, 2007) • Lexical spelling (orthographe lexicale) is a problem – incorrect forms lead to increased ambiguity and to incorrect annotation • Attribute selection is not sufficiently studied. • Amount of data is still insufficient.
Future work • Optimising annotation: • Procedures to adress the spelling problem • Review the rules • Ongoing student tests with a stochastic parser (trained on the Le monde corpus) • Adding more texts from higher stages of development • Expanding to other languages (Italian L2) • Continue working with other assessment schemes, i.e. the Common European Framework of Reference (Granfeldt, 2008)
Thank you for your attention! • Direkt Profil is free to use • Available at this adress: • http://profil.sol.lu.se • Acknowledgments The profiling team in Lund: Pierre Nugues, Suzanne Schlyter, Malin Ågren, Edin Kuckovic, Emil Persson, Fabian Kostadinov, Lisa Persson This work was supported by the Swedish Research Council Grant number 2004-1674