200 likes | 349 Views
A new corpus for Spanish Second Language Acquisition Research L. Dominguez, R. Mitchell, M. J. Arche (U. of Southampton), E. Marsden (U. of York), F. Myles (Newcastle U.). A corpus for L2 Acquisition. SLA theory aims to understand the complex mechanisms and conditions behind learner grammars
E N D
A new corpus for Spanish Second Language Acquisition Research L. Dominguez, R. Mitchell, M. J. Arche (U. of Southampton), E. Marsden (U. of York), F. Myles (Newcastle U.)
A corpus for L2 Acquisition • SLA theory aims to understand the complex mechanisms and conditions behind learner grammars • Access to good quality data is crucial: learner production data + focused comprehension tasks • Increasing interest in the creation of electronic learner corpora: • sharing data more easily • automatising some aspects of data analysis through the use of software such as concordancers, part of speech taggers, etc.
Some Existing Learner Corpora • CHILDES: http://childes.psy.cmu.edu/ • TALKBANK: http://talkbank.org/ • English Corpus Linguistics: http://cecl.fltr.ucl.ac.be/Cecl-Projects/Icle/icle.htm • L2 FRENCH • FLLOC: www.flloc.soton.ac.uk/ • L2 (Written) SPANISH • CEDEL 2: www.ugr.es/~cristoballozano/cedel2.htm
SPLLOC “Spanish Language Learner Oral Corpus” • 2 year ESRC funded corpus project investigating the development of L2 Spanish • Aims: • a small scale, high quality cross-sectional database of spoken learner Spanish • topics being investigated lie at the syntax/discourse interface • Data: • Collected - c40 hours of audio recordings (native/non-native) - 80 written focused tests on word order - 60 computer based tests on clitic comprehension • 95% transcribed to date!
Immediate Research Agenda • Syntax/discourse interface as conceptualised in generative linguistics, including: • The acquisition of Spanish word order • Clitic pronouns • Verbal morphology • Development of the L2 lexicon
Corpus Design • Balance of spontaneous and focused data (semi-spontaneous oral tasks are complemented by focused judgement and production tasks) • Balance of genres (semi-spontaneous oral tasks include interview, narrative and discussion) • Balance of participants (20 L2 speakers from each of beginner, intermediate and advanced levels + NS speakers) • Flexibility of computer-aided analysis (use of the CHILDES system, plus an XML version) • Free web access to all materials (anonymised sound files, transcripts, analysis files) for all bonafide research users.
Loch Ness Illustrations by Alex Brychta for “A Monster Mistake” by Roderick Hunt (Oxford Reading Tree, 2003) used by permission of Oxford University Press.
Photos task • Description of states And • Description of events
Clitic Comprehension (computer based) • The learner hears a sentence with a clitic pronoun and has to click on the object it refers to. • 32 screens: • Combination of number and gender (canonical and non-canonical) plus syntactic collocation. • Canonical feminine: -a ending (e.g. calculadora ‘calculator’) • Canonical masculine: -o ending (e.g. teléfono ‘phone) • Non canonical: no –a/-o ending (e.g. lápiz) • Collocation: Proclitic (as in coniugated verbs) vs. enclitic (as in infinitives).
Clitic Production (computer based) • The learner is asked a question referring to an object based on the sequence of pictures shown. • 32 slides; combination of number and gender (canonical and non-canonical) plus syntactic collocation.
Word Order Task (paper & pencil) • Context-dependent word order preference test • The learner is presented with 28 situations with a following question • Two types of questions: What happened? (Broad focus) Who did x? (Narrow focus) • 4 items by 7 syntactic contexts: 4xSVO, 4xVOS, 4xCLLD, 4xUnerg/Narrow, 4xUnerg/ Broad, 4xUnacc/Narrow and 4xUnacc/Broad • Three options: Inverted (VS), non-inverted (SV) and both. • You get home and your brother just tells you that he has got an email from your friend Sue and that he has very good news to tell you. You ask your brother “¿Qué ha pasado?” (What happened?) What could he say? a. Se ha comprado un coche Sue b. .Sue se ha comprado un coche c. Both sentences (Sue has bought a car) (Sue has bought a car) 2. Your brother is having some friends over for a get together at home. When your mother comes she sees some smoke coming out of the bathroom and she asks your brother: “¿Quién está fumando?” (Who’s smoking?) What could you brother say? a.Oscar está fumando b. B. Está fumando Oscar c. Both sentences (Oscar is smoking) (Oscar is smoking)
Tools for Data Analysis • CHILDES (The Child Language Data Exchange System) • CLAN = Computerised Language Analysis • Computer program suite for transcribing, searching and analysing language data • CHAT = Codes for the Human Analysis of Transcripts • A format for notation and transcription • Types of Analyses: • FREQ, MLU, COMBO, KWAL
Next Steps • Database will be available for use by the research community via www.splloc.soton.ac.uk (in spring 2008) • Articles & conference papers (in 2007): • BAAL LLT SIG • GALA • BUCLD • HLS • SLRF • CHILDES training workshop: • 25 January 2008, University of Southampton.
Acknowledgments The SPLLOC project is supported by an ESRC research grant (RES 000231609) We would like to thank all the participants in the project, including subjects, transcribers and fieldworkers
References • Domínguez, L., Arche, M.J. 2007a. “Deviant optional forms in L2 Spanish: the case of word order variation”. Poster presentation at GALA, Barcelona, 6-8 September. • Domínguez, L., Arche, M.J. 2007b. “Optionality in L2 grammars: the acquisition of SV/VS contrast in Spanish”. To be presented at BUCLD 32,Boston, 1-4 November. • Domínguez, L., Arche, M.J. 2007c. “The L2 Acquisition of SV/VS contrast in Spanish”. To be presented at the Hispanic Linguistic Symposium, Texas, 1-4 November. • Domínguez, L., Arche, M.J., Mitchell, R, Marsden, E. and Myles, F 2007. “Innovations in Spanish SLA research methodology: introducing the ‘Spanish Learner Language Oral Corpus’”. To be presented at the Hispanic Linguistic Symposium, Texas, 1-4 November. • Granger, S., J. Hung and S. Petch-Tyson (eds.). 2002. Computer Learner Corpora, second language acquisition and foreign language teaching. Amsterdam: John Benjamins. • Lozano, C. & Mendikoetxea, A. (in press). Verb-Subject order in L2 English: new evidence from the ICLE corpus. In: Actas del XXV Congreso Internacional de AESLA. Universidad de Murcia. • Lozano, C. & Mendikoetxea, A. (forthcoming 2007). Postverbal subjects at the interfaces in Spanish and Italian learners of L2 English: a corpus analysis. In: Papp, S., Díez, B. and Gilquin, G. (eds). Linking up contrastive and corpus learner research. Rodopi • Mitchell, R., Marsden, E., Domínguez, L., Arche, M. J. and Myles, F. 2007 “Creation and analysis of a Spanish language learner oral corpus (SPLLOC)”. Poster presentation at BAAL LLT SIG Conference “Towards a Researched Pedagogy”, University of Lancaster, 2-3 July. • Mitchell, R., Dominguez, L., Arche, M.J., Myles, F. and Marsden, E. “Developing a CHILDES-based corpus of L2 oral Spanish”. To be presented at Second Language Research Forum, Urbana-Champaign, 11-14 October. • Myles, F. 2002. Linguistic development in classroom learners of French: a cross-sectional study (No. End of ESRC award report R000223421). Southampton: University of Southampton. • Myles, F. 2005. Interlanguage corpora and second language acquisition research. Second Language Research, 21,4: 373-391.