420 likes | 726 Views
Using Corpora to Teach Vocabulary. Helping Students Help Themselves. 1. What are Corpora?. Large free computerized databases of natural language Corpus of Contemporary American English (COCA) MICASE (Michigan Corpus of Academic Spoken English
E N D
Using Corpora to Teach Vocabulary • Helping Students Help Themselves 1
What are Corpora? Large free computerized databases of natural language • Corpus of Contemporary American English (COCA) • MICASE (Michigan Corpus of Academic Spoken English • MICUSP (Michigan Corpus of Upper-Level Student Papers) • British National Corpus 2
Corpus Linguistics = Methodology Bennett (2010) • Corpus-influenced materials • Textbooks, materials based on frequency & patterns • Corpus-cited texts • Dictionaries (Collins COBUILD) • Grammar books (Real Grammar: A Corpus-Based Approach to English) • Corpus-designed materials • Learner or teacher-created using a corpus
Pre-made Materials Corpus learning 101
Vocabulary Based on Corpus Studies Frequency Lists • West’s General Service List (first ~2000 most frequent words) • Academic Word List (570 word families; 3000 words) LexTutor’s VocabProfiler • Insert your own texts to assess vocabulary level
West’s General Service List 1 the 2 be 3 of 4 and 5 a 6 to 7 in 8 he 9 have 10 it 11 that 12 for 13 they 14 I 15 with 16 as 17 not 18 on 19 she 20 at 21 by 22 this 23 we 24 you 25 do 26 but 27 from 28 or 29 which 30 one 31 would
AWL abandon abstract academy access accommodate accompany accumulate accurate achieve acknowledge acquire adapt
AWL Analyse – head word analysers analysers analyses analysing analysis– most common analyst Analysts analytic analytical analytically analyze analyzed analyzes analyzing
VocabProfiler Why? How? Create a .txt document In Word (save as, then select .txt) Copy the text Paste the text into the VocabProfile site Double click on proper nouns to exclude Click Submit • Materials development • Check vocabulary levels of webpages • Decide on vocabulary to focus on
MS Office Shortcuts Ctrl + A select all Ctrl + C copy Ctrl + V paste Ctrl + X cut Ctrl + Z undo
Data-Driven Learning Using a Corpus to Teach Vocabulary
Knowing a Word (Nation, 2001) Metalinguistic awareness = dictionary definition + • spelling • morphology • part of speech • pronunciation • variant meanings • collocations • specific uses • register
Data Driven Learning (Johns, 1991) Learners become “language detectives” Johns, 1991 Authentic examples & encourages “noticing” or “awareness-raising” Romer, 2008
Using a Corpus Pros Cons Significant teacher training needed Few ready-made exercises and challenging to design Lexical information vast/confusing Contexts incomplete No focus on fluency Natural Language Practice analytical skills/verify choices Creates self-sufficient learners Contexts rich, varied Focus on accuracy 19
Data-Driven Learning: The Corpus of contemporary americanenglish
COCA • 450 million words • 20 million words added yearly (1990-2012) • 90 million spoken words • Academic and general • Spoken • Fiction • Magazines • Newspapers • Academics 21
Academic Genres • Education • Geography/Social Science • Law/Philosophy • Humanities • Philosophy/Religion • Science/Technology • Medicine • Miscellaneous 22
Class Use Sign up for group access at least 2 days prior to use • http://corpus.byu.edu/groupAccess.asp Notice the group limits • One active request at a time • Four hour limit • Teacher must be a registered user
Parts of Speech with KWIC (Key Words in Context) They certainly will not grow as learnerswithout opportunity to analyzetheirstrengthsandweaknesses.
Language Development • KWIC search • Parts of speech color coded • Students code nearby words • Student code 100 word sample
Language Development Frequency searches (easiest) • Reading fluency • Should you memorize dawdle, meander, or drift?
Phrasal Verb Frequencies Intermediate Class • Explain what phrasal verbs are with examples (mess around, use up, call on, wrap up) • Use COCA to find sample sentences
High beginning writing class • Check spelling and non-English words on 30-minute timed writing • Students look for words that might be misspelled • Use COCA • If frequency below 10, circle the word (e.g., speciel)
COCA for Morphology • Transport • transportation • transported • transports
Wildcard* Searches Circle the word not related in meaning clar* *note clarify connote clarinet denote clarity keynote clark
What are Concordancers? • Computer programs used to analyze text • LexTutor • VocabProfiler • AntConc • Create specialized corpora for ESP classes
Websites of Interest ELT Resource Training Wiki (with Amber Warren) • http://eltresourcetraining.pbworks.com AWL • http://englishvocabularyexercises.com VocabProfiler • http://www.lextutor.ca/vp/ Grimm’s Fairy Tales in .txt • http://www.cs.cmu.edu/~spok/grimmtmp/
Contact Information Debra S. Lee Vanderbilt University English Language Center dleetn@gmail.com Twitter: dleetn Google+: dleetn