200 likes | 413 Views
Corpus Linguistics (2). The Tools of the Trade http://tinyurl.com/ 669o4zt. Today’s session. An introduction to some features of tools Demo of different (kinds of) tools Hands-on practice with one tool AIM: Help you know what to look for in a tool for your work (and what options there are).
E N D
Corpus Linguistics (2) The Tools of the Trade http://tinyurl.com/669o4zt
Today’s session • An introduction to some features of tools • Demo of different (kinds of) tools • Hands-on practice with one tool AIM: Help you know what to look for in a tool for your work (and what options there are)
Quick summary from LAST WEEK’S SESSION
Need to know • What is in your corpus • How to extract information • How to use information
There are different TYPES OF TOOLS
Different kinds of tools • Online / offline • For one particular corpus / for any corpus • Use straight away / prepare corpus • 'Free' / licence conditions and costs
Tools may • take different text formats: .txt, .xml, .html • have different functions: concordance, wordlist, statistics, collocation, … • handle annotation • interpret tags, ignore tags, treat tags as text
Different tools have different functions. TYPICAL FUNCTIONS
Concordance • Search word + context • Can be displayed as KWIC • Can usually be sorted • Used to see patterns of use
Chambers-Rostand Corpus of Journalistic French – made with AntConc Word List (frequency and alphabetical) 1 110721 de 2 58740 la 3 44832 l 4 43136 le 5 41836 à 6 37974 les 7 36106 et 8 34903 d 9 33714 des 10 27968 en 11 25300 du 12 23830 un 13 21088 est 14 20357 a 15 20146 une 29 4 abaissement 30 22 abaisser 31 6 abaissée 32 2 ABANDON 33 56 abandon 34 4 abandonnait 35 12 abandonnant 36 20 abandonne 37 8 abandonnent 38 40 abandonner 39 2 abandonnerons 40 2 abandonneront 41 2 ABANDONNÉ 42 42 abandonné 43 8 abandonnée
Collocates: adjectives immediately preceding BUSINESS Corpus of Contemporary American English http://www.americancorpus.org/
Questions to consider when choosing corpus tool • What functions does it have? • What languages can it handle? • What text format? • What platform (PC, Mac, Unix) • How does it deal with tags and annotation? • Is there any user support? • What will it cost (now and later)? • What are the alternatives?
What’s the question? Foreign learners of French often misuse phrases involving the word 'question'. English speakers often make errors, for example saying 'demander une question' or 'il y a question de'. How is 'question' more typically used by native speakers?
https://ota.oerc.ox.ac.uk/bncweb-cgi/BNCweb.pl BNCweb (Oxford log-in)
BNCweb http://bncweb.lancs.ac.uk/bncwebSignup/
Tip of the week http://www.wordfrequency.info/ Book and word lists from Corpus of Contemporary American English (COCA). By Mark Davies
Next week (Session 3)Collocation Corpus linguists claim to have identified an important principle is responsible for the creation of much of the meaning of texts – collocation (co-occurrences). What is it, and are the claims true? Optional reading: * Xiao, Richard, and Tony McEnery (2006). "Collocation, Semantic Prosody, and near Synonymy: A Cross-Linguistic Perspective " Applied Linguistics 27(1): 103-129. http://applij.oxfordjournals.org/cgi/content/full/27/1/103
Corpus Linguistics (2) The Tools of the Trade http://tinyurl.com/669o4zt