230 likes | 243 Views
Corpus Linguistics Case study 2. Grammatical studies based on morphemes or words. G Kennedy (1998) An introduction to corpus linguistics , London: Longman, pp. 121-137. Introduction. Survey of corpus-based studies of uses of verbs and verb forms, taken from various sources
E N D
Corpus Linguistics Case study 2 Grammatical studies based on morphemes or words. G Kennedy (1998) An introduction to corpus linguistics, London: Longman, pp. 121-137
Introduction • Survey of corpus-based studies of uses of verbs and verb forms, taken from various sources • Besides considering the results, we should also consider what the parameters are for each search, and how we could express them • Analysis of Brown and LOB corpora reveal that nearly 20% of words are verbs • Apparently 224 different verb forms are possible
Ota (1963) • Manual (pre-computers) analysis of small corpus (150k words), mainly transcribed speech • 17,166 finite verb forms (977 of them passives) • Table (next slide) shows distribution of tenses • Looked at relative frequency of adverbs with different tenses • today and this year more commonly found with past tense • be most frequent (30%), together with 6 other verbs account for 50% of all verbs
notice predominance of simple present tense and simple past over other tenses • simple past equal or more than present with said, went, did, came, made, took • present progressive hardly used, except going, doing, coming
Ota (1963) • Remember that Ota’s corpus is mostly spoken language • Predominance of present over past also found in other studies • Stative verbs (know, want) rarely found in progressive, and 10x more frequently with 1st person subject
Joos (1964) • Also manual • Study of 9100 finite and non-finite verb forms in a book, an account of a courtroom trial • 23 most frequent forms listed on next slide • These cover 95% of data • Another 56 forms found in remaining 5% of data • 145 of the 224 possible forms were not found at all
Three studies • Ota (1963) – transcribed speech • Joos (1964) – written account of a verbal process • George (1963) – analysed 108,783 verb forms from a 0.5m word corpus of expository texts mostly from newspapers, nonfiction and references books • Despite genre differences, findings are very similar
Perfect and progressive less frequent than simple tenses in all cases, • Present more frequent than past in these compound tenses in speech • Vice versa in written English • Results slightly biased by presence of be, which is rarely used with perfect or progressive aspect
Manual counts confirmed • Early manual studies benefited from replication with computerized corpora, which confirmed the findings (eg with Brown corpus) • Might make a good project for some of you • Note how much distribution of tenses varies with genre
Interlude • How would you define the finite tense forms in a corpus search? • Combination of verb form • bare form (?infinitive), -s, -ed • note that simple past and past participle differ with strong verbs, but not others • Auxiliaries (have, be) + participles (-ing, -ed) • Various modals (if will and would indicate tenses, so do can and must) • Active vs passive • Infinitive with to – non-finite tense forms difficult to distinguish • Some verb groups are discontinous, eg should normally have been • And many verbs form homograph pairs with nouns
George (1963) • Finite vs non-finite verb forms
Francis & Kučera (1982) • Analysed syntactic and semantic functions of different verb forms in Brown corpus, eg …
Modals • Modals make up 7.6% of verb forms in Brown corpus • Coates (1983) studied both distribution and use • Major differences between spoken and written English
Modals • epistemic use eg You must be exhausted, more frequent than root meaning in spoken corpus • Major genre differences between root and epistemic use of some modals
Main use with bare form or with passive • shall, would rarely seen with passive • can more usually with passive
Mindt (1995) • Classified uses of modals, eg should: • Advisability/desirability (eg You should plant potatoes) 55% • Hypothetical event/result (eg I should have left) 36% • Politeness (eg I should like to thank you) 9% • Other non-modal meanings include tense • Past time (I told him I should have gone) 38% • Future time (This should be done soon) 25% • Present time (I don’t think we should wait) 19% • Timeless (You should brush your teeth regularly) 18%
Use of passive and genre • Study also showed that agentless passive (80%) much more usual than passive with by-agent
Other topics • Verbs + particles • Use of subjunctive (if it were…, important that he join us): more prevalent in AmE than BrE apparently • Prepositions • Frequency • Immediate right collocates • Use as prepobj markers (look at, wait for etc) • Semantic function (locative/time vs other) • Conjunctions (eg since, when, because) • Frequency • Function, esp. as indicative of genre • More vs less (and fewer)