230 likes | 247 Views
This case study explores corpus-based research on verb forms, analyzing tense distribution in English corpora and examining the prevalence of different verb forms. The study compares findings from various sources, shedding light on the usage patterns and frequencies of verbs in different contexts.
E N D
Corpus Linguistics Case study 2 Grammatical studies based on morphemes or words. G Kennedy (1998) An introduction to corpus linguistics, London: Longman, pp. 121-137
Introduction • Survey of corpus-based studies of uses of verbs and verb forms, taken from various sources • Besides considering the results, we should also consider what the parameters are for each search, and how we could express them • Analysis of Brown and LOB corpora reveal that nearly 20% of words are verbs • Apparently 224 different verb forms are possible
Ota (1963) • Manual (pre-computers) analysis of small corpus (150k words), mainly transcribed speech • 17,166 finite verb forms (977 of them passives) • Table (next slide) shows distribution of tenses • Looked at relative frequency of adverbs with different tenses • today and this year more commonly found with past tense • be most frequent (30%), together with 6 other verbs account for 50% of all verbs
notice predominance of simple present tense and simple past over other tenses • simple past equal or more than present with said, went, did, came, made, took • present progressive hardly used, except going, doing, coming
Ota (1963) • Remember that Ota’s corpus is mostly spoken language • Predominance of present over past also found in other studies • Stative verbs (know, want) rarely found in progressive, and 10x more frequently with 1st person subject
Joos (1964) • Also manual • Study of 9100 finite and non-finite verb forms in a book, an account of a courtroom trial • 23 most frequent forms listed on next slide • These cover 95% of data • Another 56 forms found in remaining 5% of data • 145 of the 224 possible forms were not found at all
Three studies • Ota (1963) – transcribed speech • Joos (1964) – written account of a verbal process • George (1963) – analysed 108,783 verb forms from a 0.5m word corpus of expository texts mostly from newspapers, nonfiction and references books • Despite genre differences, findings are very similar
Perfect and progressive less frequent than simple tenses in all cases, • Present more frequent than past in these compound tenses in speech • Vice versa in written English • Results slightly biased by presence of be, which is rarely used with perfect or progressive aspect
Manual counts confirmed • Early manual studies benefited from replication with computerized corpora, which confirmed the findings (eg with Brown corpus) • Might make a good project for some of you • Note how much distribution of tenses varies with genre
Interlude • How would you define the finite tense forms in a corpus search? • Combination of verb form • bare form (?infinitive), -s, -ed • note that simple past and past participle differ with strong verbs, but not others • Auxiliaries (have, be) + participles (-ing, -ed) • Various modals (if will and would indicate tenses, so do can and must) • Active vs passive • Infinitive with to – non-finite tense forms difficult to distinguish • Some verb groups are discontinous, eg should normally have been • And many verbs form homograph pairs with nouns
George (1963) • Finite vs non-finite verb forms
Francis & Kučera (1982) • Analysed syntactic and semantic functions of different verb forms in Brown corpus, eg …
Modals • Modals make up 7.6% of verb forms in Brown corpus • Coates (1983) studied both distribution and use • Major differences between spoken and written English
Modals • epistemic use eg You must be exhausted, more frequent than root meaning in spoken corpus • Major genre differences between root and epistemic use of some modals
Main use with bare form or with passive • shall, would rarely seen with passive • can more usually with passive
Mindt (1995) • Classified uses of modals, eg should: • Advisability/desirability (eg You should plant potatoes) 55% • Hypothetical event/result (eg I should have left) 36% • Politeness (eg I should like to thank you) 9% • Other non-modal meanings include tense • Past time (I told him I should have gone) 38% • Future time (This should be done soon) 25% • Present time (I don’t think we should wait) 19% • Timeless (You should brush your teeth regularly) 18%
Use of passive and genre • Study also showed that agentless passive (80%) much more usual than passive with by-agent
Other topics • Verbs + particles • Use of subjunctive (if it were…, important that he join us): more prevalent in AmE than BrE apparently • Prepositions • Frequency • Immediate right collocates • Use as prepobj markers (look at, wait for etc) • Semantic function (locative/time vs other) • Conjunctions (eg since, when, because) • Frequency • Function, esp. as indicative of genre • More vs less (and fewer)