1 / 23

Corpus Linguistics Case study 2

Corpus Linguistics Case study 2. Grammatical studies based on morphemes or words. G Kennedy (1998) An introduction to corpus linguistics , London: Longman, pp. 121-137. Introduction. Survey of corpus-based studies of uses of verbs and verb forms, taken from various sources

tconey
Download Presentation

Corpus Linguistics Case study 2

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Corpus Linguistics Case study 2 Grammatical studies based on morphemes or words. G Kennedy (1998) An introduction to corpus linguistics, London: Longman, pp. 121-137

  2. Introduction • Survey of corpus-based studies of uses of verbs and verb forms, taken from various sources • Besides considering the results, we should also consider what the parameters are for each search, and how we could express them • Analysis of Brown and LOB corpora reveal that nearly 20% of words are verbs • Apparently 224 different verb forms are possible

  3. Ota (1963) • Manual (pre-computers) analysis of small corpus (150k words), mainly transcribed speech • 17,166 finite verb forms (977 of them passives) • Table (next slide) shows distribution of tenses • Looked at relative frequency of adverbs with different tenses • today and this year more commonly found with past tense • be most frequent (30%), together with 6 other verbs account for 50% of all verbs

  4. notice predominance of simple present tense and simple past over other tenses • simple past equal or more than present with said, went, did, came, made, took • present progressive hardly used, except going, doing, coming

  5. Ota (1963) • Remember that Ota’s corpus is mostly spoken language • Predominance of present over past also found in other studies • Stative verbs (know, want) rarely found in progressive, and 10x more frequently with 1st person subject

  6. Joos (1964) • Also manual • Study of 9100 finite and non-finite verb forms in a book, an account of a courtroom trial • 23 most frequent forms listed on next slide • These cover 95% of data • Another 56 forms found in remaining 5% of data • 145 of the 224 possible forms were not found at all

  7. Again, 62% are simple present or simple past

  8. Three studies • Ota (1963) – transcribed speech • Joos (1964) – written account of a verbal process • George (1963) – analysed 108,783 verb forms from a 0.5m word corpus of expository texts mostly from newspapers, nonfiction and references books • Despite genre differences, findings are very similar

  9. Perfect and progressive less frequent than simple tenses in all cases, • Present more frequent than past in these compound tenses in speech • Vice versa in written English • Results slightly biased by presence of be, which is rarely used with perfect or progressive aspect

  10. Manual counts confirmed • Early manual studies benefited from replication with computerized corpora, which confirmed the findings (eg with Brown corpus) • Might make a good project for some of you • Note how much distribution of tenses varies with genre

  11. informative imaginative

  12. Interlude • How would you define the finite tense forms in a corpus search? • Combination of verb form • bare form (?infinitive), -s, -ed • note that simple past and past participle differ with strong verbs, but not others • Auxiliaries (have, be) + participles (-ing, -ed) • Various modals (if will and would indicate tenses, so do can and must) • Active vs passive • Infinitive with to – non-finite tense forms difficult to distinguish • Some verb groups are discontinous, eg should normally have been • And many verbs form homograph pairs with nouns

  13. George (1963) • Finite vs non-finite verb forms

  14. Francis & Kučera (1982) • Analysed syntactic and semantic functions of different verb forms in Brown corpus, eg …

  15. Modals • Modals make up 7.6% of verb forms in Brown corpus • Coates (1983) studied both distribution and use • Major differences between spoken and written English

  16. Modals • epistemic use eg You must be exhausted, more frequent than root meaning in spoken corpus • Major genre differences between root and epistemic use of some modals

  17. Main use with bare form or with passive • shall, would rarely seen with passive • can more usually with passive

  18. Mindt (1995) • Classified uses of modals, eg should: • Advisability/desirability (eg You should plant potatoes) 55% • Hypothetical event/result (eg I should have left) 36% • Politeness (eg I should like to thank you) 9% • Other non-modal meanings include tense • Past time (I told him I should have gone) 38% • Future time (This should be done soon) 25% • Present time (I don’t think we should wait) 19% • Timeless (You should brush your teeth regularly) 18%

  19. Voice: active vs passive

  20. Use of passive and genre • Study also showed that agentless passive (80%) much more usual than passive with by-agent

  21. Other topics • Verbs + particles • Use of subjunctive (if it were…, important that he join us): more prevalent in AmE than BrE apparently • Prepositions • Frequency • Immediate right collocates • Use as prepobj markers (look at, wait for etc) • Semantic function (locative/time vs other) • Conjunctions (eg since, when, because) • Frequency • Function, esp. as indicative of genre • More vs less (and fewer)

More Related