1 / 15

Atkins-Rundell

Atkins-Rundell. The Oxford Guide to Practical Lexicography 2008 Part I Pre-lexicography Dictionary types and dictionary users The birth of a dictionary Types of dictionary Types of dictionary users Tailoring the entry to the user who needs it. The birth of a dictionary (p. 18).

isi
Download Presentation

Atkins-Rundell

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Atkins-Rundell The Oxford Guide to Practical Lexicography 2008 Part I Pre-lexicography Dictionary types and dictionary users The birth of a dictionary Types of dictionary Types of dictionary users Tailoring the entry to the user who needs it

  2. The birth of a dictionary (p. 18) hugely expensive to produce from scratch pre-lexicography involves decisions taken by senior editor and publisher See table 2.2 (major publisher’s project) pag. 19 Academic projects: Dizionario di Anglicismi in italiano Dictionary of Bioethics Dictionary of Rum

  3. Marketing research (p. 30) www.macmillandictionaries.com free material for teachers to use in the classroom ‘Word of the Week’ e-zine online questionnaires monitoring log-files to check what people have looked up research on dictionary use

  4. Lexicographic evidence (p. 46) • introspection: based on our mental lexicon, necessarily partial, subjective • objective evidence: observing language in use • Rationalism (Chomsky): describe linguistic ‘competence’ (p. 49) • Empiricism (corpus linguists): describe linguistic ‘performance’, or typical, frequent and well-dispersed patterns of language

  5. OED and the collection of citations • www.sba.unito.it blogosphere, stakeholder, hub, spoke • recruiting and training readers • collect slips with citations • storing the data in a computer database

  6. The central role of a corpus (p. 53) • objective evidence of language is a fundamental prerequisite for a reliable dictionary • "A corpus is a collection of pieces of language text in electronic form, selected according to external criteria to represent, as far as possible, a language or language variety as a source of data for linguistic research" (Sinclair 2005: 16) • COBUILD:  Spoken and written, non-technical, current, standard British English

  7. Descriptivism vs prescriptivism • Samuel Johnson (1755): ‘to preserve the purity… of our English idiom’ • only ‘writers of the first reputation’ would be Johnson’s data • today a dictionary must provide a genuine snapshot of a language (cf. “dizionario dell’uso”, ‘usage dictionary’) • a dictionary is a bridge between norm (the received rules) and usage (the realization of the rules in authentic language use)

  8. Does a corpus favour high-quality language? • the lexicographer is a historian, not a critic • the BNC is the best pre-web corpus of English • the BNC is a ‘gold standard’ for corpus linguists • 90% written language, 10% spoken • 75% informative 25% imaginative • spoken: 42% demographic, 58% context-governed • British English only • 100 million words

  9. Zipf’s Law 1930s • Harvard linguist involved in word-frequency • “the frequency with which a word appears in a collection of texts is inversely proportional to its ranking in a frequency table” • e.g. was 10th 923,957 hits in BNC at 20th 478,177 The 100 most frequent words in English make up 45% of BNC’s 100 million words

  10. Corpus size • you need a very large corpus to obtain information about rare words • the more data we have, the more information we have • the larger the corpus, the better the lexical profile

  11. Corpus design: how large? • Brown Corpus AmE (1960s) 1 million words • LOB BrE (1960s) • Bank of English (Cobuild) 20 million words • BNC 100 million words • ukWaC 2 billion words (2.000 million words) • Il Giornale del Turismo 150,000 words

  12. Corpus representativeness • A balanced corpus is the ideal objective for lexicographic work • A balanced corpus reflects the diversity of the target language and contains texts that cover the full repertoire of ways in which people use the language • Spoken data: • demographic approach (gender, social class, age, religion, etc.) • context-governed (conversational, educational, business, political, leisure, etc.)

  13. Corpus representativeness • A right- or left-sorted corpus of 100 m. words clearly shows most of the normal patterns of usage for all words except the very rare • to break someone’s service (12 hits in the BNC) • mucosa and unfortunate have the same number of hits in the BNC • a case of skewing: a feature is over- or under-represented (the larger the corpus

  14. Parallel corpora • Translation corpus, e.g. the EU documents • Parallel corpus, e.g. ICE (International Corpus of English) 15 corpora od varieties of English (New Zealand, Indian, Jamaica, etc.)

  15. Collection of corpus data • style (e.g. journalistic) • medium: written, spoken • a corpus consisting of single type of texts will reflect only the stylistic and subject-matter features of that particular genre • web ukWaC Corpus 2 billion words www.sketchengine.co.uk

More Related