330 likes | 403 Views
A Sprinkling of Key Words. Mike Scott Aston University June 30, 2010. Issues: Key words (KWs). Keyness Aboutness Distribution patterns of KWs. complex pattern. or simple. fractal?. Fractal.
E N D
A Sprinkling of Key Words Mike Scott Aston University June 30, 2010
Issues: Key words (KWs) • Keyness • Aboutness • Distribution patterns of KWs
Fractal • A fractal is "a rough or fragmented geometric shape that can be split into parts, each of which is (at least approximately) a reduced-size copy of the whole,"[1] a property called self-similarity • (Wikipedia) • [1] Mandelbrot, B.B. (1982). The Fractal Geometry of Nature. W.H. Freeman and Company.
Keyness • aboutness • importance • a textual category
aboutness • what the text is about • what the message is • what it all means • picture from mindreadersdictionary.com
importance centrality
Context • Claimp by Maya Goldblum • New Designers 07
Impoverished context • Dandelion Light by Sunghwa Jang • New Designers 07
Identification of KWs: criteria • simple verbatim repetition • no allowance for anaphora, synonymy, antonymy etc. • threshold • one word, or more than one?
Corpus-bound or corpus-driven? • Machine-identified keyness is ideal for corpus-driven research • The researcher lets the PC suggest areas needing further chasing up • See recent work by McEnery, Baker, etc. and Nelia Scott 1998
Research Questions • How are the KWs of Bleak House distributed? • Are the KWs of different kinds (nouns/verbs … character/place/style words) distributed differently? • Do the KWs of the chapters reflect the pattern of the whole text but on a smaller scale?
Bleak House • published 1852-3 • (20 monthly instalments) • 350,000 words • Preface + 66 Chapters
reference corpus • 9 million words • 52 novels, • 29 other 19th Century authors • 23 Dickens
Procedures • download Bleak House (Gutenberg Project) • separate each chapter as a separate file • create a wordlist of the reference corpus • create a wordlist of the whole of Bleak House • create a batch of wordlists, one of each chapter of Bleak House ref. corpus BH
KW Procedures • Compute KW list of the whole novel • Compute batch of KW lists, one for each chapter
Overall Results • Over 300 positive KWs for the whole novel • About 70 negative KWs including God (half as frequent as in 19th C literature overall)
Excel • spreadsheet constructed at the same time as the batch of KW files fewer characters in first chapters pronouns are sprinkled http:\\www.lexically.net\downloads\corpus_linguistics\Bleak_House.xls
Chapter by Chapter • Average of 23 KWs per chapter – same settings, same reference corpus (19th C Lit.) • Per chapter: minimum 5, maximum 38.
middling burstiness • verbs appears begins puts observes replies continues says considers etc.
Preliminary findings • All chapters have KWs • Individual chapters differ considerably in their KWs • because KWs are not all global • Character KWs enter the novel gradually • Pronouns and verbs present in many sections but absent in many too • not much to do with aboutness • middling level of burstiness • KWs of different kinds are distributed differently
Preliminary conclusion • KWs of the chapters do not simply reflect the pattern of the whole text but on a smaller scale • Keyness is not fractal
References • Baker, P., Gabrielatos C., Khosravinik, M., Krzyzanowski, M., McEnery, T. & Wodak, R., 2008. A useful methodological synergy? Combining critical discourse analysis and corpus linguistics to examine discourses of refugees and asylum seekers in the UK press. Discourse & Society 19(3), 273-305. • McEnery, Tony, 2009. "Keywords and Moral Panics: Mary Whitehouse and Media Censorship". in Dawn Archer (ed.) What's in a Word-list? Investigating word frequency and keyword extraction. Farnham: Ashgate, 93-124. • Scott, M. Nelia, 1998, Normalisation and Readers' Expectations: A Study of Literary Translation with Reference to Lispector's A Hora da Estrela. Liverpool: Unpublished PhD thesis, University of Liverpool.