120 likes | 253 Views
Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction. Ilina Doykova Shumen University, Shumen (Bulgaria) ilina.doykova@abv.bg. Statistical analysis. Simple things may characterise different styles average sentence length average word length
E N D
Ways of searching for the Zeitgeist of Modernity - a corpus-based approach to modern fiction IlinaDoykova Shumen University, Shumen (Bulgaria) ilina.doykova@abv.bg
Statistical analysis • Simple things may characterise different styles • average sentence length • average word length • vocabulary richness • vocabulary growth (homogeneity of text) • More complex analyses give a more interesting picture • specific syntactic structures • degree of modification in NPs • types of verbs (e.g. verbs of persuasion, speech verbs, action verbs, descriptive verbs) • distribution of pronouns (1st/2nd/3rd person) • themes, beliefs, etc. • authorship • Especially when used comparatively
Linguistic Tools: WordSmith and Wmatrix Useful features: + Tagging = identifies and labels PoS + WordList = generates word-frequency lists + Concordance = lists occurrences of a word in context and its immediate environment, gives access to collocates • Identify syntactic use of word • Identify range of meanings • Identify relative frequency of different uses/meanings + KWIC (key word) = identification of key words through a comparison with a reference corpus + Word Clouds = semantic tagsets in 21 domains • Listings can be customised to show what you want more clearly: sort according to next or previous word show more or less context highlight important information
WordSmithfrequency list of predicative adjectives, Modern British Women Fiction Writers Corpus
Key words list and dispersion plot(ALONE in MBWFW corpus)Consistency analysis indicates whether a word is found consistently across lots of different texts or only in a narrow set of texts, or a specific text
Lemmatized results for relational pairsWordSmith and Wmatrix
Investigation of semantic domains through semantic tagging (Wmatrix)
Key Domain clouds (for Wmatrixonly) • The larger the word, the greater its “keyness” or uniqueness as compared to the BNC Written Sampler of imaginative texts.
Research and language learning Word frequency knowledge in present-day language textbooks (grammatical, collocational, semantic) is frequency-based; Real usage corpora represent actual, not prescribed usage; Translation find the best equivalent; Grammar investigate on word classes, specific syntactic structures; Teaching collocations ‘trouble and strife’, ‘the elephant in the room’; ‘blue murder’ Decoding specific content (sexist, racist or ideological, etc. ) Authorship identification of true authorship Analysis of texts written in any language and any alphabet
References [1]Biber, Douglas et al. (1998). Corpus Linguistics: Investigating Language Structure and Use. Cambridge: CUP, 1998. [2]Campbell, R.S., & Pennebaker, J.W. (2003).The secret life of pronouns: Flexibility in writing style and physical health. Psychological Science, 14, 60-65,2003. [3]Leech, G. N. and Scott M. (1981). Style in Fiction. London: Longman, 1981. [4]Rayson, Paul. (2009). Wmatrix. A Web-basedCorpusProcessingEnvironment, Computing Department, LancasterUniversity, 2009. [5] Rayson, P., Archer, D., Piao, S. L., McEnery(2004). UCREL Semantic Analysis System (USAS), 2004. (http://ucrel.lancs.ac.uk/usas/) [6] Scott, M. (2012). WordSmithTools, Version6, Liverpool: LexicalAnalysis Software, 2012 (http://www.lexically.net/wordsmith/index.html). [7] Seizova-Nankova,T. (2012). Primaryschooleducationandcomputer-basedlanguagestudy, BETA Papers, 2012. [8] Seizova-Nankova,T. (in print). Developingcollocationalcompetence. A casestudy. 12th International language, Literature and Stylistics Symposium, Edirne, Trakya University, Turkey. [9] Semino, E. andScott, M. (2004). CorpusStylistics: Speech, writingandthoughtpresentationin a corpus of Englishwriting, Routledge, 2004. [10] Sinclair, J. (2007). TheSearchforUnits of Meaning. In CorpusLinguistics: CriticalConceptsinLinguistics. Vol. 3. Routledge, 2007. [11] YasunoriNishina. (2007). A Corpus-DrivenApproach to GenreAnalysis: TheReinvestigation of Academic, NewspaperandLiteraryTexts”, ELR Journal, 1 (2), 2007, (http://ejournals.org.uk/ELR/article/2007/2 (accessed 27 June 2013)). [12] UCREL Home Page, Lancaster, UK. 1993-2013. 23 April, 2013, (http://www.comp.lancs.ac.uk/research/) Electronic text resources • http://www.stylist.co.uk/books, • http://www.newyorker.com, • http://narrativemagazine.com, • http://www.one-story.com, • http://www.teachingenglish.org.uk/teaching-resources, • http://www.guardian.co.uk/books, • http://gutenberg.net.au/