210 likes | 510 Views
Stylistics – case study. see last slide for websites used to get numerical information from texts. Stylistic analysis. Literary vs linguistic stylistics Lit crit focuses on effect on the reader, intended or otherwise, so largely intuitive and subjective
E N D
Stylistics – case study see last slide for websites used to get numerical information from texts
Stylistic analysis • Literary vs linguistic stylistics • Lit crit focuses on effect on the reader, intended or otherwise, so largely intuitive and subjective • Linguistic stylistics looking for characterisations of style (including literary style) in terms of linguistic phenomena at the various levels of linguistic description
Stylistic analysis • Inventory of linguistic devices and their effect • usually in a contrastive way: • in contrast with other texts of a similar genre • in contrast with other genres • Linguistic devices described in terms of the usual linguistic levels of description: phonology, morphology, lexis, grammar, etc.
Example • Newspaper reporting of a similar story • Sun vs Independent • readership by social class • Sun: widely read (c. 5m), mostly by lower class and lower middle class • Independent: circulation 0.25m, educated middle class • How would you expect this different readership to be reflected in the styles?
Sun vs Independent • Targeted readership largely dictates subject matter and the angle of coverage • From a purely linguistic point of view we might expect differences in … • vocabulary • complexity of sentence structure • Other differences might include literary • But (compared to other texts) features of the genre (newspaper story) may be shared
http://www.independent.co.uk/news/world/asia/hawker-family-make-new-plea-799964.htmlhttp://www.independent.co.uk/news/world/asia/hawker-family-make-new-plea-799964.html
http://www.thesun.co.uk/sol/homepage/news/justice/article952630.ecehttp://www.thesun.co.uk/sol/homepage/news/justice/article952630.ece
Some differences • Differences of detail • [Some are due to slightly different publication time, before or after press conf] • What elements are of interest? • Differences of vocabulary • cops vs officers, dad vs father, year after vs anniversary • Differences of explication • capital of Japan, Facebook • Differences of syntax • surprisingly few • but possible stylistic trademark of redtop is internal structure of noun phrases …
a sand-filled bath Parents Bill and Julia capital Tokyo suspect Tatsuya Ichihashi, 29 website Facebook a bath filled with sand Miss Hawker's parents, Bill and Julia Tokyo; the Japanese capital 29-year-old suspect Tatsuya Ichihashi [the] social networking site Facebook Appositive noun phrases
Numerical comparison • Thanks to computers it is now (relatively) easy to count things • What should we count? • easy to count number of paragraphs, sentences, words, letters • may give a measure of complexity • average sentence length (words/sentence) • average word length • percentage of long words • type:token ratio (vocabulary richness) • number of types = number of different words • number of tokens = total number of words • Hapax legomena = numbner of unique words
Normalization and significance • Always important to compare like with like • It is usual when counting things to “normalize” over the length of the text • If one text is longer than the other, of course you would expect higher frequencies of everything • Issue of statistical significance • Small differences may not really tell you anything • Various measures can confirm whether difference is statistically significant or due to random fluctuation
How to count • How to recognize paragraph breaks? • How to recognize sentence breaks? • Headlines don’t end in a fullstop • Not all sentences end in a fullstop • Not all full stops are sentence ending (abbreviations) • How to count words • Hyphenated words, contractions e.g. don’t • How to measure word-length/complexity • length only roughly corresponds to complexity • number of characters vs number of syllables • cf. through vs idea • counting syllables implies either a dictionary or an algorithm
Numerical comparison • texts are roughly the same length • Hard to know if any differences are statistically significant with such a small amount of data, but … • Indy does have more complex words … • and higher AWL and ASL … • and higher ratio of short:long sentences … • and richer vocabulary
Word length • Comparison of distribution of words by length only tells us that the two texts are very similar • correlation ρ = 0.977
Syntactic information • Again, hard to know if differences are significant • This kind of measure more useful to distinguish different genres
Readability • Big interest from teachers, publishers and researchers in quantifying the appropriate reading age for a text • i.e. what level of education do you need to understand this text? (reader-oriented view) • or: for what age of readership is this text appropriate (text-oriented view) • Most measures based on combination of average word length (measured in characters or syllables), and average sentence length • some additionally take into account proportion of long/short words
Readability indexes • Most give a (US) school grade: • Kincaid – best for technical material; short sentences, eg in dialogues, will lower the score: gives a grade level • ARI (Automated Readability Index) • Coleman-Liau – counts characters rather than syllables, so easier to implement • SMOG (simple measure of goobledygook) (McLaughlin 1969) – can be estimated by sampling e.g. 3 10-sentence segments; said to give best correlation with its criterion. See http://www.harrymclaughlin.com/SMOG.htm • FOG (Gunning 1952) – gives a school grade. Score >12 means “too hard to read”! • A few give a raw score: • Flesch-Kincaid – widely used, simple calculation; the higher the score, the easier it is to read. Highest possible score is 121 (text made up of one-word one-syllable sentences). Score around 100 means OK for 11-yr old. Time magazine ~52, Harvard Law Review low 30s. • Lix (Björnsson) – originally developed for Swedish, raw score <24 suitable for children, >55 very hard.
Readability Conversion: Add 1 to US grade to give British school year eg 11th grade = year 12 Note: with Flesch-Kincaid, lower score means harder to read http://www.editcentral.com/gwt/com.editcentral.EC/EC.html also suggests where improvements can be made! also used (give slightly different figures, probably depending on how they count things) http://www.readability.info/ http://www.online-utility.org/english/readability_test_and_improve.jsp