1 / 39

The Armchair and the Machine

The Armchair and the Machine Corpus-Assisted Discourse Studies Alan Partington Lorient 14/09/07 Corpus-Assisted Discourse Studies ( CADS ) What does CADS do? Examples (politics & media) & Types of research questions / methodologies Teaching material? “two types of linguist”

paul2
Download Presentation

The Armchair and the Machine

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Armchair and the Machine Corpus-Assisted Discourse Studies Alan Partington Lorient 14/09/07

  2. Corpus-Assisted Discourse Studies (CADS) • What does CADS do? • Examples (politics & media) & • Types of research questions / methodologies • Teaching material?

  3. “two types of linguist” the Armchair linguist … “sits in a deep soft comfortable armchair, with his eyes closed and his hands clasped behind his head. Once in a while he opens his eyes, sits up abruptly shouting, “Wow, what a neat fact!”, grabs his pencil, and writes something down. Then he paces around for a few hours in the excitement of having come still closer to knowing what language is really like.” Introspection

  4. “two types of linguist” the Corpus linguist … “has all the primary facts that he needs, in the form of approximately one zillion running words, and he sees his job as that of deriving secondary facts from his primary facts. At the moment he is busy determining the relative frequencies of the eleven parts of speech as the first word of a sentence” Data observation

  5. “two types of linguist” however “These two don’t speak to each other very often, but when they do the corpus linguist says to the armchair linguist, ‘Why should I think that what you tell me is true?’, and the armchair linguist says to the corpus linguist, ‘Why should I think that what you tell me is interesting?’” (Fillmore)

  6. Four stages of science • respect for authority (generally Scripture and Aristotle) • rationalist introspection (Descartes: cogito ergo sum - I introspect therefore I am) • “observationism” and distrust of theory (Bacon: ‘The intellect, left to itself, ought always to be suspected’) • the mutually reinforcing hermeneutic interaction of theory and observation

  7. Four stages of science • respect for authority (generally Scripture and Aristotle) • rationalist introspection (Descartes: cogito ergo sum - I introspect therefore I am) • “observationism” and distrust of theory (Bacon: ‘The intellect, left to itself, ought always to be suspected’) • the mutually reinforcing hermeneutic interaction of theory and observation

  8. Psycho- & Socio- …corpus linguists have so far contributed little to answering classic questions of cognitive and social theory; they have hardly considered the relevance of corpus evidence to questions about the mental lexicon and the construction of the social world (though one of Halliday’s central topics) (Stubbs 2006: 15)

  9. Data observation

  10. Intuition & contemplation

  11. Speculation Stubbs 2006: …could be related …may be reducible… may also be internally related … seems to show … might also provide … show how we could do real ‘ordinary language philosophy’ …

  12. Interdependence: technology & theoryof machine and mind New instruments lead to New ways of observing lead to New ways of thinking

  13. New instruments = grinding of lenses (Galileo, Spinoza) lead to New ways of observing = astronomy lead to New ways of thinking = model of universe

  14. New instruments = radio trasmitter, receiver lead to New ways of observing = radio-telescopy lead to New ways of thinking = theory of creation

  15. New instruments = corpora lead to New ways of observing = inductive data-driven lead to New ways of thinking = lexical grammar

  16. What do CADS do? Investigate (and compare) discourse types(DTs): ‘Non-obvious’ meanings to “not get caught in using corpora just to tell you more about what you know already” (Sinclair 2004: 183)

  17. It combines Corpus Linguistics Data crunching: Statistical OVERVIEW (very quickly) “Quantitative” approach (“general” language dictionaries, grammars) Discourse analysis DETAILED analysis, even single texts “Qualitative” approach

  18. “Traditional” Corpus Linguistics vs CADS

  19. Traditional Corpus Linguistics: • Very large ‘general’ – heterogeneric - corpora: BNC, BoE CADS: • Compile your own ‘specialized’ corpus/corpora • Comparison: Particular features of a discourse type, DT(a)? Compare DT(a) – DT(b) – DT(n) Compare DT(a) – BNC / BoE

  20. Traditional CL: Corpus: “Black box” – Keep out!

  21. CADS: Make friends with our corpus Detailed knowledge of DT: • Frequency Information > Concordancing • Reading / watching / listening to corpus-held DT tokens • Intuitions • “External” data (esp in political – media): interviews with protagonists; official documents;

  22. Beginnings Hardt-Mautner (1995) Stubbs (1996; 2001) Teubert, Mahlberg ITALY: Newspool: Partington, Morley & Haarman (eds) 2004 CorDis: Morley & Bayley (eds) forthcoming Intune

  23. FRANCE “I’ve been doing CADS for years and never knew it” (Geoffrey Williams, Siena 2006)

  24. What’s been done?

  25. What’s been done? Berlusconi’s election speeches (Garzone & Santulli 2004) Word lists (WordSmith): Italia; stato; libertà Concordanced

  26. What’s been done? Lo stato when it is run by the Left: autoritario, burocratico, invasivo, moloch, padrone, stato-partito (authoritarian, bureaucratic, invasive, moloch, bossy, a party-state)

  27. What’s been done? Lo stato when treated to the Forza Italia cure becomes: amico, civile, di diritto, liberale, moderno (friend, civilised, lawful, liberal, modern)

  28. What’s been done? Libertà is the third most frequent noun; but it is rarely attached to an individual in the co-text. Whose liberty?

  29. Research question type 1 How does P achieve G with language? What does this tell us about P? Comparative: how do P1 and P2 differ?

  30. September 11th

  31. C2001 Sept 11-18 2001 150,000 words Times - Independent - Telegraph- Guardian C2002 Sept 11-18 2002 150,000 words Times - Independent - Telegraph- Guardian WordSmithKeywords September 11th

  32. September 11th world (468 - 136): • an attack on the whole civilised world • convinced the world is its enemy • the world will never be the same global dimension, attack on the international community, not just USA

  33. September 11th war (351 - 60) • a totally new kind of war, acts of war, the first war of the 21st century, (or simply) this war Reaction must be: declare war on terrorism, launch an international war

  34. September 11th

  35. September 11th enemy (106 - 20) • ghostlike global enemy, shadowyenemy, not a clearly definedenemy, absence of a tangibleenemy Collocates: semantic preference forthe unknown

  36. September 11th in- and –un words: inconceivability: • what was once thought inconceivable • an unimaginable tragedy • the unthinkable has happened inexpressibility: • unspeakable horror of today’s inhuman terrorist attacks, unspeakable sadness • untold hundreds ... of dead and injured

  37. September 11th • incalculable, unfathomable • incredible, incredulity • unbearable, intolerable • “…surpassing the collective ability to understand and feel” (Blair)

  38. TYPICAL CADS METHODOLOGY • Step 1: Design, unearth, stumble upon research question • Step 2: Choose, edit or compile an appropriate corpus • Step 3: Choose, edit or compile an appropriate referencecorpus / corpora

  39. TYPICAL CADS METHODOLOGY • Step 4: Run a Keywords comparison of the corpora • Step 5: Determine the existence of setsof key items (by eye and brain) • Step 6: Concordance interesting key items (varying quantities of co-text: sentence, ‘chunk’)

More Related