110 likes | 272 Views
Informing Narrative Analysis with Corpus Analytical Methods. Xiaofei Lu APLNG Round Table October 21, 2011. Goal : Informing (qualitative) narrative analysis by using corpus analytical methods to reveal quantitatively salient linguistic features Analytical tools and procedure
E N D
Informing Narrative Analysis with Corpus Analytical Methods Xiaofei Lu APLNG Round Table October 21, 2011
Goal: Informing (qualitative) narrative analysis by using corpus analytical methods to reveal quantitatively salient linguistic features Analytical tools and procedure Discussion of (some) results Introduction 2
Stanford part-of-speech tagger: sentence segmentation, word segmentation, POS tagging I_PRP normally_RB teach_VBP courses_NNS on_IN how_WRB to_TO rebuild_VB states_NNS after_IN war_NN ._. MORPHA: lemmatization i normally teach course on how to rebuild state after war . Analytical tools and procedure 3
(ROOT (S (NP (PRP I)) (ADVP (RB normally)) (VP (VBP teach) (NP (NP (NNS courses)) (PP (IN on) (SBAR (WHADVP (WRB how)) (S (VP (TO to) (VP (VB rebuild) (NP (NNS states)) (PP (IN after) (NP (NN war)))))))))) (. .))) Stanford parser: Syntactic parsing 4
AntConc Frequency of lemmas, n-grams, POS categories, etc. Concordances of quantitatively salient linguistic features Data view: search and highlight linguistic features in text Tregex Querying syntactic parses to retrieve structures matching specified pattern, e.g., ‘ROOT < (S|SBARQ|SQ < (VP < VBD))’ Querying annotated text 5
Demo: narrative.lem, AntConc (Word List, N-grams) 901 word tokens, 321 word types Top content words and bigrams Khmer (11), Rouge (11), Vietnamese (10), Cambodium (7), Vetnam (7), mother (5), picture (5), place (5) Khmer Rouge (11), my mother/father (6), a picture (4), American bombing (3) What does this indicate about the recurring themes/contexts of the narrative? Lemma and bigram frequency 6
Pronouns I (13), she (13), my (11), they (9), we (8), her (6), you (3) Participants of the narrative event? Conjunctions and (38), so (8), but (7), because (5) Relations of clauses to narrative sequence? Coherence? Modality: modal operators, comment/mood adjunct actually (8), in fact (3), probably/likely/may/might (0) What does this suggest? Lemma and bigram frequency 7
Patterns of pronoun use Demo: narrative.lem, AntConc (File View), Primary participants of the narrative event: they|she, clear boundaries I: narrator/evaluator, not primary participant in the narrative event until the end; narrator/evaluator We: not primary participant, affected by how ‘she’ acted You: this isn’t about ‘you’ Pronouns 8
Heavy use of conjunctions as textual theme 39 out of 65 ‘sentences’ (unit loosely used here; for illustration only) begin with And/But/Because/So Characteristic of personal narratives? Only 1 out of 6 in the video section. Demo: narrative.pos, AntConc (File View, And|But|So|Because) Conjunctions 9
Amount of evaluation and explanation embedded in the narrative and pattern of embedding Past tense: clauses related to the narrative sequence Present tense: comments, evaluations, explanations VBD (47) vs. VBZ (25) / VBP(18) Demo: narrative.pos, AntConc (Concordance, Plot, File View) Only 27 of the 65 ‘sentences’ have a main verb in past tense Tregex demo: ./tregex.sh 'ROOT < (S|SBARQ|SQ < (VP < VBD))’ narrative.par Verb tense patterns 10
Identification of quantitatively salient (both frequent and rare) linguistic features Close examination of the distribution, patterns of use, and functions of these features Complements insights gained from clause-by-clause qualitative analysis Summary 11