220 likes | 332 Views
Capturing patterns of linguistic interaction in a parsed corpus. A methodological case study. Sean Wallis Survey of English Usage University College London s.wallis@ucl.ac.uk. Capturing linguistic interaction. Parsed corpus linguistics Intra-structural priming Experiments
E N D
Capturing patterns of linguistic interaction in a parsed corpus A methodological case study Sean Wallis Survey of English Usage University College London s.wallis@ucl.ac.uk
Capturing linguistic interaction... • Parsed corpus linguistics • Intra-structural priming • Experiments • Attributive AJPs before a noun • Embedded postmodifying clauses • Sequential postmodifying clauses • Speech vs. writing • Conclusions • The handout explains the analytical method in more detail(so read it later!)
Parsed corpus linguistics • An example tree from ICE-GB (spoken) S1A-006 #23
Parsed corpus linguistics • Three kinds of evidence may be obtained from a parsed corpus • Frequencyevidence of a particular known rule, structure or linguistic event • Coverage evidence of new rules, etc. • Interaction evidence of the relationshipbetween rules, structures and events • This evidence is necessarily framed within a particular grammatical scheme • How might we evaluate this grammar?
Intra-structural priming • Priming effects within a structure • Study repeating an additive step in structures • Consider • a phrase or clause that may (in principle) be extended ad infinitum • e.g. an NP with a noun head N
Intra-structural priming • Priming effects within a structure • Study repeating an additive step in structures • Consider • a phrase or clause that may (in principle) be extended ad infinitum • e.g. an NP with a noun head • a single additive step applied to this structure • e.g. add an attributive AJP before the head N AJP
Intra-structural priming • Priming effects within a structure • Study repeating an additive step in structures • Consider • a phrase or clause that may (in principle) be extended ad infinitum • e.g. an NP with a noun head • a single additive step applied to this structure • e.g. add an attributive AJP before the head • Q. What is the effect of repeatedly applying this operation to the structure? N AJP N ship
Intra-structural priming • Priming effects within a structure • Study repeating an additive step in structures • Consider • a phrase or clause that may (in principle) be extended ad infinitum • e.g. an NP with a noun head • a single additive step applied to this structure • e.g. add an attributive AJP before the head • Q. What is the effect of repeatedly applying this operation to the structure? N AJP AJP N tall ship
Intra-structural priming • Priming effects within a structure • Study repeating an additive step in structures • Consider • a phrase or clause that may (in principle) be extended ad infinitum • e.g. an NP with a noun head • a single additive step applied to this structure • e.g. add an attributive AJP before the head • Q. What is the effect of repeatedly applying this operation to the structure? N AJP AJP AJP N tall very green ship
Intra-structural priming • Priming effects within a structure • Study repeating an additive step in structures • Consider • a phrase or clause that may (in principle) be extended ad infinitum • e.g. an NP with a noun head • a single additive step applied to this structure • e.g. add an attributive AJP before the head • Q. What is the effect of repeatedly applying this operation to the structure? N AJP AJP AJP N AJP tall very green ship old
probability 0.20 0.15 0.10 0.05 0.00 0 1 2 3 4 5 Experiment 1: analysis of results • Sequential probability analysis • calculate probability of adding each AJP • error bars: Wilson intervals • probabilityfalls • second < first • third < second • decisions interact • Every AJP addedmakes it harderto add another
Experiment 1: explanations? • Feedback loop: for each successive AJP, it is more difficult to add a further AJP • logical-semantic constraints • tend to say the tall green ship • do not tend to say tall short shipor green tall ship • communicative economy • once speaker said tall green ship, tends to only say ship • memory/processing constraints • unlikely: this is a small structure, as are AJPs
0.25 0.20 0.15 0.10 0.05 0.00 0 1 2 3 4 5 Experiment 1: speech vs. writing • Spoken vs. written subcorpora • Same overall pattern • Spoken data tends to have fewer attributive AJPs • Support for communicative economy or memory/processing hypotheses? • Significance tests • Paired 2x1 Wilson tests (Wallis 2011) • first and secondobserved spokenprobabilities are significantly smallerthan written probability written spoken
Experiment 2: preverbal AVPs • Consider adverb phrases before a verb • Results very different • Probability does not fall significantly between first and second AVP • Probability does fall between third and second AVP • Possible constraints • (weak) communicative • (weak) semantic • Further investigationneeded 0.10 probability 0.05 0.00 0 1 2 3 4
Experiment 3: postmodifying clauses • Another way to specify nouns in English • add clause after noun to explicate it • the ship [that was in the port] • the ship [called Ariadne] • may be embedded • the ship [that was in the port [we visited last week]] • orsuccessively postmodified • the ship [called Ariadne][that was in the port]
probability 0.10 written 0.05 spoken all 0.00 0 1 2 3 4 Experiment 3: (i) embedding • Probability of adding a further embedded postmodifying clause falls with size • All data • second < first • third < first • Spoken • second < first • Written • third < second • Compare with effect ofsequential postmodification of same head
probability 0.15 0.10 spoken 0.05 written 0.00 0 1 2 3 4 5 Experiment 3: (ii) sequential • Probability of sequential postmodifying falls - and - for spoken data, falls, then rises • All data • second < first • Spoken • third > second
probability 0.15 0.10 spoken 0.05 written 0.00 0 1 2 3 4 5 Experiment 3: (ii) sequential • Probability of sequential postmodifying falls - and - for spoken data, falls, then rises • All data • second < first • Spoken • third > second • Option: count conjoins separatelyor treat as single item • Either way, results showsimilar pattern • Negative feedback: the ‘in for a penny’ effect
probability 0.15 0.10 sequential 0.05 embedding 0.00 0 1 2 3 4 5 Experiment 3: (iii) embed vs. seq • Embedded vs. sequential postmodification • embedding > sequence (second level) • It is slightly easier tomodify the latest headthan a more remoteone: • semantic constraints? • backtracking cost? • Third level • embedding < sequence(if counting conjoins) • long sequences seem to be easier to construct than comparable layers of embedding
Conclusions • A method for evaluating interactions along grammatical axes • General purpose, robust, structural • More abstract than ‘linguistic choice’ experiments • Depends on a concept of grammatical distance along an axis, based on the chosen grammar • Method has philosophical implications • Grammar viewed as outcome of linguistic choices • Linguistics as an evaluable observational science • Signature (trace) of language production decisions • A unification of theoretical and corpus linguistics?
Potential applications • Corpus linguistics • Optimising existing grammatical framework • e.g. coordination, compound nouns • Comparing genres/languages/periods • Theoretical linguistics • Comparing different grammars, same language • Psycholinguistics • Search for evidence of language production constraints in spontaneous speech corpora • speech and language therapy • language acquisition and development
References Nelson, G., Wallis, S. & Aarts, B. (2002) Exploring natural language. Benjamins. Pickering, M. & Ferreira, V. (2008) Structural priming. Psychological Bulletin 134, 427–459. Wallis, S.A. (2011) Comparing χ² tests for separability. Survey of English Usage. • For explanation of the analysis method see the handout! • For more detail and a draft of the full paper see http://corplingstats.wordpress.com