390 likes | 543 Views
Saying nothing: Frequency effects in Dominican Spanish null subjects. Cristina Martinez Sanz (U of Ottawa) & Gerard Van Herk (Memorial U). The spark:. “ every aspect of language can profitably be re-examined in light of the important frequency effects ” (Bybee 2002)
E N D
Saying nothing: Frequency effects in Dominican Spanish null subjects Cristina Martinez Sanz (U of Ottawa) & Gerard Van Herk (Memorial U)
The spark: • “every aspect of language can profitably be re-examined in light of the important frequency effects” (Bybee 2002) • …but most work so far is on phonology
For example… • intervocalic /d/ (Bybee, 2002; Díaz-Campos & Gradoville, 2011) • coronal stop deletion (Bayley & Loudermilk, 2008; Bybee 2001, 2002; Walker 2012) • lenition of Spanish syllable final /r/ (Díaz-Campos, 2005, 2006) • syllable final /s/ (Brown, 2009; Fife-Muriel, 2009)
Some studies find frequency effects, some don’t • But, very little work on frequency effects in syntax
Today’s variable • Null subject in Spanish, aka “subject personal pronoun” (SPP) variation • For example: yo/0te voy a hacer una historia buena ‘I’m going to tell you a good story’
A widely studied variable… • …but not with respect to frequency
Dialectal differences • Factor groups: stable across dialects • Factor group rankings: -Person is the first ranked factor group in Caribbean dialects, whereas it is overriden by discourse- related constraints (switch reference) in non-Caribbean dialects (Otheguy et al. 2007, Orozco & Guy 2008, Martínez Sanz 2011).
Frequency effects • Erker & Guy (2012) look at Mexican and Dominican Spanish speakers in New York City • 12 informants, 4916 tokens • “Frequent” forms (N=13; each form that makes up more than 1% of the token file (N=1,120) • Frequency has no direct role, but it influences whether other factors play a role (activation) and/or how much (amplification)
E&G’s findings • Activated: morphological regularity, semantic class, person/number • Amplified: TMA, switch reference
Theoretical assumption • Speakers need a certain level of familiarity with a form to figure out the probabilistic constraints on variant choice in that context • It’s an exemplar thing
Bayley (2013) • Mexican Spanish speakers in Texas, California • 29 informants, 8676 tokens • Tested E&G, threw in a couple more factor groups • 19 frequent forms (N=2612)
Bayley’s findings • Amplified: Semantic class (like E&G) • Weakened: TMA, switch reference, person/number, lexical aspect (opposite of E&G) • So frequency doesn’t activate or amplify constraints in this data set, contrary to what Erker & Guy (2012) predict • Frequency does have a (weak) independent effect, given a large enough data set
Why replicate again? • Maybe we can resolve these differences • We have access to non-contact data • And contact with English might account for the wonkiness of E&G and Bayley’s findings • We have access to Dominican Spanish data • This variety is famously variable, with higher rates of overt subjects than elsewhere • And in more contexts (Martínez Sanz 2011)
The data • 25 interviews, 2008, Dominican Republic
Data extraction • 34 speakers • First 200 tokens per speaker • 4567 tokens • after exclusions, other variants, etc. • 835 frequent-form tokens
Other (linguistic) constraints • Switch reference • Bayley & Pease-Alvarez 1996, 1997; Bayley et al. 2012; Cameron 1995, 1996; Flores-Ferrán 2004; Otheguy & Zentella 2012; Silva-Corvalán 1994 • Person/number • Bayley & Pease-Alvarez 1996, 1997; Cameron 1992, 1996; Erker & Guy 2012; Flores-Ferrán 2004; Otheguy et al. 2007; Otheguy & Zentella 2012 • Tense/mood/aspect • Silva-Corvalán 1994; Bayley & Pease-Alvarez 1997; Bayley et al. 2012 • Ambiguity re person • Hochberg 1986; Cameron 1993 • Verb semantics • Travis 2007
Interactions • Previous studies have coded for different things • But many of these overlap • TMA and form ambiguity • Semantic features and lexical aspect • In Bayley, subject type and stativity interact with frequency
Factor groups coded • Person/number of the subject • 1/2/3, singular/plural, plus 2nd indefinite • TMA of the verb (~ambiguity) • Preterit, present & related, other • Semantic features of the verb (~lexical aspect) • Reference • Same or switch • Form frequency • Tengo and tiene are different forms
Findings • (See also table in handout!) • Note that null subject is the application value
All tokens together • All usual factor groups are significant • Direction virtually identical to earlier (Caribbean Spanish) studies • Frequency is not significant • i.e., higher-frequency items are not more or less likely to have null subjects • As in E&G and Bayley (more or less)
Infrequent verb forms • Behave almost exactly like the full data set • Because they are the full data set, more or less!
Frequent verb forms • Same effects as full data set, very slightly weaker • One difference: Person/Number: • Because there were no frequent forms in plurals or second person, and those were the strongest constraints in the full data set • Similar effects in Bayley, Erker & Guy
Frequent verb forms • TMA (Ambiguity) -Very slightly weaker effects • Verb semantics (Lexical aspect) -very slightly stronger effects • Switch reference -No differences
So… • Frequency alone doesn’t matter • …although for this variable, nobody has said that it should • Frequency doesn’t amplify or activate any other constraints
1. Speech community differences • Dialect differences • Mexican Spanish has less productive overt subject pronoun system • Contact • Previous studies were based on Spanish speakers in contact with English • Their constraint systems might be eroded in infrequent contexts (a Nancy Dorian thing)
2. Data collection differences: Is this all lexically driven? • If frequency is determined based on the corpus, topic and interlocutors could affect which verb forms are frequent. • The Bayley interviews are about language choices, so verba dicendi are frequent • What’s the theoretical justification for this method? Surely what matters is which forms are frequent in a speaker’s cognitive system
3. Similarities across corpora: are we measuring frequency wrong? • Even if frequent forms vary from corpus to corpus, a number of forms appear in all three corpora. • In the three data sets, most frequent forms are in the first and third person singular, and they include a lot of statives, which trigger overt subject insertion.
Further research:redefining frequency • Person -Should we look at different Persons separately, given that they have very different properties? • Verb Forms and Verb Types -Frequent verb forms include verbs that bear very different properties. -Should we look at how frequency affects verb types, not just verb forms?
Further research:redefining frequency • Frequency might not work the same in phonology (where most frequency work has been carried out) as in syntax. • Given the conflicting results in SPP frequency studies, we might want to explore different ways of measuring frequency in syntax (by lemma, form, collocation?)
Conclusion • Maybe there’s no frequency effect • Maybe there is, but we haven’t yet figured out how to find it
Thanks • Bob Bayley (and indirectly Greg Guy and Danny Erker), for sharing work in progress • The people in the Dominican who shared their language with Cristina
Contact us • Gerard: gvanherk@mun.ca • Cristina: cristina.martinez.sanz@gmail.com