The question:

Differential Possessor Expression in English: Re-evaluating Animacy and Topicality EffectsCatherine O'Connor Boston University Arto Anttila NYUVivienne Fong NYUJoan Maling Brandeis UniversityAnnual Meeting of the Linguistic Society of AmericaJanuary 9 - 11, 2004Boston, Massachusetts

The question: What are the factors that drive the English alternation between the "Saxon genitive" and the "Of genitive"? X'S Spec The man's widow OF-X Comp The widow of the man

ANIMACY The X'S construction tends to attract animate possessors/ modifiers, and the OF-X construction tends to attract inanimate possessors/ modifiers. (Jespersen, Rosenbach, Stefanowitsch, Anschutz, R. Hawkins...) Hypothesis 1:

Walking's many virtues X'S The many virtues of walking OF-X This is a statistical tendency, at best: Hemispheres Magazine, 2001

DISCOURSE STATUS The X'S construction attracts old, topical, or highly accessible modifiers. The OF-X construction attracts newer or less accessible modifiers (Deane, Anschutz) Hypothesis 2:

its rejection (a bill) A neighbor's car X'S X'S ...recommend passage of it OF-X The car of a neighbor OF-X This is also a statistical tendency:

WEIGHT The X'S prenominal construction attracts lighter modifiers, and the OF-X construction attracts heavier modifiers (Stefanowitch, also cf. Arnold et al., J.Hawkins, Wasow) Hypothesis 3:

An analytical problem: These three hypotheses are seriously confounded: His advocacy of betting on the ponies Pronouns are light Humans are often topical Topics are often expressed as pronouns

Of-X Water's glass X's The glass of water Another analytical problem: Which examples can legitimately be expected to alternate between Of-X and X's? There are many, many such distractors

Our plan of inquiry: 1. Secure a large number of OF-X and X'S tokens in the Brown Corpus. 2. Exclude tokens of non-reversible types. 3. Code remaining tokens for weight, animacy and discourse status. 4. Control for confounds where possible, and try to model the statistical findings within an OT grammar, following work by Aissen and others.

1. Cleaning the sample

First: exclude non-nominals. Using F.Karlsson's part-of-speech tagged version of the Brown corpus (1995), we excluded all irrelevant Of-NP and NP's tokens. A few examples: Verbal OF-X: He thought of her. Adjectival OF-X: bald and afraid of women. Contraction X'S: Kate's all right.

X'S 4744 47% OF-X 5263 53% N = 10,006 "All NP" sample, after removal of non-nominal examples

Second: exclude all tokens of non-reversible constructions A few examples: Partitives: half of his stirrup guard Measure and a drop of liquor container phrases: two saucers of water Classifier phrases: a grove of trees a flight of wooden steps Configuration and: strips of skin constitutive phrases a...castle of pine boughs

Second: exclude all tokens of non-reversible constructions 'Sort' phrases the crassest kind of materialism Headless OF-X: that of a frustrated gnome Nominal dog-eared men's magazines compounds: and many more: a man of brooding suspicions [the] concept of the white-suited big-daddy colonel the notion of philosophy as Queen Bee . . .

OF-X 2839 38% X'S 4604 62% N = 7,443 X'S 4604 62% Partially clean sample after removal of 'strict' non-reversibles

fear of him his fear Third: exclude tokens where reversal substantially alters meaning--'soft' non-reversibles Idioms, fixed phrases, and titles bachelor of science *science's bachelor Satan's L'il Lamb #the L'il Lamb of Satan (b) Deverbal nominals with argument constraints (see handout for more examples)

OF-X 1985 30% X'S 4585 70% N = 6570 Cleaner sample after removal of 'soft' non-reversibles

2. Coding the sample

My sister's house X'S OF-X The house of my sister For each token, we coded the head, and the modifier. Each was coded for animacy, definiteness, NP form, and weight.

CODING for ANIMACY: •Human(oid)s•Animals ANIMATE •Human organizations ORG •Concrete objects•Abstract entities•Locations •Temporal entities INANIMATE

CODING for WEIGHT: Arnold et al., Wasow, and J. Hawkins assert that the [orthographic] word is a reasonable measure of weight for most purposes. It is also easily automated. Each head and modifier were coded for weight in words, from 1 through >20.

How to code for Discourse Status? Even simple codes such as 'New', 'Inferrable', and 'Old' are quite time-consuming, although they are clearly desirable. With thousands of tokens, we chose instead to exploit certain robust relationships between NP form and discourse status / accessibility. Relying on previous research of Prince, Gundel et al., Ariel, i.a., we coded modifiers and heads for NP form and for morphosyntactic definiteness.

Coding for NP Form and Definiteness: Most accessible, most topical, discourse-old... Pronoun Proper Noun Common Noun (definite) Least accessible, least topical, discourse-new... Common Noun(indefinite)

3. Controlling weights

After we coded our clean sample for weight, we noticed that 99% of our X'S examples had possessive modifiers that were 1, 2, or 3 words in weight. his only attack on the Republicansthetaxpayers' pocketsSpeakerSamRayburn's forces We controlled for weight effects by limiting OF-X tokens to those of 1, 2, or 3 words in weight. the invasion of Cubathe rapid growth of juveniledelinquencythe 9th precinct of the23rdward

OF-X 1485 25% X'S 4455 75% N = 6034 Cleanest sample, after removal of modifiers greater than 3 words in weight

3. Generalizations

X'S Of-X We decided to convert the raw numbers of X'S and Of-X tokens into ratios. For example, of 4177 animate tokensX'S Of-X 3909 268

X'S Of-X Let's compare the inanimate tokens: 1359inanimate tokensX'S Of-X 357 1002

favors X'S favors Of-X Ratio of X'S to OF-X by Animacy categoryin Cleanest sample (n=6034) 15 : 1 log scale 1.3 : 1 1 : 3 (N=3937) (N=498) (N=1359)

favors X'S favors Of-X Ratio of X'S to OF-X by NP form type in 'Cleanest' sample (n=6034) 297:1 1.85 :1 1 : 2.3 1 : 7.7 (N=3577) (N=971) (N=947) (N=539)

Both animacy and discourse status seem to have a large effect. How about weight? Do we find effects of similar magnitude when we examine our possessive modifiers by our three weight values--1, 2 and 3 words? Yes: here, the results span two orders of magnitude.

favors X'S favors Of-X Ratio of X'S to OF-X by Weight in Cleanest sample (n=6034) 10 : 1 1 : 2 1 : 5 (N=4443) (N=1174) (N=417)

Animacy, discourse status, and weight all show strong effects. If we hold one factor constant, do the other factors disappear? First we will hold animacy constant and look at the effects of discourse status, through the proxy of NP form.

favors X'S favors Of-X Ratio of X's to OF-X by NP form, controlling Animacy (n=6034) (N=4177) (N=498) (N=1359)

If we hold NP form constant, do the Animacy rankings hold up?

Ratio of X's to OF-X by Animacy category, controlling NP form (n=6034) (n=3577) (n=971) (n=947) (n=539)

Do the animacy and discourse status ratios hold the same relative values when we control for weight? Yes. If we repeat the process for all tokens with modifiers of weight 1, 2, and 3, the relative ranking of ratios stays the same, and the magnitude of the differences persists.

Just how robust are these results? The animacy and discourse status effects remained intact no matter what we controlled for. The relative ranking of the animacy and NP form categories was unchanged, although the ratios themselves differed somewhat in magnitude. What would happen if we computed the same ratios on our original sample of 'All NPs' (n=10,006)? Did all our laborious extractions and exclusions really make a difference?

N = 10,006 Return to initial sample, "All NPs" X'S 4744 47% OF-X 5263 53%

Ratio of X'S to OF-X by NP form type: Comparison of Cleanest (n=6034) vs. All NPs (n=9963) 297 28 1.85 0.96 0.44 0.15 0.13 0.04

Ratio of X'S to OF-X by Animacy: Comparison of Cleanest (n=6140) vs. All NPs (n=9963) 14.59 5.53 1.29 0.59 0.11 0.36

Interpreting the results

Recall our goal: 4. Control for confounds where possible, and try to model the statistical findings within an OT grammar, following work by Aissen and others. This is in progress, with very good results. A set of three binary constraints fits the data from our corpus study, and makes predictions that can be tested cross-linguistically.

OT Analysis In this preliminary phase, we classify possessors in terms of three binary features: [±animate][±human] [±pronoun]

Input [+anim, +hum, +pron] =she [+anim, +hum, –pron] = butler [–anim, +hum, +pron] = it (organization) [+anim, –hum, +pron] = it (animal) [–anim, –hum, +pron] = it (other) [+anim, –hum, –pron] = dog [–anim, +hum, –pron] = government [–anim, –hum, –pron] = table

OT constraints for the Complement (the X in Of-X) (a) *P/C ‘No [+pron] in Comp.’ *P-NP/C ‘No [±pron] in Comp.’ (b) *A/C ‘No [+anim] in Comp.’ *A-I/C ‘No [±anim] in Comp.’ (c) *H/C ‘No [+hum] in Comp.’ *H-NH/C ‘No [±hum]in Comp.’

OT constraints for the Specifier(the X in X'S) (a) *NP/S ‘No [−pron] in Spec.’ *NP-P/S ‘No [±pron] in Spec.’ (b) *I/S ‘No [−anim] in Spec.’ *I-A/S ‘No [±anim] in Spec.’ (c) *NH/S ‘No [−hum] in Spec.’ *NH-H/S ‘No [±hum] in Spec.’

19 predicted languages (out of 256 logically possible ones)

The question:

The question:

Presentation Transcript

The Question

The Question:

The question

The Question

The Question???

The question

THE QUESTION:

The Question

“The Question”

The Question

The Question

The QUESTION

The Question

The Question

The Strategic Question

The Big Question?

Seeing the Question

The Question

THE QUESTION: