520 likes | 676 Views
Lecture 4: Sorites 2. Finish Kamp 1981 Explain other context-based approaches (Kyburg & Morreau: fitting words). Context for sorites: Kamp 1981. Context C is a deductively closed set of formulas, e.g., a. {small(1), small(2), ¬small(3)} b. {small(1), small(2), ¬small(4)}
E N D
Lecture 4: Sorites 2 Finish Kamp 1981 Explain other context-based approaches (Kyburg & Morreau: fitting words)
Context for sorites: Kamp 1981 • Context C is a deductively closed set of formulas, e.g., a. {small(1), small(2), ¬small(3)} b. {small(1), small(2), ¬small(4)} c. {small(1), small(6), ¬small(4)} • C is incoherent xy(small(x) εC & ¬small(y) εC & (x~y V y<<x)) • Suppose JND=2. Then a is incoherent (2~3) b is coherent (not 2~4) c is incoherent (4 << 6)
3-valued truth definition: (Here only truth) • [small(a)]C=1 iffb(small(b) εC & (b~a or a<<b)) • [pq]C=1 iff [p]C=0 or [q]C{p}=1 • [p&q]C=1 iff [p]C=1 and [q]C=1 and C {p,q} is coherent Let C contain small(0) and ¬small(1000). Then • For every i, we have [small(i) small(i+1)]C=1 because [small(i+1)]C{small(i)}=1 • But let p1= small(0) small(1), p2= small(1) small(2), etc. Then [p1&…&p1000]C≠1, since C {p1,…&p1000} is not coherent
Kamp 1981 universal quantification: like conjunction [x(x)]C=1 iff [(a)]C=1 for all constants a & {(a)} is coherent Example: [i(small(i)small(i+1))]=1 iff [small(1)small(2)]C=1 & …. & the set of all the instantiations is coherent
Analysis of paradox (Kamp 81) Every instance of the crucial premiss Is true, but • Not their conjunction • Not the quantified version Is the paradox valid?
The argument (n=1000) small(0) small(0)small(1), so small(1) small(1)small(2), so small(2) … small(999)small(1000), so small(1000)
Contextualised version of validity: If premisses are true then conclusion is true in the context that collects all premisses. Argument is valid. Key: If [small(i)]C=1 and [small(i)small(i+1)]C=1 then [small(i+1)]C{small(i)}=1
Kamp 1981: consequences • The argument is deemed valid • Each instantiation of the crucial premiss is true, but their conjunction is not • Likewise, quantified version is false • Drawback: in many ways, this is as nonclassical as earlier accounts.(Compare Kamp’s critique of these)
Veltman 1987 • Variant of Kamp’s approach, based on an idea of Nelson Goodman (further developed by Michael Dummett) • Key idea: whether two objects can be distinguished depends on other objects in the context a ~ b ~ c | | a << c If c is absent then a and b are indistinguishable. But with c present, a difference between a and b can be inferred: c is called a help element.
Notations (x << y)A x << y V hεA(h<<y & ¬ h<<x) VhεA(h>>x & ¬ h>>y) (x ~ y)A ¬(x << y)A & ¬(y << x)A
Veltman 1987 (sketch) • Proposed to judge the truth of a discourserelative to a context determined by the discourse • A context is coherent iff it satisfies this version of EOI: small(i)C & (i~i+1)C small(i+1)C • When premisses small(i)small(i+1) are judged on their own, they must be true in any coherent context because the “context determined” is {i,i+1}: There is no help element, so (i~i+1)C,so EOI “fires”.
Veltman 1987 (sketch) • But when taken together, the premisses “determine” a larger context. • For example,(99~100)C is judged in C = {0,1,2,…,100}.Plenty of help elements, so(99~100)C doesnot hold, and EOI does not fire, so coherence does not imply that small(100).
Veltman 1987 (sketch) • In fact, it is not possible that all premisses of the form (i~i+1)C are true in the context jointly determined by them (Suppose small(k) & ¬small(k+1). Then C must contain h such that (k<<h) & ¬(k+1 << h). Consequently,(k<<k+1)C and therefore C is incoherent.)
Veltman 1987 (sketch) Similar reasoning makes the quantified version of the crucial premiss false in any coherent model Comments: • A flaw in an otherwise appealing approach? (Similar to Kamp 81) • The idea of using context may be taken further. For instance, contexts may be built up from left to right • More importantly, linguistic approaches to context show that where there is context dependence there is ambiguity(e.g., Graeme Hirst, Computational Linguistics 1996)
Van Deemter 1992,1996 • In NPs like `a small elephant’, the predicate small is known to be context dependent: context is provided by the noun elephant. • Discourse Representation Theory (DRT) suggests: context can also be built up through discourse: x1 . dinosaur(x1) x2 . whale(x2) x3 . x3=Jumbo small(Jumbo) • For concreteness, focus on one way in which small(x) might depend on context. Notation: small(x)A ,S(x)A
Some definitions small(x)Adef |{yεA: y<<x}| < |{yεA: y>>x}| KA[x] =def {yεA: y<<x} GA[x] =def {yεA: y>>x} . . . . . . . . . . x . . . . . . . . . . KA[x] GA[x] (Dots represent the elements of the context A)
We have contextualised both ~ and small This leads to many possible versions of the Crucial premiss. In particular • Plain version:S(x)A & x~y S(y)A{x} • Sophisticated version:S(x)A & x~yA S(y)A{x}
What other contextualisations can you think of? Are they really different? Recall: • Plain version:S(x)A & x~y S(y)A{x} • Sophisticated version:S(x)A & x~yA S(y)A{x}
Plain: S(x)A & x~y S(y)A{x} Sophisticated: S(x)A & x~yA S(y)A{x} • S(x) may be relativised to A or to A{x} • x~y may be relativised to nothing, to A, to A{x} or to A{x,y} • S(y) may be relativised to A or to A{x}, or to A{y} or to A{x,y} Each of these options is equivalent to either Plain or Sophisticated. For instance, x~yA x~yA{x,y} . S(y)A S(y)A{x,y} becausex~y (in both versions)
First claim Plain version is invalid: ¬╞S(x)A & x~y S(y)A{x} This is easy to see: let context A={x,y,z} x << z . . . x ~ y ~ z KA[x]={}, GA[x]={z}, therefore S(x)A KA[y]={}, GA[y]={}, therefore ¬S(y)A {x}
How about the sophisticated version? Is it valid? ╞ S(x)A & x~yA S(y)A{x}
Second claim Sophisticated version is valid:╞ S(x)A & x~yA S(y)A{x} Proof: suppose S(a)A & a~bA. Now suppose ¬S(b)A{a} This is equivalent to ¬S(b)A. We therefore have S(a)A and ¬S(b)A , so either xεA(x<<b & ¬x<<a) or xεA(x>>a & ¬x>>b). But then x is a help element for distinguishing a and b, contradicting a~bA
So, the premiss is ambiguous between a plain version that is invalid, and a sophisticated version that is valid • Since there exists a valid version of the premiss, does this mean that the paradox follows? • No, for the sophisticated version is not strong enough to support the (paradoxical) conclusion
Two further claims 1. Plain version supports sorites. 2. Sophisticated version does not support sorites. • Plain version supports sorites. Suppose S(x)A & x~y S(y)A{x} then sorites goes S(0)A so S(1)A {0} so S(2)A {0,1} so S(3)A {0,1,2} , etc.
2. Sophisticated version does not support sorites. Suppose we use S(x)A & x~yA S(y)A{x} then the sorites chain breaks off: S(0)A , so [since (0~1)A] S(1)A {0} , so [since (1~2)A{0}] S(2)A {0,1} , so [since (2~3)A{0,1}] S(3)A {0,1,2} , so [since (3~4)A{0,1,2}] , etc. Context for judging (i ~ i+1) grows, creating more and more help elements. Beyond some point, no suitable y can be found.
Proof Suppose i: (o~i)A {0,1,2,…i-1} Then certainly i: (o~i). If that is the case, all objects in the domain are indistinguishable from each other.
Illustration • Suppose the argument concerns the heights of people. (S = short) • Suppose A={ I }={speaker} • Let size(xi+1)-size(xi)=1mm • Let JND=10mm Three types of situations can arise:
We reach x10, which is the first element to be indistinguishable from x0 x0 ~x9 . . . . . x0 x1 x9 x10 I . . . . . x0 <<x10 <<I = x400
2. We reach x10, which is the first element to be indistinguishable from the speaker x9 << I . . . . x1 x9 x10 I . . . . x1 ~x10 ~I = x19
3. Upon reaching x10 , both changes coincide x0 ~x9 << I . . . . . . x0 x1 x9 x10 I . . . . . x0 <<x10 ~I = x19
In all three situations, two things happen at the same time: • While x9 was indistinguishable from its predecessor, x10 was not • While x9 was short with respect to its context, x10 was not
DRT-ish perspective on sorites: • I • x0 S(x0){I} • x1 S(x1){I} {x0} • x2 S(x2){I} {x0,x1} • …
Properties of this solution • In terms of Hyde’s classification: this approach depicts sorites as a fallacious argument. (Fallacy of Ambiguity) • The pattern is as follows • Some readings of sorites have premisses all of which are true • Some readings of sorites are valid • There are no readings for which both(a) and (b) are true • See paper in syllabus (section 4)
Properties of this solution • Proposal is logically conservative 2-valued standard semantics for connectives • Some empirical support: By working with formulas like Short(x)A, the context-dependence of `Short` has been taken into account, in the spirit of context-change semantics
Approach to meaning of gradable adjectives • One could argue that the vagueness of gradable adjectives was denied • Fuzzy logic really models vagueness • n-valued logic models it to some approximation • Kamp 1981 and Veltman 1987 are similar to our proposal in this respect • Maybe vagueness cannot be modelled classically?
Approach to vagueness • EOI is done justice: “Suppose the objects a and b are observationally indistinguishable in the respects relevant to P; then either a and b both satisfy P or else neither of them does.” but observationally indistinguishable has been contextualised, using Goodman/Dummett’s trick
Some nagging doubts • Some recent proposals bank on Goodman/Dummet notion of a help element • Read Dummett 1975: `Wang’s Paradox`. Synthese 30 • Strange predictions. Imagine a clock whose hands can be manually adjusted. Following Goodman any difference between two hands can be made `distinguishable’ by positioning the third hand. • If experiments were done, one would find that indistinguishability was not always judged in the same way. (JND’s were defined with this in mind!) • How do these considerations affect our proposals?
What can we learn from these proposals? • A computational viewpoint: what should an NLP system understand about vagueness? What pitfalls should be avoided and how? • Where the system employs vague relations like `similar’, `equivalent’, it should be aware of their non-transitivity. (unlike real equivalence, which is transitive.) • The system should be able to reason with vagueness, at least in the style of Goyal & Shoham. • It should not implement EOI, since this leads to paradox: fixed standards would be preferable!
When interpreting vague statements, the system should understand that speakers may be using different standards, and that these standards are dependent on context, which tends to lead to ambiguities • When producing vague statements, it should be aware of the same unclarities, and preferably mirror the way people (e.g., the system’s user) speak. • Sometimes it may be better to use numbers. In this case, the system has to understand the notion of an approximation. This implies understanding • that a measured difference can be arbitrarily small • that no difference may be measured even though one exists • that `nice’ numbers have a vague reading • More about these issues in the next lecture!
Semantics of vague adjectives • Speakers can vary their standards `by fiat’ (Kennedy 1999, van Deemter 2000) m1:2cm, m2: 5cm, m3: 7cm, m4: 10cm • `The large mouse’ = m4 • `The two large mice’ = {m3,m4} • Kyburg and Morreau (2000): `Just as a home handyman can fit an adjustable wrench to a nut, we think, a speaker can adjust the extension of a vague expression to suit his needs’
How does this wrench work? A stab • at formalisation by Kyburg and Morreau • at implementation by DeVault & Stone (NLU) and Van Deemter (NLG) • Here: Kyburg and Morreau • Their example: pigs in a pen. Language: • fat-pig(x) • at-least-as-fat-as(x,y) • fatter(x,y) • individual constants: Arnold, Babe • D(p) (“p is Definitely the case”)
One way of thinking about D: D records the things that the wrench cannot change • D is similar to using three truth values: if Arnold is borderline-fat then ¬D(fat-pig(Arnold) ¬D¬(fat-pig(Arnold) • Abbreviation: I(fat-pig(Arnold) (Arnold’s being fat is Indeterminate)
Sketch of model theory Model M = <Dom, P, ≤, L> Dom = Domain P = set of evaluation points ≤ = tree ordering of P @ = `appropriate point’ = bottom of tree L = interpretation function L+(R,p)=objects having property R at p L-(R,p)=objects having property R at p
Constraint on interpretation L:If p≤q then L+(R,p) L+(R,q) L-(R,p) L-(R,q) (moving up the tree, more predications get resolved) • A point p is complete iff, for all R, L+(R,p) L-(R,p) = Dom
Truth at point p is roughly supervaluational :[R(a)]M,p = true/false iff for all complete q such that p≤q, [a] εL+(R,q) [D()]M,p = true iff []M,@ (NB This makes [D()] independent of p) • is true/false/undefined in the model M iff is true/false/undefined at @ Consequence: (at complete and incomplete points) it may be that [R(a)]M,p = true while [D(R(a))]M,p = false
Example 1 • Common ground: I(fat-pig(Arnold)) D(skinny-pig(Babe)) x(pig(x) x =Arnold V x=Babe) fatter(Arnold,Babe) • Utterance: s = `the fat pig won the prize’ • Entailed: fat-pig(Arnold),¬fat-pig(Babe)so, `the fat pig` = Arnold
Reasoning: • s implies !x(fat-pig(x) • Common Ground says D(skinny-pig(Babe)), so ¬fat_pig(Babe) • Since Arnold is the only other pig, it follows that fat-pig(Arnold)
Example 2 • Common ground: I(fat-pig(Arnold)) I(fat-pig(Babe)) x(pig(x) x =Arnold V x=Babe) fatter(Arnold,Babe) • Utterance: s = `the fat pig won the prize’ • Entailed: fat-pig(Arnold),¬fat-pig(Babe)so, the fat pig = Arnold
Example 3: revising the common ground fat-pig(Arnold) fat-pig(Babe) x(pig(x) x = Arnold V x = Babe) fatter(Arnold,Babe) s = `the pig that is not fat won the prize’ Entailed: fat-pig(Arnold),¬fat-pig(Babe) so, the pig that is not fat = Babe
Kyburg & Morreau (summing up) • Intuitions about accommodation of word meaning seem spot on. Some aspects of `dynamics’ are captured • Formal apparatus: some questions, e.g., how does the tree arise? E.g., does it incorporate laws saying large-pig(x) & larger(y,x) large-pig(y) ? • Maybe the wrench is too adaptable. • After calling {Arnold,Babe} `the large pigs’,can we call {Arnold} `the large pig’ ? • Can statements like fatter(Arnold,Babe) also be revised? • Might this be embedded into a general theory of information update?