320 likes | 398 Views
Semantic Composition with l -DRT. Christof Rumpf Heinrich-Heine-Universität Düsseldorf http://www.phil-fak.uni-duesseldorf.de/~rumpf/ 30.07.2003. the problem(s). semantic construction
E N D
Semantic Composition with l-DRT Christof Rumpf Heinrich-Heine-Universität Düsseldorf http://www.phil-fak.uni-duesseldorf.de/~rumpf/30.07.2003
the problem(s) • semantic construction • Given a text in a natural language, is there a systematic method for constructing the representation of its meaning from the meaning of its parts (Frege)? • inference • Can we draw inferences with these representations? • computation • Can all this be automated?
the framework • first-order logic (fol) • l-calculus + (context free) grammar • DRT - Dicourse Representation Theory This framework has proved to be useful in the machine translation project Verbmobil : 1993 - 2000, 166 million DM, 375 publications, where the funding came from the BMBF and some industry partners.
the sources • You will find almost all the material from this presentation (and much more else) in: • Blackburn, Patrick & Johan Bos (to appear) Representation and Inference for Natural Language. Stanford: CSLI Publications.and for free at • http://www.comsem.org
levels of semantic representation • word level • lexical semantics: lexicalist l-abstractions of fol expressions that specify the combinatorial potential at the phrase level • representations may be complex • phrase level • semantic composition with functional application • results in very simple phase structure rules • discourse level • l-DRT: l-calculus over discourse representation structures (DRSs) for word, phrase and discourse levels
simplified composition model SPeter likes Marylike(peter, mary) CFG:S NP VPVP V NP Is this rule-based?What does the rule looks like? VPlikes Marylike(?, mary) NPPeterpeter The process of variable binding should be guided by explicit rules. Vlikeslike(?, ?) NPMarymary
l-calculus • l-calculus can be used as a metalanguage over fol to serve as a ‚glue language‘ for fol expressions. • The l-operator binds variables over individuals and predicates (2nd order logic) with l-abstraction. • Variables can be associated with arguments via functional application. Access to variables is constrained by the sequence of l-operators. • Substitutions of variables with arguments are performed by an operation called b-conversion.
l-abstraction & functional application • l-abstraction: lx.woman(x) • The l-operator binds the occurrence of x in the one-place predicate woman and marks it as a landing place for an argument. • functional application: lx.woman(x)@mary • The term mary is applied to the l-expression with the operator @, which denotes functional application. The operator has the shape functor@argument. Argument substitution will be done by b-conversion.
b-conversion • Functional applications are instructions to perform b-conversion:ly.lx.likes(x, y)@maryb-conversion yieldslx.likes(x, mary) • b-conversion substitutes the leftmost variable in the sequence of l-abstractions with the rightmost argument that is attached by functional application. This eliminates the l-abstraction for the variable.
a-conversion • The variables of the two terms in functor@argument structures need to be distinct to make b-conversion sound. a-conversion renames bound variables. • without a-conversion: ly.lx.like(x,y)@xlx.like(x, x) • with a-conversion:ly.lx.like(x,y)@xa ly1.lx1.like(x1,y1)@x2ly1.lx1.like(x1,y1)@x2 lx1.like(x1,x2)
quantifiers • every: lP.lQ.x(P@x Q@x) • beetle: ly.beetle(y) • every beetle:lP.lQ.x(P@x Q@x)@ly.beetle(y) • by b-conversion: 1. lQ.x(ly.beetle(y)@x Q@x) 2. lQ.x(beetle(x) Q@x) • every beetle hums: lQ.x(beetle(x) Q@x)@ly.hum(y)x(beetle(x) ly.hum(y)@x)x(beetle(x) hum(y)) • exists: lP.lQ.x(P@x Q@x)
proper names • Quantified noun phrases (in subject position) are functors that take verbs as arguments to build sentences:lQ.x(beetle(x) Q@x)@ly.hum(y) • Proper names are raised to functors to allow for uniform composition:lP.P@mary@ly.hum(y)ly.hum(y)@maryhum(mary)
transitive verbs • Transitive verbs are represented as functors that take their object NP‘s semantic representation as an argument: lQ.lx.(Q@ly.like(x, y))@lP.P@mary lx.(lP.P@mary@ly.like(x, y)) lx.(ly.like(x, y)@mary) lx.like(x, mary)
semantic construction Finally, compared with the earlier example this is rule guided semantic composition: we know in advance, where the arguments have to take place. S (NP@VP)Peter likes Marylike(peter, mary) VP (V@NP)likes Marylz.like(z,mary) NPPeterlP.P@peter VlikeslQ.lx.Q@ly.like(x, y) NPMarylP.P@mary
Prolog DCG s(NP@VP) np(NP), vp(VP).vp(V@NP) tv(V), np(NP).vp(VP) iv(V). Syntaxnp(NP) pn(PN).np(Det@N) det(Det), n(N). det(lP.lQ.X(P@X Q@X)) [every].n(lY.beetle(Y)) [beetle].pn(lP.P@mary) [mary]. Lexiconiv(lY.hum(Y)) [hums].tv(lQ.lX.Q@lY.like(X, Y)) [likes]. ?- s([every,beetle,likes,mary],[ ],S1), betaconvert(S1,S2).S1 = (lP.lQ.X(P@X Q@X)@lY.beetle(Y))@ (lQ.lX.Q@lY.like(X, Y)@lP.P@mary)S2 = X(beetle(X) like(X, mary)) The l-expressions in the lexicon have to be converted to linearized Prolog notation.
DRT • Discourses are sequences of sentences. • In discourse representation theory (DRT) discourses are represented as discourse representation structures (DRSs) which contain discourse referents and conditions on discourse referents (individuals). • DRSs provide a language that restricts expressiveness to possible discourse structures.
x1, ..., xn g1, ..., gm discourse representation structures • If x1, ..., xn (n 0) are discourse referents and g1, ..., gm (m 0) are conditions, then is a DRS. • If R is a relation symbol of arity n and x1, ..., xn are discourse referents, then R(x1, ..., xn) is a condition. • If t1 and t2 are discourse referents or constants, then t1 = t2 is a condition. • If K1 and K2 are DRSs,then K1 K2 and K1K2 are conditions. • If K is a DRS then K is a condition. • Nothing else is a DRS.
x beetle(x)hum(x) indefinite noun phrases DRS for a beetle hums fol: x beetle(x) hum(x) discourse referent x introduced by the noun phrase
x x=mary hum(x) proper names, negation DRS for mary does not hum fol: x hum(x) x=mary discourse referent x introduced by the noun phrase
x beetle(x) hum(x) universal quantifiers DRS for every beetle hums fol: x beetle(x) hum(x)
semantics of DRSs • DRSs can be translated to a subset of fol with equality. • One can give DRSs a model theoretic semantics in parallel to the equivalent fol expressions (satisfiability). • DRSs include a notion of accessibility for variables, what introduces deliberate restrictions on possible discourse structures • These restrictions constitute DRT as a theory on discourse structures.
accessibility • accessibility of discourse referents between nested DRSs is restricted • discourse referents of DRS K1 are accessible from DRS K2 iff • K1 subordinates K2 or • K1 = K2
subordination Let K1 and K2 be DRSs. K1 subordinates K2 iff • K1 contains a condition K2 • K1 contains a condition K2 K, where K is some DRS • K1 contains a condition K2 K or K K2 for some DRS K • K1 K2 is a condition in some DRS K • Some DRS K subordinates K2, and K1 subordinates K (transitvity of subordination)
x y beetle(x)fly(x)hum(y)x=y pronoun resolution DRS for a beetle flies. it hums • How can we construct this? • standard construction algorithm • l-DRT (explication follows) fol: x y beetle(x) fly(x) hum(y) x=y
x beetle(x) hum(y)y=? fly(x) accessibility conflict DRS for every beetle flies. it hums variable x not accessible for pronoun resolution fol: x beetle(x) hum(x)
l-DRT • In l-DRT we define l-calculus over DRSs. So we have • l-abstraction over DRSs and discourse referents • functional application of DRSs and discourse referents • b- and a-conversion of those functional applications • In addition, we use an operation (merge) to build a DRS from two DRSs by the • union of the discourse referents • union of the conditions
beetle: lx. a: lP.lQ. P@x Q@x hums: lx. beetle(x) hum(x) love(x,y) x x loves: lP.lx.P@ly. every: lP.lQ. P@x Q@x some lexical entries
hum(x) xbeetle(x) every beetle hums: lQ. hum(x) xbeetle(x) Q@x every beetle: lQ. x beetle(x) beetle: lx. P@x Q@x every: lP.lQ. example analysis S (NP@VP) hums: lx. NP (Det@N)
x y beetle(x)fly(x)hum(y)x=y x beetle(x)fly(x) y hum(y)y=? pronouns • Pronouns introduce variables that need special anaphoric bindings. • One has to find an accessibleandappropriate (antecedent) discourse referent. • Appropiateness can be triggered with additional constraints like gender congruence and the type of the pronoun (reflexivity: it vs. itself). • We introduce a special notation a-DRSs, where a stands for anaphoric. • Not functional application, but our merge operation has to be extended to cope with a-DRSs. • Analysis of a beetle flies. it hums: = y
coreference resolution • Anaphoric binding of pronouns is just one instance of the corefence resolution problem. • There are (many?) other instances of this problem, for example synonyms: Abendstern, Morgenstern; Current President of Germany, Gerhard Schröder, Herr Schröder etc. • Evaluation of the coreference resolution problem in current information technology systems is significantly poor: < 58% f-measure (geometrical middle of precision and recall, which means correctness and completeness) here there seems to be a lot of work to do.
some extensions of l-DRT • focussing • Anaphoric binding in large texts is not unlimited - what are the appropriate distance limits for bindings in a certain context? • quantifier scope ambiguities • The most recent solutions are based on underspecification, where earlier approaches used storage techniques. • presupposition resolution • What are the pieces of information that can be taken for granted in a context?
conclusion • Semantic composition is interesting not only for philosophers, linguists and other friends of the science of language and/or mind. • It is also a crucial problem for current well funded work on information technology. • How do/can/would we exploit this modern ‚gold rush‘?