330 likes | 836 Views
Shake-and-Bake MT Chris Brew, The Ohio State University http://www.purl.org/NET/cbrew.htm 1 Ed Hovy’s vision Automatic summarization/ document understanding could be interesting System should be able to distill information down to a few salient points.
E N D
Shake-and-Bake MT Chris Brew, The Ohio State University http://www.purl.org/NET/cbrew.htm 1
Ed Hovy’s vision • Automatic summarization/ document understanding could be interesting • System should be able to distill information down to a few salient points. • Users are no longer overwhelmed by floods of irrelevant information. • NLP people would enjoy picking out the salient points and generating language that expresses these points Shake-and-Bake MT
Ed Hovy’s plaint • To know which are the key points takes world knowledge • Summarization systems have no world knowledge and little ability to represent key points in robust ways. • What they can do is to identify sentences that seem to be important. • They select a series of sentences then smooth them back together • NLP people still seem to like this, but perhaps they should not. Shake-and-Bake MT
John threw a large red ball John threw a red ball that is large John threw a large ball that is red x. throw(j,x) large(x) red(x) ball(x) x. throw(j,x) red(x) ball(x) large(x) x. throw(j,x) large(x) ball(x) red(x) Generating from logical form Shake-and-Bake MT
Shieber CL 20(1) 1993 The three terms are equivalent according to first-order logic, but we might not want them to be equivalent for purposes of generation Generation systems can and often are broken down into a strategic component and a tactical component. We might want the tactical component to be tied to a particular grammar, but the strategic component should know nothing of grammar. x. throw(j,x) large(x) red(x) ball(x) x. throw(j,x) red(x) ball(x) large(x) x. throw(j,x) large(x) ball(x) red(x) Generating from logical form Shake-and-Bake MT
We might want the generator to be tied to a particular grammar, but the strategic component should know nothing of grammar. Thus, we may wish that the strategic component have the freedom to pass to the tactical component any LF that has the right meaning. x. throw(j,x) large(x) red(x) ball(x) x. throw(j,x) red(x) ball(x) large(x) x. throw(j,x) large(x) ball(x) red(x) Generating from logical form Shake-and-Bake MT
Canonical logical forms • The grammar assigns a logical form to each string. • In general the one that it assigns is only one of the many that could express that meaning • We call this the canonical logical form. • Shieber argues that the generator should be able to generate from non-canonical logical forms • Reason 1: Canonicality is a fact about the grammar, not the meanings • Reason 2: The reasoner would otherwise have to know details of how the generator wants to receive logical forms Shake-and-Bake MT
The problem • Regrettably, there are no effective procedures for mapping non-canonical logical forms to canonical ones. • One might, for example, use a weaker logic (e.g. propositional logic) that offers normal forms. • But canonicality (grammar’s view of equivalence) and normalisation (logic’s view of equivalence) do not necessarily coincide. • For example, all those sentences about large red balls… Shake-and-Bake MT
Machine translation • Machine translation is a hard engineering problem • All-pairs translation between n languages seems to require O(n2) separate systems. • If n = 16 n2 = 256, which is more than the European Community can afford. KR JP EN FR DE Shake-and-Bake MT
Equivalence of Logical Forms Shake-and-Bake MT
TLE • Translationally equivalent expressions Shake-and-Bake MT
Modularity • We need • A parser for each source language, delivering some kind of logical form • A transfer mechanism for each language pair, converting source language logical forms into target language logical forms • A generator for each target language, converting logical forms back into strings of target language words Shake-and-Bake MT
Modularity ?! • Gotcha! • To keep the transfer mechanism simple, tempting to mess with the parser, but is convenient for one language pair will be inconvenient for the others. • Cleverness in the parser will provide opportunities for complexity in the transfer mechanism. • The generator is likely to offer the same deadly opportunities for extra cleverness as the parser Shake-and-Bake MT
Interlingua • This sounds like an argument for an interlingua-based approach. • Each language is responsible for mapping into and out of a powerful logical language. • Adding a new language is just a matter of adding mapping and unmapping components for the interlingua. • But, this fails because: • It is really hard to ensure that each language maps to identical logical forms for things that are intertranslatable. • Determining equivalence of logical forms without help is undecidable for any logical form language with adequate logical power. The only place you’re going to get help is from heuristics about the mapping between language pairs. Shake-and-Bake MT
Lexicalism • The Shake-and-Bake solution is to keep excess cleverness in check by adopting strong constraints on the architecture of the system. • Fortunately, HPSG and similar formalisms adopt lexicalism • The only meaningful elements of a grammar are its lexical items • Signs are combined by rules that introduce no independent meaning, but simply equate variables in the logical forms of he combining signs • The derivable logical forms of the grammar are constructed entirely from templates introduced by lexical items. Shake-and-Bake MT
The Shake-and-Bake idea • The transfer mechanism simply states equivalences between multisets of lexical items {pay,attention,to} {faire,attention,á} {take,a,walk} {faire,une,promenade} {as,as}{aussi,que} (as fast as possible aussi vite que possible) • The representation of a sentence is a bag of extensions of lexical items, called its base • Two sentences are translation equivalent if the bases are equivalent bags, and they obey the same constraints on relevant logical form variable. Shake-and-Bake MT
The Shake-and-Bake idea • Second condition is needed to keep “John loves Mary” distinct from “Mary loves John”, even though the two may be correlated. Condition on grammar: you must be able to find the semantics in SourceSign and Skolemize the variables. • Bag equivalence: find a set of equivalence statements that use each extended sign in the source bag once. The resulting target bag is the input to generation. • Both equivalencing and generation are non-deterministic. First try at target language sentence representation may not be one that can be sewn together into a target language sentence. Shake-and-Bake MT
Advantages of Shake-and-Bake • The ordering of items from the target language bag is entirely a matter for the source language grammar. • No constraints on grammar formalism, providing the semantic forms on each side are mappable. Nothing stops the English grammar writer from using HPSG, the French one TAG, the Japanese one JPSG and the German one YAHCDF. • Head-switching and argument switching just work: Jan zwemt graag John enjoys swimming John likes Mary Marie plait à Jean • See Whitelock 92 for the details Shake-and-Bake MT
Disadvantages • New algorithm development needed to handle bag-generation The input to the task consists of the following elements: • A set (B) of lexical signs having cardinality |B|. • A grammar (G) against which to parse this input string. and a solution to the problem consists of •A parse of any sequence (S) such that S contains all the elements of B. We are interested in understanding how hard this is. Shake-and-Bake MT
Shift-reduce parsing shiftreduce([Sign],Sign,[], []) shiftreduce(P0,Sign, [Next|Bag0], Bag):- push(Next, P0, P) shiftreduce(P,Sign,Bag0,Bag). shiftreduce(P0,Sign,Bag0,Bag) :- pop(First,P0,P1), pop(Second,P1,P2), rule(Mom,First, Second), push(Mom, P2, P), shiftreduce([P,Sign,Bag0, Bag). Shake-and-Bake MT
Shake and Bake generation shake_and_bake([Sign],Sign,[], []) shake_and_bake(P0,Sign, [Next|Bag0], Bag):- push(Next, P0, P) shake_and_bake(P,Sign,Bag0,Bag). shake_and_bake(P0,Sign,Bag0,Bag) :- pop(First,P0, P1), delete(Second,P1,P2), unordered_rule(Mom,First, Second), push(Mom, P2, P), shake_and_bake([P,Sign,Bag0, Bag). Shake-and-Bake MT
NP-completeness • It’s intuitively obvious that the change from “pick the second top element of the stack” to “pick any of the elements in the stack” introduces extra indeterminacy. • In fact, it turns out that bag generation is equivalent to the STABLE MÉNAGE Á TROIS problem, and therefore NP-complete and likely intractable (Brew,92) • So the ground shifts to answering: • Can we find a sensible algorithm anyway? • What properties of linguistic signs shall we exploit? Shake-and-Bake MT
English adjective ordering The fierce brown little cat ? The brown fierce little cat ? The brown little fierce cat ? The little brown fierce cat For the sake of argument, lets pretend that the top one is the only grammatical ordering. I’m not committed to this belief. Shake-and-Bake MT
Grammar Item Remainder Active Part the np / n(_) fierce n([]) / n([1|_]) little n([1]) / n([1,1|_]) brown n([1,1]) / n([1,1,1|_]) cat n(_) <none> Shake-and-Bake MT
Connections NodeCategoryLexical ItemNesting 0 np : <dummy> 1 1 np : the 0 2 n(_) : the 1 3 n([]) : fierce 0 4 n([1|_]) : fierce 1 5 n([1]) : little 0 6 n([1,1|_]) : little 1 7 n([1,1]) : brown 0 8 n([1,1,1|_]) : brown 1 9 n([1,1,1]) : cat 0 Shake-and-Bake MT
The search space Link together pairs that may stand in functor/argument relationships. We still don’t know which elements do stand in functor/argument relationships Shake-and-Bake MT
Completing the graph Add lines linking functor and argument categories. Now the task of finding a parse comes down to finding a Hamiltonian path through the graph. Shake-and-Bake MT
Applying constraints we can immediately see that node 3 must be connected to node 2, since there are no other links leading away from node 3. and so on. Shake-and-Bake MT
Mopping up • Once these links have been established, we can delete alternative links which they preclude. This results in the deletion of the lines from node 9 to nodes 6, 4 and 2, and that of the line from 7 to 2. • The resulting system can once again be simplified by deleting the line from node 7 to node 4, yielding a unique circuit through the graph. This corresponds to the correct analysis of “the fierce little brown cat”. • In this example the constraints encoded in the graph are sufficient to drive the analysis to a unique conclusion, without further search, but this will not always happen. We need a combination of constraint propagation with a facility for making guesses when confronted with a choice of alternatives. Shake-and-Bake MT
Shake and Bake generation with constraints shake_and_bake([Sign],Sign,[], [],_) shake_and_bake(P0,Sign,[Next|Bag0],Bag,G):- push(Next, P0, P) shake_and_bake(P,Sign,Bag0,Bag,G). shake_and_bake(P0,Sign,Bag0,Bag) :- pop(First,P0, P1,G), delete(Second,P1,P2,G), unordered_rule(Mom, First, Second, Info), update(Info,G), push(Mom, P2, P), shake_and_bake([P,Sign,Bag0, Bag). Shake-and-Bake MT
Implementation notes • We combine the constraint propagation mechanism with Whitelock's original shift-reduce parser, propagating constraints after every reduction step. The parser has the role of systematically choosing between alternative reductions, while the constraint propagation mechanism fills in the consequences of a particular set of choices. • One of the elements in a reduction is taken from the top of the stack, while the other is taken from anywhere in the tail of the stack. This idea, due to Whitelock and Reape, ensures that the input is treated as a bag rather than a string. Shake-and-Bake MT
Performance (number of reductions) Shake-and-Bake MT
Conclusions • These preliminary results must obviously be interpreted with some caution, since the examples were specially constructed. • For grammars related to HPSG it seems probable that considerable benefit would be gained from adding a constraint propagation component to an unordered version of a head-corner parsing algorithm, as described by Van Noord [Van Noord, 1991]. • Whatever the right basis for declarative MT is, likely to look something like this. • The constraint graph is a good place to put statistics Shake-and-Bake MT