Artificial Intelligence Definite clause grammars and semantic interpretation

Artificial IntelligenceDefinite clause grammars and semantic interpretation Fall 2008 professor: Luigi Ceccaroni

Semantic interpretation • The task of determining the meaning of a sentence can be divided into 2 steps: • Computing a context-independent notion of meaning (e.g., via DCG parsing) = semantic interpretation • Interpreting the parsed sentence in context to produce the final meaning representation • Many actual systems do not make this division and use contextual information early in the processing

What are definite clause grammars? • Definite Clause Grammars (DCGs) are convenient ways to represent grammatical relationships for various parsing applications. • They can be used for natural language work, for creating formal command and programming languages. • Quite simply, they are a nice notation for writing grammars that hides the underlying difference list variables.

DCGs • A little grammar written as a DCG: s --> np, vp.np --> det, n.vp --> v, np.vp --> v.det --> [the].det --> [a].n --> [woman].n --> [man].v --> [shoots]. • How do we use this DCG? In fact, we use it in exactly the same way as we used the difference list recognizer.

DCGs • For example, to find out whether a woman shoots a man is a sentence, we pose the query: s([a,woman,shoots,a,man],[]). • That is, just as in the difference list recognizer, we ask whether we can get an s by consuming the symbols in [a,woman,shoots,a,man], leaving nothing behind.

DCGs • Similarly, to generate all the sentences in the grammar, we pose the query: s(X,[]). • This asks what values we can give to X, such that we get an s by consuming the symbols in X, leaving nothing behind. • Moreover, the queries for other grammatical categories also work the same way. For example, to find out if a woman is a noun phrase we pose the query: np([a,woman],[]).

DCGs • We generate all the noun phrases in the grammar as follows: np(X,[]). • Quite simply, this DCG is a difference list recognizer! • That is, DCG notation is essentially syntactic sugar: user friendly notation that lets us write grammars in a natural way.

DCGs • The Prolog language can translate this notation into the kinds of difference lists discussed before. • So we have the best of both worlds: • a nice simple notation for working with • the efficiency of difference lists

DCGs • To see what Prolog translates DCG rules into: • let Prolog consult the rules of this DCG, then if you pose the query: listing(s) • you will get the response: s(A,B) :- np(A,C), vp(C,B). • This is what Prolog has translated s --> np,vp into. • Note that this is exactly the difference list rule we used in the recognizer.

DCGs • Similarly, if you pose the query: listing(np) • you will get: np(A,B) :- det(A,C), n(C,B). • This is what Prolog has translated np --> det,n into. • Again (apart from the choice of variables) this is the difference list rule we used in the recognizer.

DCGs • To get a complete listing of the translations of all the rules, simply type: listing.

Separating rules and lexicon • By separating rules and lexicon we mean that we want to eliminate all mentioning of individual words in the DCGs and instead record all the information about individual words separately in a lexicon. • To see what is meant by this, let's return to the basic grammar, namely: np - - > det, n.vp - - > v, np.vp - - > v.det - - > [the].det - - > [a].n - - > [woman].n - - > [man].v - - > [shoots].

Separating rules and lexicon • We are going to separate the rules form the lexicon. • That is, we are going to write a DCG that generates exactly the same language, but in which no rule mentions any individual word. • All the information about individual words will be recorded separately.

Separating rules and lexicon • Here is an example of a (very simple) lexicon. • Lexical entries are encoded by using a predicate lex/2 whose first argument is a word, and whose second argument is a syntactic category: lex(the, det).lex(a, det).lex(woman, n).lex(man, n).lex(shoots, v).

Separating rules and lexicon • A simple grammar that could go with this lexicon will be very similar to the basic DCG. • In fact, both grammars generate exactly the same language. • The only rules that change are those that mention specific words, i.e. the det, n, and v rules. det --> [Word], {lex(Word, det)}.n --> [Word], {lex(Word, n)}.v --> [Word], {lex(Word, v)}.

Separating rules and lexicon Grammar: np - - > det, n.vp - - > v, np.vp - - > v.det --> [Word], {lex(Word, det)}.n --> [Word], {lex(Word, n)}.v --> [Word], {lex(Word, v)}.

Separating rules and lexicon • Consider the new det rule: det --> [Word], {lex(Word, det)}. • This rule says “a det can consist of a list containing a single element Word” (note that Word is a variable). • The extra test adds the crucial condition: “as long as Word matches with something that is listed in the lexicon as a determiner”.

Separating rules and lexicon • With our present lexicon, this means that Word must be matched either with the word “a” or “the”: lex(the, det).lex(a, det). • So this single rule replaces the two previous DCG rules for det.

Separating rules and lexicon • This explains the how of separating rules from lexicon, but it doesn't explain the why. • Is it really so important? • Is this new way of writing DCGs really that much better?

Separating rules and lexicon • The answer is yes! for a theoretical reason: • Arguably rules should not mention specific lexical items. • The purpose of rules is to list general syntactic facts, such as the fact that a sentence can be made up of a noun phrase followed by a verb phrase. • The rules for s, np, and vp describe such general syntactic facts, but the old rules for det, n, and v don't. • Instead, the old rules simply list particular facts: that a is a determiner, that the is a determiner, and so on. • From a theoretical perspective it is much neater to have a single rule that says “anything is a determiner (or a noun, or a verb,...) if it is listed as such in the lexicon”.

Separating rules and lexicon • Now, our little lexicon, with its simple lex entries, is a toy. • But a real lexicon is (most emphatically!) not. • A real lexicon is likely to be very large (it may contain hundreds of thousands, or even millions, of words) and moreover, the information associated with each word is likely to be very rich.

Separating rules and lexicon • Our lex entries give only the syntactical category of each word. • A real lexicon will give much more, such as information about its phonological, morphological, semantic, and pragmatic properties. • Because real lexicons are big and complex, from a software engineering perspective it is best to write simple grammars that have a well-defined way of pulling out the information they need from vast lexicons.

Separating rules and lexicon • That is, grammars should be thought of as separate entities which can access the information contained in lexicons. • We can then use specialized mechanisms for efficiently storing the lexicon and retrieving data from it. • The new rules really do just list general syntactic facts, and the extra tests act as an interface to our (admittedly simple) lexicon that lets the rules find exactly the information they need.

Grammar 1: a trivial grammar for a fragment of language • s  np, vp. % A sentence (s) is a noun phrase (np) plus a verb phrase (vp) • np  det, n. % A noun phrase is a determiner plus a noun • np  n. % ... or just a noun. • vp  v, np. % A verb phrase is a verb and its direct object, which is an np • vp  v. % ... or just the verb (for intransitives). • det  [Word], {lex(Word, det)}. • n  [Word], {lex(Word, n)}. • v  [Word], {lex(Word, v)}.

Grammar 1: a trivial grammar for a fragment of language • lex(the, det). % ‘the’ is a determiner • lex(mary, n). % ‘mary’ is a noun. • lex(john, n). • lex(woman, n). • lex(apple, n). • lex(man, n). • lex(loves, v). % ‘loves’ is a verb. • lex(eats, v). • lex(sings, v).

Sentences for Grammar 1 • mary loves john • the woman eats the apple • the man sings • mary eats

Grammar 2: restrictions in argument selection • s  np, vp. • compl([])  []. • compl([arg(X)])  p(X), np. • compl([])  np. • np  name. • np  det, n. • vp  v(X), compl(X). % A vp is a verb plus a verbal complement (compl)

Grammar 2: restrictions in argument selection • v(A)  [Word], {lex(Word, v, A)}. • name  [Word], {lex(Word, name)}. • n  [Word], {lex(Word, n)}. • det  [Word], {lex(Word, det)}. • p(Word)  [Word], {lex(Word, p)}.

Grammar 2: restrictions in argument selection • lex(piensa, v, [arg(en)]). • lex(está, v, [arg(en)]). • lex(ríe, v, []). • lex(habla, v, [arg(con)]). • lex(lee, v, []). • lex(el, det). % ‘el’ is a determiner

Grammar 2: restrictions in argument selection • lex(un, det). • lex(mary, name). • lex(john, name). • lex(profesor, n). • lex(en, p)

Sentences for Grammar 2 • mary piensa en john • john habla con mary • john ríe • un profesor habla con mary

Extension of Grammar 2 • John habla de Clara con Mary • Needed modifications: • compl([arg(X) | Y])  p(X), np, compl(Y). • lex(habla, v, [arg(de), arg(con)]). • lex(clara, name).

Grammar 3: logical representation of sentences • s(F)  np(S), v(S, X, F), compl(X). • compl([ ])  [ ]. • compl([arg(X,O) | Y])  p(X), np (O), compl(Y). • compl([arg(null, O) | Y])  np(O), compl(Y). • np(S)  name(S). • np(S)  det, n(S).

Grammar 3: logical representation of sentences • v(S,A,F)  [Word], {lex(Word, v, S, A, F)}. • name(Word)  [Word], {lex(Word, name)}. • n(Word)  [Word], {lex(Word, n)}. • det  [Word], {lex(Word, det)}. • p(Word)  [Word], {lex(Word, p)}.

Grammar 3: logical representation of sentences • lex(clara, name). • lex(maria, name). • lex(juan, name). • lex(barcelona, name). • lex(libro, n). • lex(hombre, n). • lex(profesor, n).

Grammar 3: logical representation of sentences • lex(el, det). • lex(un, det). • lex(en, p). • lex(con, p). • lex(de, p).

Grammar 3: logical representation of sentences • lex(ríe, v, S, [], reir(S)). • lex(piensa, v, S, [arg(en, O)], pensar_en(S, O)). • lex(habla, v, S, [arg(de, O),arg(con, O1)], comunica(S,O, O1)). • lex(habla, v, S, [arg(con, O),arg(de, O1)], comunica(S,O1, O)). • lex(está, v, S, [arg(en, O)], locativo(S, O)). • lex(lee, v, S, [arg(null, O)], leer(S, O)).

Sentences for Grammar 3 • unary predicate (ríe, v, S, [ ], reir(S)). • binary predicate (piensa, v, S, [arg(en, O)], pensar_en(S, O)). • ternary predicate (habla, v, S, [arg(con,O), arg(de, O1)], comunica(S, O1, O)). • Example: • Juan piensa en Maria = pensar_en(juan, maria).

Prolog input and output • analysis(F, X, [ ]):- s(F, X, [ ]). • | ?- analysis(F, [juan, está, en, barcelona], [ ]). • F = locativo(juan, barcelona) ? • yes • | ?- analysis(F, [juan, piensa, en, maria], [ ]). • F = pensar_en(juan, maria) ? • yes

Prolog input and output • | ?- analysis(F, [el, libro, está, en, barcelona], [ ]). • F = locativo(libro, barcelona) ? • yes • | ?- analysis(F, [juan, lee, un, libro], [ ]). • F = leer(juan, libro) ? • yes

Prolog input and output • | ?- analysis(F, [el, hombre, habla, de, juan, con, maria], [ ]). • F = comunica(hombre, juan, maria) ? • yes • | ?- analysis(F, [el, hombre, ríe], [ ]). • F = reir(hombre) ? • yes

Prolog input and output • | ?- analysis(F, [el, profesor, piensa, en, un, libro], [ ]). • F = pensar_en(profesor, libro) ? • yes

Grammar 4: quantification • It exists X and X is a libro: • el libro = e(X, libro(X)). • All X such that X is a libro: • todo libro = a(X, libro(X)).

Grammar 4: quantification • el libro cae = e(X, and(libro(X), cae(X))) • Juan piensa en el libro = e(X, and(libro(X), piensa(juan, X))) • todo hombre piensa en el libro = a(X, implies(hombre(X), e(Y, and(libro(Y), piensa(X,Y)))

Grammar 4: quantification • lex(el,det,K,S1,S2,e(K,and(S1,S2))). • lex(un,det,K,S1,S2,e(K,and(S1,S2))). • lex(los,det,K,S1,S2,a(K,implies(S1,S2))). • lex(todo,det,K,S1,S2,a(K,implies(S1,S2))). • np(K,S2,F)  det(K,S1,S2,F), n(K,S1). • np(K,F,F)  name(K).

Grammar 4: quantification • compl([ ],S,S)  [ ]. • compl([arg(X,K) | Y],S1,S)  p(X), np(K,S2,S), compl(Y,S1,S2). • s(S)  np(K,S2,S), v(K,X,S1), compl(X,S1,S2). • n(K,F)  [Word],{lex(Word,n,K,F)}. • lex(libro,n,K,libro(K)).

Prolog input and output • | ?- analysis(F,[el,hombre,ríe],[ ]). • F = e(_A,and(hombre(_A),reir(_A))) ? • yes • | ?- analysis(F,[el,profesor,piensa,en,un,libro],[ ]). • F = e(_B,and(profesor(_B),e(_A,and(libro(_A),pensar_en(_B,_A))))) ? • yes

Prolog input and output • | ?- analysis(F,[el,hombre,habla,de,juan,con,maria],[ ]). • F = e(_A,and(hombre(_A),comunica(_A,juan,maria))) ? • yes • | ?- analysis(F,[todo,hombre,piensa,en,un,libro],[ ]). • F = a(_B,implies(hombre(_B),e(_A,and(libro(_A),pensar_en(_B,_A))))) ? • yes

Prolog input and output • | ?- analysis(F,[todo,libro,esta,en,barcelona],[ ]). • F = a(_A,implies(libro(_A),locativo(_A,barcelona))) ? • yes

Extension of Grammar 4 • El hombre bueno: • e(X, and(hombre(X), bueno(X)))

Artificial Intelligence Definite clause grammars and semantic interpretation

Artificial Intelligence Definite clause grammars and semantic interpretation

Presentation Transcript

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

Artificial Intelligence

Definite Clause Grammars

Artificial Intelligence

Lecture 7: Definite Clause Grammars

Artificial Intelligence Definite clause grammars and semantic interpretation

Definite Clause Grammar

Definite Clause Grammar

Inferring Definite-Clause Grammars to Express Multivariate Time Series

Artificial life and artificial intelligence

Artificial Intelligence

Artificial Intelligence

ARTIFICIAL INTELLIGENCE

Artificial Intelligence

CSA2050: Definite Clause Grammars IV

DEFINITE CLAUSE GRAMMARS

4. Semantic Processing and Attribute Grammars

Syntax and Processing it: Definite Clause Grammars in Prolog (optional material)

Semantic Interpretation