1 / 21

CSA2050 Introduction to Computational Linguistics

CSA2050 Introduction to Computational Linguistics. Lecture 8 Definite Clause Grammars. Rationale. Prolog Program. Logic. CFG + Sentence. Sentence Structure. Logic Rules and Grammar Rules. Basic Question: what is the connection between logic rules and grammar rules?

myron
Download Presentation

CSA2050 Introduction to Computational Linguistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSA2050 Introduction to Computational Linguistics Lecture 8 Definite Clause Grammars CSA2050: DCG I

  2. Rationale Prolog Program Logic CFG + Sentence SentenceStructure CSA2050: DCG I

  3. Logic Rules andGrammar Rules • Basic Question: what is the connection between logic rules and grammar rules? • x y male(x) & parent(x,y) → father(x,y) • S → NP VP • They are both concerned with the definition of predicates. CSA2050: DCG I

  4. Logic Rulesand Grammar Rules • Logic: arbitrary n-ary predicates, eg raining; clever(x); father(x,y); between(x,y,z) • Grammar Rules: predicates over text segments, egnp(x); vp(y); s(z). CSA2050: DCG I

  5. Text Segments • A text segment is a sequence of consecutive words. • A text segment can be identified by two pointers, if we assign names to the spaces between words.0 the 1 cat 2 sat 3 on 4 the 5 mat 6 • (0,6) is the whole sentence • (0,2) is the first noun phrase CSA2050: DCG I

  6. From Grammar Rules to Logic • The general statement made by the CF rule S → NP, VP • can be summarised using predicates over segments with the following logic statementNP(p1,p) & VP(p,p2) => S(p1,p2) CSA2050: DCG I

  7. From Grammar Rules to Logic 0 the 1 cat 2 sat 3 on 4 the 5 mat 6 VP NP S CSA2050: DCG I

  8. From Logic to Prolog • Each logic statement of the formNP(p1,p) & VP(p,p2) => S(p1,p2) • Corresponds to the "definite clause"s(P1,P2) :- np(P1,P), vp(P,P2). CSA2050: DCG I

  9. S → NP, VP NP → N NP → Det N VP → V NP s(P1,P2) :- np(P1,P), vp(P,P2). np(P1,P2) :- n(P1,P2). np(P1,P2) :- det(P1,P), n(P,P2). vp(P1,P2) :-v(P1,P), np(P, P2) Converting a Grammar CSA2050: DCG I

  10. Lexical Categories and Rules • Lexical categories are those which are not defined in the grammar itself (eg. N and V in our grammar) • Instead, they are defined by the words that they rewriteV → run, sleep, talk etc • Lexical categories always derive exactly one input token. CSA2050: DCG I

  11. Lexical Rules • A rule defining lexical category C must express the following information:there is a C between positions p1 and p2 if some word of syntactic category C spans those positions • There are many different ways to translate such a rule into a Prolog clause. • Each way needs to make reference to how the input sentence is represented. CSA2050: DCG I

  12. Defining Lexical Categories • Each category is defined in terms of the words it can rewrited(P1,P2) :- input(P1,P2,[the]).n(P1,P2) :- input(P1,P2,[cat]).n(P1,P2) :- input(P1,P2,['John']).v(P1,P2) :- input(P1,P2,[ate]). • How is the input sentence represented? CSA2050: DCG I

  13. Representing the Input • Define the predicate input(P1,P2,L) such that P1 and P2 are positions and L is a list containing the words spanning those positions • Checkpoint: show how to represent the input sentence "John ate the cat" CSA2050: DCG I

  14. John ate the cat input(0,1,['John']). input(1,2,[ate]). input(2,3,[the]). input(3,4,[cat]). • Checkpoints • Why is John in quotes? • Why use a list of one element rather than an atom? • Is this the only way to do it? CSA2050: DCG I

  15. 1. Grammar s(P1,P2) :- np(P1,P), vp(P,P2). np(P1,P2) :- n(P1,P2). np(P1,P2) :- d(P1,P), n(P,P2). vp(P1,P2) :- v(P1,P2). vp(P1,P2) :-v(P1,P), np(P, P2) 2. Lexicon d(P1,P2) :- input(P1,P2,[the]). n(P1,P2) :- input(P1,P2,[cat]). n(P1,P2) :- input(P1,P2,['John']). v(P1,P2) :- input(P1,P2,[ate]). 3. Input input(0,1,['John']). input(1,2,[ate]). input(2,3,[the]). input(3,4,[cat]). 4. Query ?- s(0,4). Complete Program CSA2050: DCG I

  16. 1 1 Call: vp(1,4) ? 2 2 Call: v(1,4) ? 3 3 Call: input(1,4,[ate]) ? 3 3 Fail: input(1,4,[ate]) ? 2 2 Fail: v(1,4) ? 2 2 Call: v(1,_349) ? 3 3 Call: input(1,_349,[ate]) ? 3 3 Exit: input(1,2,[ate]) ? 2 2 Exit: v(1,2) ? 4 2 Call: np(2,4) ? 5 3 Call: n(2,4) ? 6 4 Call: input(2,4,[cat]) ? 6 4 Fail: input(2,4,[cat]) ? 6 4 Call: input(2,4,[John]) ? 6 4 Fail: input(2,4,[John]) ? 5 3 Fail: n(2,4) ? 5 3 Call: d(2,_1338) ? 6 4 Call: input(2,_1338,[the]) ? 6 4 Exit: input(2,3,[the]) ? 5 3 Exit: d(2,3) ? 7 3 Call: n(3,4) ? 8 4 Call: input(3,4,[cat]) ? 8 4 Exit: input(3,4,[cat]) ? 7 3 Exit: n(3,4) ? 4 2 Exit: np(2,4) ? 1 1 Exit: vp(1,4) ? Trace of query?- vp(1,4) CSA2050: DCG I

  17. Representing the Sentence Using Difference Lists We can represent the input as a pair of pointers • The first pointer points to the entire list • The second pointer points to a suffix of the list. • The represented list is the difference between the two lists. input(['John',ate,the,cat],['John',ate,the,cat]). input(['John',ate,the,cat],[ate,the,cat]). input(['John',ate,the,cat],[the,cat]). input(['John',ate,the,cat],[]). input([X|Y],Y,X). CSA2050: DCG I

  18. DCG Notation • The conversion of CF rules into Prolog is so simple that it can be done automatically. • Clauses in DCG notation:s --> np, vp.np --> d, n.n --> [cat]. are automatically translated when read in tos(P1,P2) --> np(P1,P),vp(P,P2).np(P1,P2) --> d(P1,P), n(P,P2).n([dog|L],L). CSA2050: DCG I

  19. DCG Notation • Every DCG rule takes the formnonterminal --> expansionwhere expansion is any of • A nonterminal symbol np • A list of non-terminal symbols [each,other] • A null constitutent [ ] • A plain Prolog goal enclosed in braces {write('Found')} • A series of any of these expansions joined by commas. CSA2050: DCG I

  20. 1. Grammar s --> np, vp. np --> n. np --> d, n. vp --> v. vp --> v, np 2. Lexicon d --> [the]. n --> [cat]. n --> ['John']. v --> ['ate']. 3. Input 4. Query ?- s(['john', ate, the, cat], []). Complete DCG CSA2050: DCG I

  21. Checkpoints • What is your system's translation of s --> np, vp.n --> [cat]. CSA2050: DCG I

More Related