1 / 29

Lists: Joining, splitting & pattern matching

Lists: Joining, splitting & pattern matching. Remember that a list can be regarded as consisting of: A head which is one (or more) Prolog terms which are members of the list and A tail which is a list containing the rest of the members of the list.

Download Presentation

Lists: Joining, splitting & pattern matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lists: Joining, splitting & pattern matching Remember that a list can be regarded as consisting of: A head which is one (or more) Prolog terms which are members of the list and A tail which is a list containing the rest of the members of the list.

  2. Lists: Joining, splitting & pattern matching We can use this concept to join and split lists by a simple process of pattern matching. Let us suppose we have a series of very simple statements represented as list of words. [tom, loves, mary][cows, eat, grass][mary, is going to the shops] If we know all our sentences have the subject as the first word the we can easily split them into agent and action.

  3. Lists: Joining, splitting & pattern matching [tom, loves, mary][cows, eat, grass][mary, is going to the shops] we could define a Prolog rule such as:split1([Agent|Action], Agent, Action). or if L is a list representing one of our sentences we could write: L = [Agent | Action]. In either case using the first statement as an example:Agent would become tom and action becomes [loves, mary]

  4. Lists: Joining, splitting & pattern matching No let us assume that we knew the agent was alwaqys represented by two words, for example[my, brother, loves, mary][your, sister, is going to the shops] we could define a new Prolog rule such as:split2([A,B |Action], A, B, Action). or if L is a list representing one of our sentences we could write: L = [A, B | Action]. In either case using the first statement as an example:A would become my B would become brotherand action becomes [loves, mary]

  5. Lists: Using append/3 (or concatenate) SWI-Prolog has a built in predicate append/3 which can be used to join two lists together. If for example we set the goal:append([monday, tuesday ], [wednesday, thursday], L). Prolog would reply: L = [monday, tuesday , wednesday, thursday ]

  6. Lists: Using append/3 (or concatenate) • If you were using a version of Prolog with did not include a version of append/3 then you could write a list concatenation rule. Examples along these lines are to be found in most textbooks: • conc([],L,L). • conc([H|T],L,[H|L1]):- • conc(T,L,L1). • This is a recursive process which can be inefficient when processing large number of long lists and there are special techniques such as difference lists to get round the problem. Modern fast machines make it less of a problem.

  7. Lists: Using append/3 to split a list append/3 can be used in reverse to split a list. For example: append(A, B, [a, b, c, d, ]). would give the reply A = [ ] B = [a, b, c, d, ] then on backtracking would give in succession A = [ a] B = [ b, c, d, e]A = [ a,b ] B = [c, d, e]A = [ a,b,c] B = [d]A = [ a,b,c,d] B = [ ]

  8. Lists: Using append/3 to split a list By including conditions we can arrange for a list to be split into two parts at a given point. For example to split at the first verb. split_at_verb(L, Front, [H | Rest]):- verb(H), append(Front, [H| Rest], L). verb(loves).verb(eats). Hence split_at_verb([my, brother, george, loves, to, drive fast cars],Front,Back). would give: Front = [my brother george] Back = [loves, to, drive, fast, cars]

  9. Lists: Using append/3 to split a list The two rules which follow provide a more generalised split facility. /* to split a list at a specific word */split(L,At,Front,[At|Back]):- append(Front,[At|Back],L). /* to split a list at a key phrase entered as a list */split(L,Key_Phrase,Before,After):- append(Before,Rest,L), append(Key_Phrase,After,Rest). Note: these involve a lot of backtracking, use the trace to see it. Therefore use with care if the lists are long.

  10. Natural Language Processing • If we want to represent (a sub set of) the English language in Prolog a frequently used techniques is to represent the vocabulary of the language as atoms and to represent sentences as lists of atoms. • The techniques we have been using up to this point are suitable for working in a very simple way with list of words or any other items. • If however we want to process a range of different sentence formats we need a more powerful system. The ones used however are still based on list manipulation.

  11. Natural Language Processing A very simple way to define a language is to define a set of terminal symbols, that is to say symbols that cannot be subdivided. We can think of these as the words of the language. We can define a grammar as a set of rules which specify the ways in which we can construct legal sequences of terminal symbols to make valid expressions in the languages, that is to say legal sentences. We could then go on to define our language as the set of all legal sentences which can be constructed by applying our grammar rules to our set of terminal symbols.

  12. Natural Language Processing • The progam which follows is a very simple English Grammar with just one type of sentence. • Sentences in this grammar are constructed using phrases and phrases are constructed from terminal symbols (words). • All components of the languages are represented as Prolog lists and the predicate append/3 is used to join together phrases in order to construct sentences.

  13. A Simple Grammar • sentence(S):- • noun_phrase(N), • verb_phrase(V), • append(N,V,S). • noun_phrase(N1):- • determiner(D1), • noun(N2), • append(D1,N2,N1). • verb_phrase(V1):- • verb(V2), • noun_phrase(N), • append(V2,N,V1). • determiner([the]). • determiner([a]). • noun([girl]). • noun([woman]). • verb([loves]). • verb([kicked]). • adjective([fat]). • adjective([young]).

  14. Using our simple grammar We can use our grammar to generate sentences and, if we like we can use backtracking to generate all the legal sentences of our language (provided it is small enough !!!!) For example if we set a goal: ?- sentence(S). we get: S = [the, girl, loves, the, girl] ; S = [the, girl, loves, the, woman] ; S = [the, girl, loves, a, girl] ; S = [the, girl, loves, a, woman]

  15. Using our simple grammar We can also use our grammar to test the legality of a sentence. For example if we set a goal:?- sentence([a,girl,loves,the,woman]). Prolog will replyYes Or if we set a goal?- sentence([a,girl,loves,the,elephant]). Prolog will replyNo

  16. Using our simple grammar We can also set goals which include variables in order to try to complete a sentence. For example:?- sentence([a,girl,loves,the,X]). Where Prolog repliesX = girl ;X = woman Or if we set a goal?- sentence([a,girl|L]). Prolog replies:L = [loves, the, girl] ;L = [loves, the, woman]

  17. Definite Clause Grammar Notation • Most versions of Prolog support this alternative notation as an easy way to define a grammar, known as Definite Clause Grammar Notation • It allows use to define a grammar using a more simple syntax similar to that used in linguistics. • When you consult the program it is translated into conventional Prolog syntax. • The Prolog generated uses Difference Lists as a means of overcoming the slow recursive process of list concatenation.

  18. A Grammar Written using DCG Notation sentence --> noun_phrase,verb_phrase. noun_phrase--> determiner,noun. noun_phrase--> proper_noun. verb_phrase --> verb,noun_phrase. proper_noun--> [john]. proper_noun--> [mary]. noun--> [cat]. noun--> [mouse]. determiner--> [the]. determiner--> [a]. verb--> [hates]. verb--> [loves].

  19. Translation of DCG to Prolog • sentence(A,C) :- • noun_phrase(A,B) , • verb_phrase(B,C) . • noun_phrase(A,C) :- • determiner(A,B) , • noun(B,C) . • verb_phrase(A,C) :- • verb(A,B) , • noun_phrase(B,C) . • determiner([the|X],X). • determiner([a|X],X). • noun([cat|X],X). • noun([mouse|X],X). • verb([loves|X],X). • verb([hates|X],X).

  20. Difference Lists List concatenation as it was defined on an earlier slide is a very inefficient process, particularly for long lists. Using normal list processing the only way we can process a list is to work from the head, item by item. In concatenation we want to join one list to the tail of another. Difference lists is an alternative method of list representation which makes list processing very much more efficient although it is a bit less flexible than concatenation of normal lists. We represent a list as a pair of lists that we could write A-B. We then impose a condition that the list B mustbe a legal tail of list A, for example [a,b,c | B] - B.

  21. Difference Lists We could now define a predicate that will concatenate two lists to produce a third list thus:concat( A - B, B - C, A -C). If we now set a goal:concat([a,b,c | B] - B, [d,e | C] - C, L). we get the replyB = [d,e | T2] L = [a,b,c,d,e| C] - C.

  22. Difference Lists -------A------- [a, b,c,d,e | C] ---B---- A is [a,b,c,d|C] A - B is [a,b,c] B is [d,e|C] B - C is [d,e] A - C is [a,b,c,d,e] If we set the goal ? concat([a,b,c|B] - B, [d,e|C] - C, A - C). we get B = [d, e|_G400] C = _G400 A = [a, b, c, d, e|_G400]

  23. Difference Lists • If we now decide to make C the empty list we can set our goal as: ?concat([a,b,c|B] - B, [d,e] - [], A - C). • and we get the resultB = [d, e] • A = [a, b, c, d, e] • C = []

  24. Using our grammar written in DCG Notation If we set a goal?- sentence(S,X). Prolog will replyS = [the, cat, hates, the, cat|_G217]X = _G217 If we set a goal?- sentence(S,[]). Prolog will reply S = [the, cat, hates, the, cat] We can therefore use our grammar to generate sentences without having to pay too much attention to the fact that we are using difference lists.

  25. Context Free Grammars All the grammars we have studied up to this point can be considered to be context free grammars. The choice of one word to go into our sentence does not impose any restriction on other words we might add to the sentence. This is because we have chosen only singular forms of nouns and verbs. In a realality most grammar rules are context sensitive, meaning that only certain noun verb combinations are legal. We have to modify our NLP programs to allow for this, usually by adding extra arguments.

  26. Context Sensitive Grammars If we consider the first language we implemented we can see that all the determiners, nouns and verbs can be used as singular forms. If we consider a sentence of this language to have the general format[determiner, noun, verb|Rest] then we can build a sentence using any combination of determiner, noun and verb. • determiner([the]). • determiner([a]). • noun([girl]). • noun([woman]). • verb([loves]). • verb([kicked]). • adjective([fat]). • adjective([young]).

  27. Context Sensitive Grammars If we add three plural forms girls, women and love to our language then our original grammar is no longer acceptable. It can now generate sentences which are not grammatically correct: The girls loves the womanA women kicked the girlsetc. • determiner([the]). • determiner([a]). • noun([girl]). • noun([girls]). • noun([woman]). • noun([women]). • verb([loves]). • verb([love]). • verb([kicked]). • adjective([fat]). • adjective([young]).

  28. Context Sensitive Grammars • determiner([the] ,singular). • determiner([the] ,plural). • determiner([a] ,singular). • noun([girl] ,singular). • noun([girls] ,plural). • noun([woman] ,singular). • noun([women] ,plural). • verb([loves] ,singular). • verb([love] ,plural). • verb([kicked] ,singular). • adjective([fat] ,singular). • adjective([young] ,singular). By adding an extra argument to the representetion of our terminal symbols we can identify it number as singular or plural. Then we have to re-write our grammar so that it uses arity 2 predicates for its terminal symbols and enforces grammatically correct number agreement in sentences.

  29. Context Sensitive Grammars • determiner([the] ,singular). • determiner([the] ,plural). • determiner([a] ,singular). • noun([girl] ,singular). • noun([girls] ,plural). • noun([woman] ,singular). • noun([women] ,plural). • verb([loves] ,singular). • verb([love] ,plural). • verb([kicked] ,singular). • adjective([fat] ,singular). • adjective([young] ,singular). • sentence(S, Num):- • noun_phrase(N, Num), • verb_phrase(V , Num), • append(N,V,S). • noun_phrase(N1 , Num):- • determiner(D1 , Num), • noun(N2 , Num), • append(D1,N2,N1). • verb_phrase(V1 , Num):- • verb(V2 , Num), • noun_phrase(N, _), • append(V2,N,V1).

More Related