1 / 83

Prolog for Linguists Symbolic Systems 139P/239P

Prolog for Linguists Symbolic Systems 139P/239P. John Dowding Week 5, Novembver 5, 2001 jdowding@stanford.edu. Office Hours. We have reserved 4 workstations in the Unix Cluster in Meyer library, fables 1-4 4:30-5:30 on Thursday this week Or, contact me and we can make other arrangements.

ron
Download Presentation

Prolog for Linguists Symbolic Systems 139P/239P

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Prolog for Linguists Symbolic Systems 139P/239P John Dowding Week 5, Novembver 5, 2001 jdowding@stanford.edu

  2. Office Hours • We have reserved 4 workstations in the Unix Cluster in Meyer library, fables 1-4 • 4:30-5:30 on Thursday this week • Or, contact me and we can make other arrangements

  3. Course Schedule • Oct. 8 • Oct. 15 • Oct. 22 • Oct. 29 • Nov. 5 (double up) • Nov. 12 • Nov. 26 (double up) • Dec. 3 No class on Nov. 19

  4. More about cut! • Common to distinguish between red cuts and green cuts • Red cuts change the solutions of a predicate • Green cuts do not change the solutions, but effect the efficiency • Most of the cuts we have used so far are all red cuts %delete_all(+Element, +List, -NewList) delete_all(_Element, [], []). delete_all(Element, [Element|List], NewList) :- !, delete_all(Element, List, NewList). delete_all(Element, [Head|List], [Head|NewList]) :- delete_all(Element, List, NewList).

  5. Green cuts • Green cuts can be used to avoid unproductive backtracking % identical(?Term1, ?Term2) identical(Var1, Var2):- var(Var1), var(Var2), !, Var1 == Var2. identical(Atomic1,Atomic2):- atomic(Atomic1), atomic(Atomic2), !, Atomic1 == Atomic2. identical(Term1, Term2):- compound(Term1), compound(Term2), functor(Term1, Functor, Arity), functor(Term2, Functor, Arity), identical_helper(Arity, Term1, Term2).

  6. Input/Output of Terms • Input and Output in Prolog takes place on Streams • By default, input comes from the keyboard, and output goes to the screen. • Three special streams: • user_input • user_output • user_error • read(-Term) • write(+Term) • nl

  7. Example: Input/Output • repeat/0 is a built-in predicate that will always resucceed % classifing terms classify_term :- repeat, write('What term should I classify? '), nl, read(Term), process_term(Term), Term == end_of_file.

  8. Streams • You can create streams with open/3 open(+FileName, +Mode, -Stream) • Mode is one of read, write, or append. • When finished reading or writing from a Stream, it should be closed with close(+Stream) • There are Stream-versions of other Input/Output predicates • read(+Stream, -Term) • write(+Stream, +Term) • nl(+Stream)

  9. Characters and character I/O • Prolog represents characters in two ways: • Single character atoms ‘a’, ‘b’, ‘c’ • Character codes • Numbers that represent the character in some character encoding scheme (like ASCII) • By default, the character encoding scheme is ASCII, but others are possible for handling international character sets. • Input and Output predicates for characters follow a naming convention: • If the predicate deals with single character atoms, it’s name ends in _char. • If the predicate deals with character codes, it’s name ends in _code. • Characters are character codes is traditional “Edinburgh” Prolog, but single character atoms were introduced in the ISO Prolog Standard.

  10. Special Syntax I • Prolog has a special syntax for typing character codes: • 0’a is a expression that means the character codc that represents the character a in the current character encoding scheme.

  11. Special Syntax II • A sequence of characters enclosed in double quote marks is a shorthand for a list containing those character codes. • “abc” = [97, 98, 99] • It is possible to change this default behavior to one in which uses single character atoms instead of character codes, but we won’t do that here.

  12. Built-in Predicates: • atom_chars(Atom, CharacterCodes) • Converts an Atom to it’s corresponding list of character codes, • Or, converts a list of CharacterCodes to an Atom. • put_code(Code) and put_code(Stream, Code) • Write the character represented by Code • get_code(Code) and get_code(Stream, Code) • Read a character, and return it’s corresponding Code • Checking the status of a Stream: • at_end_of_file(Stream) • at_end_of_line(Stream)

  13. Review homework problems: last/2 % last(?Element, ?List) last(Element, [Element]). last(Element, [_Head|Tail]):- last(Element, Tail). Or last(Element, List):- append(_EverthingElse, [Element], List).

  14. evenlist/1 and oddlist/1 %evenlist(?List). evenlist([]). evenlist([_Head|Tail]):- oddlist(Tail). %oddlist(+List) oddlist([_Head|Tail]):- evenlist(Tail).

  15. palindrome/1 %palindrome1(+List). palindrome1([]). palindrome1([_OneElement]). palindrome1([Head|Tail]):- append(Rest, [Head], Tail), palindrome1(Rest).

  16. Or, palindrome/1 %palindrome(+List) palindrome(List):- reverse(List, List). %reverse(+List, -ReversedList) reverse(List, ReversedList):- reverse(List, [], ReversedList). %reverse(List, Partial, ReversedList) reverse([], Result, Result). reverse([Head|Tail], Partial, Result):- reverse(Tail, [Head|Partial], Result).

  17. subset/2 %subset(?Set, ?SubSet) subset([], []). subset([Element|RestSet], [Element|RestSubSet]):- subset(RestSet, RestSubSet). subset([_Element|RestSet], SubSet):- subset(RestSet, SubSet).

  18. union/3 %union(+Set1, +Set2, -SetUnion) union([], Set2, Set2). union([Element|RestSet1], Set2, [Element|SetUnion]):- union(RestSet1, Set2, SetUnion), \+ member(Element, SetUnion), !. union([_Element|RestSet1], Set2, SetUnion):- union(RestSet1, Set2, SetUnion).

  19. intersect/3 %intersect(+Set1, +Set2, ?Intersection) intersect([], _Set2, []). intersect([Element|RestSet1], Set2, [Element|Intersection]):- member(Element, Set2), !, intersect(RestSet1, Set2, Intersection). intersect([_Element|RestSet1], Set2, Intersection):- intersect(RestSet1, Set2, Intersection).

  20. split/4 %split(+List, +SplitPoint, -Smaller, -Bigger). split([], _SplitPoint, [], []). split([Head|Tail], SplitPoint, [Head|Smaller], Bigger):- Head =< SplitPoint, !, % green cut split(Tail, SplitPoint, Smaller, Bigger). split([Head|Tail], SplitPoint, Smaller, [Head|Bigger]):- Head > SplitPoint, split(Tail, SplitPoint, Smaller, Bigger).

  21. merge/3 %merge(+List1, +List2, -MergedList) merge([], List2, List2). merge(List1, [], List1). merge([Element1|List1], [Element2|List2], [Element1|MergedList]):- Element1 =< Element2, !, merge(List1, [Element2|List2], MergedList). merge(List1, [Element2|List2], [Element2|MergedList]):- merge(List1, List2, MergedList).

  22. Sorting: quicksort/2 % quicksort(+List, -SortedList) quicksort([], []). quicksort([Head|UnsortedList], SortedList):- split(UnsortedList, Head, Smaller, Bigger), quicksort(Smaller, SortedSmaller), quicksort(Bigger, SortedBigger), append(SortedSmaller, [Head|SortedBigger], SortedList).

  23. Sorting: mergesort/2 % mergesort(+List, -SortedList). mergesort([], []). mergesort([_One], [_One]):- !. mergesort(List, SortedList):- break_list_in_half(List, FirstHalf, SecondHalf), mergesort(FirstHalf, SortedFirstHalf), mergesort(SecondHalf, SortedSecondHalf), merge(SortedFirstHalf, SortedSecondHalf, SortedList).

  24. Merge sort helper predicates % break_list_in_half(+List, -FirstHalf, -SecondHalf) break_list_in_half(List, FirstHalf, SecondHalf):- length(List, L), HalfL is L /2, first_n(List, HalfL, FirstHalf, SecondHalf). % first_n(+List, +N, -FirstN, -Remainder) first_n([Head|Rest], L, [Head|Front], Back):- L > 0, !, NextL is L - 1, first_n(Rest, NextL, Front, Back). first_n(Rest, _L, [], Rest).

  25. Lexigraphic Ordering • We can extending sorting predicates to sort all Prolog terms using a lexigraphic ordering on terms. • Defined recursively: • Variables @< Numbers @< Atoms @< CompoundTerms • Var1 @< Var2 if Var1 is older than Var2 • Atom1 @< Atom2 if Atom1 is alphabetically earlier than Atom2. • Functor1(Arg11, … Arg1N) @< Functor2(Arg21,…, Arg2M) if • Functor1 @< Functor2, or Functor1 = Functor2 and • N @< M, or Functor1=Functor2, N=M, and • Arg11 @< Arg21, or • Arg11 @= Arg21 and Arg12 @< Arg22, or …

  26. Built-in Relations: • Less-than @< • Greater than @> • Less than or equal @=< • Greater than or equal @>= • Built-in predicate sort/2 sorts Prolog terms on a lexigraphic ordering.

  27. Tokenizer • A token is a sequence of characters that constitute a single unit • What counts as a token will vary • A token for a programming language may be different from a token for, say, English. • We will start to write a tokenizer for English, and build on it in further classes

  28. Homework • Read section in SICTus Prolog manual on Input/Output • This material corresponds to Ch. 5 in Clocksin and Mellish, but the Prolog manual is more up to date and consistent with the ISO Prolog Standard • Improve the tokenizer by adding support for contractions • can’t., won’t haven’t, etc. • would’ve, should’ve • I’ll, she’ll, he’ll • He’s, She’s, (contracted is and contracted has, and possessive) • Don’t hand this in, but hold on to it, you’ll need it later.

  29. My tokenizer • First, I modified to turn all tokens into lower case • Then, added support for integer tokens • Then, added support for contraction tokens

  30. Converting character codes to lower case % occurs_in_word(+Code, -LowerCaseCode) occurs_in_word(Code, Code):- Code >= 0'a, Code =< 0'z. occurs_in_word(Code, LowerCaseWordCode):- Code >= 0'A, Code =< 0'Z, LowerCaseWordCode is Code + (0'a - 0'A).

  31. Converting to lower case % case for regular word tokens find_one_token([WordCode|CharacterCodes], Token, RestCharacterCodes):- occurs_in_word(WordCode, LowerCaseWordCode), find_rest_word_codes(CharacterCodes, RestWordCodes, RestCharacterCodes), atom_chars(Token, [LowerCaseWordCode|RestWordCodes]). find_rest_word_codes(+CharacterCodes, -RestWordCodes, -RestCharacterCodes) find_rest_word_codes([WordCode|CharacterCodes], [LowerCaseWordCode|RestWordCodes], RestCharacterCodes):- occurs_in_word(WordCode, LowerCaseWordCode), !, % red cut find_rest_word_codes(CharacterCodes, RestWordCodes, RestCharacterCodes). find_rest_word_codes(CharacterCodes, [], CharacterCodes).

  32. Adding integer tokens % case for integer tokens find_one_token([DigitCode|CharacterCodes], Token, RestCharacterCodes):- digit(DigitCode), find_rest_digit_codes(CharacterCodes, RestDigitCodes, RestCharacterCodes), atom_chars(Token, [DigitCode|RestDigitCodes]). % find_rest_digit_codes(+CharacterCodes, -RestDigitCodes, -RestCharacterCodes) find_rest_digit_codes([DigitCode|CharacterCodes], [DigitCode|RestDigitCodes], RestCharacterCodes):- digit(DigitCode), !, % red cut find_rest_digit_codes(CharacterCodes, RestDigitCodes, RestCharacterCodes). find_rest_digit_codes(CharacterCodes, [], CharacterCodes).

  33. Digits %digit(+Code) digit(Code):- Code >= 0'0, Code =< 0'9.

  34. Contactions • Turned unambiguous contractions into the corresponding English word • Left ambiguous contractions contracted. • Handled 2 cases • Simple contractions: He’s => He + ‘s He’ll => He + will They’ve => They + have • Exceptions can’t => can + not won’t => will + not

  35. Simple Contractions simple_contraction("'re", "are"). simple_contraction("'m", "am"). simple_contraction("'ll", "will"). simple_contraction("'ve", "have"). simple_contraction("'d", "'d"). % had, would simple_contraction("'s", "'s"). % is, has, possessive simple_contraction("n't", "not").

  36. handle_contractions/2 % handle_contractions(+TokenChars, -FrontTokenChars, RestTokenChars) handle_contractions("can't", "can", "not"):- !. handle_contractions("won't", "will", "not"):- !. handle_contractions(FoundCodes, Front, NewCodes):- simple_contraction(Contraction, NewCodes), append(Front, Contraction, FoundCodes), Front \== [], !.

  37. Modify find_one_token/3 % case for regular word tokens find_one_token([WordCode|CharacterCodes], Token, RestCharacterCodes):- occurs_in_word(WordCode, LowerCaseWordCode), find_rest_word_codes(CharacterCodes, RestWordCodes, TempCharacterCodes), handle_contractions([LowerCaseWordCode|RestWordCodes], FirstTokenCodes, CodesToAppend), append(CodesToAppend, TempCharacterCodes, RestCharacterCodes), atom_chars(Token, FirstTokenCodes).

  38. Dynamic predicates and assert • Add or remove clauses from a dynamic predicate at run time. • To specify that a predicate is dynamic, add :- dynamic predicate/Arity. to your program. • assert/1 adds a new clause • retract/1 removes one or more clauses • retractall/1 removes all clauses for the predicate • Can’t modify compiled predicates at run time • Modifying a program while it is running is dangerous

  39. assert/1, asserta/1, and assertz/1 • Asserting facts (most common) assert(Fact) • Asserting rules assert( (Head :- Body) ). • asserta/1 adds the new clause at the front of the predicate • assertz/1 adds the new clause at the end of the predicate • assert/1 leaves the order unspecified

  40. Built-In: retract/1 • retract(Goal) removes the first clause that matches Goal. • On REDO, it will remove the next matching clause, if any. • Retract facts: retract(Fact) • Retract rules: retract( (Head :- Body) ).

  41. Built-in: retractall/1 • retractall(Head) removes all facts and rules whose head matches. • Could be implemented with retract/1 as: retractall(Head) :- retract(Head), fail. retract(Head):- retract( (Head :- _Body) ), fail. retractall(_Head).

  42. Built-In: abolish(Predicate/Arity) • abolish(Predicate/Arity) is almost the same as retract(Predicate(Arg1, …, ArgN)) except that abolish/1 removes all knowledge about the predicate, where retractall/1 only removes the clauses of the predicate. That is, if a predicate is declared dynamic, that is remembered after retractall/1, but not after abolish/1.

  43. Example: Stacks & Queues :- dynamic stack_element/1. empty_stack :- retractall(stack_selement(_Element)). % push_on_stack(+Element) push_on_stack(Element):- asserta(stack_element(Element)). % pop_from_stack(-Element) pop_from_stack(Element):- var(Element), retract(stack_element(Element)), !.

  44. Queues % dynamic queue_element/1. empty_queue :- retractall(queue_element(_Element)). %put_on_queue(+Element) put_on_queue(Element):- assertz(queue_element(Element)). %remove_from_queue(-Element) remove_from_queue(Element):- var(Element), retract(queue_element(Element)), !.

  45. Example: prime_number. :- dynamic known_prime/1. find_primes(Prime):- retractall(known_prime(_Prime)), find_primes(2, Prime). find_primes(Integer, Integer):- \+ composite(Integer), assertz(known_prime(Integer)). find_primes(Integer, Prime):- NextInteger is Integer + 1, find_primes(NextInteger, Prime).

  46. Example: prime_number (cont) %composite(+Integer) composite(Integer):- known_prime(Prime), 0 is Integer mod Prime, !.

  47. Aggregation: findall/3. • findall/3 is a meta-predicate that collects values from multiple solutions to a Goal: findall(Value, Goal, Values) findall(Child, parent(james, Child), Children) • Prolog has other aggregation predicates setof/3 and bagof/3, but we’ll ignore them for now.

  48. findall/3 and assert/1 • findall/3 and assert/1 both let you preserve information across failure. :- dynamic solutions/1. findall(Value, Goal, Solutions):- retractall(solutions/1), assert(solutions([])), call(Goal), retract(solutions(S)), append(S, [Value], NextSolutions), assert(solutions(NextSolutions)), fail. findall(_Value, Goal, Solutions):- solutions(Solutions).

  49. Special Syntax III: Operators • Convenience in writing terms • We’ve seem them all over already: union([Element|RestSet1], Set2, [Element|SetUnion]):- union(RestSet1, Set2, SetUnion), \+ member(Element, SetUnion), !. This is just an easier way to write the term: ‘:-’(union([Element|RestSet],Set2,[Element|SetUnion]), ‘,’(union(RestSet1,Set2,SetUnion), ‘,’(‘\+’(member(Element, SetUnion), !)))

  50. Operators (cont) • Operators can come before their arguments (prefix) • \+, dynamic • Or between their arguments (infix) • , + is < • Of after their arguments (postfix) • Prolog doesn’t use any of these (yet) • The same Operator can be more than one type • :-

More Related