460 likes | 598 Views
Intelligent systems. Lecture 21 Example of ALICE-like dialog system. Language AIML for ALICE-like systems. The Main Processing Cycle of Elizabeth. Receive user’s input as the ‘active text’. Input Transformations Apply any input transforms. Keyword Transformations
E N D
Intelligent systems Lecture 21 Example of ALICE-like dialog system. Language AIML for ALICE-like systems
The Main Processing Cycle of Elizabeth Receive user’s input as the ‘active text’ Input Transformations Apply any input transforms Keyword Transformations Search for a keyword; if one is found, replace the active text with a response from the corresponding set; if not, replace it with a no-keyword response Output the new ‘active text’ Output Transformations Apply any output transforms
Input/Output Transformations • Input transformations are applied to the initial input; their main use is to standardise words that you want to be treated similarly, e.g. • I mum => mother • if you want ‘mum’ to be changed to ‘mother’. • Output transformations are applied to the final output; often their main use is to change first-person to second-person and vice-versa, e.g. • O i am => YOU ARE • Make sure you capitalise these as illustrated above.
Simple Keywords and Responses • The following script commands create a simple keyword/response set with two keywords and three responses: • When ‘mother’ or ‘father’ is found in the active text, one of the responses will be chosen (randomly, but avoiding immediate repetition if possible). K MOTHER K FATHER R TELL ME MORE ABOUT YOUR FAMILY. R DO YOU HAVE ANY BROTHERS OR SISTERS? R ARE YOU THE YOUNGEST IN YOUR FAMILY?
Keywords with Substitution • The following script commands create a keyword/ response set which pattern-matches the keyword against the active text and then makes appropriate substitutions in the response: • Any pattern of the form [p…] is a phrase wildcard, matching any sequence of words (which can contain only letters, hyphens or apostrophes). [phr1] is treated as a separate pattern from [phr2]. K [phr1] IS YOUNGER THAN [phr2] R SO [phr2] IS OLDER THAN [phr1]
Pattern Matching • Any of these patterns can be used in combination (see the help file section ‘Pattern Matching’ for the complete list): • [w…] any single complete word (or part-word) • [t…] any single complete term (or part-term) – a term, unlike a word, may contain digits as well as letters • [l…] any single letter (i.e. any character that can occur in a word, including hyphen/apostophe) • [p…] a phrase – any sequence of complete words • [X…] any text string which contains only complete ‘items’ (so it cannot contain only half a word or number). • [b…] like [X…], but will only match text in which all brackets – ‘(’, ‘)’, ‘<’, and ‘>’, correctly pair up. • [;] any punctuation mark • [] matches beginning or end of active text
Memorising and Recalling Phrases • Note from the previous example: • ‘& {…}’ is used to specify an action, in this case one that is triggered by the matching of a keyword and the selection of a corresponding response; • ‘{M [phrase]}’ memorises whatever text was matched against [phrase]; • [M] can then be used to recall the latest remembered text, within any kind of transformation or response; • Here a no-keyword response is created, which when invoked will make use of the latest memory ([M]). • [M-1], [M-2] etc. can be used to recall earlier memories (the last but 1, last but 2, etc.).
Memorising Pronoun References • One simple use of index-coded memories is to keep track of what’s been referred to by a recent output, so that pronouns (‘it’, ‘they’ etc.) can be dealt with appropriately. The following might yield ‘I watch football. WHAT DO YOU THINK OF DAVID BECKHAM? He crosses well. I LIKE HIS FREE KICKS …’: here the input transformation replaces ‘He’ in the last input with ‘BECKHAM’, enabling an appropriate response to be found. I HE => [Mhe] I HIM => [Mhe] K FOOTBALL R WHAT DO YOU THINK OF DAVID BECKHAM? & {Mhe BECKHAM} K BECKHAM R I LIKE HIS FREE KICKS, BUT NOT HIS HAIR!
Using Multiple Memories • This script will keep track of some of your favourites, tell you what they are, and then go on repeating them. W WHAT ARE YOUR FAVOURITE GAME, TEAM AND PLAYER? K GAME [X?] IS [phrase] & {Mgame [phrase]} K TEAM [X?] IS [phrase] & {Mteam [phrase]} K PLAYER [X?] IS [phrase] & {Mplayer [phrase]} R THANK YOU - SAY "OK" WHEN YOU'VE FINISHED K OK R YOUR FAVOURITE GAME IS [Mgame], TEAM IS [Mteam], AND PLAYER IS [Mplayer] & {I [word] => OK} N PLEASE CARRY ON TELLING ME YOUR FAVOURITES
Note from the previous example: • ‘K GAME [X?] IS [phrase]’ matches any text containing the word ‘GAME’ and then at some later point ‘IS’ followed by a phrase (recall that a ‘phrase’ here just means one or more words in sequence); • ‘& {Mgame [phrase]}’ then memorises the relevant phrase under the index code ‘game’; • ‘R YOUR FAVOURITE GAME IS [Mgame], TEAM IS [Mteam], AND PLAYER IS [Mplayer]’ outputs the three memories, but this response cannot be used until something has been memorised under each of the three index codes (you can check this by inputting ‘OK’); • ‘& {I [word] => OK}’ creates an input transformation which changes all words to ‘OK’ – this simply ensures that from then on, any input will be treated as though it was just ‘OK OK …’.
Changing Mood • The following script fragment makes Elizabeth get progressively more angry at the user’s swearing (starting off in the ‘calm’ state, then progressing to ‘cross’ and ‘enough’; note how ‘M\’ is used to delete all memories, and that more than one command can be put inside the curly brackets. K DAMN K BLOODY R [Mcalm] I'D RATHER YOU DIDN'T SWEAR, PLEASE & {M\ Mcross} R [Mcross] LOOK, JUST STOP SWEARING WILL YOU! & {M\ Menough} R [Menough] THAT'S IT! I'VE HAD ENOUGH - GO AWAY! & {M\ O [X] => JUST GO AWAY} Mcalm
Syntactic Analysis • The ELIZA method of simple pattern-matching and pre-formed responses may sometimes be able to generate the illusion of ‘intelligent’ language processing, and even in some cases (e.g. a computer help system) provide the basis for a useful tool. • However to get anywhere near genuine NLP (natural language processing), Elizabeth needs to do more than pattern-match – it must be responsive to the structure of sentences, and react not just according to the literal word strings they contain, but how these words are put together – their syntax.
A Testbed: Simple Transformations • A good testbed for Elizabeth’s potential for handling syntactic structure is the attempt to generate simple grammatical transformations. • A transformation is a change in structure which alters the ‘surface’ form of the sentence (so the words are different, or in a different order), but without significantly altering its ‘propositional content’ (i.e. what ‘facts’ are in question; what the sentence ‘says’ about what or whom). • Transformations played a major and controversial role in the rise of Chomskyan linguistics, but their value as a useful testbed is independent of all that.
Our Starting Point:Active Declarative Sentences • We start from straightforward active declarative sentences, such as: • John chases the cat • The white rabbits bit a black dog • You like her • Declarative simply means that these sentences purport to state (‘declare’) facts – they are not questions or commands, for example. • Here we shall stick to very simple word categories and grammatical constructs.
Some Types of Transformation (1):Active to Passive • Most types of transformation are easier to grasp by example than explanation: • Active to Passive • ‘John chases the cat’ becomes • ‘The cat is chased by John’ • ‘The white rabbits bit a black dog’ becomes • ‘A black dog was bitten by the white rabbits’ • ‘You like her’ becomes • ‘She is liked by you’
(2): Yes/No Questions • These transform the sentence into a question with a simple yes/no answer: • ‘John chases the cat’ becomes • ‘Does John chase the cat?’ • ‘You like her’ becomes • ‘Do you like her?’ • They can also be applied to passive sentences, though here they’re a bit more complicated: • ‘A black dog was bitten by the white rabbits’ becomes • ‘Was a black dog bitten by the white rabbits?’
(3): Tag Questions • A Tag Question is appended to the end of a sentence, to ask for confirmation or to give emphasis to what was said: • ‘John chases the cat’ becomes • ‘John chases the cat, doesn’t he?’ • ‘The white rabbits bit a black dog’ becomes • ‘The white rabbits bit a black dog, didn’t they?’ • ‘You like her’ becomes • ‘You like her, don’t you?’ • These provide an excellent test case, because a tag question must agree with the sentence in number (singular or plural), person (first person, second, third), gender (masculine, feminine, neuter), and tense (past, present, future).
Phrase Structure Rules in Elizabeth • The phrase structure rules above can be reversed and then translated into Elizabeth input transformations suitable for analysing a sentence into its structural constituents: • NP D N • I (d:[b1]) (n:[b2]) => (np:(D:[b1]) (N:[b2])) • VP V NP • I (v:[b1]) (np:[b2]) => (vp:(V:[b1]) (NP:[b2])) • S NP VP • I (np:[b1]) (vp:[b2]) => (s: (NP:[b1]) (VP:[b2])) • Note here that a ‘[b…]’ pattern can match anything at all, as long as it contains matching brackets. This ensures that the sentence structure is recorded by the ‘nested’ brackets, and that the processing respects this structure.
Obviously we also need to specify the categories (noun, verb etc) for the various words. We might end up with a set of input transformations like this: • I the => (d:THE) • I dog => (n:DOG) • I cat => (n:CAT) • I chases => (v:CHASES) • I (d:[b1]) (n:[b2]) => (np:(D:[b1]) (N:[b2])) • I (v:[b1]) (np:[b2]) => (vp:(V:[b1]) (NP:[b2])) • I (np:[b1]) (vp:[b2]) => (s: (NP:[b1]) (VP:[b2])) • If we then input the sentence: • the dog chases the cat • the input transformations will convert this into: • (s: (NP:(D:THE)(N:DOG)) (VP:(V:CHASES) (NP:(D:THE)(N:CAT))))
Having used the input transformations to analyse the sentence into its constituent structure, we can then apply keyword transformations to alter that structure, e.g. from active to passive: • K (s:(NP:[b1]) (VP:[b2])) • R (s:(VP:[b2] passive) (NP:[b1])) • Then output transformations can be used to decompose the sentence structure back into its parts: • O (s:(VP:[b1] passive) (NP:[b2])) => (vp:[b1] passive)(np:[b2]) • O (vp:(V:[b1]) (NP:[b2]) passive) => (np:[b2])(v:[b1] passive) • O (np:(D:[b1]) (N:[b2])) => (d:[b1]) (n:[b2]) • O (v:CHASES passive) => IS CHASED BY • O (d:[b1]) => [b1] • O (n:[b1]) => [b1] • If we then input the sentence: • the dog chases the cat • the output will have been ‘translated’ into the passive form: • the cat is chased by the dog
Binary Propositional Connectives • A binary propositional connective joins two proposi-tions together to make a third (complex) proposition. • Such connectives in English include ‘and’, ‘because’, ‘but’, ‘if’, ‘implies’, ‘nevertheless’, ‘only if’, ‘or’, ‘suggests that’, ‘unless’. • ‘Snow is white’ and ‘the moon is cheese’ are atomic propositions (i.e. they’re not themselves made up of other propositions). Using the connectives, we get: • Snow is white and the moon is cheese • Snow is white because the moon is cheese • Snow is white but the moon is cheese • Snow is white if the moon is cheese (etc.)
Language AIML • Artificial Linguistic Internet Computer Entity (A.L.I.C.E.) • Artificial Intelligence Markup Language (AIML) • The first AIML-based personality program, won the Loebner Prize as “the most human computer” at the annual Turing Test contests in 2000 and 2001. (Loebner 2002) • More than 500 volunteers from around the world have contributed to her development (from 1995). • How to use AIML to create robot personalities like A.L.I.C.E. that pretend to be intelligent and self-aware.
AIML • AIML files consist of simple stimulus-response modules called categories. • Each <category> contains a <pattern>, or “stimulus,” and a <template>, or “response.” • AIML software stores the stimulus-response categories in a tree managed by an object called the Graphmaster. • When a bot client inputs text as a stimulus, the Graphmaster searches the categories for a matching <pattern>, along with any associated context, and then outputs the associated <template> as a response.
AIML • These categories can be structured to produce more complex humanlike responses with the use of a very few markup tags. • AIML bots make extensive use of themulti-purpose recursive <srai> tag, as well as two AIML context tags, <that> and <topic>. • Conditional branching in AIML is implemented with the <condition> tag. • AIML implements the ELIZA personal pronoun swapping method with the <person> tag. Bot personalities are created and shaped through a cyclical process of supervised learning called Targeting. • Targeting is a cycle incorporating client, bot, and botmaster, wherein client inputs that find no complete match among the categories are logged by the bot and delivered as Targets the botmaster, who then creates suitable responses, starting with the most common queries. • The Targeting cycle produces a progressively more refined bot personality. The art of AIML writing is most apparent in creating default categories, which provide noncommittal replies to a wide range of inputs.
Categories • In its simplest form, the template consists of only plain, unmarked text. More generally, AIML tags transform the reply into a mini computer program which can save data, activate other programs, give conditional responses, and recursively call thepattern matcher to insert the responses from other categories. • AIML currently supports two ways to interface other languages and systems. The <system> tag executes any program accessible as an operating system shell command, and inserts the results in the reply. Similarly, the <javascript> tag allows arbitrary scripting inside the templates. • The optional context portion of the category consists of two variants, called <that> and <topic>. The <that> tag appears inside the category, and its pattern must match the robot’s last utterance. Remembering one last utterance is important if the robot asks a question. The <topic> tag appears outside the category, and collects a group of categories together. The topic may be set inside any template.
Recursion • AIML implements recursion with the <srai> operator. • Symbolic Reduction—Reduce complex grammatic forms to simpler ones. • Kinds of application of <srai>: • Divide and Conquer—Split an input into two or more subparts, and combine the responses to each. • Synonyms—Map different ways of saying the same thing to the same reply. • Spelling or grammar corrections. • Detecting keywords anywhere in the input. • Conditionals—Certain forms of branching may be implemented with <srai>. • Any combination of (1)-(6). • The danger of <srai> is that it permits the botmaster to create infinite loops.
Symbolic Reduction • <category> • <pattern>DO YOU KNOW WHO * IS</pattern> • <template><srai>WHO IS <star/></srai></template> • </category> • Whatever input matched this pattern, the portion bound to the wildcard * may be inserted into the reply with the markup <star/>. This category reduces any input of the form “Do you know who X is?” to “Who is X?”
Divide and Conquer • <category> • <pattern>YES *</pattern> • <template><srai>YES</srai> <sr/></template> • </category> • The markup <sr/> is simply an abbreviation for <srai><star/></srai>
Synonyms • The AIML 1.01 standard does not permit more than one pattern per category. Synonyms are perhaps the most common application of <srai>.Many ways tosay the same thing reduce to one category, which contains the reply: • <category> • <pattern>HELLO</pattern> • <template>Hi there!</template> • </category> • <category> • <pattern>HI</pattern> • <template><srai>HELLO</srai></template> • </category> • <category> • <pattern>HI THERE</pattern> • <template><srai>HELLO</srai></template> • </category> • <category> • <pattern>HOWDY</pattern> • <template><srai>HELLO</srai></template> • </category> • <category> • <pattern>HOLA</pattern> • <template><srai>HELLO</srai></template> • </category>
Spelling and Grammar correction • The single most common client spelling mistake is the use of “your” when “you’re” or “you are” is intended. Not every occurrence of “your” however should be turned into “you’re.” A small amount of grammatical context is usually necessary to catch this error: • <category> • <pattern>YOUR A *</pattern> • <template>I think you mean “you’re” or “you are” not “your.” • <srai>YOU ARE A <star/></srai> • </template> • </category> • Here the bot both corrects the client input and acts as a language tutor.
Keywords • <category> • <pattern>MOTHER</pattern> <template> Tell me more about your family. </template> • </category> • <category> • <pattern>_ MOTHER</pattern> <template><srai>MOTHER</srai></template> • </category> • <category> • <pattern>MOTHER _</pattern> <template><srai>MOTHER</srai></template> • </category> • <category> • <pattern>_ MOTHER *</pattern> • <template><srai>MOTHER</srai></template> • </category> • The first category both detects the keyword when it appears by itself, and provides the generic response. The second category detects the keyword as the suffix of a sentence. The third detects it as the prefix of an input sentence, and finally the last category detects the keyword as an infix. Each of the last three categories uses <srai> to link to the first, so that all four cases produce the same reply, but it needs to be written and stored only once.
Conditionals • It is possible to write conditional branches in AIML, using only the <srai> tag. Consider three categories: • <category> • <pattern>WHO IS HE</pattern> • <template><srai>WHOISHE <get name=“he”/></srai></template> • </category> • <category> • <pattern>WHOISHE *</pattern> • <template>He is <get name=“he”/>.</template> • </category> • <category> • <pattern>WHOISHE UNKNOWN</pattern> • <template>I don’t know who he is.</template> • </category> • Provided that the predicate “he” is initialized to “Unknown,” the categories execute a conditional branch depending on whether “he” has been set. As a convenience to the botmaster, AIML also provides the equivalent function through the <condition> tag.
Context • The keyword “that” in AIML refers to the robot’s previous utterance • <category> • <pattern>YES</pattern> • <that>DO YOU LIKE MOVIES</that> • <template>What is your favorite movie?</template> • </category> • This category is activated when the client says YES. The robot must find out what is he saying “yes” to. If the robot asked, “Do you like movies?,” this category matches, and the response, “What is your favorite movie?,” continues the conversation along the same lines
Context (2) • Internally the AIML interpreter stores the input pattern, that pattern and topic pattern along a single path, like: • INPUT <that> THAT <topic> TOPIC • When the values of <that> or <topic> are not specified, the program implicitly sets the values of the corresponding THAT or TOPIC pattern to the wildcard *. • The first part of the path to match is the input. If more than one category have the same input pattern, the program may distinguish between them depending on the value of <that>. If two or more categories have the same <pattern> and <that>, the final step is to choose the reply based on the <topic>. This structure suggests a design rule: never use <that> unless you have written two categories with the same <pattern>, and never use <topic> unless you write two categories with the same <pattern> and <that>.
Context (3) • Useful applications for <topic> is to create subject-dependent “pickup lines: • <topic name=“CARS”> • <category> • <pattern>*</pattern> • <template> • <random> • <li>What’s your favorite car?</li> • <li>What kind of car do you drive?</li> • <li>Do you get a lot of parking tickets?</li> • <li>My favorite car is one with a driver.</li> • </random> • </template> • </category> • The botmaster uses the <set> tag to change the value of the topic predicate.
Predicates • One of the most common applications of AIML predicates is remembering pronoun bindings. The template • <template> • <set name=“he”>Samuel Clemens</set> is Mark Twain. • </template> results in “He is Mark Twain,” but as a side effect remembers that “he” now stands for “Samuel Clemens.”
Predicates (2) • The AIML specification leaves up to the botmaster whether a <set> predicate returns the contents between the tags, or the name of the predicate. For example: • <set name=“it”>Opera</set> returns “it,” but <set name=“likes”>Opera</set> returns “Opera.” • The botmaster must also specify what happens when the bot gets a predicate which has not already been set. The values returned are called default predicate values and depend completely on the application of the predicate: • When the corresponding predicates have not been initialized with a <set> tag, <get name=“she”/> returns “Unknown,” <get name=“has”/> returns “a mother” (because everyone has a mother), and <get name=“wants”/> returns “to chat”.
Person • One of the simple tricks that makes ELIZA so believable is a pronoun swapping substitution. The AIML <person> tag provides this function. • The actual substitutions are defined by the botmaster for local languages and settings. • The most common application of the <person> tag operates directly on the <star/> binding. For that reason, AIML defines a shortcut tag <person/> = <person><star/></person>.
Person (2) • C: My mother takes care of me. • R: Who else in your family takes care of you? • Might be generated by the category • <category> • <pattern>MY MOTHER *</pattern> • <template>Who else in your family <person/>?</template> • </category>
Person (3) • C: You don’t argue with me. • R: Why do you think I don’t argue with you? • Results from the category • <category> • <pattern>YOU DO NOT *</pattern> • <template>Why do you think I don’t <person/>?</template> • </category>
Graphmaster • To achieve efficient pattern matching time, and a compact memory representation, the AIML software stores all of the categories in a tree managed by an object called the Graphmaster. • The Graphmaster stores AIML patterns along a path from r to a terminal node t, where the AIML template is stored. Let w1,…,wk be the sequence of k words or tokens in an AIML pattern. To insert the pattern into the graph, the Graphmaster first looks to see if m = G(r, w_1) exists. If it does, then the program continues the insertion of w2,…,wk in the subtree rooted at m. Only when the program encounters a first index i, where $ n | G(n, wi) is undefined, does the program create a new node m = G(n, wi), whereafter the Graphmaster creates a set of new nodes for each of the remaining wi,…,wk.
Graphmaster matching • Graphmaster matching is a special case of backtracking, depth-first search. • Match(n, h) :- if h > k return true; else exists m = G(n, _) and exists j in [h+1..k+1] | Match(m, j), return true; else if exists m = G(n, w_j) and Match(m, h+1) return true; else if Exists m = G(n, *) and exists j in [h+1..k+1] | Match(m, j), return true; else return false; The first case defines the boundary condition: 0. If there are no more words in the input, the match was successful.
Graphmaster matching (2) • The heart of the algorithm consists of three cases: • Does the node contain the key “_”? If so, search the subgraph rooted at the child node linked by “_.” Try all remaining suffixes of the input to see if one matches. If no match was found, ask • Does the node contain the key wh, the jth word in the input sentence? If so, search the subgraph linked by wh, using the tail of the input wh+1,…,wk. If no match was found, ask • Does the node contain the key “*”? If so, search the subgraph rooted at the child node linked by “*.” Try all remaining suffixes of the input to see if one matches. If no match was found, return false.
Graphmaster matching (3) • Note that: • At every node, the “_” wildcard has highest priority, an atomic word second priority, and the “*” wildcard has the lowest priority. • The patterns need not be ordered alphabetically. They are partially ordered so that “_” comes before any word, and “*” comes after any word. • The matching is word-by-word, not category-by-category. • The algorithm combines the input pattern, the <that> pattern and <topic> pattern into a single sentence or path, such as: “PATTERN <that> THAT <topic> TOPIC.” The Graphmaster treats the symbols <that> and <topic> just like ordinary words. The patterns PATTERN, THAT and TOPIC may all contain multiple wildcards. • The matching algorithm is a highly restricted form of depth-first search, also known as backtracking. • For pedagogical purposes, one can explain the algorithm by removing the wildcards and considering match steps (2) only. The wildcards may be introduced one at a time, first “*” and then “_.” It is also simpler to explain the algorithm first using input patterns only, and then subsequently develop the explanation of the path including <that> and <topic>.
Targeting • The ALICE brain, at the time of this writing, contains about 41,000 categories. In any given run of the server however, typically only a few thousand of those categories are activated. Potentially every activated category with at least one wildcard in the input pattern, that pattern, or topic pattern, is a source of targets. • The targeting software may include a GUI for browsing the targets. The program displays the original matched category, the matching input data, a proposed new pattern, and a text area to input the new template. The botmaster may choose to delete, skip or complete the target category.
Defaults • The art of AIML writing is most apparent in default categories, that is, categories that include the wildcard “*” but do not <srai> to any other category. • Depending on the AIML set, a significant percentage of client inputs will usually match the ultimate default category with <pattern>*</pattern> (and implicitly, <that>*</that> and <topic>*</topic>). The template for this category generally consists of a long list of randomly selected “pickup lines,” or non-sequitors, designed to direct the conversation back to topics the bot knows about. • <category> • <pattern>*</pattern> • <template><random> • <li>How old are you?</li> • <li>What’s your sign?</li> • <li>Are you a student?</li> • <li>What are you wearing?</li> • <li>Where are you located?</li> • <li>What is your real name?</li> • <li>I like the way you talk.</li> • <li>Are you a man or a woman?</li> • <li>Do you prefer books or TV?</li> • <li>What’s your favorite movie?</li> • <li>What do you do in your spare time?</li> • <li>Can you speak any foreign languages?</li> • <li>When do you think artificial intelligence will replace lawyers?</li> • </template> • </category>