640 likes | 1.12k Views
Natural Language. What do we mean when we speak of a natural language ? Let us contrast two kinds of language "unnatural" (or formal) and "natural". . Unnatural language. computer language, such as C++, Java, Prolog, or LISP highly constrained; they have a very rigid syntax
What do we mean when we speak of a natural language? • Let us contrast two kinds of language "unnatural" (or formal) and "natural".
Unnatural language • computer language, such as C++, Java, Prolog, or LISP • highly constrained; they have a very rigid syntax • work with a very limited initial vocabulary.
A program program add-numbers-in-the-table (input, output); var table: array [1..200] of integer; index: integer; sum: real; begin index := 1 sum:=0 while index <= 100 begin sum := sum + table[index]; index := index+1; end; end.
Natural Languages • tend to be far less precisely defined. • Examples include the languages we speak, write and read: English, American, French, Chinese, Japanese, or Sanscrit. • Have a very complex syntax or, perhaps, do not even conform to a well defined syntax ... especially as they are used • They typically have an enormous vocabulary.
Natural language dialogue • Here are some numbers in a table. • Add them • All of them? • No, just the first 100.
Natural Language and Perception? • Natural language is conveyed in a number of ways. One of the most important ways is that it is "spoken”. • Speaking a language places sound patterns in the environment. • These sound patterns join other sounds in the environment, produced by the wind rushing through the trees, by machinery, and by other speakers
Natural language is also conveyed via text, in books, letters, newspapers and on computer screens. • What sensor do we use to interpret text? What is perception in this case? • Are we reasoning about out perceptions? • Do we perceive that the string "My hat is red" is a code for a conceptual entity?
Machine Translation • One of the original motivating reasons for studying Natural Language was to build machines which could translate text from one language to another. • This was originally thought to be a modest task
The method first envisioned was to • 1. Replace the words in the language to be translated with equivalent words in the target language. • 2. Use syntax rules to cleanup the resulting sentences.
Sometimes this works. • Consider the following example. English Sentence: I must go home. • Word replacements into German I Ich must muss go gehen home nach hause
Resulting Sentence: Ich muss gehen nach hause. • Syntax transform ... move verb to the end of the sentence. Ich muss gehen nach hause. • German Sentence: Ich muss nach hause gehen. • This is pretty good!
Here is an unlucky example. • The spirit is willing but the flesh is weak............English • Translate: English Russian English • The vodka is strong but the meat is rotten.........The result!
Words can mean many things • Most and often the meanings can be unrelated. Here is a simple word which has a number of unrelated meanings. bow ribbon tied into a decorative configuration bow an instrument used to project an arrow. bow the forward portion of a boat bow an act performed out of respect bow deviation of an object from a straight line.
Placement • For one thing, selection of the proper meaning of a word seems to depend on where that word occurs in the sentence. The pen is in the box. The box is in the pen.
Context • Sentences change their meaning depending on the context in which the sentence is presented. • The following sentence is ambiguous even if we keep the specify the meaning for each of the words. I saw the man on the hill with a telescope. • How well do you think the word substitution method of machine translation would work on this one?
Natural Language Understanding • The upshot of the early work in machine translation made it clear that there was more to translating natural language than simply substituting words and massaging the syntax. • Something "deeper" was afoot. • To properly translate a sentence the machine must first "understand" it. What do we mean by "deeper" and "understand"??? How do we do it??
What does the following phrase mean? • Water pump pulley adjustment screw threads damage report summary.
How do we understand conversations? • What is going on in the following dialogue? Do you know the time? Yes. Could you tell me the time? Yes. Will you tell me the time? Yes. I need to know the time. I understand.
Eliza: Step toward Natural Language Understanding? • Although its author, Joseph Weisenbaum, would strongly disagree, it has been often said that Eliza is a demonstration of natural language understanding ... at least to some extent.
Eliza's understanding based on key words • Eliza had a set of templates, each looking for a key word in the input sentence. • These templates were of the following form: (* keyword *) where the *'s are meant to be wildcards, like the ones used in UNIX for filename descriptions. They are meant to match with any string.
The template (* computers *) for example, would match the sentence I am really frustrated with computers. • by matching the first “” of the template with the string "I am really frustrated with " and the second “” with the string ".".
Eliza's "understanding" of the sentence was simple, to be sure, but it was enough to return "meaningful" sentences to the user ... in the context of a therapy session. • The program would store the strings which were matched to the *'s and sometime later generate a new sentence, often using these stored strings. • The responses were built into a table, referenced by the template.
Whenever Elisa found a match, she would generate one of the corresponding responses. Question: Is this understanding?
Conceptual Dependency • An attempt to represent sentences about actions in a way that addresses the similarity in meaning.
Capturing the similarity of meaning • capture the similarity of meaning found in sentences like: Mary took the book from John. Mary received the book from John. Mary bought the book from John. John gave the book to Mary. John sold the book to Mary. John sold Mary the book. John bought the book from Mary. John traded a cigar to Mary for the book.
In all these examples the "ownership" of the book is transferred between John and Mary. • The direction may vary and the intention may change, but the result of the event is the similar.
Frames • Shank and Abelson used frames to represent events. These frames had four slots: actor: the agent causing the event action: the action performed by the actor object: the object being acted upon direction:the direction in which that action is oriented
All actions in terms of a small set of “primitive actions”. • One of the list of primitive actions they worked with was • atrans transfer of possession • mtrans transfer of mental information • ptrans physical transfer of an object from one place to another. • mbuild to build mental structures • speak the act of making sounds • ingest eat • grasp to hold in ones hand • propel to apply a force to an object • attend focusing ones consciousness upon • move move a body part, eg moving an arm.
Four similar sentences John gave Mary a the book. actor: John action: atrans object: book direction: from John to Mary. John took the book from Mary actor: John action: atrans object: book direction: from Mary to John. John bought the book from Mary actor: John action: atrans object: book direction: from Mary to John. Mary sold the book to John. actor: Mary action: atrans object: book direction: from Mary to John.
Enhancements • One criticism of this presentation was that the “understanding” did not capture some of the subtleties present in many sentences. • In response, enhancements were made to the frame structure. actor: the agent causing the event action: the action performed by the actor object: the object being acted upon direction: the direction in which that action is oriented instrument: device used to accomplish action cause: events caused by the action time: timeframe
John went to the store actor: John action: ptrans object: John direction: to (the store) instrument: unspecified cause unspecified time unspecified.
John flew to New York actor: John action: ptrans object: john direction: to (New York) instrument: actor: plane action: propel object: plane direction: to (New York) other fields: unspecified time: past
Challenges: John took a plan to New York • Is it actor: John action: ptrans object: plane direction: to New York • Or should it be ... actor: John action: ptrans object: John direction: to (New York) instrument: actor: plane action: propel object: plane direction: to (New York) time: past
MARGIE (Schank, et al. 1975). • - MARGIE was a natural language program composed of a conceptual analyzer, a sentence generator and an inference module. • The conceptual analyzer was a kind of compiler that converted English sentences into conceptual dependency structures.
English Sentences Conceptual Dependency (CD). • The sentence generator produced "natural language sentences" from conceptual dependency structures • Conceptual Dependency Natural Language Sentences. • Schank and his students experimented with sentence generators in a number of languages. CD English CD French
Hope in these efforts was, of course, to get a handle on a solution to the language translation problem. • The inferencemodule would make inferences from the CD expressions in a data base together with the new CD expressions created by the new sentences.
Experiments were conducted to see what would happen with and without the inference module enabled.
Inferencing disabled • Here is an example of the output generated with inferencing disabled. • Input John killed Mary by choking her. • Output paraphrases: John Strangled Mary. John choked Mary and she died because she could not breathe. Mary died because she was unable to inhale some air and she was unable to inhale some air because John grabbed her by the neck.
Inferencing enabled • When inferencing was enabled the output became even more interesting: • Input John gave Mary an Aspirin. • Output Sentences John believes that Mary is sick. Mary is sick. Mary wants to feel better. Mary will ingest the aspirin.
Programs like MARGIE are very interesting. Their successes and failures help us understand the natural language understanding process. • When we write programs like this we are trying to discover what is it about sentences that make it possible for us to: Understand them. Understand ones we have not seen before. Produce new ones that others can understand.
Language building blocks or units • Printed text, for example, is organized according to the following language units • Letters • Words • Phrases • Sentences • Paragraphs • Sections • Chapters • Books • Fields
Syntax or sentence structure • It seems compelling that syntax or sentence structure plays a role in determining the meaning of a sentence. • if we can determine the structure of the sentence it seems to help resolve the ambiguity.
If we can determine that “on the hill modifies the man and “with the telescope” modifies the action of seeing, then the sentence meaning is resolved. It means that I used the telescope to seem the man and that the man was on the hill. • If, on the other hand, “ with the telescope” modifies the phrase “on the hill” the hill, then the sentence means that I saw the man that was located on the particular hill that had a telescope on it. Etc.
Grammar • One form of grammar describes a sentence in terms of the concepts “noun phrase, verb phrase, noun, verb, prepositional phrase, adverb, preposition, etc.
Example • sentence : • nounphrase, verbphrase. • nounphrase : • determiner, nounexpression. • nounphrase : • nounexpression. • nounexpression : • noun. • nounexpression : • adjective, nounexpression. • verbphrase : • verb, nounphrase. • determiner : • the | a. • noun : • dog | bone | mouse | cat. • verb : • ate | chases. • adjective : • big | brown | lazy.
Parsing in Prolog • To begin with, we will simply determine if a sentence is a legal sentence. In other words, we will write a predicate sentence/1, which will determine if its argument is a sentence. • Our two examples assume we have broken the sentences into words (by testing for the “whitespace” between words) and stored in the following prolog lists [the,dog,ate,the,bone] [the,big,brown,mouse,chases,a,lazy,cat]
Basic strategies for parsing • The generate-and-test strategy the list to be parsed is split in different ways with the splittings tested to see if they are components of a legal sentence. Prolog codesentence(L) :- append(NP, VP, L), nounphrase(NP), verbphrase(VP).
The append predicate will generate possible values for the variables NP and VP, by splitting the original list L. • The next two goals test each of the portions of the list to see if they are grammatically correct. If not, backtracking into append/3 causes another possible splitting to be generated. • The clauses for nounphrase/1 and verbphrase/1 are similar to sentence/1, and call further predicates that deal with smaller units of a sentence, until the word definitions are met, such as noun([dog]). verb([ate]). noun([mouse]). verb([chases]).
Difference strategy • The more efficient strategy is to skip the generation step and pass the entire list to the lower level predicates, • which in turn will take the grammatical portion of the sentence they are looking for from the front of the list and return the remainder of the list.