170 likes | 323 Views
AI - Weeks 18 & 20 Natural Language Processing. Lee McCluskey, room 2/07 Email lee@hud.ac.uk http://scom.hud.ac.uk/scomtlm/cha2555/. Natural Language Processing. Background: The Turing Test.
E N D
AI - Weeks 18 & 20Natural Language Processing Lee McCluskey, room 2/07 Email lee@hud.ac.uk http://scom.hud.ac.uk/scomtlm/cha2555/ Natural Language Processing
Background: The Turing Test Assume person A communicates by text or email to 1) a person and 2) a machine. The Turing Test is for A to determine which is the computer and which is the person by the textual responses only. Assume A continues to ask 1) and 2) questions by written text and elicits responses. If, from the responses over time, the person cannot tell the difference between 1) and 2), then the Turing Test is passed. Up to now, no system has got close to passing this version of the Turing Test. It is possible to apply the “Turing Test” metaphorically to other areas of computing eg the Turing Test for the game of chess could be said to be passed. Natural Language Processing
NLP is NOT SPEECH RECOGNITION NLP is only a part of “aural communication” (the system of speaking and hearing): Speech Understanding is FAR HARDER than NLP but is potentially much more valuable because: Translating Speech → Text we loose MANY visual and aural clues Eg • Tone, speed, emotion within voice, accent • Facial expressions, arm movements, body language All contribute to the meaning of the utterance as well as the plain text In text there are no sound cues, or visual cues, available to give extra meaning to the text. So we might raise our voice to show our anger, or make gestures to add to the description of a shape. Without these extra cues, it is much harder to do NLP (cf misinterpreting texts!). Natural Language Processing
Some Potential/Current Applications of NLP -- Q & A services eg automated quiz answering services. These need to understand the question enough so that they can choose the correct answer from eg an online search --chatbots – online programs that get into conversation with you for entertainment eg Eliza --natural language translators – (online) services that take text in one language and translate it to another language eg English -> German -- natural language generation – eg games that need to communicate to the user in text or generate news stories / running commentaries as part of the game -- text summarisation or categorization (news stories, spam filters, document classifiers …) Natural Language Processing
NLP – the problem Text (sentences, email, news stories..) UNDERSTANDING PROCESS (“Natural Language Understanding”) Question Answering Knowledge Base: representation of meaning Translation Natural Language Generation Summary/ Classification Natural Language Processing
NLP: the process Text (sentence, email, news story..) -- Parsing -- Referencing -- Meaning Extraction and Integration UNDERSTANDING PROCESS (“Natural Language Understanding”) Knowledge Base: representation of meaning Translation Natural Language Generation Summary/ Classification Natural Language Processing
NLP: the process Text “The cat sits on the mat” sentence UNDERSTANDING PROCESS (“Natural Language Understanding”) np vp The cat sits on the mat Fact(type: statement, agent: cat-002, action: sits_on, object: mat-001) Knowledge Base: representation of meaning Fact(type: statement, agent: Fido, action: is_a, object: cat) Fact(type: statement, agent: Freda, action: loves, object: Fido) CONTEXT Natural Language Processing
NLP (NLU) : typical process Scanning Parsing Finding referents for pronouns etc Resolving Ambiguities Meaning Extraction Meaning Integration Natural Language Processing
NLP: Scanning and Parsing Scanning - breaking down the sentence into components (words) Parsing - checking that the sentence conforms to a grammar (are syntactically correct) and outputting a parse tree. These processes are very similar to the FRONT END of computing tools like interpreters, formatters, compilers etc Difference is that natural languages cannot be (completely) defined by BNF grammars Hence we can create parsers using parser-generators if we input the grammatical definition of a fragment of English Natural Language Processing
Parsing – a small grammar for a subset of English sentence --> noun_phrase verb_phrase noun_phrase --> determiner adjective noun noun_phrase --> adjective noun noun_phrase --> determiner noun noun_phrase --> noun verb_phrase --> verb noun_phrase verb_phrase --> verb preposition noun_phrase determiner --> a | an adjective --> fruit noun --> flies | fruit | time | arrow noun --> banana verb --> like | flies preposition --> like Natural Language Processing
NLP: Reference and Ambiguity Referencing – finding the “actual” references to words in a sentence. [Again this is similar to finding user-defined names in a Compiler’s Symbol Table.] Resolving Ambiguities – e.g. a sentence may have more than one parse [Computer Languages are not ambiguous.] I love him. [ who is I and him ?] I shot an elephant in my pyjamas. [whose pyjamas ?] My old friend John is here. [what does old refer to ? where is here?] Fruit flies like a banana. [2 parses] My cousin Fred has grown another foot. [simple pun] Natural Language Processing
NLP: Reference and Ambiguity Solution problems with Reference and Ambiguity is to use a range of techniques e.g: - context / contextual disambiguation [“him” male person in previous sentence, subject of conversation] - physical constraints [elephants are too big for pyjamas] - default roles in known verb structures [the subject of verb “flies” must be capable of flying] - general defaults [old usually refers to friend, not the referent] Some of these are covered in 5 and 6 below. Natural Language Processing
NLP: Meaning Extraction and Integration Meaning Extraction: translate the parse, noun references, etc into an internal representation language eg logic, action models, “conceptual dependency” frames, scripts (for story understanding). The idea is that If two distinct sentences have the same meaning they should map to the same internal representation. Integration:integrate it with other “knowledge” eg the sentence may represent an episode (“John paid the bill”) in a script (Eating Out Script) The main functions are matching the input with stored templates, and using constraints and context to disambiguate and fill in more details. Natural Language Processing
NLP: Meaning ExtractionConceptual Dependency, Roger Schank c.1970 Store basic primitives like time and locations a set of conceptual transitions which are like abstract operator schema from AI Planning - e.g. “ATRANS” represent a transfer such as "give" or "take" “PTRANS” is used to act on locations such as "move" or "go“ “MTRANS” represents mental acts such as "tell", etc. Example: “John gave a book to Mary" is represented as ATRANS on two real world objects John and Mary, giver=John, taker =Mary Natural Language Processing
Present Day Online tools: Wordnet’s options for “fly” S: (v) fly, wing (travel through the air; be airborne) S: (v) fly (move quickly or suddenly) S: (v) fly, aviate, pilot (operate an airplane) S: (v) fly (transport by aeroplane) S: (v) fly (cause to fly or float) S: (v) fly (be dispersed or disseminated) S: (v) fly (change quickly from one emotional state to another) S: (v) fly, fell, vanish (pass away rapidly) S: (v) fly (travel in an airplane) S: (v) fly (display in the air or cause to float) S: (v) flee, fly, take flight (run away quickly) S: (v) fly (travel over (an area of land or sea) in an aircraft) S: (v) fly (hit a fly) S: (v) vanish, fly, vaporize (decrease rapidly and disappear) Natural Language Processing
Online Tools – in the last few years have become much more sophisticated Parsers - example CMU’s parser http://www.link.cs.cmu.edu/link/submit-sentence-4.html Taxonomies (“types”) of Words WordNet http://wordnet.princeton.edu/ S POS Tagger and other stuff http://nlp.stanford.edu/software/index.shtml Natural Language Processing
Summary – Still Huge Challenges • adequacy of internal representation: translating the prose into some “representation of its meaning” adequate for the purposes of the application eg QA, translation • even mapping individual words into one meaning is problematic, ambiguity is still a challenge • there is no standard grammar for NL: it changes over time, words go in and out of currency • NL contains lots of proper names – causing more problems with identity and determining referents for nouns. Natural Language Processing