1 / 29

Natural Language Processing

Natural Language Processing . CIS 479/579 Bruce R. Maxim UM-Dearborn. Eliza. In 1966 Weizenbaum developed a program that simulates the behavior of a Rogerian, non-directive, psychotherapist The program seemed to be able to understand anything typed in by the user. Eliza Demo. Eliza.

kermit
Download Presentation

Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Natural Language Processing CIS 479/579 Bruce R. Maxim UM-Dearborn

  2. Eliza • In 1966 Weizenbaum developed a program that simulates the behavior of a Rogerian, non-directive, psychotherapist • The program seemed to be able to understand anything typed in by the user

  3. Eliza Demo

  4. Eliza • The program was actually fairly “dumb” in modern AI terms • Its “understanding” was the result of programming trickery • Its weaknesses were caused by relying almost exclusively on the premise that the syntax of a sentence captured its semantic meaning

  5. Eliza Algorithm • Keep track of the two most recent entries from the user • Remove all punctuation from these entries and check for duplicate entries • Make some synonym replacements from a list of pairs (e.g. big for huge) • Change pronouns (e.g. I and me to you)

  6. Eliza Algorithm • Search for keywords in the edited entries • if a keyword is found copy everything following the key word from the user’s entry • If no keywords are found then generate a non-committal response

  7. Eliza Algorithm • The nature of the non-committal response depends on whether a stored concept (e.g. “mother” or “hate”) exists if stored concept X exists then 2/5 of time “let’s discuss X” 3/5 of time “earlier you said X” else give some response like “I see”

  8. Eliza Source Listing

  9. Mechanical Translation • Based on syntax (surface structure) of sentence and has no real understanding of sentence meaning (semantics) • Cold War attempts failed with idiomatic speech “the spirit is willing but the flesh is weak” “time flies like an arrow” “fruit flies like a banana”

  10. Issues in UnderstandingNatural Language • Large amounts of human knowledge is assumed by the person generating a sentence • Language is pattern-based, phonemes, words, and sentences can not be randomly ordered in normal communication • Language acts are agent-based and agents are embedded in complex social environments

  11. Language Components • Prosody • Deals with the rhythm and intonation of language, hard to formalize • Phonology • Examines sounds combined to form language, important to computerized speech recognition and generation • Morphology • Components that makeup words, prefixes, suffixes, word tense, word number, parts of speech

  12. Language Components • Syntax • Rules for combining words into sentences and the use of these rules to parse and generate sentences • Most easily formalized and so far the most successfully automated component of linguistic analysis • Semantics • Considers meaning of words and sentences and ways meaning is conveyed in natural language expressions

  13. Language Components • Pragmatics • Study of ways in which language is used and its effects on the listener (e.g. the reason “yes” is not a good answer to “do you know what time it is?”) • World knowledge • Includes knowledge of physical world, the conventions of social discourse, and the role of intentions in communication

  14. Stages of Language Analysis • Parsing • Analyzes the syntactic structure of sentences (e.g. identifies components and makes sure sentences are well formed) • Often builds a parse tree • Semantic interpretation • Produces a representation of the meaning of the sentence often represented as semantic network or conceptual graph • May also use frames, conceptual dependency, or predicate logic

  15. Stages of Language Analysis • World knowledge representation • Structures from the knowledge base are used to augment the internal representation of the sentence to allow meaning to be more accurately inferred • Note: these phases are not purely sequential and may proceed concurrently in many systems

  16. Syntax • Many parsers are built assuming that the production rules can be expressed as a context free grammar s ::= np vp np ::= noun | art noun vp ::= verb | verb np art ::= a | the noun ::= man | dog verb ::= likes | bites

  17. Parse Tree s np vp v np art n art noun bites man the the dog

  18. Transition Network Parsers • Represents a grammar as a set of finite state machines or transition networks • Each network corresponds to a single non-terminal in a grammar • Arcs are labeled with either terminal or non-terminal symbols • Each path through the network corresponds to a rule for that non-terminal

  19. Transition Network Parsers • Finding a successful path through the network corresponds to a replacement nonterminal with the RHS of the rule • The parser must find a path from the start symbol to the final symbols • Terminals must match exactly • The network pieces are assembled until the entire sentence is represented

  20. Part of Transition Network np vp sinit sfinal art noun sinit sfinal noun np v sfinal sinit verb

  21. Chomsky Hierarchy • Regular Grammars • Can be recognized using finite state machines • Rules cannot have more than one non-terminal on the right-hand side • Are not powerful enough to represent even programming language syntax

  22. Chomsky Hierarchy • Context-free Grammars • Rules have only one non-terminal on their left hand side • Rules can have more than one non-terminal on the right-hand side • Can have recursive rules • Can be parsed by transition network parsers or pushdown autonoma

  23. Chomsky Hierarchy • Context-Sensitive Grammars • Allow more than one non-terminal on their left hand side • Make it possible to define a context in which the rule can be applied • Must be non-erasing (e.g. RHS is never shorter than LHS) • Can be recognized using linear bounded automaton (e.g. Turing machine with finite tape)

  24. Chomsky Hierarchy • Unrestricted or Recursively Enumerable Grammar • There are no restrictions • Can be recognized using Turing machine with infinite tape • Not very useful for defining the syntax of natural language in AI programming

  25. Context-Sensitive Grammars • Increase the number of rules and non-terminals in a grammar • Obscure the phrase structure of the language more so than context-free rules • By attempting to add more checks for things like agreement and semantic consistency, they lose some of the separation between syntax and semantics

  26. Context-Sensitive Grammars • They still do not address the problem needing to build a semantics representation of the sentence • Parser only accepts or rejects based on syntax of the sentence itself

  27. Augmented Transition Network Parser • Add a set of registers to regular transition network to give it the ATN the ability to store partially developed parse trees • Allow conditional execution of arcs (e.g. test before calling) • Attach actions to nodes capable of modifying the data structures returned • In short, the recognizer becomes a full Turing machine

  28. Combining Syntax and Semantic Knowledge • The semantic interpreter constructs its interpretation by beginning with the root node of the structure returned by the ATN parser and traverses the “parse tree” • At each node the semantic interpreter interprets the children recursively and combines the results in a single conceptual graph that is passed up the parse tree • The semantic interpreter makes use of a domain specific knowledge base to build this conceptual graph

  29. Natural Language Processing Applications • Natural language queries against relational databases • Improving free text web searches • Natural language report generators whose input is data files or reports • Realistic adventure game dialogs • News filters and surveillance tools

More Related