180 likes | 368 Views
Understanding Natural Language:. Grammar and Parsing. Motivations. Expert systems interact with users with questions and answers. Often the interface is the English language. Today you will learn about natural language processing. Objectives. Language translation: then and now Parsing
E N D
Understanding Natural Language: Grammar and Parsing
Motivations • Expert systems interact with users with questions and answers. • Often the interface is the English language. • Today you will learn about natural language processing.
Objectives • Language translation: then and now • Parsing • Grammar • Semantics • Transition networks
Early history of language translation • Dictionary translation, word by word • Out of sight, out of mind→ the person is blind and insane • The spirit is willing but the flesh is weak.→ The vodka is great but the meat is rotten. • Did not address interrelation among words. • Simple dictionary meaning of a word is not enough • Shakespeare's Sonnet XVIII opening lineShall I compare thee to a summer's day?Not in Hong Kong, spring would make better sense.
Levels of language analysis Prosody • rhythm and intonation of language, poetry, religious chants Phonology • phonemes examples: /p/, /b/, /ʃ/ , /tʃ/, /i/ , etc. • speech analysis: identify phonemes to form words. Morphology • the components that make up a word (ing, ed,...), e.g., inter-nation-al-iz-ed • inflection: case, gender, number, tense, person, mood, voice Syntax • rules for combining words into legal (syntactically correct) sentences Semantics • attaching meaning to words, phrases, and sentences Pragmatics • how is language usually used? • Merry Christmas and Happy New Year • Happy Christmas and Merry New Year Common sense world knowledge • general background necessary to interpret text or conversation • I can program day and night. • I can program day or night.
Today General conversationalist is still far away. Scale back to interpretation in restricted applications • block world, football, chess, bus schedule inquiries, etc. Audio to text, but little interpretation • meeting transcripts • singing to lyrics Normal steps of linguistic analysis: • parsing • semantic interpretation
Parts of speech The English language is made up of eight types of words or parts of speech: 1. nouns – words used to name people, animals, places, things, qualities or ideas 2. pronouns 3. verbs 4. adjectives 5. adverbs 6. prepositions 7. conjunctions 8. articles – definite, indefinite, demonstratives, possessives
Tony (the tiger) growled at Bob Parse tree shows the sentence structure. Leaf nodes contain terminals (words). Other nodes contain non-terminals. Both are grammar symbols.
A simple grammar • sentence → np vp • np → n • np → art n • vp → v • vp → v pp • pp → prep np • prep → at • art → a • art → the • n → Bob • n → Tony • v → likes • v → growled Legal sentence: a string of terminals that can be derived from these rules. rewrite rules terminals
Interpret it with a semantic net • Tony is a tiger • Bob is a human • tigers are a subclass with stripes that growls • mammals are covered with hair • humans are a subclass of mammal that are frightened by tigers
Semantic interpretation: Conceptual graph Growling has an agent and an object (from parse tree): From the semantic net, we know that tiger frightens people. Perform the join operation on the above 2 input graphs. The user tells the machine that "Tony growled at Bob". The machine understands also "Tony the tiger frightens Bob the human".
The man bites the dog Sentence derivation 1. sentence → np vp 3. art n vp 7. the n vp 8. the man vp 5. the man v np 11. the man bites np 3. the man bites art n 7. the man bites the n 9. the man bites the dog Grammar 1. sentence → np vp 2. np → n 3. np → art n 4. vp → v 5. vp → v np 6. art → a 7. art → the 8. n → man 9. n → dog 10. v → likes 11. v → bites
Transition nets • Transition networks are finite-state machines. • Each network corresponds to a single non-terminal. • Each path from initial to final state corresponds to a grammar rule. • Each arc is labeled with a grammar symbol.
Grammar to transition nets n → man n → dog sentence → np vp np → art n np → n vp → v np vp → v art → a art → the v → likes v → bites
Sentence: parse(Sentence) parse(Noun_phrase) Noun_phrase: parse(Article) terminals don't match Dog terminal matches dog parse(Noun) Trace of a transition network parse: “dog bites.”
Conclusion • Grammar specifies a language of all legal sentences. • Given a sentence, parse it to reveal its sentence structure, the parse tree. • Analyze the parse tree with semantic networks or conceptual graphs to find the meaning of the sentence.