360 likes | 471 Views
EECS 595 / LING 541 / SI 661. Natural Language Processing. Fall 2005 Lecture Notes #6. Lexicalized and probabilistic parsing. Probabilistic CFG. G (N, Σ , P, S) Non-terminals (N) Terminals ( Σ) Productions (P) augmented with probabilities: A β [p].
E N D
EECS 595 / LING 541 / SI 661 Natural Language Processing Fall 2005 Lecture Notes #6
Lexicalized and probabilistic parsing
Probabilistic CFG • G (N, Σ, P, S) • Non-terminals (N) • Terminals (Σ) • Productions (P) augmented with probabilities: A β [p]
Disambiguation as a probability problem • P(T,S) = P(T) P(S|T) = P(T) • P(Tl) = .15 * .40 * .05 * .05 * .35 * .75 * .40 * .40 * .40 * .30 * .40 * .50 = 1.5 x 10-6 • P(Tr) = .15 * .40 * .40 * .05 * .05 * .75 * .40 * .40 * .40 * .30 * .40 * .50 = 1.7 x 10-6
Probabilistic parsing • Probabilistic Earley algorithm • Top-down parser with a dynamic programming table • Cocke-Younger-Kasami (CYK) algorithm • Bottom-up parser with a dynamic programming table • Probabilities come from a Treebank.
Dependency grammars • Lexical dependencies between head words • Top-level predicate of a sentence is the root • Useful for free word order languages • Also simpler to parse
Dependencies S VP NP NP NNP VBS JJ NNS John likes tabby cats
Introduction • Meaning representation languages: capturing the meaning of linguistic utterances using formal notation • Example: deciding what to order at a restaurant by reading a menu • Example: answering a question on an exam • Semantic analysis: mapping between language and real life • I have a car: ∃x,y: Having(x) ^ Haver(speaker,x) ^ HadThing(y,x) ^ Car(y)
Verifiability • Example: Does LeDog serve vegetarian food? • Knowledge base (KB) • Sample entry in KB: Serves(LeDog,Vegetarian Food) • Convert question to logical form and verify its truth value against the knowledge base
Unambiguousness • Example:I want to eat some place near UM.(multiple interpretations) • Interpretation is important • Preferred interpretations • Vagueness: I want to eat Italian food.- what particular food?
Canonical form • Does LeDog have vegetarian dishes? • Do they have vegetarian food at LeDog? • Are vegetarian dishes served at LeDog? • Does LeDog serve vegetarian fare? • Having vs. serving • Food vs. fare vs. dishes (each is ambiguous but one sense of each matches the others) • word sense disambiguation
Inference and variables; expressiveness • Inference and variables: • I’d like to find a restaurant that serves vegetarian food. • Serves (x,VegetarianFood) • System’s ability to draw valid conclusions based on the meaning representations of inputs and its store of background knowledge. • Expressiveness: • system must be able to handle a wide range of subject matter
Predicate-argument structure • I want Italian food. NP want NP • I want to spend less than five dollars. NP want Inf-VP • I want it to be close by here. NP want NP Inf-VP • Thematic roles: e.g. entity doing the wanting vs. entity that is wanted (linking surface arguments with the semantic=case roles) • Syntactic selection restrictions: I found to fly to Dallas. • Semantic selection restrictions: The risotto wanted to spend less than ten dollars. • Make a reservation for this evening for a table for two persons at eight: Reservation (Hearer,Today,8PM,2)
First-order predicate calculus (FOPC) • Formula AtomicFormula | Formula Connective Formula | Quantifier Variable … Formula | ¬ Formula | (Formula) • AtomicFormula Predicate (Term…) • Term Function (Term…) | Constant | Variable • Connective ∧| ⋁ | ⇒ • Quantifier ∀ | ∃ • Constant A | VegetarianFood | LeDog • Variable x | y | … • Predicate Serves | Near | … • Function LocationOf | CuisineOf | …
Example • I only have five dollars and I don’t have a lot of time. • Have(Speaker,FiveDollars) ∧¬Have(Speaker,LotOfTime) • variables: • Have(x,FiveDollars) ∧¬Have(x,LotOfTime) • Note: grammar is recursive
Semantics of FOPC • FOPC sentences can be assigned a value of true or false. • LeDog is near UM. • Near(LocationOf(LeDog),LocationOf(UM))
Variables and quantifiers • A restaurant that serves Mexican food near UM. • ∃ x: Restaurant(x) ∧ Serves(x,MexicanFood)∧ Near(LocationOf(x),LocationOf(UM)) • All vegetarian restaurants serve vegetarian food. • x: VegetarianRestaurant(x) ⇒Serves (x,VegetarianFood) • If this sentence is true, it is also true for any substitution of x. However, if the condition is false, the sentence is always true.
Inference • Modus ponens:⇒ • Example:VegetarianRestaurant(Joe’s) x: VegetarianRestaurant(x) ⇒ Serves(x,VegetarianFood)Serves(Joe’s,VegetarianFood)
Uses of modus ponens • Forward chaining: as individual facts are added to the database, all derived inferences are generated • Backward chaining: starts from queries. Example: the Prolog programming language • father(X, Y) :- parent(X, Y), male(X).parent(john, bill).parent(jane, bill).female(jane).male (john).?- father(M, bill).
Examples from Russell&Norvig (1) • 7.2. p.213 • Not all students take both History and Biology. • Only one student failed History. • Only one student failed both History and Biology. • The best history in History was better than the best score in Biology. • Every person who dislikes all vegetarians is smart. • No person likes a smart vegetarian. • There is a woman who likes all men who are vegetarian. • There is a barber who shaves all men in town who don't shave themselves. • No person likes a professor unless a professor is smart. • Politicians can fool some people all of the time or all people some of the time but they cannot fool all people all of the time.
Categories & Events • Categories: • VegetarianRestaurant (Joe’s) – categories are relations and not objects • MostPopular(Joe’s,VegetarianRestaurant) – not FOPC! • ISA (Joe’s,VegetarianRestaurant) – reification (turn all concepts into objects) • AKO (VegetarianRestaurant,Restaurant) • Events: • Reservation (Hearer,Joe’s,Today,8PM,2) • Problems: • Determining the correct number of roles • Representing facts about the roles associated with an event • Ensuring that all the correct inferences can be drawn • Ensuring that no incorrect inferences can be drawn
INCIDENT: DATE 30 OCT 89 INCIDENT: LOCATION EL SALVADOR INCIDENT: TYPE ATTACK INCIDENT: STAGE OF EXECUTION ACCOMPLISHED INCIDENT: INSTRUMENT ID INCIDENT: INSTRUMENT TYPEPERP: INCIDENT CATEGORY TERRORIST ACT PERP: INDIVIDUAL ID "TERRORIST" PERP: ORGANIZATION ID "THE FMLN" PERP: ORG. CONFIDENCE REPORTED: "THE FMLN" PHYS TGT: ID PHYS TGT: TYPEPHYS TGT: NUMBERPHYS TGT: FOREIGN NATIONPHYS TGT: EFFECT OF INCIDENTPHYS TGT: TOTAL NUMBERHUM TGT: NAMEHUM TGT: DESCRIPTION "1 CIVILIAN"HUM TGT: TYPE CIVILIAN: "1 CIVILIAN"HUM TGT: NUMBER 1: "1 CIVILIAN"HUM TGT: FOREIGN NATIONHUM TGT: EFFECT OF INCIDENT DEATH: "1 CIVILIAN"HUM TGT: TOTAL NUMBER On October 30, 1989, one civilian was killed in a reported FMLN attack in El Salvador. MUC-4 Example
Subcategorization frames • I ate • I ate a turkey sandwich • I ate a turkey sandwich at my desk • I ate at my desk • I ate lunch • I ate a turkey sandwich for lunch • I ate a turkey sandwich for lunch at my desk - no fixed “arity” (problem for FOPC)
One possible solution • Eating1 (Speaker) • Eating2 (Speaker, TurkeySandwich) • Eating3 (Speaker, TurkeySandwich, Desk) • Eating4 (Speaker, Desk) • Eating5 (Speaker, Lunch) • Eating6 (Speaker, TurkeySandwich, Lunch) • Eating7 (Speaker, TurkeySandwich, Lunch, Desk) Meaning postulates are used to tie semantics of predicates:w,x,y,z: Eating7(w,x,y,z) ⇒ Eating6(w,x,y) Scalability issues again!
Another solution - Say that everything is a special case of Eating7 with some arguments unspecified: ∃w,x,y Eating (Speaker,w,x,y) - Two problems again: • Too many commitments (e.g., no eating except at meals: lunch, dinner, etc.) • No way to individuate events:∃w,x Eating (Speaker,w,x,Desk)∃w,y Eating (Speaker,w,Lunch,y) – cannot combine into∃w Eating (Speaker,w,Lunch,Desk)
Reification • ∃ w: Isa(w,Eating) ∧ Eater(w,Speaker) ∧ Eaten(w,TurkeySandwich) – equivalent to sentence 5. • Reification: • No need to specify fixed number of arguments for a given surface predicate • No more roles are postulated than mentioned in the input • No need for meaning postulates to specify logical connections among closely related examples
Representing time • I arrived in New York • I am arriving in New York • I will arrive in New York • ∃ w: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork)
Representing time • ∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ EndPoint(I,e) ∧ Precedes (e,Now) • ∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ MemberOf(i,Now) • ∃ i,e,w,t: Isa(w,Arriving) ∧ Arriver(w,Speaker) ∧ Destination(w,NewYork) ∧ IntervalOf(w,i) ∧ StartPoint(i,s) ∧ Precedes (Now,s)
Representing time • We fly from San Francisco to Boston at 10. • Flight 1390 will be at the gate an hour now. • Use of tenses • Flight 1902 arrived late. • Flight 1902 had arrived late. • “similar” tenses • When Mary’s flight departed, I ate lunch • When Mary’s flight departed, I had eaten lunch • reference point
Aspect • Stative: I know my departure gate • Activity: John is flyingno particular end point • Accomplishment: Sally booked her flightnatural end point and result in a particular state • Achievement: She found her gate • Figuring out statives:* I am needing the cheapest fare.* I am wanting to go today.* Need the cheapest fare!
Representing beliefs • Want, believe, imagine, know - all introduce hypothetical worlds • I believe that Mary ate British food. • Reified example: • ∃ u,v: Isa(u,Believing) ∧ Isa(v,Eating) ∧ Believer (u,Speaker) ∧ BelievedProp(u,v) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood) However this implies also: • ∃ u,v: Isa(v,Eating) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood) • Modal operators: • Believing(Speaker,Eating(Mary,BritishFood)) - not FOPC! – predicates in FOPC hold between objects, not between relations. • Believes(Speaker, ∃ v: ISA(v,Eating) ∧ Eater(v,Mary) ∧ Eaten(v,BritishFood))
Modal operators • Beliefs • Knowledge • Assertions • Issues: If you are interested in baseball, the Red Sox are playing tonight.
Examples from Russell&Norvig (2) • 7.3. p.214 • One more outburst like that and you'll be in comptempt of court. • Annie Hall is on TV tonight if you are interested. • Either the Red Sox win or I am out ten dollars. • The special this morning is ham and eggs. • Maybe I will come to the party and maybe I won't. • Well, I like Sandy and I don't like Sandy.