320 likes | 468 Views
Natural Language Processing. Lecture 2: Semantics. Last Lecture. Motivation Paradigms for studying language Levels of NL analysis Syntax Parsing Top-down Bottom-up Chart parsing. Today’s Lecture. DCGs and parsing in Prolog Semantics Logical representation schemes
E N D
Natural Language Processing Lecture 2: Semantics
Last Lecture • Motivation • Paradigms for studying language • Levels of NL analysis • Syntax • Parsing • Top-down • Bottom-up • Chart parsing
Today’s Lecture • DCGs and parsing in Prolog • Semantics • Logical representation schemes • Procedural representation schemes • Network representation schemes • Structured representation schemes
Parsing in PROLOG • How do you represent a grammar in PROLOG?
Writing a CFG in PROLOG • Consider the rule S -> NP VP • We can reformulate this as an axiom: • A sequence of words is a legal S if it begins with a legal NP that is followed by a legal VP • What about s(P1, P3):-np(P1, P2), vp(P2, P3)? • There is an S between position P1 and P3 if there is a position P2 such that there is an NP between P1 and P2 and a VP between P2 and P3
Inputs • John ate the cat can be described • word(john, 1, 2) • word(ate, 2, 3) • word(the, 3, 4) • word(cat, 4, 5) • Or (better) use a list representation: • [john, ate, the, cat]
Lexicon • First representation • isname(john), isverb(ate) • v(P1, P2):- word(Word, P1, P2), isverb(Word) • List representation • name([john|T], T).
A simple PROLOG grammar s(P1, P3):-np(P1, P2), vp(P2, P3). np(P1, P3):-art(P1, P2), n(P2, P3). np(P1, P3):-name(P1, P3). pp(P1, P3):-p(P1, P2), np(P2, P3). vp(P1, P2):-v(P1, P2). vp(P1, P3):-v(P1, P2), np(P2, P3). vp(P1, P3):-v(P1, P2), pp(P2, P3).
Direct clause grammars • PROLOG provides an operator that supports DCGs • Rules look like CFG notation • PROLOG automatically translates these
DCGs and Prolog Grammar s(P1, P3):-np(P1, P2), vp(P2, P3). np(P1, P3):-art(P1, P2), n(P2, P3). np(P1, P3):-name(P1, P3). pp(P1, P3):-p(P1, P2), np(P2, P3). vp(P1, P2):-v(P1, P2). vp(P1, P3):-v(P1, P2), np(P2, P3). vp(P1, P3):-v(P1, P2), pp(P2, P3). s --> np, vp. np --> art, n. np --> name. pp --> p, np. vp --> v. vp --> v, np. vp --> v, pp. Lexicon name --> [john]. v --> [ate]. art --> [the]. n --> [cat]. Lexicon name([john|P], P). v([ate|P],P). art([the|P],P). n([cat|P],P).
Building a tree with DCGs • We can add extra arguments to DCGs to represent a tree: • s --> np, vp. becomes • s(s(NP, VP)) -->np(NP), vp(VP).
An ambiguous DCG s(s(NP, VP)) --> np(NP), vp(VP). np(np(ART, N)) --> art(ART), n(N). np(np(NAME)) --> name(NAME). pp(pp(P,NP)) --> p(P), np(NP). vp(vp(V)) --> v(V). vp(vp(V,NP)) --> v(V), np(NP). vp(vp(V,PP)) --> v(V), pp(PP). vp(vp(V,NP,PP)) --> v(V), np(NP), pp(PP). np(np(ART, N, PP)) --> art(ART), n(N), pp(PP). %Lexicon art(art(the)) --> [the]. n(n(man)) --> [man]. n(n(boy)) --> [boy]. n(n(telescope)) --> [telescope]. v(v(saw)) --> [saw]. p(p(with)) --> [with].
Semantics • What does it mean?
Semantic ambiguity • A sentence may have a single syntactic structure, but multiple semantic structures • Every boy loves a dog • Vagueness – some senses are more specific than others • “Person” is more vague than “woman” • Quantifiers: Many people saw the accident
Logical forms • Most common is first-order predicate calculus (FOPC) • PROLOG is an ideal implementation language
Thematic roles • Consider the following sentences: • John broke the window with the hammer • The hammer broke the window • The window broke • The syntactic structure is different, but John, the hammer, and the window have the same semantic roles in each sentence
Themes/Cases • We can define a notion of theme or case • John broke the window with the hammer • The hammer broke the window • The window broke • John is the AGENT • The window is the THEME (syntactic OBJECT -- what was Xed) • The hammer is the INSTR(ument)
Case Frames TIME past AGENT THEME Sarah fix chair INSTR glue Sarah fixed the chair with glue
Network Representations • Examples: • Semantic networks • Conceptual dependencies • Conceptual graphs
Semantic networks • General term encompassing graph representations for semantics • Good for capturing notions of inheritance • Think of OOP
Part of a type hierarchy ALL PHYSOBJ SITUATION EVENT NON-ANIMATE ANIMATE NON-LIVING VEGETABLE DOG PERSON
Strengths of semantic networks • Ease the development of lexicons through inheritance • Reasonable sized grammars can incorporate hundreds of features • Provide a richer set of semantic relationships between word senses to support disambiguation
Conceptual dependencies • Influential in early semantic representations • Base representation on a small set of primitives
Primitives for conceptual dependency • Transfer • ATRANS - abstract transfer (as in transfer of ownership) • PTRANS - physical transfer • MTRANS - mental transfer (as in speaking) • Bodily activity • PROPEL (applying force), MOVE (a body part), GRASP, INGEST, EXPEL • Mental action • CONC (conceptualize or think) • MBUILD (perform inference)
Problems with conceptual dependency • Very ambitious project • Tries to reduce all semantics to a single canonical form that is syntactically identical for all sentences with same meaning • Primitives turn out to be inadequate for inference • Must create larger structures out of primitives, compute on those structures
Structured representation schemes • Frames • Scripts
Frames • Much of the inference required for NLU involves making assumptions about what is typically true about a situation • Encode this stereotypical information in a frame • Looks like themes, but on a higher level of abstraction
Frames • For an (old) PC: Class PC(p): Roles: Keyb, Disk1, MainBox Constraints: Keyboard(Keyb) & PART_OF(Keyb, p) & CONNECTED_TO(Keyb,KeyboardPlug(MainBox)) & DiskDrive(Disk1) & PART-OF(Disk1, p) & CONNECTED_TO(Disk1, DiskPort(MainBox)) & CPU(MainBox) & PART_OF(MainBox, p)
Scripts • A means of identifying common situations in a particular domain • A means of generating expectations • We precompile information, rather than recomputing from first principles
Scripts • Travel by plane: • Roles: Actor, Clerk, Source, Dest, Airport, Ticket, Money, Airplane • Constraints: Person(Actor), Value(Money, Price(Ticket)), . . . • Preconditions: Owns(Actor, Money), At(Actor, Source) • Effects: not(Owns(Actor, Money)), not(At(Actor, Source)), At(Actor, Dest) • Decomposition: • GoTo(Actor, Airport) • BuyTicket(Actor, Clerk, Money, Ticket),. . .
Issues with Scripts • Script selection • How do we decide which script is relevant? • Where are we in the script?
NLP -- Where are we? • We’re five years away (??) • Call 1-888-NUANCE9 (banking/airline ticket demo) • 1-888-LSD-TALK (Weather information) • Google • Ask Jeeves • Office Assistant