1 / 42

组合范畴语法

组合范畴语法. 孙薇薇 @ 计算机科学技术研究所 @ 北京大学. Outline. Local, Nonlocal TAG, HPSG, LFG Combinatory categorial grammar. Grammar formalisms and linguistic theories. Linguistics aims to explain natural language: What is universal grammar? What are language-specific constraints?

jodie
Download Presentation

组合范畴语法

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 组合范畴语法 孙薇薇@ 计算机科学技术研究所@北京大学

  2. Outline • Local, Nonlocal • TAG, HPSG, LFG • Combinatory categorial grammar

  3. Grammar formalisms and linguistic theories • Linguistics aims to explain natural language: • What is universal grammar? • What are language-specific constraints? • Formalisms are mathematical theories: • They provide a language in which linguistic theories can be expressed (like calculus for physics) • They define elementary objects (trees, strings, feature structures) and recursive operationswhich generate complex objects from simple objects. • They do impose linguistic constraints (e.g. on the kinds of dependencies they can capture)

  4. Lexicalized formalisms • Lexicalized formalisms: • TAG, HPSG, LFG and CCG • The lexicon: • pairs words with elementary objects • specifies all language-specific information • The grammatical operations: • are universal • define (and impose constraints on) recursion

  5. TAG, HPSG, LFG and CCG They describe different kinds of linguistic objects: • TAG: trees • LFG is a multi-level theory based on a projection architecture relating different types of linguistic objects • trees, feature structures, … • HPSG: typed feature structures • CCG: (syntactic and semantic) types

  6. (Lexicalized) Tree-Adjoining Grammar • TAG is a tree-rewriting formalism: • TAG defines operations (substitution and adjunction) on trees. • The elementary objects in TAG are trees (not strings) • TAG is lexicalized: • Each elementary tree is anchored to a lexical item (word) • “Extended domain of locality”:The elementary tree contains all arguments of the anchor. • TAG requires a linguistic theory which specifies the shapeof these elementary trees. • TAG is mildly context-sensitive: • can capture Dutch crossing dependencies • but is still efficiently parseable

  7. a1: X Y X Y Substitute Y X a3: a2: a1 a2 a3 TAG substitution (arguments) Derived tree: Derivation tree:

  8. X Auxiliary tree Foot node X* X X X* a1 b1 TAG adjunction (modifiers) b1: Derived tree: a1: ADJOIN Derivation tree:

  9. a1: S NP VP NP VBZ eats a2: b1: a3: VP NP NP VP* RB John tapas always A small TAG lexicon

  10. NP S NP John NP VP a2 a3 tapas NP VBZ eats VP VP* RB always A TAG derivation a1: a1 NP NP b1: a3: a2: NP NP

  11. S NP VP VP a2 a3 b1 NP NP VBZ VBZ eats eats tapas VP VP* RB always A TAG derivation a1 VP John tapas S NP b1 VP VP VP* RB John always

  12. Head-Driven Phrase Structure Grammar (HPSG) • HPSG is a unification-/constraint-based theory of grammar • Syntactic/semantic constraints are uniformly denoted by signs, which are represented with feature structures • Two components of HPSG • Lexical entries represent word-specific constraints • elementary objects • Principles express generic grammatical regularities • grammatical operations

  13. Sign • Sign is a formal representation of combinations of phonological forms, syntactic and semantic constraints phonological form signPHON string syntactic/semanticconstraints synsem local constraints local category syntactic category head MOD synsem HEAD CAT syntactic head valence SPR listSUBJ list COMPS list SYNSEM LOCAL VAL modifying constraints subcategorization frames CONT content nonlocal QUE listREL list SLASH list semantic representations NONLOCAL non-local dependencies DTRS dtrs daughter structures

  14. Lexical entries • Lexical entries express word-specific constraints

  15. Principles • Principles describe generic regularities of grammar • Not corresponding to construction rules • Head Feature Principle • The value of HEAD must be percolated from the head daughter • Valence Principle • Subcats not consumed are percolated to the mother • Immediate Dominance (ID) Principle • A mother and her immediate daughters must satisfy one of ID schemas • Many other principles: percolation of NONLOCAL features, semantics construction, etc.

  16. Syntactic Structure • Lexical entries determine syntactic/semantic constraints of words Lexical entries HEAD nounSUBJ <>COMPS <> HEAD verbSUBJ <HEAD noun>COMPS <HEAD noun> HEAD nounSUBJ <>COMPS <> John saw Mary

  17. Syntactic Structure • Principles determine generic constraints of grammar HEAD SUBJCOMPS 1 2 4 HEAD SUBJCOMPS < | > 1 Unification 3 2 3 4 HEAD nounSUBJ <>COMPS <> HEAD verbSUBJ <HEAD noun>COMPS <HEAD noun> HEAD nounSUBJ <>COMPS <> John saw Mary

  18. Syntactic Structure • Principle application produces phrasal signs HEAD verbSUBJ <HEAD noun>COMPS <> HEAD nounSUBJ <>COMPS <> HEAD verbSUBJ <HEAD noun>COMPS <HEAD noun> HEAD nounSUBJ <>COMPS <> John saw Mary

  19. Syntactic Structure • Recursive applications of principles HEAD verbSUBJ <>COMPS <> HEAD verbSUBJ <HEAD noun>COMPS <> HEAD nounSUBJ <>COMPS <> HEAD verbSUBJ <HEAD noun>COMPS <HEAD noun> HEAD nounSUBJ <>COMPS <> John saw Mary

  20. Lexical-Functional Grammar (LFG) Two (basic) levels of representation: • C-structure: represents surface syntactic configurations • word order, annotated phrase-structures • trees • F-structure: represents abstract grammatical functions • SUBJ, OBJ, OBL, PRED, COMP, ADJ, … • AVM • F-structure approximates to basic predicate-argument structure, dependency representation

  21. Lexical-Functional Grammar (LFG)

  22. Lexical-Functional Grammar LFG

  23. Lexical-Functional Grammar LFG

  24. LFG Grammar Rules and Lexical Entries

  25. Outline • Local, Nonlocal • TAG, HPSG, LFG • Combinatory categorial grammar

  26. Motivation for (C)CG • Only a “minimal” extension to CFGs → formalism is also well-understood from a logical standpoint • Transparent interface to (compositional) semantics • Cross-linguistic generalizations can be made easily • the same set of rules always apply • Flexible constituency

  27. Combinatory Categorial Grammar • Categories: specify subcat lists of words/constituents. • Combinatory rules: specify how constituents can combine. • The lexicon: specifies which categories a word can have. • Derivations: spell out process of combining constituents.

  28. CCG categories • Simple categories:NP, S, PP • Complex categories: functions which return a result when combined with an argument: VP or intransitive verb: S\NPTransitive verb: (S\NP)/NPAdverb: (S\NP)\(S\NP)PPs: ((S\NP)\(S\NP))/NP(NP\NP)/NP • Every category has a semantic interpretation

  29. The combinatory rules • Function application: x.f(x) a  f(a) X/YY X (>) Y X\Y  X (<) • Function composition: x.f(x) y.g(y)  x.f(g(x)) X/Y Y/Z X/Z(>B) Y\Z X\Y  X/Z(<B) X/YY\Z X\Z(>Bx) Y/Z X\Y  X/Z(<Bx) • Type-raising: a f.f(a) X T/(T\X) (>T) X T\(T/X) (<T)

  30. Function application • Combines a function with its argument to yield aresult:(S\NP)/NP NP -> S\NPeats tapas eats tapasNPS\NP -> SJohn eats tapas John eats tapas • Used in all variants of categorial grammar

  31. A (C)CG derivation

  32. Type-raising and function composition • Type-raising: turns an argument into a function.Corresponds to case: NP -> S/(S\NP) (nominative)NP -> (S\NP)/((S\NP)/NP) (accusative) • Function composition: composes two functions (complex categories)(S\NP)/PP PP/NP -> (S\NP)/NPS/(S\NP) (S\NP)/NP -> S/NP

  33. Type-raising and Composition • Wh-movement: • Right-node raising:

  34. Another CCG derivation

  35. CCG: semantics • Every syntactic categoryand rule has a semantic counterpart:

  36. The CCG lexicon • Pairs words with their syntactic categories(and semantic interpretation): eats (S\NP)/NPxy.eats’xyS\NPx.eats’x • The main bottleneck for wide-coverage CCG parsing

  37. Example from Chinese CCGBank

  38. Example from Chinese CCGBank

  39. Example from Chinese CCGBank

  40. Example from Chinese CCGBank

  41. Example from Chinese CCGBank

  42. Summary • CCG is a lexicalized grammar formalism • “rules” of are extremely general, just like HPSG schemata • CCG is nearly context-free • Weakly equivalent to TAG • CCG has a flexible constituent structure • CCG has a transparent syntax-semantics interface • Every syntactic category and combinatory rule has a semantic interpretation • Movement or traces don’t exist • CCG rules are type-driven, not structure-driven • E.g. intransitive verbs and VPs are indistinguishable

More Related