600 likes | 889 Views
Probabilistic Parsing. Ling 571 Fei Xia Week 5: 10/25-10/27/05. Outline. Lexicalized CFG (Recap) Hw5 and Project 2 Parsing evaluation measures: ParseVal Collin’s parser TAG Parsing summary. Lexicalized CFG recap. Important equations. Lexicalized CFG. Lexicalized rules:
E N D
Probabilistic Parsing Ling 571 Fei Xia Week 5: 10/25-10/27/05
Outline • Lexicalized CFG (Recap) • Hw5 and Project 2 • Parsing evaluation measures: ParseVal • Collin’s parser • TAG • Parsing summary
Lexicalized CFG • Lexicalized rules: • Sparse data problem • First generate the head • Then generate the unlexicalized rule
An example • he likes her
An example • he likes her
Building a statistical tool • Design a model: • Objective function: generative model vs. discriminative model • Decomposition: independence assumption • The types of parameters and parameter size • Training: estimate model parameters • Supervised vs. unsupervised • Smoothing methods • Decoding:
Team Project 1 (Hw5) • Form a team: program language, schedule, expertise, etc. • Understand the lexicalized model • Design the training algorithm • Work out the decoding (parsing) algorithm: augment CYK algorithm. • Illustrate the algorithms with a real example.
Team Project 2 • Task: parse real data with a real grammar extracted from a treebank. • Parser: PCFG or lexicalized PCFG • Training data: English Penn Treebank Section 02-21 • Development data: section 00
Team Project 2 (cont) • Hw6: extract PCFG from the treebank • Hw7: make sure your parser works given real grammar and real sentences; measure parsing performance • Hw8: improve parsing results • Hw10: write a report and give a presentation
Evaluation of parsers: ParseVal • Labeled recall: • Labeled precision: • Labeled F-measure: • Complete match: % of sents where recall and precision are 100% • Average crossing: # of crossing per sent • No crossing: % of sents which have no crossing.
An example Gold standard: (VP (V saw) (NP (Det the) (N man)) (PP (P with) (NP (Det a) (N telescope)))) Parser output: (VP (V saw) (NP (NP (Det the) (N man)) (PP (P with) (NP (Det a) (N telescope)))))
ParseVal measures • Gold standard: (VP, 1, 6), (NP, 2, 3), (PP, 4, 6), (NP, 5, 6) • System output: (VP, 1, 6), (NP, 2, 6), (NP, 2, 3), (PP, 4, 6), (NP, 5, 6) • Recall=4/4, Prec=4/5, crossing=0
A different annotation Gold standard: (VP (V saw) (NP (Det the) (N’ (N man)) (PP (P with) (NP (Det a) (N’ (N telescope))))) Parser output: (VP (V saw) (NP (Det the) (N’ (N man) (PP (P with) (NP (Det a) (N’ (N telescope)))))))
ParseVal measures (cont) • Gold standard: (VP, 1, 6), (NP, 2, 3), (N’, 3, 3), (PP, 4, 6), (NP, 5, 6), (N’, 6,6) • System output: (VP, 1, 6), (NP, 2, 6), (N’, 3, 6), (PP, 4, 6), (NP, 5, 6), (N’, 6, 6) • Recall=4/6, Prec=4/6, crossing=1
EVALB • A tool that calculates ParseVal measures • To run it: evalb –p parameter_file gold_file system_output • A copy is available in my dropbox • You will need it for Team Project 2
Summary of Parsing evaluation measures • ParseVal is the widely used: F-measure is the most important • The results depend on annotation style • EVALB is a tool that calculates ParseVal measures • Other measures are used too: e.g., accuracy of dependency links
History-based models • History-based approaches maps (T, S) into a decision sequence • Probability of tree T for sentence S is:
History-based models (cont) • PCFGs can be viewed as a history-based model • There are other history-based models • Magerman’s parser (1995) • Collin’s parsers (1996, 1997, ….) • Charniak’s parsers (1996,1997,….) • Ratnaparkhi’s parser (1997)
Collins’ models • Model 1: Generative model of (Collins, 1996) • Model 2: Add complement/adjunct distinction • Model 3: Add wh-movement
Model 1 • First generate the head constituent label • Then generate left and right dependents
An example Sentence: Last week Marks bought Brooks.
Model 2 • Generate a head label H • Choose left and right subcat frames • Generate left and right arguments • Generate left and right modifiers
Model 3 • Add Trace and wh-movement • Given that the LHS of a rule has a gap, there are three ways to pass down the gap • Head: S(+gap)NP VP(+gap) • Left: S(+gap)NP(+gap) VP • Right: SBAR(that)(+gap)WHNP(that) S(+gap)
TAG • TAG basics: • Extension of LTAG • Lexicalized TAG (LTAG) • Synchronous TAG (STAG) • Multi-component TAG (MCTAG) • ….
TAG basics • A tree-rewriting formalism (Joshi et. al, 1975) • It can generate mildly context-sensitive languages. • The primitive elements of a TAG are elementary trees. • Elementary trees are combined by two operations: substitution and adjoining. • TAG has been used in • parsing, semantics, discourse, etc. • Machine translation, summarization, generation, etc.
S VP VP NP ADVP VP* V NP ADV draft still Two types of elementary trees Initial tree: Auxiliary tree:
Y Y* Y* Adjoining operation
Derivation tree Derived tree Elementary trees Derivation tree
Derived tree vs. derivation tree • The mapping is not 1-to-1. • Finding the best derivation is not the same as finding the best derived tree.
S S S NP i NP VP S NP i V NP S V NP N draft NP VP i PN do what V NP S NP PN they V S* draft i N they do what Wh-movement What do they draft ?
Long-distance wh-movement S S NP S i S NP NP i VP V S V NP NP VP does draft i John S S NP VP S think V NP NP VP they V S* V S* draft i does think What does John think they draft ? what
S S NP S i NP VP PN VP NP V NP VP who PP have V NP P NP S VP have with i i NP S* VP* PP PN P NP who i with Who did you have dinner with?
TAG extension • Lexicalized TAG (LTAG) • Synchronized TAG (STAG) • Multi-component TAG (MCTAG) • ….
STAG • The primitive elements in STAG are elementary tree pairs. • Used for MT
Summary of TAG • A formalism beyond CFG • Primitive elements are trees, not rules • Extended domain of locality • Two operations: substitution and adjoining • Parsing algorithm: • Statistical parser for TAG • Algorithms for extracting TAG from treebanks.