1 / 17

A Pipeline Model for Bottom-Up Dependency Parsing

A Pipeline Model for Bottom-Up Dependency Parsing. Tenth Conference on Natural Language Learning, Shared Task New York, USA, 2006. Ming-Wei Chang, Quang Do , Dan Roth Computer Science Department University of Illinois, Urbana-Champaign. Dependency Parsing Approach: Summary.

Download Presentation

A Pipeline Model for Bottom-Up Dependency Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A Pipeline Model for Bottom-Up Dependency Parsing Tenth Conference on Natural Language Learning, Shared Task New York, USA, 2006 Ming-Wei Chang, Quang Do, Dan Roth Computer Science Department University of Illinois, Urbana-Champaign

  2. Dependency Parsing Approach: Summary • Modified Shift-Reduced parser. • Actions are selected via a classifier + Extended action set + Look ahead search • Control Policy: Left to right, with step back • Dependency Types: A separate multiclass classifier • Multilingual: Convert non-projective languages to projective [Nirve and Nilsson, 2005]

  3. Motivation • Shift-Reduced Parsing as a pipeline model: • A classifier is used to determine which action to take • The decision at each stage depends on previous decisions. + Making a decision can rely on information acquired in previous stages - Making a decision can rely on incorrect information acquired in previous stages. • Viewed this way, we want to: • Reduce the number of decisions • Make local decisions more robust

  4. S L R x a b a b a WL b c a b a b c d a b d c Parsing Algorithm • Parsing from left to right; considering pair of (currently) consecutive words (a,b) (with a<b) • For the pair (a, b), to become the child of a, b must be a complete subtree. • Standard action set: Left, Right, Shift [Yamada and Matsumoto, 03], • Left: a is the parent of b • Right: b is the parent of a • Shift: the action is not Left or Right. +We split Shiftinto Shift, WaitLeft,WaitRight • WaitLeft: a is the parent of b, but b is the parent of other nodes. Action is deferred. • WaitRight (?): not needed ! • Control policy: Step Back • Provably, allows parsing in one pass over the sentence [ACL06] • Reduces the number of decisions

  5. w x y z w y z x A Pipeline Model with Look Ahead Search • Pipilining decisions may result in error accumulation: • The correct dependencies  • If the algorithm decides w  x before xy and x  z, we cannot recover the correct parent for y and z. • Correct early decisions are crucial • A look ahead search algorithm takes into account future predicted actions • Local decisions are more robust

  6. keep this action keep this action keep this action a0 a1 a0 a1 a2 a0 a1 a2 a3 depth=1 depth=2 depth=3 A Pipeline Model with Look Ahead – cont’ • The search algorithm performs a search of length depth. • Additive scoring is used to score the sequence • The first action in this sequence is performed.

  7. Experiments (for Swedish)* • The effect of the new action. • The effect of look ahead search. *For other languages, please refer to our paper.

  8. Analysis • WaitLeft and the Look Ahead Search improve the parsing results. Results can be improved by: • Selecting features and parameters more carefully • Currently we use exactly the same set of features and the same parameters for all languages. • Using the FEAT column properly • The result for languages with FEAT column is generally worse than the languages without FEAT column

  9. Thank you !

  10. It is not tractable to find the global optimal predicted sequence in the pipeline model with the large depth. • In the pipeline framework, the feature vector of current decision depends on every previous predictions. • The FEAT column • Average difference between our system with the best systems • With Feats 4.5%, Without Feats 3.4%

  11. Labeling the Dependency Type • A post-task after predicting the head for the tokens in the sentence. • This is a multi-class classification. • Consider every edge of the tree • Classify the edge into several classes • The parents of the tokens which were labeled in the first phase will be used as the features.

  12. x L a b a b R a b WL a b c

  13. Experiments • We show the effect of the new action (on Swedish).

  14. Experiments • The effect of look ahead search (on Swedish).

More Related