Dependency Hashing for n-best CCG Parsing

Dependency Hashing for n-best CCG Parsing Dominick Ng and James R. Curran Presented by Yun Huang

CCG derivation Dependency Evaluation All components of a dep. structure must match golden standard Prec./Recall/F-score Background: CCG

Background: CCGbank • CCGbank was created by converting the phrase-structure trees in the PTB into normal-form CCG derivations. (99.44% covered)

Background: C&C parser • Supertagger: assign possible lexical categories to word (eg. S\NP, (S\NP)/PP for swim) • Tag dictionary extracted from training data • Adaptive supertagging: β and k • C&C parser: log-linear model parser • POS tags and lexical categories as input. • CKY chart parsing • N-best reranking

Ambiguity in n-best CCG parsing • Spurious ambiguity • Norm-form (usually right branching) • Absorption ambiguity • Diversity problem: n-best CCG derivations, but with duplicated dependencies

Dependency Hashing (1) • Constraint: any n-best candidate must not have the same dependencies as any candidate already in the list. • Similar in SMT: remove duplicated strings • Delete which: later inserted? lower score?

Dependency Hashing (2) • Implementation: • 32-bit hash value for each dependency • Bit-wise XOR to combine sub-derivations • Only hash value, no hash table • Collision: miss some useful dependencies

Dependency Grammatical relation Diversity experiments

Parsing Results • Oracle • Reranking upper bound • Reranking Gap

Three types of error • Grammar error • Only a subset of CCGbank rules are used • Seen rule constraint • Supertagger error • Restricted categories by frequency cutoff • Probability threshold βand cutoff k • Model error • Suboptimal parse

Grammar Error • Given gold-standard categories, the parser F-score is 99.49%, with 95.61% coverage • Grammar error accounts about 0.5% of overall parser errors, and 4.4% drop in coverage

Supertagger and model error • Supertagger error : differ from oracle • Model error : differ from baseline

More experiments • Tradeoff of speed and accuracy • Gold/automatic POS tags

Conclusion • Dependency hashing for n-best CCG • Avoid derivations with same dependency • Increase diversity in n-best list • Comprehensive error analysis • Grammar error: 0.5% • Supertagger error: 5% • Model error: 7.5%

Thank you Q & A

Dependency Hashing for n-best CCG Parsing

Dependency Hashing for n-best CCG Parsing

Presentation Transcript

Computational Paninian Grammar for Dependency Parsing

Dependency Parsing: Machine Learning Approaches

Dependency Parsing by Belief Propagation

Dependency Parsing

Partial Dependency Parsing for Irish

Unsupervised Dependency Parsing

Data-Driven Dependency Parsing

Dependency Parsing

Dependency Parsing

Dependency Parsing as a Classification Problem

Dependency Parsing by Belief Propagation

Question Answering Passage Retrieval Using Dependency Parsing

DEPENDENCY PARSING ， Framenet , SEMANTIC ROLE LABELING, SEMANTIC PARSING

Lexical Dependency Parsing

Exploiting Reducibility in Unsupervised Dependency Parsing

An SVMs Based Multi-lingual Dependency Parsing

A Pipeline Model for Bottom-Up Dependency Parsing

Dependency Parsing as a Classification Problem

Unsupervised Dependency Parsing