Parsing with HPSG Grammars: Overview and Implementations

Grammar Engineering:Parsing with HPSG Grammars Miguel Hormazábal

Overview • The Parsing Problem • Parsing with constraint-based grammars • Advantages and drawbacks • Three different approaches

The Parsing Problem • Given a Grammar and a Sentence, • Can the < S, Θ> generate / rule out the input String ? • A candidate sentence must satisfy all the principles of the Grammar • Coreferences as main explanatory mechanism in HPSG

Parsing with Constraint-based Grammars • Object-based formalism • Complex specifications on signs • Structure sharing imposed by the theory • Feature Structures • Sort resolved and well typed • Multiple information levels (PHON, SYNSEM) • Universal / Language specific principles to be met

Advantages and Drawbacks Pros: • A common formalism for all levels of linguistic Information • All information simultaneously available Cons: • Hard to modularize • Computational overhead for parser

1st Approach: Distributed Parsing • Two kind of constraints: • Genuine: syntactic, they work as filters of the input • Spurious: semantic, they build representational structures • Parser cannot distinguish between analytical and structure-building constraints • VERBMOBIL implementation: • Input: word lattices of speech recognition hypotheses • Parser identifies those paths of acceptable utterances • Lattices can contain hundreds of hypotheses, most ungrammatical • Goal: Distribute the labour of evaluating the constrains in the grammar on several processes

Distributed Parsing • Analysis strategy: Two parser units: • SYN-Parser: • Works directly with word lattices • Performs as a filter for the SEM-Parser • SEM-Parser: • Works only with successful analysis results • Performs under control by the SYN-Parser

Distributed Parsing • Processing requirements: • Incrementality: • The SYN-Parser must NOT send its results only when it has complete analysis, forcing the SEM-Parser to wait • Interactivity: • The SYN-Parser must report back when its hypothesis failed • Efficient communication system between the parsers, based on the common grammar

Distributed Parsing • Centralized Parsing • Distributed Parsing

Distributed Parsing • Bottom-Up Hypotheses • Emitted by the SYN-Parser and sent to SEM-Parser, for semantic verification • Top-Down Hypotheses • Emitted by the SEM-Parser, failures reported back to SYN-Parser • Completion History C-hist(NP-DET-N) := ((DET t0 t1) (N t’1 t2)) C-hist(det) := ((“the” t0 t1)) C-hist(N) := ((“example” t’1 t2))

Distributed Parsing • Compilation of Subgrammars • From common source Grammar, • Straightforward option: split up the Grammar into syntax and semantics strata • Manipulating grammar rules and lexical entries to obtain: Gsyn and Gsem

2nd Approach: Data-Oriented Parsing • Main goal: achieve domain adaptation to improve efficiency of HPSG parsing • Assumption: frequency and plausibility of linguistic structures within a certain domain, will render better results • DOP process new input by combining structure fragments from a Treebank • DOP allows to assign probabilities to arbitrarily large syntactic constructions

Data-Oriented Parsing Procedure: • Parse all sentences from a training corpus using HPSG Grammar and Parser • Automatic acquisition of a stochastic lexicalized tree grammar (SLTG) • Each parse tree is decomposed into a set of subtrees. • Assignment of probabilities to each subtree

Data-Oriented Parsing • Implementation using unification-based Grammar, parsing and generation platform: LKB • First parse each sentence of the training corpus • The resulting Feature Structure contains the parse tree • Each non-terminal node contains the label of the HPSG-rule schema applied • Each terminal node contains lexical type of the corresponding feature structure • After this, each parse tree is further processed

Data-Oriented Parsing • 1. Decomposition, two operations: • Root  creates ‘passive’ (closed, complete) fragments by extracting substructures • Frontier  creates ‘active’ (open, incomplete) fragments by deleting pieces of substructure • Each non-head subtree is cut off, and the cutting point is marked for substitution.

Data-Oriented Parsing • 2. Specialization • Rule labels of root node and substitution nodes are replaced with a corresponding category label. Example: signs with local.cat.head value of type noun, and local. cat.val.subj feature the empty list, are classified as NPs. • 3. Probability • Count total number n of all trees with same root label α • Divide frequency number m of a tree t with root α by n  p(t) • The sum of all probabilities of trees ti with root α  1 Σti: root(ti) = α p(ti) = 1

Data-Oriented Parsing • This implementation for the VerbMobil project uses a chart-based agenda-driven bottom-up parser • Step 1: Selection of a set of SLTG-trees associated with the lexical items in the input sentence • Step 2: Parsing of the sentence with respect to this set. • Step 3: Each SLTG-parse tree is “expanded” by unifying the feature constraints into the parse trees • If successful, complete valid feature structure • Else, next most likely tree is expanded

3rd Approach: Probabilistic CFG Parsing • Main goal: to obtain the Viterbi parse (highest probability) given an HPSG and a probabilistic model • One way: • Parse input without using probabilities • Then select most probable parse looking at every result • Cost: Exponential search space • This Approach: • Define equivalence class function (F.S. reduction) • Integrate SEM and SYN preference into Figures Of Merit (FOMs)

Probabilistic CFG Parsing • Probabilistic Model: • HPSG Grammar: G = < L, R >, where L = { l = < w, F > | wЄW, F ЄF } set of lexical entries • Ris a set of grammar rules, i.e., r Є R is a partial function: F x F -> F

Probabilistic CFG Parsing • Probabilistic HPSG: Probability p(F | w) of F.S. Assign to given sentence: Where λiis a model parameter, si is a fragment of a F.S., and σ (si , F)is a function of N of appearences of F.S. fragment si in F • Probabilities represent syntactic/semantic preferences expressed in a Feature Structure

Probabilistic CFG Parsing • Implementation: Iterative CYK parsing algorithm • Pruning edges during parsing • Best N parses are tracked • Reduced F.S.E though equivalence classes • Requires not over/undergenerate • FOMs computed with reduced F.S. Equivalent to original • Parser calculates Viterbi, taking maximum of probabilities of the same non terminal symbol at each point

Assessment • The three approaches attempt to achieve a higher efficiency of the Parsing process Distributed Parsing • Distributed Parsing:  Unification and copying faster  Soundness of Grammar affected  L(G) ⊂ L(Gsyn) ∩L(Gsem) • DO Parsing  Fragment at the right level of generality  Straightforward Probability computation • PCFG Parsing  Highly efficient CYK parsing implementation trough reduced FS and edge pruning

References • Pollard, C. and Sag, I. A. (1994). Head-Driven Phrase Structure Grammar . Chicago, IL: University of Chicago Press. • Richter, F. (2004b). A Web-based Course in Grammar Formalisms and Parsing. Textbook, MiLCA project A4, SfS, Universit¨at T¨ubingen. http://milca.sfs.uni-tuebingen.de/A4/Course/PDF/gramandpars.pdf. • Levine Robert, and Meurers Detmar. Head-Driven Phrase Structure Grammar: Linguistic Approach, Formal Foundations, and Computational Realization In Keith Brown (Ed.): Encyclopedia of Language and Linguistics, Second Edition. Oxford: Elsevier. 2006. • Abdel Kader Diagne, Walter Kasper, and Hans-Ulrich Krieger. (1995). Distributed Parsing With HPSG Grammars. In Proceedings of the 4th International Workshop on Parsing Technologies, IWPT-95, pages 79–86. • Neumann, G.HPSG-DOP: data-oriented parsing with HPSG. In: Unpublished manuscript, presented at the 9th Int. Conf. on HPSG, HPSG-2002, Seoul, South Korea (2002) • Tsuruoka Yoshimasa, Miyao Yusuke, and Tsujii Jun'ichi. 2003. Towards efficient probabilistic HPSG parsing: integrating semantic and syntactic preference to guide the parsing. Proceedings of IJCNLP-04 Workshop: Beyond shallow analyses - Formalisms and statistical modeling for deep analyses.

Parsing with HPSG Grammars: Overview and Implementations