260 likes | 544 Views
‘Alexandru Ioan Cuza’ University of Iasi Faculty of Computer Science Masterat of Computational Linguistics Maths CL: professor Corina Forascu. LEXICAL FUNCTIONAL GRAMMAR ( LFG ) Anca-Diana BIBIRI 1 st semester 2012-2013. LFG. CONTENT: Definition
E N D
‘Alexandru Ioan Cuza’ University of IasiFaculty of Computer ScienceMasterat of Computational LinguisticsMaths CL: professor CorinaForascu LEXICAL FUNCTIONAL GRAMMAR (LFG) Anca-Diana BIBIRI 1st semester 2012-2013
LFG CONTENT: • Definition • Specific/different elements with/vs another grammars • Structure • Examples • Applications • Constraints • Mono-/multi-lingual implementations • Conclusions • Bibliography Anca Bibiri
LFG • Key words: • constituent structure (c-structure) • functional structure (f-structure) • lexikon • constituency grammar Anca Bibiri
LFG – Definition • LFG is a formalism for representing the native speaker’s syntactic knowledge; • LFG is a grammar framework in theoretical linguistics, a variety of generative grammar; • It has been designed to serve as a medium for expressing and explaining important generalizations about the syntax of human languages and thus to serve as a vehicle for independent linguistic research; • It is a restricted, mathematically tractable notation for which simple, psychologically plausible processing mechanisms can be defined. • LFG was developed by Joan Bresnan and Roland Kaplan in the early 80s. Anca Bibiri
LFG – Distinctions from another Grammars • LFG is opposed to a dependency grammar (DGs); • The dependency relation views the (finite) verb as the structural center of all clause structure. All other syntactic units (e.g. words) are either directly or indirectly dependent on the verb; • DGs are distinct from phrase structure grammar since DGs lack phrasal nodes. Structure is determined by the relation between a word (a head) and its dependents. Dependency structures are flatter than constituency structures in part because they lack a finite verb phrase constituent and they are thus well suited for the analysis of languages with free word order (as Czech and Turkish). Anca Bibiri
Constituency vs dependency • Phrase structure rules as they are commonly employed result in a view of sentence structure that is constituency-based. Thus grammars that employ phrase structure rules are constituency grammar, as opposed to dependency grammars, which view sentence structure as dependency-based. What this means is that for phrase structure rules to be applicable at all, one has to pursue a constituency-based understanding of sentence structure. The constituency relation is a one-to-one-or-more correspondence. For every word in a sentence, there is at least one node in the syntactic structure that corresponds to that word. The dependency relation, in contrast, is a one-to-one relation; for every word in the sentence, there is exactly one node in the syntactic structure that corresponds to that word. Anca Bibiri
LFG – Structure • LFG views language as being made up of multiple dimensions of structure. Each of these dimensions is represented as a distinct structure with its own rules, concepts, and form. • LFG assigns 2 levels of syntactic description to every sentence of a language (primary structures): • constituent structure or ‘c-structure’and • functional structure or ‘f-structure’. Both of them are related in terms of a functional projection, or correspondence function. Anca Bibiri
LFG – Structure There are other structures which are hypothesized in LFG work: • argument structure (a-structure) – a level which represents the number of arguments for a predicate and some aspects of the lexical semantics of these arguments; • semantic structure (s-structure) – a level which represents the meaning of phrases and sentences; • information structure (i-structure); • morphological structure (m-structure); • phonological structure (p-structure). Anca Bibiri
LFG – Structure • c-structure is a phrase structure configuration = a conventional phrase structure tree, a well-formed labeled bracketing that indicates the superficial arrangements of words and phrases in the sentence; • f-structure = provides a precise characterization of traditional syntactic notions as subject, ‘understood’ subject, object, complement and adjunct. The f-structure is a hierarchical attribute-value matrix that represents underlying grammatical relations. Anca Bibiri
LFG – Structure • An f-structure is a finite set of pairs of attributes and values. (is an abstract description of a sentence). • Attributes are features such as number, person, case etc., or name of grammatical functions, such as subject, object. • Values can be: simple symbols; semantic forms that govern the process of semantic interpretation; subsidiary f-structures, sets of ordered pairs representing complex of internal functions; another type it refers to a set of symbols, semantic forms or f-structures. Anca Bibiri
Examples • (1) Rules: • S→NP VP • NP → Det N • VP → V NP NP • (2) Sentence: A girl handed the baby a toy. • (3) c-structure Anca Bibiri
C-structure S NP VP Det N V NP NP Det N Det N A girl handed the baby a toy Anca Bibiri
F-structure • SPEC A • SUBJ NUM SG PRED ’girl’ • TENSE PAST • PRED ‘hand ‹(↑SUBJ), (↑OBJ), (↑OBJ2)›’ SPEC THE • OBJ NUM SG PRED ‘baby’ SPEC A • OBJ2 NUM SG PRED ‘toy’ Anca Bibiri
Mapping Anca Bibiri
Applications • A central goal in LFG research is to create a model of grammar with a depth which appeals to linguists while at the same time being efficiently parsable and having the rigidity of formalism which computational linguists require. Because of this, LFG has been used as the theoretical basis of various machine translations tools: • AppTek’sTranSphere and • the Julietta Research Group's Lekta. Anca Bibiri
Constraints • LFG is a unification and constraint based theory. There are different kinds of constraints. • For example, a functional annotation like (↑Attribute): Symbol-Value (e.g. (↑NR) = SG) assign a value to the number-feature of an f-structure. Sometimes we do not want to assign a value but to only verify that a feature has a certain value. A functional annotations of this kind establishes a constraint that has to be met by the final f-structure and reads: 'The feature has to have value V and it has to get to somewhere else for I do not assign it.’ Anca Bibiri
Multilingual implementations • ParGram, a parallel grammar project involving Xerox PARC (English), XRCE (French), IMS Stuttgart (German), and University of Bergen (Norwegian). The basic goal of the ParGram project is to write large-scale LFG grammars with parallel analyses. • The basic idea behind parallel analysis is that, when linguistically justifiable, similar analyses are given to similar phenomena across languages. As such, a linguistically unjustified analysis is never forced on a language. However, if more than one analysis is possible, then the one that can be used in all the languages is chosen. Anca Bibiri
Multilingual implementations • For example, the representation of tense in English, French, and German: Maria will see Hans./ Maria verra Hans./Maria wird Hans sehen. • Although the basic meaning of the three sentences in is identical, their morphosyntactic manifestation is different in all three languages: • French uses just one word verra to represent the future tense; • English and German use two: an auxiliary and a main verb. Difference – the auxiliary will is adjacent to the main verb see in English, whereas in German the auxiliary wird is in second position while the main verb sehen is in final position. Anca Bibiri
Multilingual implementations • Along with the ParGram project, Xerox PARC has developed the XLE (Xerox Linguistic Environment) system, a platform for large-scale LFG grammar development. • XLE as a grammar development platform comes with an interface to finite-state transducers for tokenization and morphological analysis. Anca Bibiri
Multilingual implementations • A cascade of tokenizers and normalizers segments the input string into tokens, which are then ‘looked up’ in finite-state morphological transducers. The integration of morphological analysis allows to automatically generate large LFG lexica for open class categories like nouns, adjectives, adverbs, etc. They are created by generic LFG lexicon entries which specify f-structure annotations for morphological and lexical information provided by the morphology. While each grammar comes with hand-coded core LFG lexica for closed class ‘syntactic’ lexical items, XLE supports integration and processing of large-size subcategorization lexica, which are extracted and converted from machine-readable dictionaries, or obtained by use of corpus analysis tools. Anca Bibiri
Conclusions • LFG is particularly well suited for high-level syntactic analysis in multilingual NLP tasks (as Machine Translation). • The formal mode of external structure in LFG, the c-structure varies across languages; • Internal structures, the f-structure, are largely invariant across languages. • Output: • tree – c-structure • avm (attribute-value matrix) – f-structure. Anca Bibiri
Bibliography • Bresnan, Joanand Ronald M. Kaplan (1982). Introduction: grammars as mental representations of language. Cambridge, MA: The MIT Press • Bresnan, Joan (2001). Lexical Functional Syntax.Oxford: Blackwell. • Miriam Butt, Stefanie Dipper, Anette Frank, Tracy Holloway King (1999).Writing Large-Scale Parallel Grammars for English, French, and German.The University of Manchester.CSLI Publications. • Chomsky, Noam (1965). Aspects of the theory of syntax. Cambridge, MA: MIT Press. • Dalrymple, Mary (2001). Lexical Functional Grammar. No. 42 in Syntax and Semantics Series. New York: Academic Press. • Falk, Yehuda N. (2001). Lexical-Functional Grammar: An Introduction to Par • Ash Asudeh & Ida Toivonen (2009). Lexical-Functional Grammar in Bernd Heine and HeikoNarrog, eds., The Oxford Handbook of Linguistic Analysis. Oxford: Oxford University. Anca Bibiri
Thank you! Anca Bibiri