160 likes | 297 Views
GreekGram Building a parallel grammar for Modern Greek. Kakia Chatsiou achats@essex.ac.uk Dept of Language and Linguistics University of Essex Parts of this project have been funded with an ESRC Award PTA-2004-031-00112 , support which is gratefully acknowledged.
E N D
GreekGramBuilding a parallel grammar for Modern Greek Kakia Chatsiou achats@essex.ac.uk Dept of Language and Linguistics University of Essex Parts of this project have been funded with an ESRC Award PTA-2004-031-00112 , support which is gratefully acknowledged. Parts of this presentation have been presented in the LangUE 2007 International Postgraduate Conference, University of Essex. LAC day 2007
Outline • Introduction – Overview of the ParGram Project • Project Objectives • Participating Members • The XLE Platform • Grammar Architecture • The GreekGram Project • Overview • Assumptions • Coverage • Demonstration • Future Development Directions LAC day 2007
Outline • Introduction – Overview of the ParGram Project • Project Objectives • Participating Members • The XLE Platform • Grammar Architecture • The GreekGram Project • Overview • Assumptions • Coverage • Demonstration • Future Development Directions LAC day 2007
An Overview of the ParGram Project Project Objectives • Broad coverage grammars • Inclusion of important and frequently occurring constructions • Linguistically motivated analyses • Parallel and crosslinguistic development of grammars between the participating members • All grammars are guided by a common set of linguistic principles and a commonly agreed upon set of grammatical analyses and features • Identical treatment of core crosslinguistic phenomena • Methods in grammar engineering • Common test methods and evaluation strategies • Balance between efficiency, performance, reliability and maintainability across grammars LAC day 2007
An Overview of the ParGram Project Essex Greek, Welsh Manchester Arabic Oxford Malagasy Participating Members Bergen, Norway Georgian, Norwegian, Tigrinya DCU, Ireland Chinese, English, French, German, Japanese, Spanish Fuji XEROX Japanese Ho Chi Minch Vietnamese PARC, CA Chinese, English, French Debrecen Hungarian IMS, Stuttgart German Konstanz Urdu Sabanci, Istanbul Turkish LAC day 2007
An Overview of the ParGram Project The Xerox Linguistics Environment (XLE) Platform • Under current development at PARC (Palo Alto Research Center, USA) • An implementation of the Lexical Functional Grammar (LFG) Formalism • Implemented in C; works in Unix, Linux, MacOS. MS Windows version under development. • Integrates a morphological analyser employing Finite State Technology • Can be used for both parsing and generation • Includes tools for various grammar development activities (such as analysing performance, test-suites) • The core technology used in the consumer search engine based on natural language processing which is currently under development by LAC day 2007
An Overview of the ParGram Project • a Silicon Valley Company, currently building a transformative consumer search engine based on natural language processing: • It is based on technologies that take advantage of the structure and nuances of natural language • It offers an innovative approach to searching: • It breaks the confines of keyword search queries using natural language • Makes search more natural and intuitive • Aims at fundamentally changing how we search the web and at the same time delivering higher quality results (source: http://www.powerset.com ) LAC day 2007
LFG Grammar (rules, templates) Other Finite State Tools Lexicon(s) (Hand-written Or Automatically Extracted) Grammar Resources (parsing and generating) Tokenizer Morpholo- gical Analyser FST An Overview of the ParGram Project Basic Grammar Architecture LAC day 2007
Outline • Introduction – Overview of the ParGram Project • Project Objectives • Participating Members • The XLE Platform • Grammar Architecture • The GreekGram Project • Overview • Assumptions • Coverage • Demonstration • Future Development Directions LAC day 2007
The GreekGram Project Overview • A preliminary effort to develop a large-scale LFG computational Grammar for Modern Greek • Shares the objectives and principles of similar ParGram projects (parallel and crosslinguistic; balancing of maintainability and achieving large coverage) • Current main focus is on development of grammar rules; the lexicon is kept as minimal as possible LAC day 2007
The GreekGram Project Assumptions • Focus on syntax (but in such a way that the representations produced could serve as direct and useful input for incorporation into semantic interpretations) • All four possible Modern Greek Word orders are treated as equally acceptable in terms of markedness and acceptability • Modern Greek word Order is represented non-configurationally (Tzanidaki, 1996; Alexopoulou, 1999) • No treatment of morphology (using the Morphological analyzer or some other Finite state tool) – there is a separate lexical entry for each form. LAC day 2007
The GreekGram Project Coverage • All 4 possible word orders (irrespective of markedness) • Pro-drop character of the language • Subcategorisation frames of intransitive, transitive and ditransitive verbs • Number, case and gender agreement within the NP or PP; subject-verb agreement • Basic Relative Clause structure • Coordination of relative clauses • Some punctuation LAC day 2007
The GreekGram Project Demonstration LAC day 2007
The GreekGram Project Future Development Directions • Account for a greater variety of verb subcategorisation frames • Enrich the lexicon adding more lexical entries • Account for the morphology within the lexicon, using the XFST (or other finite state tools) • Expand the coverage of the grammar to other constructions such as interrogatives, imperatives and negation • … LAC day 2007
Thank you! For more information and updates on the progress of the project visit http://privatewww.essex.ac.uk/~achats/projects/greekgram/index.html LAC day 2007