100 likes | 232 Views
A Tool: Morphological Analyzer / Synthesizer for Lithuanian. Vytautas Zinkevičius VDU KLC vytas @donelaitis.vdu. Introduction Importance of the morphology level for the Lithuanian language technologies Difficulties caused by Lithuanian morphology.
E N D
A Tool: Morphological Analyzer / Synthesizer for Lithuanian Vytautas Zinkevičius VDU KLC vytas@donelaitis.vdu
Introduction Importance of the morphology level for the Lithuanian language technologies Difficulties caused by Lithuanian morphology Morphological Analyzer / Synthesizer for Lithuanian
Morphological Analyzer / Synthesizer for Lithuanian • Creating the Tool • Problem of Ambiguity • A Demo of the Tool http://donelaitis.vdu.lt/~vytas/lemo/angl/lemo_down.htm
Morphological Analyzer / Synthesizer for Lithuanian Implementation of the tool in application programs and systems • Spelling Checkers (e.g. Lithuanian Spellcheckers for Microsoft Office’97-2000) • Grammatical tagging of the Lithuanian text corpus at CCL VMU • Implementation of the Tool in SproUT (a multi-lingual shallow text processing system, Language Technology Lab, DFKI) • Used in the process of compiling the "Frequency Dictionary of Contemporary Lithuanian" (Grumadienė L., Žilinskienė V., Dažninis dabartinės rašomosios lietuvių kalbos žodynas, - Vilnius, 1997-1998).
MorphologicalAnalysisin SProUT Lithuanian text: Šimtų tūkstančių ar milijono ir daugiau metų, per kuriuos atsirado žmogus, procesas vyko toli nuo dabartinės Lietuvos teritorijos. The Result of the Morphological analysis: Šimtų TYPE=wordform LEMMA={POS=numeral <šimtas>} GRAMM_MEANING={POS=numeral + GROUP_OF_NUMERAL=cardinal + GENDER=masculine + NUMBER=plural + CASE=genitive} tūkstančių TYPE=wordform LEMMA={POS=numeral <tūkstantis>} GRAMM_MEANING={POS=numeral + GROUP_OF_NUMERAL=cardinal + GENDER=masculine + CASE=genitive} ar TYPE=wordform LEMMA={POS=particle <ar>} GRAMM_MEANING={POS=particle <ar>} LEMMA={POS=conjunction <ar>} GRAMM_MEANING={POS=conjunction <ar>} LEMMA={POS=onomatopoeic_interjection <ar>} GRAMM_MEANING={POS=onomatopoeic_interjection <ar>}
Morphological Analyzer / Synthesizer for Lithuanian Foundations and projects • Lithuanian State Science and Studies Foundation: 1994 reg. no. 94-299/4D, contract no 41; 1995 reg. no. 95-241/7E, contract no. 159. • "Lithuanian language recognition and generation at morphological level" - in the National Lithuanian language committee program "Lithuanian language in informational society 2000 - 2006"
The End Thank You