140 likes | 272 Views
Towards a Morphological Analyzer for Old Norse. Introduction. Goal : a computer program that analyzes morphological structure of Old Norse words and generates declension tables Two analyzers A1, A2; both output all possible declension paradigms for inputted word
E N D
Introduction • Goal: a computer program that analyzes morphological structure of Old Norse words and generates declension tables • Two analyzers A1, A2; both output all possible declension paradigms for inputted word • A1: input headwords from a dictionary database or manual input • A2: input inflected words from saga texts • [Show sample query A1 without details] Morpholog. Analyzer - CHLT 2003
Broader Context (1) Input (mss) Marked-up transcript Normalized text Analyzer Output (declension tables) Morpholog. Analyzer - CHLT 2003
Broader Context (2) CLHT Project • Scandinavian Section, UCLA (Prof. Timothy Tangherlini) • Developing Old Norse morphological analyzer • Det Arnamagnæanske Institut, Københavns Universitet (Matthew Driscoll) • XML markup of Old Norse texts Morpholog. Analyzer - CHLT 2003
Computational Environment • Written in Perl • Database: MySQL • Server: Apache • Running on Linux machine • http://ecampusdev.humnet.ucla.edu/curban Morpholog. Analyzer - CHLT 2003
Linguistic Environment • Zoega’s Dictionary of Old Norse - Icelandic • augmented with additional headwords from the Old Norse dictionary project, Ordbog over det norrøne prosasprog(ONP) at Københavns Universitet • Fornrit normalization • Verification of performance: comparison with forms in Bower (1994), Gordon (1956) • Focus on non-poetic lexicon Morpholog. Analyzer - CHLT 2003
Analyzer Structure (General) In: (head)word MySQL database • Analyzer • Find root • Find endings • Apply sound changes Out: declension(s) Morpholog. Analyzer - CHLT 2003
Analyzer Structure (Database) • Tables exist for: • adjectives (regular endings, exceptions) • articles (free, suffixed) • dictionary • nouns (regular endings, exceptions) • possessive pronouns • verbs (regular endings, exceptions, anomalous, strong_ablaut) Morpholog. Analyzer - CHLT 2003
A1 Structure (Specific 1) Input: • Head word • Declension information • Part of speech • Translation Morpholog. Analyzer - CHLT 2003
A1 Structure (Specific 2-1) A1 pseudocode (nouns): • Translate declension info into MySQL format • Extract most likely endings from words in declension info • Determine root of head word • Create MySQL statement Morpholog. Analyzer - CHLT 2003
A1 Structure (Specific 2-2) • Receive all declension paradigms that fit declension information • Apply regular sound changes • Replace exceptional forms • Output results • [Show sample queries with details] Morpholog. Analyzer - CHLT 2003
Outlook (1): Accomplishments • Zoega in electronic, parsable format • Show sample of complex Zoega entry] • A1 outputs paradigms for all parts of speech in Zoega Morpholog. Analyzer - CHLT 2003
Outlook (2): A1 Performance Morpholog. Analyzer - CHLT 2003
Outlook (3): Next Steps • Improve A1 performance: general, compound words, etc. • Expand databases of exceptions • Improve verification method • Implement A2 beyond experimental stage • Connect analyzers to XML-tagged text Morpholog. Analyzer - CHLT 2003