1 / 15

Creation of a Russian-English Translation Program

Creation of a Russian-English Translation Program. Karen Shiells. Purpose. Object-oriented approach Interactive machine translation Designed for aid, not independent translation Explore algorithms used in machine translation Identify grammatical obstacles to translation

triage
Download Presentation

Creation of a Russian-English Translation Program

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Creation of a Russian-English Translation Program Karen Shiells

  2. Purpose • Object-oriented approach • Interactive machine translation • Designed for aid, not independent translation • Explore algorithms used in machine translation • Identify grammatical obstacles to translation • Create a base to expand later

  3. Scope of Study • Machine translation is and will be imperfect • Modern translation uses statistical methods • Project is limited to: • Separating base words from morphological endings • Constructing syntax trees from source text • Generating simple English output from tree • Identifying words already known to the program

  4. Other Research • Part-of-speech tagging: • Uses probability to identify parts of speech • Applied to unknown words and structures • Complex labeling systems, beyond conventional • Translation algorithms: • Massive dictionaries store words and information • Aided by verb categorization • Omit unknown words and translate without • Usually comprehensible, but require human revision

  5. Old Methods • Direct Translation • First method • Rearranges sentences without parsing • Based on rules of transfer for specific languages • Interlingua • From era of international languages • Uses one representation as an intermediary • Intermediary is usually a constructed language • Easier to add language pairs

  6. Syntactic Transfer • Similar to interlingua • Generates syntax tree using specific parser • Rearranges tree to fit target structure • Uses specific generation method to form output • Entire algorithm specific to one language pair • Best quality translations • Relatively new • Not as common in commercial software

  7. Alternative Structures • Valency • Stores number of complements for each word • Type of complements not specified • Occupies less space in dictionary • Phrase-Structure Representation • Most familiar: noun phrase, verb phrase, etc. • Breaks sentence into superstructures • Puts terminal symbols only in leaves • Non-terminal symbols for branches

  8. Dependency Trees • Uses words as nodes, not just leaves • Examples: • Verb dependent on subject • Objects dependent on verb • Adjectives dependent on nouns • Prepositions vary by type of prepositional phrase • Easier to verify agreement between words • Occupies less space

  9. Object Orientation • Object-oriented approach allows more flexibility • Endings, cases, and declensions are classes • Fewer hard-coded rules • Methods for locating dependents are in classes • Modular design allows gradual changes • Changes in lexical analysis do not affect parsing • Changes in dictionary do not affect translation

  10. Verb Typing • Divides verbs into categories, for example: • Transitive • Intransitive • Directional or Non-directional motion • Condenses structure storage • Dictionary stores only type of a verb • Particular structures taken from general • Code can apply to general structures, not specific

  11. Dictionary • Open, save, add, remove, and search functions • Stores: • Russian nominative • English nominatives • Part of speech • Noun/pronoun attributes • Verb types

  12. Translator • Uses transliteration for ease of testing • Can be easily converted to Unicode Cyrillic • Debugging output to terminal window

  13. Results • Subject, verb, direct object translated • Subject is first nominative • Verb matched by gender, number, and person • Direct object is first accusative • Adjectives matched to nouns • Matched by case, number, and gender • Word order not considered • Word order should be accounted for, but aren't • Adjectives to nearest, not matching • Prepositional objects should be nearby

  14. Conclusions • Part-of-speech guessing could be added easily • When a subordinate is not found, add to list • For each unmatched word, prompt user • Allow selection between subordinates not found • Verb typing would be harder, but helpful • Restricting complements makes more precise • More efficient, not searching for all possible • Prepositions could be associated with nouns • Even in inflecting languages, word order matters • Subordinates should be located by proximity • Multiple functions use the same inflections

  15. Bibliography • Allen, James. Natural Language Understanding. New York: Benjamin/Cummings Publishing Company, 1995. • Arnold, Doug, Lorna Balkan, Siety Meijer, R. Lee Humphreys, and Louisa Sandler. Machine Translation: An Introductory Guide. London: NCC Blackwell, 1994. Available Online: http://www.essex.ac.uk/linguistics/clmt/MTbook/PostScript. • Barber, Charles. The English Language: A Historical Introduction. Cambridge: Cambridge University Press, 1993. • Beard, Robert. “Russian: An Interactive On-Line Reference Grammar”. November 1, 2005. Available Online: http://www.alphadictionary.com/rusgrammar/. • Comrie, Bernard, ed. The World's Major Languages. Oxford: Oxford University Press, 1990. • Hutchins, John and Harold Somers. An Introduction to Machine Translation. London: Academic Press, 1992. Available Online: http://ourworld.compuserve.com/hompages/WJHutchins/IntroMT-TOC.htm.

More Related