1 / 18

FROM BITS TO BOTS: Women Everywhere, Leading the Way

FROM BITS TO BOTS: Women Everywhere, Leading the Way. Lenore Blum, Anastassia Ailamaki, Manuela Veloso, Sonya Allin, Bernardine Dias, Ariadna Font Llitjós School of Computer Science Carnegie Mellon University. AVENUE Automatic Machine Translation for low-density languages.

bedros
Download Presentation

FROM BITS TO BOTS: Women Everywhere, Leading the Way

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. FROM BITS TO BOTS: Women Everywhere, Leading the Way Lenore Blum, Anastassia Ailamaki, Manuela Veloso, Sonya Allin, Bernardine Dias, Ariadna Font Llitjós School of Computer Science Carnegie Mellon University

  2. AVENUEAutomatic Machine Translation for low-density languages Ariadna Font Llitjós Language Technologies Institute

  3. Automatic Machine Translation Interlingua interpretation Transfer rules Corpus-based methods generation analysis

  4. Low-density languages • Not endangered languages, but languages with little or no presence in the web, little or no linguistic resources • AVENUE is currently working with: • Mapudungun [Chile] • Inupiaq [Alaska] • Aymara, Quechua and Aguaruna [Peru] • Siona [Colombia]

  5. Mapudungun for the Mapuche Chile Official Language: Spanish Population: ~15 million ~1/2 million Mapuche people Language: Mapudungun

  6. The language: Mapudungun • Oral tradition (170 hours of recorded speech in the medical domain) • Just a few written texts exist • Need to standardize the alphabet, determine phoneme set and writing rules, develop an electronic dictionary • We provide them with linguistic and technical advice + tools such as a morphological analyzer, parser and ultimately an MTS • We work in collaboration with a local team in Temuco

  7. Our last meeting in Temuco, May 2002

  8. New approach to MT • Fully automatic (no human intervention) • Very little electronic data available elicitation corpus • Machine learning techniques • Seeded version space algorithm to automatically learn transfer rules • Interactive and Automatic refinement of Transfer rules

  9. Elicitation corpus sample … \spa Una mujer se quedó en casa \map Kie domo mlewey ruka mew \eng One woman stayed at home. \spa V una mujer \map Pen kie domo \eng I saw one woman. \spa Hay suficiente comida para una mujer \map Mley iagel i yochiluwam kie domo \eng There is enough food for one woman. …

  10. Automatic Learning of a Transfer-based MTS Kathrin Probst tentative Transfer rules SVS algorithm Elicitation corpus Transfer module Rule Refinement module (tentative) TL sentences SL sentences Erik Peterson Ariadna Font

  11. Interactive and Automatic rule refinement 1. Given an MTS, translate sentences and present them to the users for minimal correction (interface design, MT error classification) 2. Determine blame assignment 3. Structure learning, as opposed to binary feedback, to automatically refine the existing rules

  12. Interactive Learning • Translation Correction Tool, web application • Bilingual informants (no knowledge of linguistics assumed) • User-friendly and Intuitive interface • Can naïve users reliably pinpoint the source of errors? MT error classification realistic? • Need of user studies: • Spanish - English • English - Spanish • English - Chinese

  13. Structure learning Learn mapping between incorrect structures and correct structures: She saw high woman She saw the tall woman • Given user feedback (correction + error classification) and blame assignment, modify the appropriate transfer rule(s) to obtain correct translation • Need to evaluate based on cross-validation, number of sentences it can translate correctly (elicitation corpus)

  14. A simple example • Spanish SLS: Ella vio a la mujer alta • English TLS: She saw high woman • Corrected TLS: She saw the tall woman • MT error classification: missing determiner + wrong lexical selection • Blame assignment (NP rule that generated the direct object + selectional restrictions) • Rule refinement: • the Noun Phrase (NP) rule that generated the error: • NP -> Adj N • needs to be refined into 2 different cases: • NP -> Det Adj N[sg] (the tall woman) • NP -> (Det) Adj N[pl] ((the)? tall women)

  15. AVENUE project members LTI team: ResearchersPh. D. students Jaime Carbonell Ariadna Font Llitjós Lori Levin Christian Monson Alon LavieErik Peterson Ralf Brown Katharina Probst Avenue External Project Coordinator Rodolfo M Vega, Chilean team: Eliseo Cañulef Luis Caniupil Huaiquiñir Hugo Carrasco Marcela Collio Calfunao Rosendo Huisca Cristian Carrillan Anton Hector Painequeo Salvador Cañulef Flor Caniupil Claudio Millacura

  16. Thanks! For more information: http://www.cs.cmu.edu/~aria/avenue/

More Related