1 / 59

Interlingua Methodology

Interlingua Methodology. Directly obtain the meaning of the source sentence. Do target sentence generation from the meaning representation. John gave the book to Mary. Meaning representation: give-action: agent: john object: the book receiver: mary.

ramla
Download Presentation

Interlingua Methodology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Interlingua Methodology Directly obtain the meaning of the source sentence. Do target sentence generation from the meaning representation. John gave the book to Mary. Meaning representation: give-action: agent: john object: the book receiver: mary

  2. Competing approaches Direct Transfer based

  3. Direct approach • Word replacements I like mangoes maOM AcCa laga Aama I like (root) mangoes • Morphology maOM AcCa lagata Aama I like mangoes • Syntactic re-arrangement maOM Aama AcCa lagata hO I mangoes like • Semantic embellishment mauJao Aama AcCa lagata hO I (dative) mangoes like

  4. Transfer Based Source sentence processed for parsing, chunking etc. S VP NP V NP I like mangoes

  5. Transfer Based Transfer structures obtained for the target sentence. S VP NP NP V I mangoes like

  6. Transfer Based Morphology and language specific modifications S VP NP NP V mauJao AcCa lagataa hO Aama

  7. MT Architectures: Vauquois' triangle

  8. Interlingua Relation Between the Transfer and the Interlingua Models Source language Parse tree Target Language Parse tree Interpretation generation transfer Parsing generation Target language words source language words

  9. State of Affairs • Systran reports 19 different language pairs. • 8 alright for intended use. • Even fewer are capable of quality written or spoken text translation.

  10. ENGLISH-SPANISH-ENGLISH • ...In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province • ... en ese imperio, el arte de la cartografía logró tal perfección que el mapa de una sola provincia ocupó la totalidad de una ciudad, y el mapa del imperio, la totalidad de una provincia • ... in that empire, the art of the cartography obtained such perfection that the map of a single province occupied the totality of a city, and the map of the empire, the totality of a province Provided by Systran on 19/11/02

  11. ENGLISH-KOREAN-ENGLISH • ...In that Empire, the Art of Cartography attained such Perfection that the map of a single Province occupied the entirety of a City, and the map of the Empire, the entirety of a Province • 저 제국안에, 단순한 지방의 지도가 도시의 완전을 점유했다 고 Cartography의 예술은 같은 얀벽,및 제국, 지방의 완전의 지도 를 달성했다 • Inside that empire, the map of the region where it is simple occupied the perfection of the city the art of the Cartography is same, yan it attained the map of of perfection of the wall and empire and region Provided by Systran on 19/11/02

  12. UNL Based MT: the scenario ENGLISH RUSSIAN ENCONVERSION UNL DECONVERSION HINDI FRENCH

  13. Universal Networking Language • Common language for computers to express information written in natural language • (Uchida et. al. 2000) • Application: • Electronic language to overcome language barrier • Information Distribution System

  14. UNL Example arrange agt obj plc residence meeting John

  15. Components of the UNL System • Universal Word • Relation Labels • Attributes

  16. Universal Word [saayaa] "shadow(icl>darkness)"; the place was now in shadow [laoSamaa~] "shadow(icl>iota)"; not a shadow of doubt about his guilt [saMkot] "shadow(icl>hint)" ; the shadow of the things to come [Cayaa] "shadow(icl>deterrant)"; a shadow over his happiness

  17. Universal Word (foreign concepts) [aput] "snow(icl>thing)"; [pukak] "snow(aoj<salt like)"; [mauja] "snow(aoj<soft, aoj<deep)"; [massak] "snow(aoj<soft)"; [mangokpok] "snow(aoj<watery)";

  18. Relation agt (agent) Agt defines a thing which initiates an action. agt (do, thing) Syntaxagt[":"<Compound UW-ID>] "(" {<UW1>|":"<Compound UW-ID>} "," {<UW2>|":"<Compound UW-ID>} ")" Detailed DefinitionAgent is defined as the relation between:UW1 - do, andUW2 - a thingwhere: UW2 initiates UW1, or UW2 is thought of as having a direct role in making UW1 happen. Examples and readingsagt(break(icl>do), John(icl>person)) John breaksagt(translate(icl>do), computer(icl>machine)) computer translates

  19. Attributes • Used to describe what is said from the speaker's point of view. • In particular captures number, tense, aspect and modality information.

  20. Example Attributes • I see a flower UNL: obj(see(icl>do), flower(icl>thing)) • I saw flowers UNL: obj(see(icl>do).@past, flower(icl>thing).@pl) • Did I see flowers? UNL: obj(see(icl>do).@past.@interrogative, flower(icl>thing).@pl) • Please see the flowers? UNL: obj(see(icl>do).@past.@request, flower(icl>thing).@pl.@definite)

  21. Analysis Rules Enconverter Dictionary ni-1 ni+3 Node List ni ni+1 ni+2 C C C A A A D Node-net C B E The Analyser Machine

  22. Strategy for Analysis • Morphological Analysis • Syntactico-Semantic Analysis

  23. Analysis of a simple sentences << A Report of John’s genius reached King’s ears>> articleandnounare combined andattribute@indefis added to the noun. <<[Report ][of] John’s genius reached king’s ears>> Right shift to put preposition with the succeeding noun. <</Report /[of ][John’s] genius reached king’s ears>> Ram’s being a possessing noun, shift right. <</Report //of / [John’s] [genius] reached king’s ears>> These two nouns are resolved into relation pos and first noun is deleted:

  24. Simple sentence (continued) <</Report /[of][genius] reached King’s ears>> The preposition of is then combined with noun and a dynamic attribute OFRES is added to entry of genius. <<[Report][of genius ] reached King’s ears>> Using the attribute OFRES these two nouns are resolved to relation mod and the second noun is deleted. <<[Report ][reached] King’s ears>> Shift right again and solve King’s ears, relation pof is generated.   <</Report /[reached][ ears]>> Relation obj is generated here and then relation agt is generated between Report and ears <</reached />>

  25. UNL as Interlingua and Language Divergence(Dave, Parikh, Bhattacharyya, JMT, 2003) • Stands for the discrepancy in representation due to the inherent characteristics of the languages. • Syntactic Divergence • Lexical Semantic Divergence

  26. Issue of free word order jaIma nao caaorI krnaovaalao laD,ko kao laazI sao maara. jaIma nao laazI sao caaorI krnaovaalao laD,ko kao maara. caaorI krnaovaalao laD,ko kao jaIma nao laazI sao maara. caaorI krnaovaalao laD,ko kao laazI sao jaIma nao maara. laazI sao jaIma nao caaorI krnaovaalao laD,ko kao maara. • Use made of the fact that in Hindi post positions stay adjacent to nouns (opposed to the preposition stranding divergence). • Flexibility in parsing- hit and preserve the predicate till the end.

  27. Conjunct and Compound verbs Typical Indian language phenomenon. Conjunct for verb-verb, compound for other POS+verb. vah gaanao lagaI She started singing H calao jaaAao E Go away. H $k jaaAao E Stop there. H Jauk jaaAao E Bend down. Possibility of combinatorial explosion in the lexicon. Possible solution: wordnet?

  28. Use of Lexical Resources Automatic Generation of the UW to language dictionary (Verma and Bhattacharyya, Global Wordnet Conference, Czeck Republic, 2004) Universal Word generation Semantic attribute generation Heavy use of wordnets and ontologies

  29. Languages under Study

  30. Conclusions • Predicate preservation strategy used for English, Hindi, Marathi, Bengali (Spanish being added). • Focus in marathi on morphology for Marathi. • Focus on kaarak (case) system for Bengali. • Extremely lexical knowledge hungry.

  31. Conclusions • Work going on in the creation of Indian language wordnets (Hindi, Marathi in IIT Bombay; Dravidian in Anna University). • Interlingua has a the attractive possibility of being used as a knowledge representation and applying to interesting applications like summarization, text clustering, meaning based multilingual search engines.

  32. Generation of the Hindi Case System in an Interlingua based MTFramework Debasri Chakrabarti, Sunil Kumar Dubey, Pushpak Bhattacharyya. Computer Science and Engineering Department, Indian Institute of Technology, Bombay, Mumbai, 400076, India. debasri,dubey,pb@cse.iitb.ac.in

  33. Introduction • Role of the case marker in a language • plays an important role in the structure of a sentence • helps to impart the meaning and naturalness • Example *मोटे तौर पर कृषि भूमि की जुताई, फसलों की रुपाई, कटाई, पालतू पशु प्रजनन, पालन, दुग्ध-व्यवसाय और वनीकरण सम्मिलित होता है । In a broad sense, agriculture includes cultivation of the soil and growing and harvesting crops and breeding and raising livestock and dairying and forestry.

  34. The Case System in Hindi • Hindi is characterized by a rich subsystem of case • Example: राम ने रवि को किताब दी। Ram Erg Ravi Dat book Nom give + pastRam gave a book to Ravi. • Hindi has the following cases nominative, ergative, accusative, instrumental, dative, genitive locative

  35. Language Universal Case Feature

  36. Language Universal Case Feature

  37. Case features of Hindi

  38. Case features of Hindi

  39. Case features of Hindi

  40. Case features of Hindi

  41. Nominative ~ Ergative alternation in the agent position • agent of an action may bear either nominative case or ergative case • ergative case appears in Hindi • simple past form • perfective aspect

  42. Examples • राम ने रवि को पीटा। Ram erg Ravi acc beat+past Ram beat Ravi. • राम ने रवि को पीटा था। Ram erg Ravi accbeat+past perfect Ram had beaten Ravi. • राम ने रवि को पीटा है। Ram erg Raviacc beat+present perfect Ram has beaten Ravi.

  43. Observations • There is a correlation between the ergative case and the aspectual property of the main verb • This is morphologically overt on the verb • Simple Past Tense: पीटा • Perfective Aspect: पीटा था • Morphological Rule • Simple Past Tense: V + आ  ने • Perfective Aspect: V + आ + (Tense morphology) ने

  44. Nominative ~ Ergative Alternation • Some Complex Phenomena • nominative case on the agent with the mentioned aspectual features • IS nominative ~ ergative subject to transitivity? • language universally transitivity determines nom ~ erg • three types of patterns independent of transitivity in Hindi

  45. Nominative ~ Ergative Alternation • Three patterns are: • only nom agents • only erg agents • either nom or erg agents • Examples of Intransitive verbs • Only nom agents i) राम गिरा। Ram fell down Ram +nomfall + past. ii) *राम ने गिरा। Ram erg f all + past

  46. Intransitive Verbs • Only erg agents i)राम ने प्रतीक्षा की। Ram waited. Ram ergwait + past. ii)*राम प्रतीक्षा किया। Ram +nomwait + past. • Either nom or erg agents i)राम खेला। Ram played. Ram +nom play + past. ii)राम ने खेला। Ramerg play + past.

  47. Transitive Verbs • Only nom agents i)राम शीशा लाया। Ram brought the glass. Ram +nomglass bring + past. ii) *राम ने शीशा लाया। Ramergglassbring + past. • Only erg agents i) राम ने शीशा तोड़ा। Ram broke the glass. Ram ergglassbreak + past. ii) *राम शीशा तोड़ा। Ram +nom glassbreak + past.

  48. Transitive Verbs • Either nom or erg agents i) राम ने समझा कि घर मेरा है। Ram erg think + past thathousemineis. Ram thought that the house is mine. ii)राम समझा कि घर मेरा है। Ram think + past thathousemineis.

  49. Inferences • Ergative case in Hindi is semantically driven • action performed deliberately : ergative case • action performed non deliberately: nominative case • Examples of deliberate and non-deliberate action राम गिरा।Ram fell down Ram +nomfall + past. राम ने मोहन कोगिराया। Ram made Mohan to fall down. Ram ergMohan acc cause to fall down

  50. Accusative ~ Nominative Alternation in the Object • Primary objects in Hindi • either accusative : को • or nom uninflected : Ө • Examples राम ने चावल खाया। Ram ate rice Ramergrice + nom eat+ past. राम ने रावण को मारा। Ram killed Ravan. RamergRavan acckill + past.

More Related