820 likes | 962 Views
The NESPOLE Interchange Format (IF). Lori Levin, Emanuele Pianta, Donna Gates, Kay Peterson, Dorcas Wallace, Herve Blanchon, Roldano Cattoni, Jean-Philippe Gibaud, Chad Langley, Alon Lavie, Nadia Mana, Fabio Pianesi. Outline. Approaches to MT: Interlingua, Transfer, Direct.
E N D
The NESPOLE Interchange Format (IF) Lori Levin, Emanuele Pianta, Donna Gates, Kay Peterson, Dorcas Wallace, Herve Blanchon, Roldano Cattoni, Jean-Philippe Gibaud, Chad Langley, Alon Lavie, Nadia Mana, Fabio Pianesi
Outline • Approaches to MT: Interlingua, Transfer, Direct. • The NESPOLE! Interlingua. • Overview and motivation • Linguistic coverage • Tools and resources. • Evaluating an interlingua • Coverage: how do we measure coverage of the domain • Reliability: so that an analyzer written by one person in Italy can work with a generator written by someone he has never met in Korea. • Scalability: move to broader semantic domains without a constant increase in the amount of work.
Outline • Approaches to MT: Interlingua, Transfer, Direct. • The NESPOLE! Interlingua. • Overview and motivation • Linguistic coverage • Tools and resources. • Evaluating an interlingua • Coverage • Reliability • Scalability
What is an interlingua? • Representation of meaning or speaker intention. • Sentences that are equivalent for the translation task have the same interlingua representation. The room costs 100 Euros per night. The room is 100 Euros per night. The price of the room is 100 Euros per night.
Interlingua Give-information+personal-data (name=alex_waibel) Vaquois MT Triangle [s [vp accusative_pronoun “chiamare” proper_name]] [s [np [possessive_pronoun “name”] ] [vp “be” proper_name]] Transfer Mi chiamo Alex Waibel My name is Alex Waibel. Direct
Other Approaches to MachineTranslation • Direct: • Very little analysis of the source language. • Transfer: • Analysis of the source language. • The structure of the source language input may not be the same as the structure of the target language sentence. • Transfer rules relate source language structures to target language structures.
Note • Some transfer systems may produce a more detailed meaning representation than some interlingua systems. • The difference is whether translation equivalents in the source and target languages are related by a single canonical representation.
Multilingual Translation with an Interlingua Chinese (input sentence) San1 tian1 qian2, wo3 kai1 shi3 jue2 de2 tong4 French Italian Analyzers English German Japanese Catalan give-information+onset+body-state (body-state-spec=pain, time=(interval=3d, relative=before)) Korean Spanish Arabic Interlingua Arabic Spanish Catalan Korean Chinese (paraphrase) wo3 yi3 jin1 tong4 le4 san1 tian1 French Italian Generators Japanese English (output sentence) The pain started three days ago. German
Multilingual translation with transfer • Transfer-rules-1: Arabic-Catalan • Transfer-rules-2: Catalan-Arabic • Transfer-rules-3: Arabic-Chinese • Transfer-rules-4: Chinese-Arabic • Transfer-rules-5: Arabic-English • Transfer-rules-6: English-Arabic • Etc.
Advantages of Interlingua • Add a new language easily • get all-ways translation to all previous languages by adding one grammar for analysis and one grammar for generation • Mono-lingual development teams. • Paraphrase • Generate a new source language sentence from the interlingua so that the user can confirm the meaning
Disadvantages of Interlingua • “Meaning” is arbitrarily deep. • What level of detail do you stop at? • If it is too simple, meaning will be lost in translation. • If it is too complex, analysis and generation will be too difficult. • Should be applicable to all languages. • Human development time.
Interlingual MT Systems • University of Maryland – Lexical Conceptual Structure (Dorr) • Carnegie Mellon • Kantoo (Mitamura and Nyberg) • Nespole/C-STAR (Waibel, Levin, Lavie) • UNL (Universal Networking Language) • Microcosmos (Nirenburg) • Verbmobil – Domain actions (Block)
Outline • Approaches to MT: Interlingua, Transfer, Direct. • The NESPOLE! Interlingua. • Overview and motivation • Linguistic coverage • Tools and resources. • Evaluating an interlingua • Reliability • Coverage
A Travel DialogueTranslated from Italian A: Albergo Gabbia D’Oro. Good evening. B: My name is Anna Maria DeGasperi. I’m calling from Rome. I wish to book two single rooms. A: Yes. B: From Monday to Friday the 18th, I’m sorry, to Monday the 21st. A: Friday the 18th of June. B: The 18th of July. I’m sorry. A: Friday the 18th of July to, you were saying, Sunday. B: No. Through Monday the 21st.
A Travel Dialogue(Continued) B: So with departure on Tuesday the 22nd. A: Then leaving on the 22nd. Yes. We have two singles certainly. B: Yes. A: Would you like breakfast? B: Is it possible to have all meals? A: No. We serve meals only in the evening. B: Ok. If you can do breakfast and dinner. A: Ok. B: Do you need a deposit?
A Travel Dialogue(Continued) A: You can give me your credit card number. B: Ok. Just a moment. Ok. My name is Anna Maria DeGaperi. The card is 005792005792. A: Good. B: Expiration 2002. A: 2002. Good. Thank you. We need a confirmation on the 18th of July before 6pm. B: Goodbye. A: Thanks. Goodbye. B: Thanks. Goodbye.
A Non-Task-Oriented Dialogue(We can’t translate this.) A: Are you cooking? B: My father is cooking. I’m cleaning. I just finished cleaning the bathroom. A: Look. What do you know about Monica? B: I don’t know anything. Look. I don’t know anything. A: You don’t know anything? I wrote her three weeks ago, but if she hasn’t received the letter, they would have returned it. I hope she received it. B: Because Celia told me that the address that Monica had given us was wrong. She said that if I was going to write to her, well, …. From the Spanish CallHome corpus: unlimited conversation between family members.
The Ideal MT System… • Fully automatic • High quality • Domain independent (any topic) ….isn’t within the current state-of-the-art.
Instructions: • Delete sample document icon and replace with working document icons as follows: • Create document in Word. • Return to PowerPoint. • From Insert Menu, select Object… • Click “Create from File” • Locate File name in “File” box • Make sure “Display as Icon” is checked. • Click OK • Select icon • From Slide Show Menu, Select Action Settings. • Click “Object Action” and select “Edit” • Click OK Design Principles of the Interchange Format • Suitable for task oriented dialogue • Based on speaker’s intent, not literal meaning • Can you pass the salt is represented only as a request for the hearer to perform an action, not as a question about the hearer’s ability. • Abstract away from the peculiarities of any particular language • resolve translation mismatches.
Translation Mismatches • Sentences that are translation-equivalents in two languages do not have the same syntactic structure or predicate-argument structure. (Unitrans; Eurotra) • I like to swim. • I swam across the river. • Sue met with Sam/Sue met Sam.
Design Principles(continued) • Domain independent framework with domain-specific parts • Simple and reliable enough to use: • at multiple research sites with high intercoder agreement. • with widely varying type of parsers and generators. • Allow robust language engines • Underspeicification must be possible. • Fragments must be represented.
Speech Acts:Speaker intention vs literal meaning • Can you pass the salt? • Literal meaning: The speaker asks for information about the hearer’s ability. • Speaker intention: The speaker requests the hearer to perform an action.
Remember this term: Domain Action
Domain Actions: Extended, Domain-Specific Speech Acts give-information+existence+body-state It hurts. give-information+onset+body-object The rash started three days ago. request-information+availability+room Are there any rooms available? request-information+personal-data What is your name?
Domain Actions:Extended, Domain-Specific Speech Acts • In domain. • I sprained my ankle yesterday. • When did the headache start? • Out of domain • Yesterday I slipped in the driveway on my way to the garage. • The headache started after my boss noticed that I deleted the file.
Formulaic Utterances • Good night. • tisbaH cala xEr waking up on good • Romanization of Arabic from CallHome Egypt
Same intention, different syntax • rigly bitiwgacny my leg hurts • candy wagac fE rigly I have pain in my leg • rigly bitiClimny my leg hurts • fE wagac fE rigly there is pain in my leg • rigly bitinqaH calya my leg bothers on me Romanization of Arabic from CallHome Egypt.
Language Neutrality • Comes from representing speaker intentionrather than literal meaning for formulaic and task-oriented sentences. How about … suggestion Why don’t you… suggestion Could you tell me… request info. I was wondering… request info.
Domain Action Interlingua and Lexical Semantic Interlingua • and how will you be paying for this • Domain Action representation: • a:request-information+payment (method=question) • Lexical Semantic representation: predicate: pay time: future agent: hearer product: distance: proximate, type: demonstrative manner: question
Complementary Approaches • Domain actions – limited to task oriented sentences • Lexical Semantics– less appropriate for formulaic speech acts that should not be translated literally
Instructions: • Delete sample document icon and replace with working document icons as follows: • Create document in Word. • Return to PowerPoint. • From Insert Menu, select Object… • Click “Create from File” • Locate File name in “File” box • Make sure “Display as Icon” is checked. • Click OK • Select icon • From Slide Show Menu, Select Action Settings. • Click “Object Action” and select “Edit” • Click OK Components of the Interchange Format speakera: (agent) speech actgive-information concept*+availability+room argument*(room-type=(single & double), time=md12)
Components of IFas of February 2002 • 61 speech actsgive-information • domain independent, • 20 are dialog managing • 108 conceptsavailability, accommodation • mostly domain dependent • 304 argumentsroom-type, time • domain dependent and independent • 7,652 values single, double, 12th
Instructions: • Delete sample document icon and replace with working document icons as follows: • Create document in Word. • Return to PowerPoint. • From Insert Menu, select Object… • Click “Create from File” • Locate File name in “File” box • Make sure “Display as Icon” is checked. • Click OK • Select icon • From Slide Show Menu, Select Action Settings. • Click “Object Action” and select “Edit” • Click OK Examples • no that’s not necessary • c:negate • yes I am • c:affirm • my name is alex waibel • c:give-information+personal-data (person-name=(given-name=alex, family-name=waibel)) • and how will you be paying for this • a:request-information+payment (method=question) • I have a mastercard • c:give-information+payment (method=mastercard)
Outline • Approaches to MT: Interlingua, Transfer, Direct. • The NESPOLE! Interlingua. • Overview and motivation • Linguistic coverage • Tools and resources. • Evaluating an interlingua • Reliability • Coverage • Scalability
Conventional Speec Acts thank you. c:thank can I help you ?a:offer+help (who=i, to-whom=you) <uh> my name is Chadc:give-information+personal-data (person-name=(given-name=chad))
Fragments: ellipsis <B> and <uh> <hm> in a restaurant. a:give-information+concept (conjunction=discourse, location=(restaurant, identifiability=no)) <uh> which town? c:request-information+concept (concept-spec=(town, identifiability=question))
Fragments: abandoned • You should a: suggest+concept (who=you) • What should I c: request-suggestion+concept (who=I)
Coordination of Sentences I want to go to France and I would prefer to leave today. c:give-information+disposition+trip (destination=(object-name=france), disposition=(who=i, desire)) c:give-information+disposition+departure (conjunction=discourse, time=(relative-time=today), disposition=(who=i, preference))
Coordination of sentences, reduced I want to leave Pittsburgh at 2 and return from Rome at 5. c:give-information+disposition+departure (conjunction=discourse, origin=(object-name=pittsburgh), disposition=(who=i, desire), time=(clock=(hours=2))) c:give-information+trip (conjunction=discourse, factuality=unspecified, trip-spec=return, origin=(object-name=rome), time=(clock=(hours=5)))
Conjunctive Set I like festivals and plays. c:give-information+disposition+event (... event-spec=(operator=conjunct, [(festival, quantity=plural),(play, quantity=plural)]))
Conjunction of modifiers I prefer red and blue cars. c:give-information+disposition+vehicle (... vehicle-spec=(car, quantity=plural, color=(operator=conjunct, [red, blue])))
Disjunctive Sets I prefer hotels or cabins. c:give-information+disposition+accommodation (... accommodation-spec=(operator=disjunct, [(hotel, quantity=plural), (cabin, quantity=plural)]))
Contrastive Set I like hotels but not cabins. c:give-information+disposition+accommodation (... accommodation-spec=(operator=contrast, [(hotel, quantity=plural), (polarity=negative, cabin, quantity=plural)]))
Attitudes: often a source of mismatches • Disposition • Eventuality • Evidentiality • Feasibility • Knowledge • Obligation • Main verbs in English that occur in other languages as affixes, adverbs, or other construtions that are not clearly bi-clausal.
Disposition <uhm> <P> and I would like to arrive <P> around September ninth.c:give-information+disposition+arrival (disposition=(who=i, desire), /* attitude */ conjunction=discourse, /* rhetorical information */ time=(exactness=approximate, month=9, md=9)) /* time */
Disposition • I would like to stay in a hotel. • Disposition=desire • I hate mushroom picking. • Disposition=dislike • I am waiting to see the circle. • Disposition=expectation • But wouldn’t matter. • Disposition=indifferent • When do you plan on arriving in Pittsburgh? • Disposition=intention
Eventuality • It is possible I may be arriving earlier. give-information+eventuality+arrival (eventuality=possible) • I’m sure that they will arrive tomorrow. • Maybe there is something beautiful to see. • It is not impossible.
Evidentiality: Source of information • Apparently there are many castles. Give-information+evidentiality+attraction • I heard there are many castles. • I noticed there is a winter package available. • I’ve been told I must leave before ten.
Feasibility • You can rent skis at the resort. Give-information+feasibility+rent+equipment (feasibility=feasible….)
Knowledge • I didn’t know that Trento has lakes. Give-information+negation+knowledge+contain+attraction (knowledge=(who=I, polarity=negative), contain=(lake, quantity=plural), attraction-spec=name-trento) • I know the location of the hotel.