360 likes | 475 Views
Shalini Gupta - 07305R02. Universal Networking Language. The Problem. Large exploration of Data Linguistic barriers(Multilingualism)
E N D
Shalini Gupta - 07305R02 Universal Networking Language
The Problem Large exploration of Data Linguistic barriers(Multilingualism) Web contents are mostly in English and cannot be accessed without some proficiency in this language Though India forms large part of total population, the proportion of Internet Access is very low. Need for high speed translation to different languages
Solution: Machine Translation 2 approaches: Transfer based Works on specific pairs of languages Some text analysis on source language Some on target language Interlingua based Build a universal language Convert data to universal language De convert it back Needs only 2N conversions opposed to N*(N-1) translations for transfer based
UNL: An Interlingua Language independent Knowledge Representation Vehicle for machine translation UNL solves “Information Monopolies” problem Hindi English Interlingua (UNL) Chinese French
Outline Introduction UNL Components Some Controversial Issues in UNL Language Divergences between Hindi and English Conclusion
Introduction to UNL Proposed by the United Nations University Enables computers to process information and knowledge across the language barriers Replicates functions of natural languages in human communication Enables distributing, receiving and understanding multilingual information Represents information sentence by sentence
UNL Graph Each sentence is converted into a hyper graph Concepts as nodes Relations as directed arcs Concepts are called Universal Words Word Knowledge represented by Universal Words (UWs) which are language independent Conceptual Knowledge captured by relating UWs through relations
Example: John eats rice with a spoon Universal Word Attribute Semantic Relations
UNL Expression John eats rice with a spoon {unl} agt(eat(icl>do).@entry.@present, John(iof>person) obj(eat(icl>do).@entry.@present, rice(icl>food) ins(eat(icl>do).@entry.@present, spoon(icl>artifact).@indef {/unl}
Types of Universal Word Syntactic and semantic unit of UNL Represents a concept Represents node in graph of UNL expression 2 classes: Unit concepts Basic UWs Restricted UWs Extra UWs Compound concepts: Scopes
Types of Universal Words(UWs) Basic UWs Bare headwords with no constraint list E.g. : house drink Restricted UWs Headwords with a constraint list Represents a more specific concept, or subset of concepts
Types of UWs (contd..) Constraint List restricts the range of the concept that a Basic UW represents E.g. : state(icl>country) state(icl>abstract thing) Extra UWs Special type of Restricted UW Denote concepts that are not present in English. Foreign-language words are used as Head Words E.g. : Bharatnatyam(icl>dance)
Compound Concepts Raju said that [he had opened the window] say (icl>do) @entry.@past obj agt open (icl>do) @entry.@past @complete :01 Raju (iof>person obj agt window (icl>obj) he
Compound Concepts (contd..) Set of binary relations that are grouped together to express a compound concept Interpreted as a whole Expressed by a scope in UNL expressions Raju said that [he had opened the window]. Part of the sentence within square brackets should be grouped Only when they are grouped together and considered as a whole unit can the correct interpretation be obtained.
Relations Relation of UNL is expressed as: <relation>(<uw1>, <uw2>) <relation> is one of the relations defined in UNL <uw1>, <uw2> are universal words E.g. John broke the window agt(break(icl>do).@entry.@past, John(iof>person)) obj(break(icl>do).@entry.@past, window(icl>thing)) 41 such relations have been defined
Attributes Describe subjectivity of sentence Enrich the description given by UWs and relations E.g. Time with respect to the Speaker happened in the past : @past happening at present : @present will happen in future : @future John broke the window agt(break(icl>do).@entry.@past, John(iof>person))
UNL Knowledge Base Defines every possible relation between concepts Two important roles Defines semantics of Universal Words Gives linguistic knowledge of concepts E.g. The anchor wrote the script Linguistic Knowledge tells that anchor is a person Semantics tells that only a person can write a script (Anchor(of ship) can't do so)
Controversial Issues Meaning Representation Language: Should provide sufficient means to express knowledge. Should be simple. Main expressive device of UNL is Restrictions New expressive means for describing UWs have been proposed.
Semantic Restriction UW: operator(icl>thing) Doesn't effectively separate the meaning 2 meanings long distance operator(icl>human) addition operator (icl>abstract thing) Hypernymy and Meronymy are mostly used for expressing restrictions Synonmy and antonymy can be used E.g. wealth(equ>richness), poor(ant>rich)
Argument Frame Restriction X borrows Y from Z for W All four arguments are needed to define the action of borrowing completely Example John borrowed $10000 for 3 years John has been borrowing money for 3 years UNL as a meaning representation language should have an ability to draw a distinction between the argument and non-argument links of predicates
Weakly Differentiated Relations Some relations seem to be weakly differentiated and therefore difficult to use consistently. E.g. gol (final state) – plt (final place) E.g. src (initial state) – plf (initial place) John went to Brussels can be described both with gol and plt difference is that gol characterizes Brussels as the final state of John, while plt – as the final place of the whole event
Redundant Relations Some relations seems to be based more on the semantic class of UWs E.g. mod (modification) – man (manner) Difference between them boils down to the semantic class of the starting point of the relation answered politely (man) [to answer] a polite answer (mod) [an answer] Relations 'man' and 'mod' can be merged
Divergences between English and Hindi Constituent Order Divergence Jim is playing tennis. जिम टैनिस खेल रहा है (S) (V) (O) (S) (O) (V) Adjunction Divergence The [living in Delhi] boy दिल्ली में रहनेवाला लडका Preposition-Stranding Divergence Which shop did John go to? किस दुकान जौन गया में
Divergences(contd..) Null Subject Divergence जा रहा हूं going-am Pleonastic Divergence It is raining. यह बारिश हो राही है Conflational Divergence Jim stabbed him. जिम उसको छुरे से मारा Promotional Divergence The play is on. खेल चल रहा है
Conclusion UNL is an Interlingua for Machine Translation Studied Components of UNL Controversial Issues in UNL Divergences between English and Hindi
References Igor Boguslavsky. Some controversial issues of UNL: linguistic aspects. 2004. Shachi Dave and Pushpak Bhattacharyya. Knowledge extraction from Hindi text, 2001. Shachi Dave, Jignashu Parikh, and Pushpak Bhattacharyya. Interlingua-based English-Hindi machine translation and language divergence. Machine Translation, 16(4):251–304, 2001.
References The universal networking language manual, www.undl.org. 2006. Zhu M. Uchida H. The universal networking language (UNL) specifications. Technical Report, 2005.
Knowledge Extraction from Hindi Text EnConverter is a language independent parser provides framework for analysis Need to provide a lexicon and Analysis Rules Analysis Rule: (<PRE>)... <LNODE> <RNODE> (<SUF1>) (<SUF2>) (<SUF3>)... <PRI> Lexicon Entry: [HW] {ID} ”UW” (ATTRIB1, ATTRIB2, ...) <FLG,FRE,PRI>;
Knowledge Extraction from Hindi Text Each Step: Morphological Analysis Decision Relation Lexical Attribute UNL Attribute
Verbal Concepts Classes of predicates actions ( have an active initiator, Eg. kill) activities ( set of heterogeneous actions with common goal, Eg.trade) events (Have no agent, Eg. the bridge broke ) processes (Denote a situation that occupies a certain time span, Eg. the tree grows) states (Homogeneous, do not denote a change, Eg. hear, ache)
Classes of predicates properties (Differ from the states in that they are atemporal, Eg. blind, red) relations (Specify relation between two or more things, Eg. love, hate,) In UNL, all verbal concepts group into three classes (icl>do) contains actions and activities (icl>occur) consists of events and processes (icl>be) composed of states, properties and relations
Adjectival Concepts All adjectival concepts are divided into two classes: predicative (aoj>thing) restrictive (mod>thing) This does not work well in some situations Eg. Wise Greeks diluted wine with water Restrictive interpretation: ‘Those Greeks who were wise diluted wine with water. Silly ones didn’t’. Non-restrictive (qualificative) interpretation: ‘Greeks were wise. They diluted wine with water’. Its restrictive vs qualificative
Should be applied to other modifiers also The students sitting in the corner are waiting for the professor The students(,) who are sitting in the corner(,) are waiting for the professor. The students in the corner are waiting for the professor The phrase 'who are sitting' can be restrictive (‘those of the students who are sitting in the corner are waiting for the professor; others are not’) non-restrictive (‘the students are waiting for the professor; they are sitting in the corner’)