250 likes | 462 Views
Odia Group’s Progress Report ( upto 30 th April 2013) . Presented by Panchanan Mohanty University of Hyderabad . Status of Synset Linking:.
E N D
Odia Group’s Progress Report (upto 30th April 2013) Presented by PanchananMohanty University of Hyderabad
Status of Synset Linking: • Total Linked Synsets: 36302Total Linked Noun Synsets: 27216 Total Linked Adverb Synsets: 452 Total Linked Adjective Synsets: 5827 Total Linked Verb Synsets: 2807 • Total Unique Words found = 53754
Status of Synset Validation: • Total Validated Synset: 35284 Total Noun Synsets validated: 27216 Total Adverb Synsets validated: 377 Total Adjective Synsets validated: 5273 Total Verb Synsets validated: 2418 • Total Unique Words found = 53916
Sense Marking Status as on 29th April 2013: • Sense Marking Status: • Total files used = 159 • Sambadnewspaper corpus: 147 files • Articles from OdishaSahitya Academy: 8 files • Other articles: 4 • Total words = 197935 • Words found or sense-marked words = 101715 • Sense-marked words percentage = 51.39%
Sense-Marking Status… Added Synsetsduring Sense-Marking: • Total number of added words: 1361 • Noun=804, Verb=40, Adjective=394, Adverb=123 • Examples: • Noun: କଣ୍ଠଶିଳ୍ପୀ-4796, ଜାକଜମକ-9208 • Verb: ଖୋଳି_ଦେବା-4699, ପାସ_କରିବା-4620 • Adjective: ୫୯ଟି-8564, ୬୫ଜଣ-9488 • Adverb: ସମ୍ପୂର୍ଣ୍ଣଭାବେ-6468, ଏକାଠି-28602
In the ID no 4796, the Odia word କଣ୍ଠଶିଳ୍ପୀ (kaNThasiLpi:) is an appropriate synonym of the word ଗାୟକ(ga:yaka) which is the Hindi equivalent of गायक(ga:yak). In the ID no 9208, the Odia word ଜାକଜମକ (ja:kajamaka) has also been found an appropriate synonym of the word ଧୂମଧାମ (dhu:mdha:m) which is the Hindi equivalent of धूम-धाम(dhu:mdha:m). The new words have been mentioned in red colour .
In the ID no 4620, the Odia word ପାସ_କରିବା(pa:skariba:) has also been found an appropriate synonym of the word ଉତ୍ତୀର୍ଣ୍ଣ_ହେବା (utti:rNNaheba:) which is the Hindi equivalent of उत्तीर्ण होना(utti:rNNahona:). The new words have been mentioned in red colour .
In the ID no 8564, the Odia words ୫୯ଟି(aNasaThiTi) and ୫୯ଜଣ(aNasaThijaNa) are appropriate synonyms of the word ଅଣଷଠି (aNasaThi) which is the Hindi equivalent of उनसठ(unsaTh). Tha word ୫୯ଟି (aNasaThiTi) is formed by adding ଟି(Ti) to the adjective ୫୯ (aNasaThi) and this is used for things and animals, where as the word୫୯ଜଣ(aNasaThijaNa) is formed by adding ଜଣ(jaNa) to the adjective ୫୯ (aNasaThi) and this is used for human beings. Likewise the words ୬୫ଟି (pancasaThiTi) and ୬୫ଜଣ (pancasaThijaNa) have been added to the ID no 9488.
In the ID no 4699, the Odia word ସମ୍ପୂର୍ଣ୍ଣଭାବେ(sampu:rNNabha:be) is an appropriate synonym of the word ସମ୍ପୂର୍ଣ୍ଣ(sampu:rNNa) which is the Hindi equivalent of बिल्कुल (bilkul). In the ID no 4620, the Odia word ଏକାଠି(eka:Thi) has also been found an appropriate synonym of the word ଏକାସାଙ୍ଗରେ (eka: sa:ngare)which is the Hindi equivalent of साथ-साथ (sa:thsa:th). The new words have been mentioned in red colour. .
Deleted Synsetsduring Sense-Marking: • Total number of deleted words: 109 • Examples: • ବରଫି, ହିମଯୁକ୍ତ- 3650 • ଆବାଦୀ, ବସ୍ତି, ପଡ଼ା-5483
In the ID no 6468, the words such as ବରଫିand ହିମଯୁକ୍ତ have been deleted. The reason is that the word ବରଫି(barafi) is the name of ‘a type of sweet’ and the word ହିମଯୁକ୍ତ(himajukta) means‘added with snow/ice’ and they do not give the same meaning as the words ‘ବରଫାବୃତ, ବରଫାଚ୍ଛନ୍ନ, ହିମାବୃତ, ହିମାଚ୍ଛନ୍ନ (barafa:bruta, barafa:chana, hima:bruta, hima:chana)’ which mean ‘covered with snow/ice’. So both the words have been deleted. In the Id no 28602, the word ଆବାଦୀ(a:ba:di:) means ‘cultivable’ in Odia. Again the words ବସ୍ତି(basti) and ପଡ଼ା(paDa:) mean a ‘hamlet’. These words give the same meaning as the word (janasaMkhya:) in Hindi, but they do not give the same meaning as the words such as ଜନସଂଖ୍ୟା(janasaMkhya:) and ଲୋକସଂଖ୍ୟା(lokasaMkhya:) mean in Odia. So These words have been deleted.
Antonym Tool Report: • Antonyms using the Antonym Creation Tool developed by Thapar University have been checked and the related issues have been discussed.
Example 1: • ID -1471, CAT-NOUN • CONCEPT:: एक पौधे का बीज जिससे तेल निकलता है • EXAMPLE :: "वह प्रतिदिन नहाने के बाद तिल का तेल लगाता है" • SYNSET-HINDI :: तिल,पूतधान्य,साराल • CONCEPT:: ଏକଛୋଟ ଗଛର ମଞ୍ଜି ଯେଉଁଥିରୁ ତେଲ ବାହାରେ • EXAMPLE:: "ସେ ସବୁଦିନେ ଗାଧୋଇସାରି ରାଶି ତେଲ ଲଗାଏ" • SYNSET-ORIYA:: ରାଶି, ଖସା, ତିଳ Antonym ID: • ID-2629, CAT-NOUN • CONCEPT:: वह स्थान जहाँ दवाएँ मिलती या बिकती हों • EXAMPLE:: "निषिद्ध दवा बेंचने के कारण यह औषधालय बंद हो गया" • SYNSET-HINDI:: औषधालय,दवाख़ाना,दवाखाना,औषध दुकान,दवाघर • CONCEPT:: ଯେଉଁ ସ୍ଥାନରେ ଔଷଧ ମିଳେ ବା ବିକ୍ରି ହୁଏ • EXAMPLE:: "ନିଷିଦ୍ଧ ଔଷଧ ବିକ୍ରିକରିବା କାରଣରୁ ସେହି ଔଷଧାଳୟ ବନ୍ଦ ହୋଇଗଲା" • SYNSET-ORIYA:: ଔଷଧାଳୟ, ଔଷଧଦୋକାନ, ଓଷଦ ଦୋକାନ
Discussion: In this example, the Hindi words til (तिल), pu:tdha:ny (पूतधान्य), and sa:ra:l (साराल) which mean ‘gingelly oil plant and seed’ andaushadha:lay (औषधालय), dava:xa:na: (दवाख़ाना), dava:kha:na: (दवाखाना), aushadhduka:n (औषध दुकान), and dava:ghar (दवाघर) which mean ‘a medicine store’ are mentioned as antonymous. The Hindi equivalents for Odia ra:si (ରାଶି), tiLa (ତିଳ), and khasa: (ଖସା) mean ‘gingelly oil plant and seed’ and ausadha:Laya (ଔଷଧାଳୟ), ausadhadoka:na (ଔଷଧଦୋକାନ), and osadadoka:na (ଓଷଦ ଦୋକାନ) ‘ medicine store’ cannot be accepted as the antonyms in Odia.
Example 2: • ID-40, -NOUN • CONCEPT:: मादा गीदड़ • EXAMPLE :: "जंगल में एक गीदड़ी अपने बच्चे को दूध पिला रही थी" • SYNSET-HINDI:: गीदड़ी,सियारिन,सियारनी,जंबुकी,जम्बुकी,शृगाली,शृगालिका,लोपापिका,लोमाशिका, लोमसिक, सृगाली, सृगालिका, शिजवा, वामी • CONCEPT:: ମାଈ ବିଲୁଆ • EXAMPLE:: "ଜଙ୍ଗଲରେ ଗୋଟିଏ ମାଈ ବିଲୁଆ ତା ଛୁଆକୁ କ୍ଷୀର ପିଆଉଥିଲା" • SYNSET-ORIYA:: ମାଈ_ବିଲୁଆ , ମାଈ_ଶିଆଳ, ମାଈ_ଶୃଗାଳ • Antonym of id no 40 is 24 • ID :: 24, CAT-NOUN • CONCEPT:: मादा शेर • EXAMPLE:: "शेरनी शेर से अधिक खूँखार होती है / गुरु भक्त शिवाजी समर्थ गुरु रामदास का पेट दर्द ठीक करने के लिए शेरनी का दूध लाए" • SYNSET-HINDI:: शेरनी,मादा बाघ,मादा व्याघ्र,बाघिन,व्याघ्री • CONCEPT:: ମାଈ ବାଘ • EXAMPLE:: "ବାଘୁଣୀ ବାଘଠାରୁ ଅଧିକ ହିଂସ୍ର/ଗୁରୁଭକ୍ତ ଶିବାଜୀ ସମର୍ଥ ଗୁରୁରାମଦାସଙ୍କ ପେଟ ଯନ୍ତ୍ରଣା ଭଲ କରିବାପାଇଁ ବାଘୁଣୀର କ୍ଷୀର ଆଣିଥିଲେ" • SYNSET-ORIYA:: ବାଘୁଣୀ, ମାଈ ବାଘ
Discussion: • In this example, the Hindi word gi:dDi: (गीदड़ी) is stated as the antonym of sherni: (शेरनी). But in Odia, the equivalents for these two are ma:i: bilua: (ମାଈ_ବିଲୁଆ), ma:i: sia:La (ମାଈ_ଶିଆଳ), and ma:i: sruga:La (ମାଈ_ଶୃଗାଳ) which mean ‘ female jackal’ . These are not usually acceptable as the antonyms of ba:ghuNi: (ବାଘୁଣୀ), and ma:i: ba:gha (ମାଈ_ବାଘ) ‘ female tiger or tigress’.
EXMAPLE-3 : Hindi synset and category wrong • ID-2234, CAT-noun • CONCEPT:: एक वृक्ष जिसके मीठे फूलों से शराब और अन्य खाद्य वस्तुएँ बनती हैं • EXAMPLE:: "महुए की लकड़ी मानव के लिए बहुत उपयोगी होती हैं" • SYNSET-HINDI:: महुआ,मधु,मधुक,महूक,महूख,मधुष्ठील,मधुवृक्ष,मधुशाक,महाद्रुम • ID-2234, CAT-NOUN • CONCEPT:: ଯେଉଁ ବୃକ୍ଷର ମିଠା ଫୁଲରୁ ମଦ ଏବଂ ଅନ୍ୟ ଖାଦ୍ୟ ବସ୍ତୁ ତିଆରି ହୁଏ • EXAMPLE:: "ମହୁଲ କାଠ ମଣିଷପାଇଁ ବହୁତ ଉପଯୋଗୀ ହୋଇଥାଏ" • SYNSET-ORIYA:: ମହୁଲ, ମହୁଆ • Antonym ID: • ID-8378, CAT-adjective • CONCEPT:: सामना होने पर संकोचवश होनेवाला • EXAMPLE:: "वह जब भी मिलता है,मुँह-देखी प्रशंसा करना शुरु कर देता है" • SYNSET-HINDI:: मुँह-देखा • ID-8378, CAT-ADJECTIVE • CONCEPT:: ମୁହାଁମୁହିଁ ହୋଇଗଲେ ମୋହବତିଆ ବ୍ୟବହାର • EXAMPLE:: "ଯେତେବେଳେ ବି ଦେଖାହୁଏ, ସେ ଉପରଠାଉରିଆ ପ୍ରଶଂସା କରିବା ଆରମ୍ଭ କରିଦିଅନ୍ତି" • SYNSET-ORIYA:: ଉପରଠାଉରିଆ, ଉପୁରିଆ
In this example, the Hindi word प्रयोजनहीनतः (prayojanhinataH) is stated as the antonym of प्रयोजनतः (prayojanataH) and these words have been mentioned under the sub-category of ‘adverb of quality’ . But the usage/meaning shows that the words should come under the sub-category of ‘adverb of reason’. Another important fact is that this sub-category ‘adverb of reason’ is not enlisted in the drop down menu. Therefore, the sub-categories of adverbs should be revisited.
ଓଡ଼ିଆ ଶବ୍ଦକୁଟୁମ୍ବODIA WORDNET Odia Wordnet website link: http://indradhanush.unigoa.ac.in/odiawordnet/
Future Plan: • An Online Dictionary • A Thesaurus