90 likes | 214 Views
Standardization of Lexicon. Team Members: Jaya Saraswati Gajanan K. Rane Kunal K. Patel. INTRODUCTION:. Dictionary is the major source of information in the Enconversion and Deconversion process
E N D
Standardization of Lexicon Team Members: Jaya Saraswati Gajanan K. Rane Kunal K. Patel
INTRODUCTION: • Dictionary is the major source of information in the Enconversion and Deconversion process • The current Hindi Dictionary contains about 80,000 common words and there are about 200 Morphological, Grammatical and Semantic Attributes
FORMAT OF THE DICTIONARY: • [HW]{} “UW(icl>restriction)” (attributes); • [Am]{} “mango(icl>fruit)”(N,MALE,EDBL,OBJCT,INANI,Na); HeadWord Grammatical, Morphological and Semantic Attributes Universal Word
THE NEED FOR STANDARDIZING THE DICTIONARIES: • The dictionary contains Universal Words which represent concepts present in all the languages • Currently, the dictionaries are containing different restrictions for the same concept • Currently, the semantic attributes in the different dictionaries are also different
Continued…………. e.g.: The boy is running English Dictionary – [run]{} "run(icl>walk)" (V,VINT); [boy]{} "boy(icl>living thing)" (N,ANI,CONCRETE); UNL: agt(run(icl>walk), boy(icl>living thing)) Hindi Dictionary – [xOdZ]{} "run(icl>act)" (V,VINT,Va,VOA-MOT); [ladZak]{} "boy(icl>person)“(N,MALE,ANIMT,MML,PRSN,NAA);
KNOWLEDGE BASE TO BE USED FOR STANDARDIZING THE DICTIONARIES • The UNU, Tokyo has sent a knowledge base which is a hierarchy of concepts • We have created a set of semantic attributes and these semantic attributes have been incorporated into the knowledge base e.g.: “glass” – ARTFCT, OBJCT • Our task is to map each word of the dictionary to the concepts provided in the knowledge base
CURRENT ACTIVITIES • The dictionary is divided into four parts - Noun, Verbs, Adjectives and Adverbs • For standardizing the Noun part, a program has been created, which facilitates the user to select a restriction quickly for a dictionary entry • For each restriction selected, the semantic attributes corresponding to that restriction are also automatically entered in the dictionary entry
Continued…………. • Efforts are being made to automatically standardize the verb, adjective and adverb parts of the dictionary • For the Adverb part, the adverbs which end with “-ly” are given the restriction (icl>how) while those which do not end with "-ly" are given the restriction (icl>how(obj>thing))
FINAL GOAL All the dictionaries should have uniform restrictions and semantic attributes for similar concepts