1 / 18

Results and Evaluation of Hungarian Nominal WordNet v1.0

This paper presents the results and evaluation of the Hungarian Nominal WordNet project. It discusses the automatic methods used, the evaluation process, the combination of results, and future work. The project aims to expand the Hungarian WordNet by creating a nominal database. The evaluation shows the precision and coverage of the different methods used in creating the WordNet. Further work includes increasing precision and coverage, adding multiwords and derivational links, and upgrading to WordNet 2.0.

Download Presentation

Results and Evaluation of Hungarian Nominal WordNet v1.0

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Results and Evaluation of HungarianNominal WordNet v1.0 Márton Miháltz MorphoLogic The 2nd Global WordNet Conference, 2004, Brno

  2. Outline 1. About theHungarian WordNet project 2. Automatic methods 3. Evaluation 4. Combination of results 5. Future work

  3. Hungarian WN project 1. • Started in 2001 • MA Thesis; MorhoLogic project • 1st(current) phase: nominal database • Minimizing costs: • Expand method

  4. Hungarian WN project 1. Hypernym Meronym Antonym ember, személy, individum {human, person} hozzátartozó, rokon test {parent} {body} {mother} {father} {arm} {leg} anya apa kar láb

  5. Hungarian WN project 2. • Started in 2000 • MA Thesis; MorhoLogic project • 1stphase: nominal database • Minimizing costs: • Expand method • Semantic Similarity Hypothesis • Automatic methods

  6. Hungarian WN project3. • Ambiguity problem: {horse, Equus caballus} horse {horse} (gymnastic apparatus) ló knight {knight, horse} (chess figure) {knight} (person of noble origin) (avg. 1.71) (avg. 2.16) • 9 disambiguation heuristics • (Atserias et al, 1997)

  7. Hungarian WN Project 4. • Electronic resources: • Princeton WN 1.6 • Hungarian-English bilingual dictionary • 17,000—12,400 headwords (WN) • Monolingual (Hungarian) explanatory dictionary • 42,000 nominalentries • 64,000 definitions

  8. Disambiguation Heuristics 1. A) Heuristics based on bilingual dictionary: • Monosemous translation: Hu En1 {ss1} … Hu En1 {ss1} • Variant English words: En2 … • Intersection method: Hu1 En1 {ss1} Hu2 En2 … …

  9. Disambiguation Heuristics 2. A) Heuristics based on bilingual (cont’d): • Identifying derivational hypernyms: • Hungarian endocentric N+N compounds • Humor analyzer: • last segment (head) = hypernym • hangverseny+zongora  zongora • (‘concert+piano’  ‘piano’) • Conceptual Distance

  10. Disambiguation Heuristics3. B) Parsing monolingual definitions: • Synonyms: • lélekelemzés_1_1: A tudat alatti lelki jelenségek vizsgálata; pszichoanalízis[psychoanalysis] Hu En1 {ss1} Syn En2 … • Hypernyms: • koala_1_1: Ausztráliában honos, fán élő, medvére emlékeztető erszényes emlős. [mammal] Hyp Eni1 {ss1} … min Hu Enj1 {ss2} • Latin equivalents: • ló_1_1 [horse]: Vontatásra és lovaglásra haszn., páratlan ujjú patás háziállat (Equus Caballus) … Hu En1 {ss1} … Lat

  11. Disambiguation Heuristics 4. C) Methods for increasing coverage (+9.2%): • Derivational hypernym of hyp./syn.: Hu Hyp/Syn DerivHyp Eng1 ( Eng) Eng2 • Lookup of hyp./syn. in monolingual: Monolingual: monosemous? Hu Hyp/Syn Hyp YES ( Eng) Eng1 Eng2

  12. Results & Validation • Results from 9unsupervised heuristics: • Total: 13,948 Hung.Nouns 12,085 PWN synsets (22,169 connections) • Different methods: different confidence! • Validation: • Gold standard: 400 nouns random from biling./Hu • Manual disambiguation (2,201 possible connections) • IAA: 84.7% • Evaluation of 9 result sets against GS • Precision: 49%—92% • Coverage: 49%—0,5%

  13. Evaluation of Individual Methods

  14. Combining results 1. • Combinig different result sets: • 2 different confidence thresholds • 1-4. methods: precision 75% (2,445 n, 2,170 ss) • 1-6.methods: precision 63% (12,275 n, 12,004 ss) • Validating and combining results not included in the previous step • 8 of 13 intersection sets: precision75% • 9 intersection sets : precision63%

  15. Combining results 2. • Combination of the 2 base sets & the intersection sets w.r.t. the 2 thresholds

  16. Further Work 1. • Increase precision: • Complete manual checking of words in synsets • Editing of hierarchies • Increasecoverage: • Use additional bilingual dictionaries w/ best auto methods • Use Hung. taxonomies from monolingual dict. • Add multiwords • Add derivational links • Upgrade to WN 2.0

  17. Further Work 2. • Funding from IKTA grant (2004-2007?): • Manual supervision • Connect to EuroWordNet Top Ontology/ILI • Do verbs(adjectives, adverbs) • Add special domain: financial terms

  18. Thank you for your attention!MártonMiháltz http://people.inf.elte.hu/mmarcy/huwn/

More Related