1 / 18

Related terms search based on WordNet / Wiktionary and its application in ontology matching

Related terms search based on WordNet / Wiktionary and its application in ontology matching. RCDL'2009. St. Petersburg Institute for Informatics and Automation of RAS. J ö nk ö ping University, Sweden. Feiyu Lin, A. Krizhanovsky (andrew.krizhanovsky at gmail.com). Contents.

Download Presentation

Related terms search based on WordNet / Wiktionary and its application in ontology matching

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Related terms search based on WordNet / Wiktionary and its application in ontology matching RCDL'2009 St. Petersburg Institute for Informatics and Automation of RAS Jönköping University, Sweden Feiyu Lin, A.Krizhanovsky (andrew.krizhanovsky at gmail.com)

  2. Contents • Wiki and Wiktionary intro • MRD, parser and Wiktionaries comparison • Correlation of relatedness measures • Experiment scheme • Result and comparison • Results, applications and future

  3. Goal • Is it possible to find related terms by the current version of Wiktionary as successfully as by WordNet? • for ontology matching, • for application in text search systems, • etc. • What advantages?

  4. Wiki-resources Distributed users and authors (edit pages) Centralized storage (e.g. MySQL, Apache, PHP) Set of hyper linked articles Each article has one or more categories (tree) * Example: http://en.wikipedia.org

  5. Wiktionary is a free-content multilingual dictionary

  6. Wiktionary data: +, -, simplicity & complexity Different wiktionaries have different levels of standartization. Fast growing data, but it’s created by a huge community(a developed parser should be very stable) Rich data thesaurus(synonyms, antonyms ) phrase books etymologies pronunciations sample quotations translations Fast growing data Interwiki (add. data) GNU DFL

  7. Wiktionary machine- readable dictionary database scheme

  8. Size of Wiktionaries WordNet (2006): 150,000 words, 115,000 synsets

  9. A shortest path in Russian Wiktionary

  10. Correlation of relatedness measures Correlation with human judgments of relatedness measures 353-TC to measures based on WordNet, English Wikipedia, Russian Wiktionary

  11. Largest eight Wiktionary editions (March 2008)

  12. Application of Machine-readable dictionary (MRD) Thesaurus data: • Related Terms Search • Search request extension (by synonyms) / request reformulation (in search systems) • Request recognition in question-answering systems • Word sense disambiguation Media data (audio + pictures) • Language learning

  13. Work plan: done and todo Russian Wiktionary Extraction (by RE) Definition Relations (synonyms…) Translation Audio Graphics Database API Visualization (MRD browser) Quiz & tests(test application) Russian Wiktionary Database scheme Definition Relations (synonyms…) Translation Audio Graphics Database API English Wiktionary

  14. Implementation Software based on Synarcher code Java MySQL or SQLite database JUnit test framework

  15. Results The scheme of the experiment for calculating the semantic relatedness measure based on Russian Wiktionary data The parser of Russian Wiktionary Database scheme designed Database API implemented in Java Compared the results of related terms search based on Wiktionary and WordNet Project site (Wiki tool kit) http://code.google.com/p/wikokit/

  16. Future work • Finish creation MRD • Database and software • Russian Wiktionary and English Wiktionary • Visualization (JavaFX) • MRD browser • Quiz & tests (learning application) • Online application (Java Web-start) • asdf

  17. Thank you!

More Related