100 likes | 280 Views
AGROVOC and the NAL Agricultural Thesaurus at the Ontology Alignment Evaluation Initiative. Willem Robert van Hage TNO Industrie & Techniek / Vrije Universiteit Amsterdam. Overview. AGROVOC & NALT the Goal federated access to heterogeneous data sources the Method align the ontologies
E N D
AGROVOC and the NAL Agricultural Thesaurus at the Ontology Alignment Evaluation Initiative Willem Robert van Hage TNO Industrie & Techniek /Vrije Universiteit Amsterdam
Overview • AGROVOC & NALT • the Goal • federated access to heterogeneous data sources • the Method • align the ontologies • the OAEI workshop (2006 and 2007) • a challenge for ontology alignment researchers • an opportunity for environmental researchers
AGROVOC & NALT • AGROVOC • 28.000 descriptor terms • 10.000 non-descriptor terms • ten languages (currently 17 and another 4 under construction) • used to index AGRIS/CARIS (and other resources) • NALT • 41.000 descriptor terms • 24.000 non-descriptor terms, • english (currently also spanish) • used to index AGRICOLA, FSRIO, AgNIC, NALDR(and other resources)
AGROVOC & NALT AGROVOC NALT
the Goal wheat aphids wheat aphid, RussianUSE: Diuraphis noxia Diuraphis noxia麦双尾蚜 AGROVOC NALT AGRICOLAFSRIO AgNIC NALDR AGRIS/ CARIS Diuraphis noxia 麦双尾蚜 Diuraphis noxia
the Method - OAEI • OAEI 2006 • A friendly competition between scientists • Six ontology alignment tasks:benchmark, anatomy, conference, directory, jobs, and AGROVOC-NALT • OAEI 2006 AGROVOC-NALT • Five participants: Falcon-AO, PRIOR, RiMOM, COMA++, HMatch • The systems align the SKOS version of the ontologies without manual intervention. • The set of alignments (around 10.000 per participant) are collected and samples are judged by experts(FAO, USDA and AI researchers at the EKAW conference) • The results are published at:http://www.few.vu.nl/~wrvhage/oaei2006
OAEI 2006 - performance Precision Recall
OAEI 2006 - conclusions • Alignments that are easy to automate: • High consensus topics such as: • Latin species names (rigid Linnaeic naming scheme) • Some geographical names • Parts of the medical domain (parts of anatomy and diseases) • Alignments that are hard to automate: • Low consensus topics such as: • Different notations of chemicals(e.g. caffeine vs. theine vs. 1,3,7-trimethyl-1H-purine-2,6(3H,7H)-dione) • Some other geographical names(e.g. time dependency or administrative vs. physical) • Products and processes • Concepts that are culturally dependant(e.g. law and economy)
OAEI 2007 • OAEI 2007 • at ISWC 2007in Busan, Korea, november 11th • again AGROVOC-NALT (did things improve?) • but also AGROVOC-GEMET, NALT-GEMET • AGROVOC-NALT • Falcon-AO, PRIOR+, DSSim, RiMOM • AGROVOC-GEMET, NALT-GEMET • Falcon-AO, PRIOR+, DSSim • Deadlines • submission of results: october 1st • evaluation of results: november 11th • More information can be found at:http://www.few.vu.nl/~wrvhage/oaei2007
References • AGROVOC:http://www.fao.org/agrovoc • NALT:http://agclass.nal.usda.gov/agt • GEMEThttp://www.eionet.europa.eu/gemet/ • OAEI 2007:http://oaei.ontologymatching.org/2007 • OAEI 2007:http://www.few.vu.nl/~wrvhage/oaei2007 • OAEI 2006 results:http://www.few.vu.nl/~wrvhage/oaei2006