290 likes | 437 Views
The Agricultural Ontology Service: multilingual domain ontologies for knowledge management in agriculture. Johannes Keizer Information Systems Officer Food and Agriculture Organization of the UN Library and Documentation Systems Division. APAN 2003, Fukuoka 23rd th January 2003.
E N D
The Agricultural Ontology Service: multilingual domain ontologies for knowledge management in agriculture Johannes Keizer Information Systems Officer Food and Agriculture Organization of the UN Library and Documentation Systems Division APAN 2003, Fukuoka 23rdth January 2003
FAOs mandate • Reducing the quantity of hungry people by 50% within the year 2015 (World Food Summit 1996). • WAICENT (World Agricultural Information Center) is FAO’s approach to fight hunger with information • FAO itself produces huge amount of content in it’s subject area • It is also within FAOs mandate to make available useful information from other information providers • FAO collaborates in information networks
Introduction It is not difficult to find information on the WWW (if you for what you are looking) But it is nearly impossible to extract knowledge or structured information
Number of Relevant Documents Identified Precision Total Number of Documents Identified Number of Relevant Documents Identified Recall Number of Relevant Documents in the Collection The Search Problem Full text search engines might have a high recall (not verifiable), but precision/relevance is desperately low! How to evaluate Search Results?
State of Search Systems • Full text search engines based on statistical text analysis are inprecise by nature • New system based only on “machine intelligence” do not show too promising results • Recogniton of meaning (semantic analysis) by machines is only possible by using knowledge organization systems • agreed metadata schemas • Controlled vocabularies • Machine readable encoding
Knowledge Organization Systems: Vocabularies • Insufficient subject + language coverage Existing Thesauri and Knowledge Organization Systems (KOSs) Dedicated KOSs e.g., ASFA thesaurus e.g., the Multilingual Forestry Thesaurus • Only very simple encoding of semantic relations e.g., the Sustainable Development website classification • Common concepts are not declared e.g., biological taxonomies such as NCBI and ITIS • No or very limited interoperability Other thematic thesauri Non-dedicated KOSs • Very limited machine readability CABI Thesaurus AGROVOC NAL Thesaurus • Severe maintenance problems GEMET
Ontologies? • An ontology is a formal knowledge organization system • It contains concepts (and instances) • a formal description of the application knowledge • Definitions of concepts and instances • Relations between concepts and instances • possibility of machine processing • Nearly everyone tries to build (inexplicit) ontologies • Directory structures, navigation trees • Humans can overcome bad organization by intuition • Machine have no intuition, Machine need formal information
What benefits do we expect from Ontologies? • Semantic Organization of websites • Knowledge maps • Guided discovery of knowledge • Easy retrievability of information without using complicated Boolean logic • Text processing by machines • Text Mining on the Web (meaning-oriented access) • Automatic indexing and text annotation tools • Full text search engines that create meaningful classification (FAO-Schwartz not related to FAO) (semantic clustering) • Intelligent search of the Web • Building dynamical catalogues from machine readable meta data • Natural Language processing • Better machine translation • Queries using natural language
The Example: International Portal on Food Safety, Animal and Plant Health • Goal: To create an explicit, formal specification of a shared conceptualization of a domain of interest Ontology
Ontology: conceptual model label Concept synonym Concept relationship stem synonym synonym description
Processes to create a Domain Ontology • Ontology acquisition (2 paths) • Creating core ontology from scratch • Automatic extraction of ontological knowledge from base vocabulary and domain specific text sources • Merging into one ontology • Refinement and Extension • Evaluation and Assessment
Creation of the core ontology • Information Resources: • Brainstorming • Codex Alimentarius • SPS Agreement 3 subject specialists Core Ontology 67 concepts 91 relationships Ontology Editor (SOEP)
1st Acquisition Approach:Focused Crawling List of extracted main sites: http://www.foodsafety.gov/ Gateway to Government Food Safety Information http://vm.cfsan.fda.gov/ Center for Food Safety & Applied Nutrition http://www.inspection.gc.ca/ Canadian Food Inspection Agency http://www.extension.iastate.edu/foodsafety/ Iowa State University - Food Safety Project http://www.foodsafety.iastate.edu Iowa State University - Food Safety Consortium http://www.fsis.usda.gov/ United States Department of Agriculture, Food Safety and Inspection Service http://www.nal.usda.gov/foodborne/index.html Foodborne Ilness Education Information Center http://www.euro.who.int/foodsafety World Health Organization – Regional Office for Europe Food Safety Programme Focused Web Crawling Core Ontology 68 concepts 91 relationships List of 257 food Safety domain web pages Grouping into Main sites
Selection of Documents • Domain Set: Manual selection • 11 documents • Codex Alimentarius: Description, Code of Ethics, Food Hygiene, Food Import and Export • Report of consultation on risk assessment of microbiological hazards in foods • Ensuring food quality and safety, Protecting food quality and safety • Domain Set: Focused Crawler Output • 5 documents extracted: • http://vm.cfsan.fda.gov/; • http://www.inspection.gc.ca/; • http://www.foodsafety.iastate.edu; • http://www.extension.iastate.edu/foodsafety/; • http://www.euro.who.int/foodsafety • Generic documents: Manual Selection • 8 documents • www.nytimes.com • Several documents of the animal feed domain
2nd Acquisition Approach:Thesaurus Pruning 5 evaluation runs AGROVOC 27365 keywords Rice BT … NT … RT …RT … RT … … 1632 frequent terms Automatic Pruning Food Safety Documents Extracted ontological structure:# of concepts: 504 taxonomic depth: 5 Generic Documents
Merging of Ontologies and Refinement 1632 Terms from pruning process 12 new concepts extracted Core Ontology 67 concepts 91 relationships 92 new relationshipscreated Assembly step Ontologicalstructureextracted from AGROVOC 23 new concepts With hierarchical relationships extracted Food Safety OntologyPrototype 102 concepts 183 relationships
relationships relationships concept concept Final Prototype Core Ontology 67 concepts 91 relationships Food Safety OntologyPrototype 102 concepts 183 relationships 1.36 1.79
102 Concepts Agreement of Agriculture ALOP ALOP, Codex ALOP, OIE ALR animal byproducts animal diseases animal fats animal feed additives animal feed contaminants animal feed ingredients animal feeding animal health animal processing animal products animal waste animals antibiotics Bacteria bakery products biological agent CAC Caragene protocol CCFH cereal products cheese chemical agent Codex Committees commodities Consumer health diseases eggs exposure assessment fabrication FAO fishes food food additives food consumption food contaminants food export food import food ingredients food safety food-borne diseases fungi good hygienic practices hazard hazard characterization hazard identification human health human nutrition humans international agreements international food trade international governmental organizations IPPC labelling meat microorganisms microorganisms byproducts microorganisms processing microorganisms products microorganisms waste milk milk products milk products non-pathogens OIE packaging parasites pathogens physical agent plant byproducts plant diseases plant feed additives plant feed contaminants plant feed ingredients plant feeding plant health plant processing plant products plant waste plants processed animal products processed plant products processed products processing risk analysis risk assessment risk characterization risk communication risk management slaughter SPS agreement standards sugar TBT agreement transport viruses WHO WTO
29 Unique Relationships adopts adversely affect are included in are produced by are the source for can be used as constitutes describes determines ensures establishes govern has economical impact on Implies includes influences interacts with is a consequence of is a step in the process is comprised of is established by is protected by originate from refer to requires rule sustains trades uses
Current project statusOntology creation:2nd application of framework ~100 domain Specific documents List of frequent terms Text To Onto Food Safety OntologyPrototype 102 concepts 183 relationships 1st acquisitionapproach 2nd acquisitionapproach Revised Ontology Pruner Pruned Agrovoc: ~3000 concepts AGROVOC Merging & Refinement Ontology Editor (OIModeler)
Biosecurity Portal: … Mark the terms below, which you might want to include in your search: Search: Risk assessment … Exposure assessment Is a Step In the process Risk characterization Risk assessment Hazard characterization Is a Step In the process Hazard identification Interactswith Risk analysis Risk communication Risk management Search: Risk assessment Risk characterization Risk analysis Extended Search Usage Scenario Ontology based search extension Ontology Enabled Search Application Ontology Search results Doc base
Current project status: Application: Ontology Browser for the Ontology on Food Safety, Animal and Plant Health
The Project for an Agricultural Ontology Service • Only agreed semantic standards guarantee knowledge discovery between different applications • The definition of Knowledge Organization systems is resource intensive • Therefore FAO started initiatives to bring interested partners together • October 2000 Launch of the AGStandards initiative to agree on metadata standards • July 2001 concept paper on Agricultural Ontology Service
What does Agricultural Ontology Service mean? The Agricultural Ontology Service is an approach to organize knowledge organization systems that is • International The Internet must become plurilingual • Multidisciplinary The area of subjects is broad and needs various inputs • Cooperative different expert knowledge has to be associated and used • Distributed no central ownership should be looked for • Coordinated Coordination must ensure reusability and standardization
Users search and browse application using components User feedback AOS: Iterative Knowledge Registration Components: terms, definitions, relationships KOS uses components to build an application Agricultural Ontology Service (AOS) Federated storage and description facility Components: terms, definitions, relationships Discussions and choices for amendments to components
Activities up to now • 4 workshops (Rome, Wallingford, Florida, Cobenhavn) and numerous presentations have been organized to discuss the role of ontologies and semantic standards • Several prototypes for ontology use are in preparation • The AGROVOC thesaurus has been enhanced especially in multilinguality
AOS – a “business model” • A consortium of Information Providers • A clearinghouse for semantic standards in the relevant subject areas • One stop access to agreed standards (Ontologies, Metadataschemas, Vocabularies…) • Participation as a consortium in semantic web activities to get funding for specific projects (“Semkos” for EU 6th framework) • Organization of seminars and workshops to further develop and promote the use of semantic standards
Further Information http://www.fao.org/agris/AOS http://www.fao.org/agris/AGMES