1 / 29

Johannes Keizer Information Systems Officer Food and Agriculture Organization of the UN

The Agricultural Ontology Service: multilingual domain ontologies for knowledge management in agriculture. Johannes Keizer Information Systems Officer Food and Agriculture Organization of the UN Library and Documentation Systems Division. APAN 2003, Fukuoka 23rd th January 2003.

mostyn
Download Presentation

Johannes Keizer Information Systems Officer Food and Agriculture Organization of the UN

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Agricultural Ontology Service: multilingual domain ontologies for knowledge management in agriculture Johannes Keizer Information Systems Officer Food and Agriculture Organization of the UN Library and Documentation Systems Division APAN 2003, Fukuoka 23rdth January 2003

  2. FAOs mandate • Reducing the quantity of hungry people by 50% within the year 2015 (World Food Summit 1996). • WAICENT (World Agricultural Information Center) is FAO’s approach to fight hunger with information • FAO itself produces huge amount of content in it’s subject area • It is also within FAOs mandate to make available useful information from other information providers • FAO collaborates in information networks

  3. Introduction It is not difficult to find information on the WWW (if you for what you are looking) But it is nearly impossible to extract knowledge or structured information

  4. Number of Relevant Documents Identified Precision Total Number of Documents Identified Number of Relevant Documents Identified Recall Number of Relevant Documents in the Collection The Search Problem Full text search engines might have a high recall (not verifiable), but precision/relevance is desperately low! How to evaluate Search Results?

  5. State of Search Systems • Full text search engines based on statistical text analysis are inprecise by nature • New system based only on “machine intelligence” do not show too promising results • Recogniton of meaning (semantic analysis) by machines is only possible by using knowledge organization systems • agreed metadata schemas • Controlled vocabularies • Machine readable encoding

  6. Knowledge Organization Systems: Vocabularies • Insufficient subject + language coverage Existing Thesauri and Knowledge Organization Systems (KOSs) Dedicated KOSs e.g., ASFA thesaurus e.g., the Multilingual Forestry Thesaurus • Only very simple encoding of semantic relations e.g., the Sustainable Development website classification • Common concepts are not declared e.g., biological taxonomies such as NCBI and ITIS • No or very limited interoperability Other thematic thesauri Non-dedicated KOSs • Very limited machine readability CABI Thesaurus AGROVOC NAL Thesaurus • Severe maintenance problems GEMET

  7. Ontologies? • An ontology is a formal knowledge organization system • It contains concepts (and instances) • a formal description of the application knowledge • Definitions of concepts and instances • Relations between concepts and instances • possibility of machine processing • Nearly everyone tries to build (inexplicit) ontologies • Directory structures, navigation trees • Humans can overcome bad organization by intuition • Machine have no intuition, Machine need formal information

  8. What benefits do we expect from Ontologies? • Semantic Organization of websites • Knowledge maps • Guided discovery of knowledge • Easy retrievability of information without using complicated Boolean logic • Text processing by machines • Text Mining on the Web (meaning-oriented access) • Automatic indexing and text annotation tools • Full text search engines that create meaningful classification (FAO-Schwartz not related to FAO) (semantic clustering) • Intelligent search of the Web • Building dynamical catalogues from machine readable meta data • Natural Language processing • Better machine translation • Queries using natural language

  9. The Example: International Portal on Food Safety, Animal and Plant Health • Goal: To create an explicit, formal specification of a shared conceptualization of a domain of interest Ontology

  10. Ontology: conceptual model label Concept synonym Concept relationship stem synonym synonym description

  11. Ontology: RDFS model, machine readable encoding

  12. Processes to create a Domain Ontology • Ontology acquisition (2 paths) • Creating core ontology from scratch • Automatic extraction of ontological knowledge from base vocabulary and domain specific text sources • Merging into one ontology • Refinement and Extension • Evaluation and Assessment

  13. Creation of the core ontology • Information Resources: • Brainstorming • Codex Alimentarius • SPS Agreement 3 subject specialists Core Ontology 67 concepts 91 relationships Ontology Editor (SOEP)

  14. 1st Acquisition Approach:Focused Crawling List of extracted main sites: http://www.foodsafety.gov/ Gateway to Government Food Safety Information http://vm.cfsan.fda.gov/ Center for Food Safety & Applied Nutrition http://www.inspection.gc.ca/ Canadian Food Inspection Agency http://www.extension.iastate.edu/foodsafety/ Iowa State University - Food Safety Project http://www.foodsafety.iastate.edu Iowa State University - Food Safety Consortium http://www.fsis.usda.gov/ United States Department of Agriculture, Food Safety and Inspection Service http://www.nal.usda.gov/foodborne/index.html Foodborne Ilness Education Information Center http://www.euro.who.int/foodsafety World Health Organization – Regional Office for Europe Food Safety Programme Focused Web Crawling Core Ontology 68 concepts 91 relationships List of 257 food Safety domain web pages Grouping into Main sites

  15. Selection of Documents • Domain Set: Manual selection • 11 documents • Codex Alimentarius: Description, Code of Ethics, Food Hygiene, Food Import and Export • Report of consultation on risk assessment of microbiological hazards in foods • Ensuring food quality and safety, Protecting food quality and safety • Domain Set: Focused Crawler Output • 5 documents extracted: • http://vm.cfsan.fda.gov/; • http://www.inspection.gc.ca/; • http://www.foodsafety.iastate.edu; • http://www.extension.iastate.edu/foodsafety/; • http://www.euro.who.int/foodsafety • Generic documents: Manual Selection • 8 documents • www.nytimes.com • Several documents of the animal feed domain

  16. 2nd Acquisition Approach:Thesaurus Pruning 5 evaluation runs AGROVOC 27365 keywords Rice BT … NT … RT …RT … RT … … 1632 frequent terms Automatic Pruning Food Safety Documents Extracted ontological structure:# of concepts: 504 taxonomic depth: 5 Generic Documents

  17. Merging of Ontologies and Refinement 1632 Terms from pruning process 12 new concepts extracted Core Ontology 67 concepts 91 relationships 92 new relationshipscreated Assembly step Ontologicalstructureextracted from AGROVOC 23 new concepts With hierarchical relationships extracted Food Safety OntologyPrototype 102 concepts 183 relationships

  18. relationships relationships concept concept Final Prototype Core Ontology 67 concepts 91 relationships Food Safety OntologyPrototype 102 concepts 183 relationships 1.36 1.79

  19. 102 Concepts Agreement of Agriculture ALOP ALOP, Codex ALOP, OIE ALR animal byproducts animal diseases animal fats animal feed additives animal feed contaminants animal feed ingredients animal feeding animal health animal processing animal products animal waste animals antibiotics Bacteria bakery products biological agent CAC Caragene protocol CCFH cereal products cheese chemical agent Codex Committees commodities Consumer health diseases eggs exposure assessment fabrication FAO fishes food food additives food consumption food contaminants food export food import food ingredients food safety food-borne diseases fungi good hygienic practices hazard hazard characterization hazard identification human health human nutrition humans international agreements international food trade international governmental organizations IPPC labelling meat microorganisms microorganisms byproducts microorganisms processing microorganisms products microorganisms waste milk milk products milk products non-pathogens OIE packaging parasites pathogens physical agent plant byproducts plant diseases plant feed additives plant feed contaminants plant feed ingredients plant feeding plant health plant processing plant products plant waste plants processed animal products processed plant products processed products processing risk analysis risk assessment risk characterization risk communication risk management slaughter SPS agreement standards sugar TBT agreement transport viruses WHO WTO

  20. 29 Unique Relationships adopts adversely affect are included in are produced by are the source for can be used as constitutes describes determines ensures establishes govern has economical impact on Implies includes influences interacts with is a consequence of is a step in the process is comprised of is established by is protected by originate from refer to requires rule sustains trades uses

  21. Current project statusOntology creation:2nd application of framework ~100 domain Specific documents List of frequent terms Text To Onto Food Safety OntologyPrototype 102 concepts 183 relationships 1st acquisitionapproach 2nd acquisitionapproach Revised Ontology Pruner Pruned Agrovoc: ~3000 concepts AGROVOC Merging & Refinement Ontology Editor (OIModeler)

  22. Biosecurity Portal: … Mark the terms below, which you might want to include in your search: Search: Risk assessment … Exposure assessment Is a Step In the process Risk characterization Risk assessment Hazard characterization Is a Step In the process Hazard identification Interactswith Risk analysis Risk communication Risk management Search: Risk assessment Risk characterization Risk analysis Extended Search Usage Scenario Ontology based search extension Ontology Enabled Search Application Ontology Search results Doc base

  23. Current project status: Application: Ontology Browser for the Ontology on Food Safety, Animal and Plant Health

  24. The Project for an Agricultural Ontology Service • Only agreed semantic standards guarantee knowledge discovery between different applications • The definition of Knowledge Organization systems is resource intensive • Therefore FAO started initiatives to bring interested partners together • October 2000 Launch of the AGStandards initiative to agree on metadata standards • July 2001 concept paper on Agricultural Ontology Service

  25. What does Agricultural Ontology Service mean? The Agricultural Ontology Service is an approach to organize knowledge organization systems that is • International The Internet must become plurilingual • Multidisciplinary The area of subjects is broad and needs various inputs • Cooperative different expert knowledge has to be associated and used • Distributed no central ownership should be looked for • Coordinated Coordination must ensure reusability and standardization

  26. Users search and browse application using components User feedback AOS: Iterative Knowledge Registration Components: terms, definitions, relationships KOS uses components to build an application Agricultural Ontology Service (AOS) Federated storage and description facility Components: terms, definitions, relationships Discussions and choices for amendments to components

  27. Activities up to now • 4 workshops (Rome, Wallingford, Florida, Cobenhavn) and numerous presentations have been organized to discuss the role of ontologies and semantic standards • Several prototypes for ontology use are in preparation • The AGROVOC thesaurus has been enhanced especially in multilinguality

  28. AOS – a “business model” • A consortium of Information Providers • A clearinghouse for semantic standards in the relevant subject areas • One stop access to agreed standards (Ontologies, Metadataschemas, Vocabularies…) • Participation as a consortium in semantic web activities to get funding for specific projects (“Semkos” for EU 6th framework) • Organization of seminars and workshops to further develop and promote the use of semantic standards

  29. Further Information http://www.fao.org/agris/AOS http://www.fao.org/agris/AGMES

More Related