220 likes | 317 Views
Introduction. It has become a triviality to state the difficulty of finding relevant information on the web. FAOs mandate. Main goal of FAO is to reduce the quantity of hungry people by 50% within the year 2015.
E N D
Introduction It has become a triviality to state the difficulty of finding relevant information on the web
FAOs mandate • Main goal of FAO is to reduce the quantity of hungry people by 50% within the year 2015. • WAICENT (World Agricultural Information Center) is FAO’s approach to fight hunger with information • FAO itself produces huge amount of content in it’s subject area • It is also within FAOs mandate to make available useful information from other information providers • FAO collaborates in information networks
Number of Relevant Documents Identified Precision Total Number of Documents Identified Number of Relevant Documents Identified Recall Number of Relevant Documents in the Collection The Search Problem Both parameters are ranking low today! How to evaluate Search Results?
The Browse Problem • Topic Trees from categorization schemes and thesauri are rigid and not very expressive • Machine produced clusters are “flexible”, but very imprecise
Ways to Resource Description and Discovery • Statistical: statistical analysis of texts (word counts) using words as lexical terms, most full text search engines work like this • Semantical: using structured metainformation and formal knowledge description (catalogues using formal metadata schemas, categorization systems and thesauri at example)
State of Search Systems • Full text search engines based on statistical text analysis are inprecise by nature • New system based only on “machine intelligence” do not show too promising results • Recogniton of meaning (semantic analysis) by machines is only possible is only possible by using knowledge organization systems • agreed metadata schemas • Controlled vocabularies • Machine readable encoding
Knowledge Organizations Systems: Metadataschema • The different subject gateways are tentatives to use traditional semantic description for web resources • Their metadata schemas are connected to traditional description of a bibliographical record • The most significant event is the Dublin Core Metadata Initiative (DCMI) to define a core of metadata to describe information objects • In the area of Agriculture few agreed standards on metadata application profiles exist
Knowledge Organization Systems: Vocabularies • Insufficient subject + language coverage Existing Thesauri and Knowledge Organization Systems (KOSs) Dedicated KOSs e.g., ASFA thesaurus e.g., the Multilingual Forestry Thesaurus • Only very simple encoding of semantic relations e.g., the Sustainable Development website classification • Common concepts are not declared e.g., biological taxonomies such as NCBI and ITIS • No or very limited interoperability Other thematic thesauri Non-dedicated KOSs • Very limited machine readability CABI Thesaurus AGROVOC NAL Thesaurus • Severe maintenance problems GEMET
Consequences • No common topic trees in one domain, no cross navigation between applications in a specific domain • Keyword searches are based on statistical textanalysis • Automatic indexing systems show mostly poor results • Web crawlers and harvesters do good jobs only on already structured information sources • The semantic Web is very far away
Why do we look for Ontologies • An ontology is a formal knowledge organization system • It contains concepts (and instances) • a formal description of the application knowledge • Definitions of concepts and instances • Relations between concepts and instances • possibility of machine processing • Nearly everyone tries to build (inexplicit) ontologies • Directory structures, navigation trees • Humans can overcome bad organization by intuition • Machine have no intuition, Machine need formal information
What benefits do we expect from Ontologies? • Semantic Organization of websites • Knowledge maps • Guided discovery of knowledge • Easy retrievability of information without using complicated Boolean logic • Text processing by machines • Text Mining on the Web (meaning-oriented access) • Automatic indexing and text annotation tools • Full text search engines that create meaningful classification (FAO-Schwartz not related to FAO) (semantic clustering) • Intelligent search of the Web • Building dynamical catalogues from machine readable meta data • Natural Language processing • Better machine translation • Queries using natural language
Records found: 5 1. xxxxxxxxxxx 2. xxxxxxxxxxx 3. xxxxxxxxxxx 4. xxxxxxxxxxx 5. xxxxxxxxxxx What would you like to view? Forest rights issues Parasites of forests Pesticides used in forests Types of forest products Uses of forest products Biotopes Cropping systems using forests Economics of forest production Forestry equipment Soil science You may also be interested in... x You can further limit by: Geographic area Africa Web page Type of resource Guided Browse and Search Facilities
Context Sensitive Knowledge Access Agricultural Web Page Use your right mouse button to learn more about an italicized word on the page. Biosecurity: management of all biological and environmental risks associated with food and agriculture, including forestry and fisheries See also: Biosafety Food Safety Risk Management Or are you interested in...: Food Security Biological Diversity Conservation agriculture Farmers like it because it gives them a means of conserving, improving and making more efficient use of their natural resources About camels and llamas Descendants of the same rabbit-sized mammal, they have become two of humanity's most versatile domestic animals Agribusiness and small farmers Well managed contract farming contributes to both increased income for producers and higher profits for investors Toward biosecurity Biological and environmental risks associated with food and agriculture have intensified with economic globalization Urban food marketing In the “century of cities”, a major challenge will be providing adequate quantities of nutritional and affordable food for urban inhabitants Crop science and ethics In order to continue their contribution to human development, crop scientists must regain credibility
Why do we need a collaborative approach? • Only agreed semantic standards guarantee knowledge discovery between different applications • The definition of Knowledge Organization systems is resource intensive • Therefore FAO started intiatives to bring interested partners together • October 2000 Launch of the AGStandards initiative to agree on metadata standards • July 2002 concept paper on Agricultural Ontology Service
What does Agricultural Ontology Service mean? The Agricultural Ontology Service is an approach to organize knowledge organization systems that is • International The Internet must become plurilingual • Multidisciplinary The area of subjects is broad and needs various inputs • Cooperative different expert knowledge has to be associated and used • Distributed no central ownership should be looked for • Coordinated Coordination must ensure reusability and standardization
Users search and browse application using components User feedback AOS: Iterative Knowledge Registration Components: terms, definitions, relationships KOS uses components to build an application Agricultural Ontology Service (AOS) Federated storage and description facility Components: terms, definitions, relationships Discussions and choices for amendments to components
Activities up to now • The first workshop took place in Rome, November 2001 • A launch group was established with participation of • Content providers (FAO, CABI) • Solution providers in the Agricultural Area (ATO -Wageningen, University of Florida) • Ontology development Groups (AIFB Karlsruhe, CNR Italy) • The second workshop (January 2002 in Oxford) decided to develop prototypes for Ontologies, which are presented on this workshop in Florida
AOS – a “business model” • A consortium of Information Providers • A clearinghouse for semantic standards in the relevant subject areas • One stop access to agreed standards (Ontologies, Metadataschemas, Vocabularies…) • Participation as a consortium in semantic web activities to get funding for specific projects (Ontoweb) • Organization of seminars and workshops to further develop and promote the use of semantic standards
The Role of IICA /CATIE • Catalyzing the participation of agricultural institutions from Latin-America • Organizing a network of subject specialist of Spanish language to maintain the vocabularies • Being a partner in a consortium for Semantic Standards in our subject areas • And?
The Evolution of Knowledge Management Pre- Web Web Semantic Web Books, Magazines, Articles, …. Books, Magazines, Articles Databases, Webpages Defined Electronic Information Elements Libraries/Archives/File systems Libraries/Archives/File Systems/Websites Electronic Repositories Bibliographic Catalogues on Cards or Computers Bibliographic Catalogues Machine Index Catalogues Machine Readable Metadata Repositories Human Indexing Human Indexing Machine Indexing Machine Indexing Human Indexing Human reading, checking and classifying Statistical Analysis by Machines Semantical Analysis by Machines Bibliographies Bibliographies/Output from Fulltext Search Engines Knowledge based specialized webportals Reviews Knowledge Mining Thesauri, Classification Schemes, Glossaries, Ontologies
Further Information http://www.fao.org/agris/AOS http://www.fao.org/agris/AGMES Johannes.Keizer@fao.org Frehiwot.Fisseha@fao.org