370 likes | 543 Views
An Intelligent Retrieval System for Chinese Agricultural Scientific Literature. Ping Qian, Xiaolu Su Scientech Documentation and Information Center , Chinese Academy of Agricultural Sciences, China. {pingq, suxiaolu}@mail.caas.net.cn. Introduction.
E N D
An Intelligent Retrieval System for Chinese Agricultural Scientific Literature Ping Qian, Xiaolu Su Scientech Documentation and Information Center, Chinese Academy of Agricultural Sciences, China. {pingq, suxiaolu}@mail.caas.net.cn
Introduction • How to find out desired information from huge information resources faster and accurately, has become the serious harassment for people to develop and utilize the network information resources. • This project attends to use new theory and technology to explore a solution to above problem. • Currently, knowledge engineering concerning ontology under research is an important theoretical foundation and applied technology to solve knowledge discovery and acquisition.
Build up the domain ontology Create the database, referring to the ontology Conduct the retrieval with the help of ontology Process the results, then display the results Import the classification method based on ontology theory Create agricultural navigation information database Create index database (Agricultural Scientific literature database) Create Web information retrieval system Display the results Information Retrieval Based on Ontology Establish Process of the System
Foundation of Building Agricultural Scientech Navigation Information Database • Theory: Ontology • Data Source: Agricultural Scientech Literature Database (more than 560,000 records) • Tool: Statistical Analysis • Standard: Chinese Library Classification Method
Stages of Building Agricultural Navigation Information Database • Agricultural Theoretical Classification Tree • Agricultural Actual Classification Tree • Class-Keyword Cross Table • Keyword-Class Cross Table • Agricultural Navigation Information Database
Agricultural Theoretical Classification Tree • Component • All of the Classes relevant to Chinese Library Classification Method • Purpose • Solve the problems in creating actual classification tree: • The relation between class number and its name • The gradation relation of some class numbers • Data Amount • Class and subclass: 42,948 • First Layer Class:17
Agricultural Actual Classification Tree • Component: • All of the classes indexed actually • Purpose: • Founding the navigation information database • Knowing the actual distribution of agricultural information to find new growing points of the development of agricultural sciences • Data amount: • Classes: 21,391,Among them. • Coordinated classes: 10,748 • Non-Coordinated classes: 10,643
AgriculturalActualClassification Tree • Key Point: • More than 100,000 class number and its corresponding class name • Solution: • Create Professional modeled class tables (9) • Create modeled class tables (6), among them: • General modeled class tables (2) • Professional modeled class tables (4)
General Compound Class Table Professional Compound Class Table
Keyword-Class Cross Table Beforedelete replication about 1,210,000 words After delete replication About 320,000 words
Agricultural Navigation Information Database • Determine the regulations for organizing the information • Make XML files for navigation information • Choose the database management system • Define database structure
The Regulations for Organizing the Information • Never lose any class or sub-class having record • Display order: Class having more records listed first, then listed from higher class layer to lower • If one node does not have record as well as one sub-node only, this node is deleted and move its sub-node to upper layer • Sub-class below the third layer class merge up to the third class • Less than 30 records in the subclass are ignored temporarily
DatabaseManagement System • Relational Database • XML-Enabled Database • Need transfer, low efficiency • Native XML Database • Software AG Tamino • Read XML data directly • Save data in XML format
Environment for running JSP and XML Java SDK 1.3.1 Xalan2.2.0 Tomcat3.2 System Framework XMLDBMS/RDBMS+XML+JAVA/JSP Browser/Server 3 Layer system structure
Conclusion • The establish of the agricultural scientific navigation information database and the development of its web search system change the traditional retrieval method from based on keyword to based on knowledge organization structure. • It is also a foundation work. The actual classification table and the cross tables between class and keyword established in the project are valuable Chinese agricultural semantic resources. • It is useful for the further studies on the automatic distinguish and classification of agricultural information as well as constructing strict agriculture domain ontology. • The work is just the beginning of the study on ontology and its application in agriculture.
The End Thanks for All