300 likes | 319 Views
Explore the potential of grid technology in agricultural research for data integration, sharing, and analysis. Implement spreadsheet-based crop data sharing and integration for easy collaboration.
E N D
VIRTUAL INTEGRATION OF DISTRIBUTED HETEROGENEOUS DATABASES BASED ON GRID CONCEPT FOR agricultural research SABRAO2005 Seishi Ninomiya snino@affrc.go.jp National Agricultural Research Center NARO
What is Grid? SABRAO2005 • Concept and technology to share, integrate and coordinate distributed computer resources • Software and Hardware • Keeping autonomy of distributed resources • Keeping heterogeneity of distributed resources • The term was originally used for a framework to realize a virtual supercomputer with distributed CPUs • Computational Grid • Data Grid seems to be more promising now
A lot of resources such as data and programs are available but ……. SABRAO2005
Users need to obtain one by one, knowing how to access each SABRAO2005
e.g. Data Grid provides you SABRAO2005 A virtually integrated huge database We do not need to know where they are, how to use,…
Potential of Grid in agricultural research SABRAO2005 • Data integration/comparison among different experiments/locations is highly required particularly for evaluation of environment X genetic effects • Tremendous number of data sets are being kept unused once they were analysed within annual or/and locational experiments • Data are managed by different organizations and difficult to be centralized • Once they are integrated, then we could expect to meet completely unknown facts through data-mining
OutlineImplementations of Grid frameworks applicable to agricultural research SABRAO2005 • Spreadsheet-based crop data sharing and integration • Consistent access to heterogeneous metrological databases • Integration of crop data and meteorological data
Spreadsheet-based crop data sharing and integration SABRAO2005 • Experimental data sets are usually stored using ordinal spreadsheet applications • But not easy to collect them and merge them particularly among different locations • A data grid based on spreadsheets is promising particularly for agricultural research • Table formats are not uniform among different locations
Spreadsheet-based crop data sharing & integration Multi-location data sharing and integration through daily data management by spreadsheet application Application Server Servlet Container EJB Container Application SABRAO2005 Internet DBMS .
Once you enter your experimental data in spreadsheet software (e.g. MS Excel), data become automatically sharable over the Internet among different locations No skill is required Just a part of everyday data management Uniformity of tables are not strictly required Low cost in user sides Spreadsheet-based crop data sharing & integration SABRAO2005
User Column Definition Application Server User Search & Delete EJB Servlet Container DBMS Data Upload User Application DataDownload User CONTROLLER VIEW MODEL JSP Servlet ENTITY BEAN SESSION BEAN Basic structure of application based on EJB SABRAO2005 SOAP/XML Container
A client on Web-service Direct data upload from spreadsheet application Use of MS Excel VB macro + its SOAP tool kit Seamless action with daily data management Direct data update with spreadsheet SABRAO2005
One can obtain any combinations of records from different locations/experiments Data upload & download by spreadsheet files Data search/modification/update by Web browser SABRAO2005
Structure of data table is registered by spreadsheet Heterogeneity of original data sheets in the order of items and lacks of items are only acceptable by the present version Definition of data table by spreadsheet SABRAO2005
Now updating the application, adopting web-ontology as a meta-DB to accept more heterogeneous tables SABRAO2005 e.g. plant height 草丈 草高 全高
Test operations SABRAO2005 • ca. 10000 records from rice adaptability tests from 20 experimental stations were applied to the application • The application was practically operational • Those who are not good at computers could use it easily
Consistent access to heterogeneous metrological databases SABRAO2005
Solution by Data Broker SABRAO2005 Data brokers provide consistent access to heterogeneous DBs Heterogeneous and Autonomous DBs Rice Growth MetBroker DB A Pest Management DB B Meta Data Farm Management DB C Heterogeneity is absorbed by brokers (mediators) DB D
Database Broker Service SABRAO2005 Data Summarization Ex) Daily mean from hourly data Database Driver Data Secondary Processing Client Data Brokage Data Request DB A Data Standardization Standardized Data Data request translated to DB C Search DB B Data acquisition Meta Database Where, How to use Data contents DB C DB D
Data Brokers Developed SABRAO2005 • Meteorological DBs • MetBroker(23DB, >22000 stations) • Map DBs • ChizuBroker(3DB,Japan,NZ,World) • Digital Elevation DBS • DEMBroker(2DB,Japan 50m, World 1Km) • Soil DBs • SoilBroker
Adoption of EJB SABRAO2005 Without EJB Servlet Container WEB Browser DBMS Application WEB Browser With EJB WEB Browser EJB Container Servlet Container DBMS WEB Application Application WEB Service engine WEB Service Client JAVA Application
Present Coverage of MetBroker SABRAO2005
Clients MetBroker Tight linkage based on Java objects/ RMI or HTTP with wrapper servelet ChizuBroker DEMBroker XML-based Loose linkage WebService-SOAP/XML Resource Server Country Server Brokers Provided as Web Services SABRAO2005
New MetBroker with Web ontology SABRAO2005 Metadata database Decision-Making Support Services Operational Products Simulation Models Detailed Digital Forecast Item Definition OWL Station metadata RDF 2. Request 3. Request metadata 1. Register Meteorological databases DB Wrapper Inference Engine Broker DB Wrapper DB Wrapper 4. Request data
Integration of crop data and meteorological data SABRAO2005 Standardized interface for data exchange Rice growth model MetBroker HyDRAS WeatherDB1 WeatherDB3 WeatherDB2
Meteorological DB MetBroker MetCrop Crop data and meteorological data Crop DB SABRAO2005 SOAP/XML Data Extraction by Spreadsheet-based DB Corresponding weather data Location & Date Crop Data XML/Crop data &weather data Models/Analysis
Crop data and weather data are combined in an XML file SABRAO2005
Conclusions SABRAO2005 • Grid-based approach accelerates data integration, helping several agricultural data analyses • Standardized interfaces make development of integration framework much less laborious, less time consuming and costless • Next step • Need to evaluate scalability of this approach • Integration framework with other types of data, e.g. molecular data, soil data, variety/line data (dendrogram)
Thank you for your attentionhttp://www.agmodel.net/ SABRAO2005