150 likes | 249 Views
Grid-based Database Integration in AIST. Isao KOJIMA Said Mirza Pahlevi Data Intensive Computing Team GTRC,AIST {kojima,mirza}@ni.aist.go.jp. Overview. Background for Database Integration Distributed and Heterogeneous Target Database Discovery Multi level application specific View
E N D
Grid-based Database Integration in AIST Isao KOJIMA Said Mirza Pahlevi Data Intensive Computing Team GTRC,AIST {kojima,mirza}@ni.aist.go.jp
Overview • Background for Database Integration • Distributed and Heterogeneous • Target • Database Discovery • Multi level application specific View • (under Autonomous /Dynamic environment) • Approach • Bottom-up • We just started - Current Results are not so large • GT3.0/OGSA-DAI based tools • Bring external web databases into Grid environment • Query conversion service to integrate different XML schema • Demo
Background A.I.S.T.=National Research Institute • Research Information Databases • Online on the Web • Bio/Life science • Geo/Earth science • Chemical/Material • Patent/Bibliographic
AIST & Tsukuba Area • In AIST nearly 100 online DBs(urls) • Tsukuba Science City • 96 research institutes • 52 public/governmental research labs. >880urls Number is not so large, But the problem is the same (heterogeneous, distributed)
Current Status • Each databases is separated/distributed • Can share some information • Chemical Structure, CAS Registry No, • Latitude,Longitude • Metadata Structure(Dublin Core,GILS,MARC,,) • Integration/Interconnection is useful • New research aspects/views • Multi Integration View(Organization, Area, Research Domain) • Most of them supports Form-query only • Limitation of Web interfaces • Need to combine with computing • Data Mining • Distributed Computation
Target • Integrate existing database/computing resources within Grid framework • OGSA(OGSI),OGSA-DAI(S) framework • Provide Database Discovery function • Advanced Information Service • Autonomous Resource Management • Provide Application Specific Database View • Schema Integration, Virtualization, Ontology • Database Autonomy/Dynamism
Bottom-Up Approachfor Research & Deployment Our Target Advanced Database Discovery Application Specific View Database Autonomy Our Target Workflow Transaction Distributed Query Processing Practical Bottom-Up Approach Remote Access Our Application Field (Scientific Data) Web-Service(WS-XX) GGF-DAIS,OGSA-DAI Existing external web databases EC site, Search Portal etc,,,
Result –Summary Approach & the Problems • Grid Proxy/Mediator database service to access web databases. • Provide uniform OGSA-DAI SQL access to external web databases (including join) • Bring Web Databases into OGSA Environment • How to access external web by using OGSA/DAIS framework • Integrate within OGSA/DAIS framework • How to integrate (bottom-up) different database schema • Query Conversion Service based on XQuery and XML schema • Provide integrated view of multi databases with different XML schema • CrossSearch over multi databases (not join)
Grid-based Integration Overview Our Target Our Application Field (Scientific Data) Workflow Transaction Feedback Query Conversion Grid Service to make CrossSearch Distributed Query Processing Proxy/Mediator Grid Database Service To access Existing external WebDBs apply Remote Access GGF-DAIS OGSA-DAI Existing external web databases EC site, Search Portal etc,,,
Wrapper Wrapper Wrapper 1:Grid Proxy/Mediator database service for external web databases OGSAEnvironment Web Databases Mediator OGSA-DAI Compliant Database Access DbLP Db/Lp join siteseer SQL siteseer join Delphin (login) Local/Remote Databases Proxy Databases delphion
グリッド外の データサービス グリッド外の データサービス Data Services Outside the Grid Architecture Internet OGSA-DAI based System OGSA-based Grid Environment Mediator InvokeMediator Wrapper SQL Grid Database Service Site specific query SQL Proxy relations SQL XML HTML e-Commerce site Search portal web dataases Proxy Database Wrapper Load &Exec Management Relations SQL Resource Management Grid Database Service Resource DB (wrappers,URLs,data formats)
Features • Globus3.0/OGSA-DAI based. • Compatible with OGSA-DAI RDB version. • Platform independent (DBMS,Wrapper) • Combined with Wrapper generator Tool (XFetch,WebL..) • Management functions are Grid services • Wrapper registration/deployment,External database definition • Proxy Relation within the Grid • Works as a cache for external webDB • SQL condition is converted to webDB query statement (not SQL function) • Handled as a table, not as a SQL function (it is possible) • Simple query optimization utilizing proxy relations • Approximate query => exact processing is done on proxy.
Schema, Location, 2:Query/Schema Conversion Service • Provide Schema/Query Conversion Function between different XML schema • User can define multiple resources with XML schema • Databases for DB resources • User can define relationship/conversion between schema (simple kind of Ontology) • XQuery-XQuery conversion service based on this info Conversion Service Converted XQueries XQuery Database Resources XML Schema Management Service Resource databases
Application Prototype • Geographic Metadata Query System • Multiple XML databases • Dublin Core • Dublin Core + Longitude/Latitude • GILS • Application Specific Converted XQueries DC Distributed Metadata Query System DC+ XQuery GILS JMP Cuurent version is not OGSI based (in deveopment)
Summary & Directions • 2 prototype services • OGSA-DAI compliant grid service to bring web databases into the grid • Lesson:Need of dynamic scheduling for uncertainness of external web • Schema/Query conversion service to handle multiple XML schema • Lesson: Need of concise set to handle ontology • Directions • Advanced Grid Database Discovery Service • Active/Autonomous Functions for DBMS