250 likes | 364 Views
OGSA-DAI Usage Scenarios and Behaviour: Determining good practice. Mario Antonioletti mario@epcc.ed.ac.uk EPCC, University of Edinburgh http://www.ogsadai.org.uk AHM 2004, Data Services and Middleware, 2 nd September 2004. OGSA-DAI. OGSA-DAI is middleware Middleware should be invisible
E N D
OGSA-DAI Usage Scenarios and Behaviour: Determining good practice Mario Antonioletti mario@epcc.ed.ac.uk EPCC, University of Edinburgh http://www.ogsadai.org.uk AHM 2004, Data Services and Middleware, 2nd September 2004
OGSA-DAI • OGSA-DAI is middleware • Middleware should be invisible • Provide additional functionality or hide complexity • Allows uniform access to data resources • data resources: relational and XML databases, files, … • Provides an extensible framework • You can extend functionality - fill any gaps • We think it works well • But need feedback • Recount how OGSA-DAI is being used • Some background first … http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Container GDSF GDS Data Resource Basic Operational Model DAISGR Client http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Why OGSA-DAI? • Why use OGSA-DAI over JDBC? • Language independence at the client end • Do not need to use Java • Platform independence • Do not have to worry about connection technology, drivers, etc • Can handle XML resources • Can embed additional functionality at the service end • Transformations • Third party delivery • etc Avoiding unnecessary data movement • Provision of Metadata is powerful • Usefulness of the Registry for service discovery • Dynamic service binding process http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Deliver data back to the client. Container Container Client GDT Deliver data to a third party. GDT Deliver data another GDS. GDS Data Resource Data Resource Data Resource More Complex Behaviour GDS And there's a lot more that you can do … http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Q+U Data Flow G G G G A A A A Q Call Response Q1 G1=P S+R S S1 Q+D A U/R I U Actors - OGSI process - Non-OGSI process A - Analyst C - Consumer G - GDS P - Producer G A P P S I Q2+D C R G2=C Q+D S2 Q1+D S G1=P Q U A G S1 S Q U/R A I D Q2 C R S G2=C S2 Usage Patterns Retrieve Update/Insert Pipeline Data Q - Query D - Delivery S - Status R - Result U - Update I - Data id http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Activities are the drivers • Express a task to be performed by a GDS • Three broad classes of activities: • Statement • Transformations • Delivery • Extensible: • Easy to add new functionality • Does not require modification to the service interface • Extension operate within the OGSA-DAI framework • Functionality: • Implemented at the service • Work where the data is (do not require to move data back) http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
OGSA-DAI Deck http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Building Applications • Activities are grouped together • Perform document • Data can flow between activities • Optimisation • Avoids multiple message exchanges • Can deliver to other GDSs • Prerequisite for data integration • Base middleware for projects requiring data access • Some capability for data integration • That is the theory … now for the practice • OGSA-DAI being adopted by a number of projects … http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Who is Using OGSA-DAI? N2Grid (http://www.cs.univie.ac.at/institute/index.html?project-80=80) Bridges (http://www.brc.dcs.gla.ac.uk/projects/bridges/) BioSimGrid (http://www.biosimgrid.org/) INWA (http://www.epcc.ed.ac.uk/projects/inwa/) BioGrid (http://www.biogrid.jp/) AstroGrid (http://www.astrogrid.org/) eDiaMoND (http://www.ediamond.ox.ac.uk/) OGSA-DAI (http://www.ogsadai.org.uk) GEON (http://www.geongrid.org/) myGrid (http://www.mygrid.org.uk/) MCS (http://www.isi.edu/~deelman/MCS/) ODD-Genes (http://www.epcc.ed.ac.uk/oddgenes/) OGSA-WebDB (http://www.gtrc.aist.go.jp/dbgrid/) GridMiner (http://www.gridminer.org/) FirstDig (http://www.epcc.ed.ac.uk/~firstdig/) GeneGrid (http://www.qub.ac.uk/escience/projects.php#genegrid) IU RGRBench (http://www.cs.indiana.edu/~plale/projects/RGR/OGSA-DAI.html) http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Project classification http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Projects using OGSA-DAI • These projects form a list of case studies • Need to capture requirements • How OGSA-DAI is being used • Where it succeeds and where it fails • Other issues that arise • An on-going process • Only time to outline salient points from a couple of projects • More detail in the paper • … but this only gives a top level overview • On-going process … • Solicit more • If you have more then please get in touch … http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
e-Digital MammOgraphy National Database • Built a prototype of a national database of mammographic images in support of the UK Breast screening programme • Employ Grid technologies to facilitate this process • Mike Brady gave a keynote that went over the details http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
CHU KCL UED UCL Training Application Data Load Training App Data Load Training App Data Load Training App Data Load Training App Core API Training API Core & Training API Core & Training API Core & Training API Core & Training API Training Services Core Services Core Services Core Services Core Services Content Manager Content Manager Content Manager Content Manager DB2 DB2 DB2 DB2 OGSA-DAI OGSA-DAI OGSA-DAI OGSA-DAI OGSA-DAI OGSA-DAI DB2 Federation Files Database http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
eDiaMoND Findings: • OGSA-DAI provides a flexible framework • Dynamically configure the system through discovery • Activities can operate with different levels of granularity • Federation can introduced at various levels • Upgrading from R3 R4 broke some things • Low level XML issues • Good documentation on how to extend the framework • Extended Activities to access IBM DB2 Content Manager http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
INWA Objectives • Innovation Node Western Australia • Informing Business & Regional Policy: Grid-enabled fusion of global data and local knowledge • Project • Run from Nov 2003 - Aug 2004 • Involved 10 partners (6 UK + 4 Australia) • Aim • Data mine commercially sensitive data • Security an absolute MUST • Employ Grid technologies • Need access to data and computational resources • Demonstrator using: • OGSA-DAI • Incorporate data resources • Sun DCG's TOG (Transfer-queue Over Globus) • Handle job submission to analyse micro array data http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
TOG EPCC,UK user@australia OGSA-DAI OGSA-DAI Bank data UK Property Grid Engine Grid Engine TOG Curtin,Australia Bank Bank Telco Telco user@edinburgh OGSA-DAI OGSA-DAI Telco data Australian property Data Browser Data Browser INWA http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
INWA: Lessons Learned • Performing Data Integration: • TimeZone date problems • Security issues: • Bugs in • JavaCoG in GT3 • OGSA-DAI could not switch security for Grid data transfers • TOG had no security option • All of these have been fixed • Middleware not mature enough for commercial deployment http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
ODD-Genes • OGSA-DAI Demo for Genetics • Collaboration between • EPCC • Scottish Centre for Genomic Technology and Informatics (GTI) • Human Genetics Unit (HGU) • ODD-Genes demonstrates: • Perform high-speed batch analysis of microarray data on the Grid • Browse the results of previous analyses stored in a database • View data from arbitrary databases as HTML • Discover related databases on the Grid • Perform coupled queries on newly-discovered databases to provide a richer analysis of gene data http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
GTI HGU GridEngine Micro Array Data TOG OGSA-DAI Mouse Genome Information ODD-Genes Webapp OGSA-DAI EPCC OGSA-DAI Globus DAISGR GridEngine ODD-Genes Actors 1. Client 2. EPCC is an example of a computational resource. 3. HGU is an example of a data repository. http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
ODD-Genes Findings • Data discovery perceived to be very important • Map data views: time -> spatial locations • Discovery of new resources • Transparency to data access • @HGU had an XML database • @GTI had a relational database • Deploy OGSA-DAI and not worry about databases • Issues • Registry maintenance policy • Semantics of the discovery process • Groups working the same area but different schemas, no generic metadata (schemas were the effective metadata) • Provides an additional tool for researchers http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Other Projects • AstroGrid • Identified (and fixed) a number of bugs • Passed on requirements • FirstDig • Identified a number of bugs • Have contributed a data browser to OGSA-DAI • GeneGrid • Interfacing Perl through an OGSA-DAI service to access biological databases • Requirement for file support • EdSkyQuery-G • Collaboration between OGSA-DAI & Eldas • Based on SkyQuery project by John Hopkins University, Baltimore, USA http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
More on Projects • MSc at Edinburgh looking at data integration scenarios • Benchmarking OGSA-DAI • Investigating capabilities • GDSActivity allowing perform documents to be executed at other GDSs • Identifying further requirements for data integration - control flow • sequence • flow • Question as to whether such capabilities should be included in OGSA-DAI or OGSA-DAI should interface with other workflow languages • MSc at Edinburgh looking at C bindings to the OGSA-DAI CTK • For language independance need to provide more of these… • Perl, Python … Eiffel!! • GridMiner • Have a really cute logo • Have a member of that team currently at NeSC http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Conclusions • Still early days • Standardisation process not stabilising quickly enough • Infrastructure still developing and prone to change • OGSA-DAI acting as an enabler • Showing people what can be done • However is it cracking a nut with a sledge hammer? • Usage patterns are similar • Call for people to work together to solve similar problems • Problems that are not OGSA-DAI specific • Metadata, Time zones, security, … • Data discovery perceived to be important • Is this in the scope of what OGSA-DAI should be doing? • Need to talk to users and gather war stories • http://www.ogsadai.org.uk/projects • On going process … http://www.ogsadai.org.uk - AHM2004, 2nd September 2004
Further Information • The OGSA-DAI Project Site: • http://www.ogsadai.org.uk • The DAIS-WG site: • http://cs.man.ac.uk/grid-db • OGSA-DAI Users Mailing list • users@ogsadai.org.uk • General discussion on grid DAI matters • Formal support for OGSA-DAI releases • http://www.ogsadai.org.uk/support • support@ogsadai.org.uk • OGSA-DAI training courses http://www.ogsadai.org.uk - AHM2004, 2nd September 2004