190 likes | 275 Views
The Grid Data Warehouse. Description of prototype work in progress by AstroGrid. Access-Grid lecture to Universities of Leeds and Sheffield by Guy Rixon on 2004-02-04. AstroGrid: the UK Virtual Observatory. }.
E N D
The Grid Data Warehouse Description of prototype work in progress by AstroGrid. Access-Grid lecture to Universities of Leeds and Sheffield by Guy Rixon on 2004-02-04
AstroGrid: the UK Virtual Observatory } Seven UK astronomy departments collaborating to build a Virtual Observatory (VO) for the use of the entire astronomical community. GDW description: access-grid lecture
IVOA: the community of VO projects GDW description: access-grid lecture
Bibliographies Archives Data grid Live feeds Private files Purpose of the virtual observatory To combine data from all sources into a data grid. Data sets can be images (mainly in files) or tabular (mainly in RDBMS). GDW description: access-grid lecture
Example of VO use “Find brown dwarf candidates: combine optical (e.g. APM catalogue) and IR (e.g. 2MASS) data to select by colour. Combine multi-epoch data to determine proper motions; select high-PM fraction of colour-selected sample. Then use that sample to…” 2nd epoch Optical archive Refined sample Colour sample IR archive 3rd epoch GDW description: access-grid lecture
VO as collection of web sites: no good Results only go to browser, not to RDBMS, reprocessing Each site has different query protocol Results in HTML etc not machine readable Basic web sites are not sufficient for the VO. GDW description: access-grid lecture
Grid metaphor: electricity supply Loadsa complex equipment Get your power from any supplier: commodity Simple delivery to consumer GDW description: access-grid lecture
Commodities in astronomy data grid Common s/w on desktop Archives Bulk data transport; machine-readable results; combined inside grid Writeable Storage Registry of resources Algorithms Metadata transport (Processors) GDW description: access-grid lecture
AstroGrid topology Portal Registry Workflow Algorithms Writeable storage Archives GDW description: access-grid lecture
Difficult RDBMS operations “Select objects with V-K > 4.5…” (i.e. find ‘red’ objects). ? No std. way of storing results in RDBMS U, B, V, R Optical archive service IR archive service ? J, H, K No std. way of combining DBs. GDW description: access-grid lecture
Join across internet Need for data warehouse RDBMS RDBMS RDBMS RDBMS 1000x speed gains Join inside warehouse DB RDBMS RDBMS RDBMS GDW description: access-grid lecture
GDW topology extends AstroGrid Portal Registry Workflow Warehouse controller File storage Archive Grid-DB (OGSA-DAI) Grid-DB (OGSA-DAI) GDW description: access-grid lecture
GDW people • Kona Andrews (Cambridge) • Elizabeth Auden (MSSL) • Martin Hill (Edinburgh) • Tony Linde (Leicester) • Clive Page (Leicester) • Guy Rixon (Cambridge) • Noel Winstanley (Jodrell Bank) GDW description: access-grid lecture
Current system Portal Registry Workflow Warehouse controller File storage Archive Link temporarily redirected Grid-DB (OGSA-DAI) Grid-DB (OGSA-DAI) DB tables preloaded; read-only DB Link not implemented yet GDW description: access-grid lecture
Next system (3Q2004) Portal Registry Workflow Limited choice Warehouse controller File storage Archive Two dedicated installations inside AstroGrid; multi-user Grid-DB (OGSA-DAI) Grid-DB (OGSA-DAI) Links implemented properly (GridFTP) GDW description: access-grid lecture
Ultimate system (2005+) Portal Registry Workflow AstroGrid Warehouse controller File storage Archive UK e-Science grid / EGEE Grid-DB (OGSA-DAI) One node per user; any storage node GDW description: access-grid lecture
Assessment Basic idea is sound Coding of GDW was quite simple Very difficult to get it all integrated Problems with OGSA-DAI: • Performance • Data-size limits • Can’t get higher functions to work yet Proceed? • Yes; need to experiment further • Still expect to get science out of it GDW description: access-grid lecture
Can one use it? • Beta testers invited • Wait for release of “Iteration 4.1” system (soon!) • Wait for release of “Iteration 5” system (3Q2004) to see GDW useful for science • AstroGrid final release is at the end of 2004 http://wiki.astrogrid.org/bin/view/Astrogrid/BetaTesting GDW description: access-grid lecture
That’s all folks! GDW description: access-grid lecture