130 likes | 299 Views
The Australian Virtual Observatory. e-Science Meeting School of Physics, March 2003 David Barnes. What is a Virtual Observatory?. A Virtual Observatory (VO) is a distributed, uniform interface to the data archives of the world’s major astronomical observatories.
E N D
The Australian Virtual Observatory e-Science Meeting School of Physics, March 2003 David Barnes
What is a Virtual Observatory? • A Virtual Observatory (VO) is a distributed, uniform interface to the data archives of the world’s major astronomical observatories. • A VO is explored with advanced data mining and visualisation tools which exploit the unified interface to enable cross-correlation and combined processing of distributed and diverse datasets. • VOs will rely on, and provide motivation for, the development of national and international computational and data grids.
Scientific motivation • Understanding of astrophysical processes depends on multi-wavelength observations and input from theoretical models. • As telescopes and instruments grow in complexity, surveys generate massive databases which require increasing expertise to comprehend. • Theoretical modeling codes are growing in sophistication to consume available compute time. • Major advances in astrophysics will be enabled by transparently cross-matching, cross-correlating and inter-processing otherwise disparate data.
Aus-VO in 2003 • “Phase A” funded AUD 260K by a 2003 ARC grant: • The University of Melbourne • The University of Sydney • CSIRO Australia Telescope National Facility • Anglo-Australian Observatory • Funded common format on-line archive projects: • HIPASS: HI spectral line and 1.4-GHz continuum survey • SUMSS: 843 MHz continuum survey • ATCA archive: spectral line and radio continuum images • 2dFGRS: optical spectra of >200K southern galaxies
CPU? ... thinking about the Aus-VO Grid, having data nodes and compute nodes... GrangeNet: Grid and Next Generation Network – a 10 Gbit backbone Parkes? Data CPU? Data ATNF/AAO Sydney 2dFGRS RAVE SUMSS Data GrangeNet Canberra CPU? CPU ATCA MSO Adelaide APAC Theory? Data CPU? CPU Melbourne VPAC CPU Swinburne HIPASS Gemini? Theory Theory
VO Interface & Portal • Agreement with AstroGrid (UK e-Science project) to be testers for their data publication and portal creation code. • Collecting the necessary resources and intend to have an AstroGrid-based portal serving HIPASS catalogue data for demonstration at IAU General Assembly in July 2003.
The MACHO Grid! • MACHO: 8-yr lightcurves for >18 million stars • ANU, APAC and MSO have the data on mass store, and are working on a VOTable XML description of the data (metadata). • Agreement with San Diego Supercomputer Center to install a storage resource broker (SRB) at ANU, with a view to making the MACHO data available on an international Grid.
Grid-based Visualisation • ATNF will build a Java PixelCanvas so that AIPS++ visualisation applications can be deployed as Web-Service and Grid- Service Java Applets • AIPS++ is modern, OpenSource software for reducing (radio) astronomy data, 1.6M lines of code.
Grid-based Volume Rendering • Agreement between Melbourne and AstroGrid to develop our existing distributed-data volume rendering code into a fully-fledged Grid-Service. • Challenge is to interactively render a multi-GB cube at the IAU GA 2003, using GridFTP to transfer the data volume from a remote data warehouse to a remote rendering cluster.
DataGrids for Aus-VO • Australian archives range from ~10 GB to ~10 TB in processed (reduced) size. • providing just the processed images and spectra on-line requires a distributed, high-bandwidth network of data servers – that is, a DataGrid. • users may want some simple operations such as smoothing or filtering, applied at the data server. This is a Virtual DataGrid.
ComputeGrids for Aus-VO • More complex operations may be applied requiring significant processing: • source detection and parameterisation • reprocessing of raw or intermediate data products with new calibration algorithms • combined processing of raw, intermediate or "final product" data from different archives • These operations require a distributed, high-bandwidth network of computational nodes – that is, a ComputeGrid.