450 likes | 460 Views
Dissemination of simulations in the Virtual Observatory. Gerard Lemson German Astrophysical Virtual Observatory, Max-Planck Institute for extraterrestrial physics. Overview. The Virtual Observatory Theory/simulations in the VObs Case study: Millennium database
E N D
Dissemination of simulations in the Virtual Observatory Gerard Lemson German Astrophysical Virtual Observatory, Max-Planck Institute for extraterrestrial physics
Overview • The Virtual Observatory • Theory/simulations in the VObs • Case study: Millennium database • Storing trees in a relational database • Virtual telescope prototypes • Outlook
Virtual Observatory I • Broad goal • Make results of astronomical research, data and applications, more readily available to larger community, and create value-adding services. (Alex’s talk yesterday) • Facilitate results- • communication • checking • (re)use • comparison • combination
Combination: a multi-wavelength view of a galaxy merger X-Ray Radio Optical John Hibbard http://www.cv.nrao.edu/~jhibbard/n4038/n4038.html NASA/CXC/SAO/G. Fabbiano et al.
the problem FIRST ROSAT GAIA 2MASS SDSS
work on a solution FIRST ROSAT GAIA 2MASS SDSS
Virtual Observatory II • Approach: • online availability of datasets and applications • standardized publication and discovery mechanisms • standardized description through common (meta-)data models • standardized selection mechanisms • standardized formats for transmitted data • value added services • introduce new technologies • find clever algorithms • Organized in International VO Alliance (IVOA)
Observations in the VO • Most VO efforts concentrate on observational data sets • simple observables: photons detected at a certain time from a certain area on the sky • long history of archiving • pre-existing standards (FITS) • valuable over long time (digitising 80 yr old plates) • Standards observationally biased • common sky: cone search, SIAP, region • common objects: XMatch • data models: characterisation of sky/time/energy(/no polarisation yet)
Theory in the VO: issues • Simulations not so simple • complex observables • no standardisation (not even HDF5) • archiving ad hoc, for local use • Moore’s law makes useful lifetime relatively short: few years later can do better • Current IVOA standards somewhat irrelevant • no common sky • no common objects • requires data models for content, physics, code
“Moore’s law” for N-body simulations Courtesy Simon White
History of simulations Toomre & Toomre, 1972 Courtesy Volker Springel Di Matteo, Springel and Hernquist, 2005
So why bother publishing simulations? • Simulations are interesting: • For many cases only way to see processes in action • Complex observations require sophisticated models for interpretation • Bridging gap in specialisations: not everyone has required expertise to create simulations, though they can analyse them. • Many use cases do not require the latest/greatest • exposure time calculator • survey design
A possible formation scenario Courtesy Volker Springel
Detailed observations electron density gas pressure gas temperature Courtesy Alexis Finoguenov, Ulrich Briel, Peter Schuecker, (MPE)
Detailed predictions Courtesy Volker Springel
Case study: Simulations in a relational database • Goal: investigate the use of RDB and web services for disseminating results of cosmological N-body simulations. • Why database ? • encapsulation of data in terms of logical structure, no need to know about internals of data storage • standard query language for finding information • advanced query optimizers • forces one to think carefully about data structure • speeds up path from science question to answer • facilitates communication • new ways of thinking about results • links to other efforts (Sloan SkyServer)
The Virgo consortium’s Millennium simulation • Millennium simulation • 10 billion particles, dark matter only • 500 Mpc (~2Gly) periodic box • “concordance model” (as of 2004) initial conditions • 64 snapshots • 350000 CPU hours • O(30Tb) raw + post-processed data • play • Postprocessing: • dark matter density fields smoothed at various scales (45 * 2563 grid cells) • dark matter cluster merger trees (~750 million) • galaxy merger trees (~1 billion/catalogue) • DeLucia & Balizot, 2006 • Bower et al, 2006
the Millennium database + web server • Post-processing results only • SQLServer database • MPA: 2000, soon + 2005 • Durham: 2005 • Web application (Java in Apache tomcat web server) • portal: http://www.mpa-garching.mpg.de/millennium/ • public DB access: http://www.g-vo.org/Millennium • private access: http://www.g-vo.org/MyMillennium • MyDB • Access methods • browser with plotting capabilities through VOPlot applet • wget + IDL, R • TOPCAT plugin
Database design: “20 queries” • Return the galaxies residing in halos of mass between 10^13 and 10^14 solar masses. • Return the galaxy content at z=3 of the progenitors of a halo identified at z=0 • Return the complete halo merger tree for a halo identified at z=0 • Find properties of all galaxies in haloes of mass 10**14 at redshift 1 which have had a major merger (mass-ratio < 4:1) since redshift 1.5. • Find all the z=3 progenitors of z=0 red ellipticals (i.e. B-V>0.8 B/T > 0.5) • Find the descendents at z=1 of all LBG's (i.e. galaxies with SFR>10 Msun/yr) at z=3 • Find all z=3 galaxies which have NO z=0 descendent. • Return all the galaxies within a sphere of radius 3Mpc around a particular halo • Find all the z=2 galaxies which were within 1Mpc of a LBG (i.e. SFR>10Msun/yr) at some previous redshift. • Find the multiplicity function of halos depending on their environment (overdensity of density field smoothed on certain scale) • Find the dependency of halo formation times on environment
Efficient storage of trees in a relational database • Goal: allow queries for the formation history of any object • No recursion possible, or desired • Method: • depth first ordering of trees • label by rank in order • pointer to “last progenitor” below each node • all progenitors have label BETWEEN label of root AND that of last progenitor • cluster table on label
Merger trees : select prog. from galaxies des , galaxies prog where des.galaxyId = 0 and prog.galaxyId between des.galaxyId and des.lastProgenitorId • Leaves : • select galaxyId as leaf • from galaxies des • where galaxyId • = lastProgenitorId Branching points : select descendantId from galaxies des where descendantId != -1 group by descendantId having count(*) > 1
Main branches • Roots and leaves: select des.galaxyId as rootId , min(prog.lastprogenitorid) as leafId into rootLeaf from galaxies des , galaxies prog where des.galaxyId = 0 and prog.galaxyId between des.galaxyId and des.lastProgenitorId • Main branch select rl.rootId, b.* from rootLeaf rl , galaxies b where prog.galaxyId between rl.rootId and rl.leafId
More database design features • Spatial indices • Peano-Hilbert index links to field (256^3) • Z-curve index (bit interleaved, 256^3) • SQLServer2005 CLR integration with C# for range queries • Zone index (ix/iy/iz, 50^3) select * from galaxies where snapnum = 63 and ix = 1 and iy = 5 and iz = 20 • Random sampling select * from galaxies where snapnum = 63 and random between 1000 and 2000
Under construction • Batch processing through CAS jobs • Mock catalogues • pre-calculated in database • online MoMaF • Utilise PCA for storing photometric predictions • Tree comparisons: statistics of branch lengths, node counts; tree edit distance.
Virtual telescopes • Virtual observations of virtual universe • Produce data products that are as similar to observational results as possible: • images • spectra • catalogues • Include atmosphere and telescope effects • predict • analyse: easier to add problems than to remove them
Prototype examples • No realistic telescope yet • Planck simulator • http://www.g-vo.org/planck • Mock catalogs through Millennium • http://www.g-vo.org/mpasims/MoMaf2? • Hydro simulations of galaxy clusters • http://www.g-vo.org/hydrosims/
Mock Map Making Facility Blaizot, J. et al Mon.Not.Roy.Astron.Soc. 360 (2005) 159-175
Conclusions and outlook • Simulation data valuable addition to VObs • Especially with interfaces similar to observational ones • IVOA theory interest group standards under development: SNAP, Semantics, Simulation data model • Virtual telescopes provide perfect use case for testing VObs ideas: • requires very different specialisations • not co-located: needs distributed treatment • requires standards for data structure and service APIs, as well as models linking observations and theory • high performance computational infrastructure for scientifically meaningful results • Distributed virtual telescope configuration
Acknowledgments • Virgo consortium, in particular: • Volker Springel, Simon White, Gabriella DeLucia, Jeremy Blaizot(MPA, Munich, Germany), • Carlos Frenk, Richard Bower, John Helly (ICC, Durham, UK) • Alex Szalay, Jan van den Berg (JHU) • GAVO is funded by the German Federal Ministry for Education and Research
Relevant references and links • Springel, V., et al (2005), Simulations of the formation, evolution and clustering of galaxies and quasars, Nature, 435, 629 • Lemson, G. and the Virgo Consortium (2006), Halo and Galaxy Formation Histories from the Millennium Simulation: Public release of a VO-oriented and SQL-queryable database for studying the evolution of galaxies in the LCDM cosmogony, http://xxx.lanl.gov/format/astro-ph/0608019 • Lemson, G. & Springel, V. (2005), Cosmological Simulations in a Relational Database: Modelling and Storing Merger Trees, ASPC, 351, Astronomical Data Analysis Software and Systems XV http://aspbooks.org/custom/publications/paper/351-0212.html • De Lucia , G. & Blaizot, J. (2006) The hierarchical formation of the brightest cluster galaxies, http://xxx.lanl.gov/format/astro-ph/0606519/ • Bower, R. et al (2006), The brokern hierarchy of galaxy formation, Mon.Not.Roy.Astron.Soc. 370 645-655 • http://www.mpa-garching.mpg.de/millennium and http://www.g-vo.org/Millennium