180 likes | 875 Views
Millennium Database. Overview and some first usage experiences Gerard Lemson and the Virgo Consortium astro-ph/0608019. The Virgo consortium’s Millennium simulation. Millennium simulation 10 billion particles, dark matter only 500 Mpc (~2Gly) periodic box
E N D
Millennium Database Overview and some first usage experiences Gerard Lemson and the Virgo Consortium astro-ph/0608019 TIG session 3+Millennium database
The Virgo consortium’s Millennium simulation • Millennium simulation • 10 billion particles, dark matter only • 500 Mpc (~2Gly) periodic box • “concordance model” (as of 2004) initial conditions • 64 snapshots • 350000 CPU hours • O(30Tb) raw + post-processed data • Postprocessing: • dark matter density fields smoothed at various scales (45 * 2563 grid cells) • dark matter cluster merger trees (~750 million) • galaxy merger trees (~1 billion/catalogue) • DeLucia & Blaizot, 2006 • Bower et al, 2006 TIG session 3+Millennium database
Dark matter and galaxies TIG session 3+Millennium database
Halos and galaxies TIG session 3+Millennium database
Database design TIG session 3+Millennium database
Database design: “20 queries” • Return the galaxies residing in halos of mass between 10^13 and 10^14 solar masses. • Return the galaxy content at z=3 of the progenitors of a halo identified at z=0 • Return the complete halo merger tree for a halo identified at z=0 • Find properties of all galaxies in haloes of mass 10**14 at redshift 1 which have had a major merger (mass-ratio < 4:1) since redshift 1.5. • Find all the z=3 progenitors of z=0 red ellipticals (i.e. B-V>0.8 B/T > 0.5) • Find the descendents at z=1 of all LBG's (i.e. galaxies with SFR>10 Msun/yr) at z=3 • Find all z=3 galaxies which have NO z=0 descendent. • Return all the galaxies within a sphere of radius 3Mpc around a particular halo • Find all the z=2 galaxies which were within 1Mpc of a LBG (i.e. SFR>10Msun/yr) at some previous redshift. • Find the multiplicity function of halos depending on their environment (overdensity of density field smoothed on certain scale) • Find the dependency of halo formation times on environment TIG session 3+Millennium database
Time evolution: merger trees TIG session 3+Millennium database
Merger trees : select prog. from galaxies des , galaxies prog where des.galaxyId = 0 and prog.galaxyId between des.galaxyId and des.lastProgenitorId • Leaves : • select galaxyId as leaf • from galaxies des • where galaxyId • = lastProgenitorId Branching points : select descendantId from galaxies des where descendantId != -1 group by descendantId having count(*) > 1 TIG session 3+Millennium database
More database design features • Spatial indices • Peano-Hilbert index links to field (256^3) • Z-curve index (bit interleaved, 256^3) • SQLServer2005 CLR integration with C# for range queries • Zone index (ix/iy/iz, 50^3) select * from galaxies where snapnum = 63 and ix = 1 and iy = 5 and iz = 20 • Random sampling select * from galaxies where snapnum = 63 and random between 1000 and 2000 TIG session 3+Millennium database
the Millennium database web server • Web application (Java in Apache tomcat web server) • portal: http://www.mpa-garching.mpg.de/millennium/ • public DB access: http://www.g-vo.org/Millennium • 30sec/1000rows | 30sec/unlimited rows • private access: http://www.g-vo.org/MyMillennium • 30sec/1000rows | 420sec/unlimited rows • MyDB, 1Gb, sometimes more • Access methods • browser with plotting capabilities through VOPlot applet • wget + IDL, R • TOPCAT plugin TIG session 3+Millennium database
Usage statistics • Up since Aug 2006 • Community notified via preprint server http://xxx.lanl.gov/abs/astro-ph/0608019 • Obtained form DB-base log with SQL • > 130 registered users • almost 1.7 million queries (not all correct) • since March 3, >5 billion rows handled TIG session 3+Millennium database
Usage patterns • Start with milli-Millennium (1/512 of full) • Some download complete set • Mainly to test approach, SQL • Ask for account on full Millennium • Run into timeout • either ask me • cut query in pieces • execute via script, using wget (good for hit rate count of site!) • MyDB usage • small projects collaborate via results, • upload own data (when local at MPA, or via me) TIG session 3+Millennium database
Conclusions • If you have valuable data (and “if you build it”), “they will come” • PR helps • astro-ph/ • presentations by owners (Simon White, Volker Springel, Carlos Frenk) • Users are not stupid • can and will learn SQL • don’t mind learning SQL (especially when relatively young) • come up with interesting solutions on their own • Documentation important • not optimal yet: indexes, internal relationships • Help desk (i.e. me) helps and is much appreciated • Possible/planned improvements • full upload facility into MyDB • mirror machine with CAS jobs • longer timeouts • batch querying • collaboration easier TIG session 3+Millennium database