290 likes | 395 Views
Astronomy toolkits and data structures. Andrew Jenkins Durham University. Data requirements of cosmological simulations. Adrian Jenkins Durham University. Talk outline. DiRAC and its major users New astronomical instruments and missions Mock catalogues Millennium simulation and database
E N D
Astronomy toolkits and data structures • Andrew Jenkins • Durham University
Data requirements of cosmological simulations • Adrian Jenkins • Durham University
Talk outline • DiRAC and its major users • New astronomical instruments and missions • Mock catalogues • Millennium simulation and database • Future directions for simulations
DiRAC 2 facility • Cambridge HPC Service: data analytic cluster • Cambridge COSMOS shared memory service • Durham ICC Service: data centric cluster (6720 core - idataPlex) • Edinburgh 6144 node Bluegene/Q • Leicester IT services: complexity cluster
DiRAC2 facility used by • Time allocated by RAC. Supports large projects (up to 3 years), and smaller allocations. • Large users: • UKQCD • Virgo Consortium (UK) • UKMHD • Horizon, Leicester …
JWST Launch date: ~2017-8 Cost >$5 billion EUCLID Launch date:~2019 Cost ~€500 million
Future large surveys • Photometric • e.g. Pan-STARRs, DES, LSST, Euclid-VIS • Spectroscopic • e.g. BOSS, BigBOSS, Euclid-NIS • Multi-wavelength • e.g SKA (HI) • Wide-field (>10,000 sq deg), wide redshift (z=0-3) • z-surveys: 10-50 million galaxies • imaging surveys ~billions of galaxies
Why build a mock? • Test galaxy formation models • Test algorithms - validation • Test processing pipelines • Assess survey performance (FoM) • Large surveys need mocks now!
Mock catalogues need observables SFR SFH Stellar mass Cold gas mass Black hole mass images Full SED (UV, Optical, FIR, Radio) Galaxies : stars, gas, AGN
Euclid OU-LE3 requirements for simulations CSWG OU-SIM Cosmological simulators Instrument simulators
Generic needs from Euclid • Position, redshift • Emission line properties/spectra • Line flux, equivalent width • Broad photometry to AB~24-24.5 • Euclid NIR • Euclid VIS • Pan-STARRS griz • DES grizy • CFHTLS ugriyz • WFCAM ZYJHK • SDSS ugriz • VISTA-VHS-VIDEO ZYJHKs • Photometric redshifts
Specific needs: clustering • 1% P(k) accuracy • Covariance estimates: P(k) etc • Initial conditions for reconstruction • Different cosmologies • Different galaxy formation models (vary bias)
Specific needs – clusters of galaxies • DM haloes M>1.e+13Msun, r(Δ), Δ=2500, 500,200; velocity dispersion along axes from DM particles • For each galaxy host halo ID, central or sat? • Simulated images for cluster detection and mass determination through weak/strong lensimg
Specific needs: weaklensing • Galaxies and DM to generate kappa map • Galaxy shapes with noise (no IA) • Galaxy shapes with IA • Shear at each galaxy position • Image properties: • mask, bright stars, chip boundaries, CCD defects, ghosts, variations in depth & background
Infrastructure required to make mocks • Require large simulations • To date these have been simulations of dark matter in large cosmological volumes.
Input simulations • Large N-body simulations • Approaching a trillion particles
Future needs Simulations for Euclid multi-trillion particle simulations Produce multi-petabyte datasets Data growing faster than network capabilities Need to scale databases up Ideally would like to serve the raw simulation data - two or more orders of magnitude larger.
Summary • Cosmological simulations are required to make the best use of observatories and space missions • The size of the required simulations makes this a Big data problem • Databases have proved very successful way of presenting processed data • Making the raw simulation data public desirable - but very challenging given financial constraints.