220 likes | 332 Views
Data-Intensive Science: Addressing common needs with shared tools. Christopher Stubbs Professor Department of Physics Department of Astronomy cstubbs@fas.harvard.edu. Storing, analyzing, and exploiting large data sets. Searching for dark matter and dark energy.
E N D
Data-Intensive Science: Addressing common needs with shared tools Christopher Stubbs Professor Department of Physics Department of Astronomy cstubbs@fas.harvard.edu
Storing, analyzing, and exploiting large data sets Searching for dark matter and dark energy Detailed imaging of brain function Searching for new elementary particles
Some common threads • Ambitious instruments copious data • E.g. tens of TB per night from imminent astronomy surveys • Loosely coupled computing • Don’t need linked analysis that uses all images • Diverse applications from common data • Simulations are an integral aspect • Build apparatus here, run it elsewhere • International collaborations • Computer science aspects • World’s largest non-proprietary databases • Clustering, data mining, file system optimization…
27 km CERN, outside Geneva
Seriously Big Toys. Harvard involvement in ATLAS detector: • J. DaCosta and G. Brandenberg at CERN now, in shakedown • Built muon chambers here • J. Huth plays leadership role in scientific computing for LHC
Event Simulations >30 Million event simulations are typical Pick an interaction Propagate through model of the detector Measure detection efficiencies
On-the-fly event reconstruction Aggregate event statistics Find tracks and trigger/store if interesting Precise track determination
ATLAS computing 5 million lines of code 200 developers, worldwide 200 collision events per second Automated event selection in firmware Selected subset of events to disk These selected events distributed worldwide to a hierarchy of data centers.
Sky Surveys in Astronomy Optical: PanSTARRS 1.4 Gpix, 1.8m • Radio:Mileura Wide-Field Array • 1 km array of 8000 custom antennas • 128 gigabit/s computing challenge
Our View of the Expanding Universe Close, Far, Recent Ancient Expansion causes stretching of light, “redshift” Expansion history can be mapped by measuring both distances and redshifts
Supernovae are powerful • cosmological probes Distances to ~6% from brightness Redshifts from features in spectra (Hubble Space Telescope, NASA)
Near Earth Asteroids • Inventory of solar system is incomplete • R=1 km asteroids are dinosaur killers • R=300m asteroids in ocean wipe out a coastline • Demanding project: requires mapping the sky down to 24th every few days, individual exposures not to exceed ~20 sec. • PanSTARRS will detect NEAs to ~400m
Cosmic Cinematography: Challenges The “static” sky: optimal co-adding of images, database issues The transient sky: variability classification asteroid association and orbits light curve analysis fusion with other data sets
A New Approach to Radio Astronomy Hardware
A Brief History of the Universe ionized neutral ( H ) Era of Reionization “The Gap” •culmination of structure formation •first luminous structures •turning point after the Dark Ages z~6.2 ionized
IIC affords us the opportunity to share resources, tools and know-how • Shared hardware maximizes effectiveness • Shared archival data storage, cooperatively • Reap benefits of sophisticated system administrators and database professionals • People are quantized, unaffordable for single group • Learn from each other on technical topics of common interest • Often large discrepancies across subfields, IIC raises all boats.
8K x 8K pixel array 16 independent amplifiers Each is a 1024 x 2048 subimage