1 / 22

Data-Intensive Science: Addressing common needs with shared tools

Data-Intensive Science: Addressing common needs with shared tools. Christopher Stubbs Professor Department of Physics Department of Astronomy cstubbs@fas.harvard.edu. Storing, analyzing, and exploiting large data sets. Searching for dark matter and dark energy.

dessa
Download Presentation

Data-Intensive Science: Addressing common needs with shared tools

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data-Intensive Science: Addressing common needs with shared tools Christopher Stubbs Professor Department of Physics Department of Astronomy cstubbs@fas.harvard.edu

  2. Storing, analyzing, and exploiting large data sets Searching for dark matter and dark energy Detailed imaging of brain function Searching for new elementary particles

  3. Some common threads • Ambitious instruments copious data • E.g. tens of TB per night from imminent astronomy surveys • Loosely coupled computing • Don’t need linked analysis that uses all images • Diverse applications from common data • Simulations are an integral aspect • Build apparatus here, run it elsewhere • International collaborations • Computer science aspects • World’s largest non-proprietary databases • Clustering, data mining, file system optimization…

  4. 27 km CERN, outside Geneva

  5. Seriously Big Toys. Harvard involvement in ATLAS detector: • J. DaCosta and G. Brandenberg at CERN now, in shakedown • Built muon chambers here • J. Huth plays leadership role in scientific computing for LHC

  6. Event Simulations >30 Million event simulations are typical Pick an interaction Propagate through model of the detector Measure detection efficiencies

  7. On-the-fly event reconstruction Aggregate event statistics Find tracks and trigger/store if interesting Precise track determination

  8. ATLAS computing 5 million lines of code 200 developers, worldwide 200 collision events per second Automated event selection in firmware Selected subset of events to disk These selected events distributed worldwide to a hierarchy of data centers.

  9. Sky Surveys in Astronomy Optical: PanSTARRS 1.4 Gpix, 1.8m • Radio:Mileura Wide-Field Array • 1 km array of 8000 custom antennas • 128 gigabit/s computing challenge

  10. Our View of the Expanding Universe Close, Far, Recent Ancient Expansion causes stretching of light, “redshift” Expansion history can be mapped by measuring both distances and redshifts

  11. Supernovae are powerful • cosmological probes Distances to ~6% from brightness Redshifts from features in spectra (Hubble Space Telescope, NASA)

  12. Schmidt et al, High-z SN Team

  13. Near Earth Asteroids • Inventory of solar system is incomplete • R=1 km asteroids are dinosaur killers • R=300m asteroids in ocean wipe out a coastline • Demanding project: requires mapping the sky down to 24th every few days, individual exposures not to exceed ~20 sec. • PanSTARRS will detect NEAs to ~400m

  14. Cosmic Cinematography: Challenges The “static” sky: optimal co-adding of images, database issues The transient sky: variability classification asteroid association and orbits light curve analysis fusion with other data sets

  15. A New Approach to Radio Astronomy Hardware

  16. A Brief History of the Universe ionized neutral ( H ) Era of Reionization “The Gap” •culmination of structure formation •first luminous  structures •turning point after the Dark Ages z~6.2 ionized

  17. BOOLARDY

  18. Lincoln Greenhill (CfA)- MWA project

  19. IIC affords us the opportunity to share resources, tools and know-how • Shared hardware maximizes effectiveness • Shared archival data storage, cooperatively • Reap benefits of sophisticated system administrators and database professionals • People are quantized, unaffordable for single group • Learn from each other on technical topics of common interest • Often large discrepancies across subfields, IIC raises all boats.

  20. 8K x 8K pixel array 16 independent amplifiers Each is a 1024 x 2048 subimage

More Related