1 / 22

From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories

From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories. R. Chris Smith NOAO/CTIO, LSST. Challenges for the Operational VO. Providing Content

soleil
Download Presentation

From Photons to Petabytes: Astronomy in the Era of Large Scale Surveys and Virtual Observatories

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. From Photons to Petabytes:Astronomy in the Era ofLarge Scale Surveys and Virtual Observatories R. Chris Smith NOAO/CTIO, LSST eScience May 2007

  2. Challenges for the Operational VO • Providing Content • capturing and archiving data from diverse instruments, AND capturing metadata (system & science) to make that data useful • Providing Access • implementing the VO standards and services, plus network infrastructure, needed for wide access to the content • Ensure not only access, but long-term support and documentation of datasets & metadata (curation) • Providing User Interfaces and Tools • developing and operating user interfaces which enable effective scientific use of ALL of the distributed resources of the VO eScience May 2007

  3. A Case Study:NOAO Data Management • Management of data from all NOAO and some affiliated facilities = CONTENT • 3 mountaintops (Cerro Tololo, Cerro Pachon, Kitt Peak) • 11 telescopes • More than 30 instruments • Virtual Observatory “back end” = ACCESS • Provide effective access to large volume (TBs to PBs) of archived ground-based optical & infrared data and data products through VO standard interfaces and networks • Virtual Observatory “front end” = UI and TOOLS • Enable science by developing VO user interfaces, tools, and services to work with distributed data sources and large volumes of data eScience May 2007

  4. Data Management Serving VO Content UI & Tools eScience May 2007

  5. BIG Question: • How does this model SCALE? • Capturing, moving, & processing the data • Making the data AVAILABLE through VO interfaces • Making the data USEFUL for scientific analysis • Why do we worry about scaling? eScience May 2007

  6. Turning Photonsinto Petabytes • Today • MOSAIC, WFI, IMACS: • 64 Mpix cameras • ~10 to 20 GB/night • Builds up quickly! • in only 3 years of two MOSAIC cameras • ~20TB raw data • ~40-60TB processed IMACS image, Las Campanas Observatory (Danny Steeghs, Jan'04) eScience May 2007

  7. Coming Soon: Dark Energy Camera • Focal Plane: • 64 2K x 4K detectors • Plus guiding and WFS • 530 Mpix camera eScience May 2007

  8. The Data:Dark Energy Survey • Each image = 1GB • 350 GB of raw data / night • Data must be moved to supercomputer center (NCSA) before next night begins (<24 hours) • Need >36Mbps internationally • Data must be processed within ~24 hours • Need to inform next night’s observing • Total raw data after 5 yrs ~0.2 PB • TOTAL Dataset 1 to 5 PB • Reprocessing planned using TeraGrid resources eScience May 2007

  9. LSST: The Large Synoptic Survey Telescope Survey the entire sky every 3 to 5 nights, to simultaneously detect and study: • Dark Matter via Weak gravitational lensing • Dark Energy via thousands of SNe per year • Potentially hazardous near earth asteroids • Tracers of the formation of the solar system • Fireworks in the heavens – GRBs, quasars… • Periodic and transient phenomena • ...…the unknown Massively PARALLEL Astronomy eScience May 2007

  10. LSST: The Instrument • 8.2m telescope • Optimized for WIDE field of view • 3.5 degree FOV • 3.5 GIGApixel camera • Deep images in 15s • Able to scan whole sky every 3 to 5 nights eScience May 2007

  11. LSST LSST: Deep, Wide, Fast Field of view (FOV) 0.2 degrees 10 m 3.5 degrees Keck Telescope eScience May 2007

  12. LSST ~1.5m cal telescope Support LSST site plan LSST Site: Cerro Pachon, Chile Gemini (South) SOAR Gemini El Penon Soar eScience May 2007

  13. LSST: Distributed Data Mgmt Long-Haul Communications Data transport & distribution Archive/Data Access Centers Data processing, long term storage, & public access Mountain Site data acquisition, temp. storage Base Facility Real time processing eScience May 2007

  14. LSST: The Data Flow All Data Public • Each image roughly 6.5GB • Cadence: ~1 image every 15s • 15 to 18 TB per night • ALL must be transferred to U.S. “data center” • Mtn-base within image timescale (15s), ~10-20Gbps • Internationally within <24 hours, >2-10Gbps • REAL TIME reduction, analysis, & alerts • Send out alerts of transient sources within minutes • Provide automatic data quality evaluation, alert to problems • Processed data grows to >100TB per night! • Just catalogs = Petaybytes per year! All Alerts Public eScience May 2007

  15. Archive Center Base Data Access Center LSST Needs eScience May 2007

  16. Turning Photonsinto Petabytes: Summary • Today, ~10 to 20 GB/night • MOSAIC, WFI, IMACS: 64 Mpix cameras • Soon, ~300 to 500 GB/night • VISTA: 67 Mpix camera • VST: 256 Mpix camera • DECam/DES: 520 Mpix camera • On the horizon, ~15 TB/night • LSST Project: 3 Gpix camera And these are just survey instruments in Chile! eScience May 2007

  17. DES, LSST, … the REST of the Science? • Ongoing (MOSAIC, WFI, IMACS) and future (DES, LSST, etc.) projects will provide PETABYTES of archived data • Only a small fraction of the science potential will be realized by the planned investigations • How do we maximize the investment in these datasets and provide for their future scientific use? eScience May 2007

  18. VO ChallengesProvider Perspective • How do we effectively capture, transport, and manage Petabytes of data? • Need advanced IT infrastructure • How do we provide effective access to Petabytes of data? • Need advanced data mining interfaces • Fundamentally IT challenges, in support of the astronomical community eScience May 2007

  19. VO ChallengesScientific Perspective • Data Discovery • From those Petabytes, what data exists that might be useful to help address my scientific query? • Data Understanding • Which data are best suited for my analysis? • Data Movement • How do I get the data from where it is to where it is most useful? • Data Analysis • How do I extract the information I need from the data? eScience May 2007

  20. NVO portal @ NOAO • Focus on Scientific USER • 4 Keys: Data Discovery, Data Understanding, Data Access, Data Analysis • First focus on supporting data DISCOVERY • Discovery in spatial coordinates: NOAO Sky • Discovery in temporal coordinates: Timeline • NOAO NVO portals: • http://nvo.noao.edu • And for South America… • http://nvo.ctio.noao.edu • Foundation for exploring partnerships with S.A. communities eScience May 2007

  21. Summary:VO Challenges • In Infrastructure • Collect and maintain petabytes of content • Provide for effective access, including networks, hardware, and software • In User Interaction • Provide effective user interfaces • Support distributed analysis • Support large queries across distributed DBs • Support statistical analysis and processing across distributed resources (Grid processing & storage) • TOOLS & SERVICES to enable SCIENCE eScience May 2007

  22. How?Strategic Partnerships • In Local Systems • Vendors: Local Storage, Processing, Servers • In Remote Systems • Distributed computer centers to provide bulk storage, large scale processing • Linked together for Grid processing, Grid storage • In Connectivity • High-speed national and international bandwidth • Scientific • VO Partners to develop standards, provide tools (IVOA) • Developing tools and services optimized for scientific analysis over large datasets (e.g., statistical methods) eScience May 2007

More Related