1 / 13

Computing Facilities & Capabilities

Computing Facilities & Capabilities. Julian Borrill Computational Research Division, Berkeley Lab & Space Sciences Laboratory, UC Berkeley. Computing Issues. Data Volume Data Processing Data Storage Data Security Data Transfer Data Format/Layout Its all about the data. Data Volume.

zamora
Download Presentation

Computing Facilities & Capabilities

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Computing Facilities & Capabilities Julian Borrill Computational Research Division, Berkeley Lab & Space Sciences Laboratory, UC Berkeley

  2. Computing Issues • Data Volume • Data Processing • Data Storage • Data Security • Data Transfer • Data Format/Layout Its all about the data

  3. Data Volume • Planck data volume drives (almost) everything • LFI : • 22 detectors with 32.5, 45 & 76.8 Hz sampling • 4 x 1010 samples per year • 0.2 TB time-ordered data + 1.0 TB full detector pointing data • HFI : • 52 detectors with 200 Hz sampling • 3 x 1011 samples per year • 1.3 TB time-ordered data + 0.2 TB full boresight pointing data • LevelS (e.g. CTP “Trieste” simulations) : • 4 LFI detectors with 32.5 Hz sampling • 4 x 109 samples per year • 2 scans x 2 beams x 2 samplings x 7 components + 2 noises • 1.0 TB time-ordered data + 0.2 TB full detector pointing data

  4. Data Processing • Operation count scales linearly (& inefficiently) with • # analyses, # realizations, # iterations, # samples • 100 x 100 x 100 x 100 x 1011 ~ O(10) Eflop (cf. '05 Day in the Life) • NERSC • Seaborg : 6080 CPU, 9 Tf/s • Jacquard : 712 CPU, 3 Tf/s (cf. Magique-II) • Bassi : 888 CPU, 7 Tf/s • NERSC-5 : O(100) Tf/s, first-byte in 2007 • NERSC-6 : O(500) Tf/s, first-byte in 2010 • Expect allocation of O(2 x 106) CPU-hours/year => O(4) Eflop/yr (10GHz CPUs @ 5% efficiency) • USPDC cluster • Specification & location TBD, first-byte in 2007/8 • O(100) CPU x 80% x 9000 hours/year => O(0.4) Eflop/yr (5GHz CPUs @ 3% efficiency) • IPAC small cluster dedicated to ERCSC

  5. Processing 9 Tf/s NERSC Seaborg 3 Tf/s NERSC Jacquard 7 Tf/s NERSC Bassi 0.1 Tf/s ERCSC Cluster 0.5 Tf/s USPDC Cluster 100 Tf/s NERSC 5 (2007) 500 Tf/s NERSC 6 (2010)

  6. Data Storage • Archive at IPAC • mission data • O(10) TB • Long-term at NERSC using HPSS • mission + simulation data & derivatives • O(2) PB • Spinning disk at USPDC cluster & at NERSC using NGF • current active data subset • O(2 - 20) TB • Processor memory at USPDC cluster & at NERSC • running job(s) • O(1 - 10+) GB/CPU & O(0.1 - 10) TB total

  7. Processing + Storage 9 Tf/s 6 TBNERSC Seaborg 2/20 PB NERSC HPSS 3 Tf/s 2 TBNERSC Jacquard 10 TB IPAC Archive 20/200 TB NERSC NGF 7 Tf/s 4 TB NERSC Bassi 0.1 Tf/s 50 GBERCSC Cluster 2 TB USPDC Cluster 0.5 Tf/s 200 GB USPDC Cluster 100 Tf/s 50 TB NERSC-5 (2007) 500 Tf/s 250 TB NERSC-6 (2010)

  8. Data Security • UNIX filegroups • special account : user planck • permissions _r__/___/___ • Personal keyfob to access planck acount • real-time grid-certification of individuals • keyfobs issued & managed by IPAC • single system for IPAC, NERSC & USPDC cluster • Allows securing of selected data • e.g. mission vs simulation • Differentiates access to facilities and to data • standard personal account & special planck account

  9. Processing + Storage + Security PLANCK KEYFOB REQUIRED 9 Tf/s 7 TB NERSC Seaborg 2/20 PB NERSC HPSS 3 Tf/s 2 TB NERSC Jacquard 10 TB IPAC Archive 20/200 TB NERSC NGF 7 Tf/s 4 TB NERSC Bassi 0.1 Tf/s 50 GB ERCSC Cluster 2 TB USPDC Cluster 0.5 Tf/s 200 GB USPDC Cluster 100 Tf/s 50 TB NERSC-5 (2007) 500 Tf/s 250 TB NERSC-6 (2010)

  10. Data Transfer • From DPCs to IPAC • transatlantic tests being planned • From IPAC to NERSC • 10 Gb/s over Pacific Wave, CENIC + ESNet • tests planned this summer • From NGF to/from HPSS • 1 Gb/s being upgraded to 10+ Gb/s • From NGF to memory (most real-time critical) • within NERSC • 8-64 Gb/s depending on system (& support for this) • offsite depends on location • 10Gb/s to LBL over dedicated data link on Bay Area MAN • fallback exists : stage data on local scratch space

  11. Processing + Storage + Security + Networks PLANCK KEYFOB REQUIRED 9 Tf/s 7 TB NERSC Seaborg 2/20 PB NERSC HPSS 8 Gb/s 3 Tf/s 2 TBNERSC Jacquard 10 Gb/s 10 TB IPAC Archive 20/200 TB NERSC NGF DPCs 10 Gb/s ? 10 Gb/s 10 Gb/s 7 Tf/s 4 TB NERSC Bassi ? ? ? ? 64 Gb/s 0.1 Tf/s 50 GB ERCSC Cluster 2 TB USPDC Cluster 0.5 Tf/s 200 GBUSPDC Cluster 100 Tf/s 50 TB NERSC-5 (2007) 500 Tf/s 250 TB NERSC-6 (2010) ?

  12. Project Columbia Update • Last year we advertised our proposed use of NASA's new Project Columbia (5 x 2048 CPU, 5 x 12 Tf/s), potentially including a WAN-NGF. • We were successful in pushing for Ames' connection to the Bay Area MAN, providing a 10Gb/s dedicated data connect. • We were unsuccessful in making much use of Columbia: • disk read performance varies from poor to atrocious, effectively disabling data analysis (although simulation is possible). • foreign nationals are not welcome, even if they have passed JPL security screening ! • We have provided feedback to Ames and HQ, but for now we are not pursuing this resource.

  13. Data Formats • Once data are on disk they must be read by codes that do not know (or want to know) their format/layout: • to analyze LFI, HFI, LevelS, WMAP, etc data sets • both individually and collectively • to be able to operate on data while it is being read • e.g. weighted co-addition of simulation components • M3 provides a data abstraction layer to make this possible • Investment in M3 has paid huge dividends this year: • rapid (10 min) ingestion of new data formats, such as PIOLIB evolution and WMAP • rapid (1 month) development of interface to any compressed pointing, allowing on-the-fly interpolation & translation • immediate inheritance of improvements (new capabilities & optimization/tuning) by the growing number of M3-based codes

More Related