110 likes | 292 Views
NOAA R&D High Performance Computing. Colin Morgan, CISSP High Performance Technologies Inc (HPTI) National Oceanic and Atmospheric Administration Geophysical Fluid Dynamics Laboratory, Princeton, NJ. R&D HPCS Background Information Scientific.
E N D
NOAA R&D High Performance Computing Colin Morgan, CISSP High Performance Technologies Inc (HPTI) National Oceanic and Atmospheric Administration Geophysical Fluid Dynamics Laboratory, Princeton, NJ
R&D HPCS Background InformationScientific • Large Scale heterogeneous supercomputing architecture • Provide cutting edge technology for weather and climate model developers • Models are developed for weather forecasts, storm warnings and climate change forecasts • 3 R&D HPCS Locations • Princeton, NJ • Gaithersburg, MD • Boulder, CO
R&D HPCS Background InformationSupercomputing • Princeton, NJ – GFDL • SGI Altix 4700 Cluster • 8000 Cores • 18PB of Data • Gaithersbug, MD • IBM Power6 Cluster • ~1200 Power6 Cores • 3PB of Data • Boulder, CO – ESRL • 2 Linux Clusters • ~4000 Xeon Hapertown/Woodcrest Cores • ~1PB of data • Remote Computing Allocated Hours • Oak Ridge National Labs – 104 Million Hours • Argonne National Labs – 150 Million Hours • NERSC – 10 Million Hours
R&D HPCS InformationData Requirements • Current Data Requirements • GFDL Current Data Capacity – 32PB • GFDL Current Data Total – 18PB • GFDL – A growth of 1PB every 2 months • Remote Compute – 6-8TB a day of data ingest • Future Data Requirements • 30-50TB a day from remote computing • 150-200PB in the next 3 years of total data How does that much data get transferred?
R&D HPCS InformationCurrent Data Transfer Methods - BBCP • BBCP – transfer rates are affected when file is being closed out • 400-500Mbs is the typical transfer rate, limited by Disk IO not the Network
R&D HPCS InformationFuture Data Transfer Methods - GRIDFTP ntt1 – receive host ntt2 – receive host ntt3 - IC0/IC9 pull host ESNET Oak Ridge PERIMETER SWITCH 1G Argonne SWITCH FABRIC CONNECTION NERSC • Require 6-8 TB/Day of inbound data ingest from ORNL • ARL & NERSC do not have the same data ingest requirements PERIMETER FIREWALL R&D CORE SWITCH/FW 10G 20G 100TB Disk Cache 10G LAN SWITCH Receive Interface - External 10G 10G VLAN1 R W R R 10G VLAN2 ntt4 ntt1 ntt2 ntt3 PRIVATE VLAN IC09 IC0 SGI Cluster Write Interface - Netapp VLAN3 MGMT VLAN ntt1, ntt2, ntt3, ntt4 – RHEL5 IC0, IC9 – OpenSuse 10.1 Switches – Cisco 6500 series Firewall – Cisco FWSM 1G NETAPP GRIDFTP Servers
R&D HPCS InformationFiber Networking FRGP ESNET Internet2 NLR Bison MAX What are we doing now? What do we plan to do? 3ROX MagPi SDN NyserNet
R&D HPCS InformationCurrent Connectivity Boulder, CO Gaithersburg, MD Princeton, NJ Commodity Internet Internet2 ESNET 45Mbs 1Gbs 10Gbs
R&D HPCS InformationPotential Future Networks • Tooth Fairy $$ • Working on preliminary designs • Design review scheduled in early May • Deployment in Q2 of FY10 • Looking to talk with • ESNET • Internet2 • National Lambda Rail • Indiana University Global Noc • Interested GigaPops • The primary focus is to provide a High Speed Network to NOAA’s Research Facilities
R&D HPCS Information QUESTIONS?