280 likes | 411 Views
Challenges in analyzing a High-Resolution Climate Simulation. John M. Dennis, Matthew Woitaszek dennis @ucar. edu , mattheww@ucar.edu October 26, 2010. Climate Modeling?. Coupled models Atmosphere-Ocean-Sea Ice-Land Challenges of climate: Need many years --> Limits resolution -->
E N D
Challenges in analyzing a High-Resolution Climate Simulation John M. Dennis, Matthew Woitaszek dennis@ucar.edu, mattheww@ucar.edu October 26, 2010 GG in Data-Intensive Discovery
Climate Modeling? • Coupled models Atmosphere-Ocean-Sea Ice-Land • Challenges of climate: • Need many years --> • Limits resolution --> • Limits parallelism • Community Earth System Model (CESM) Formally: Community Climate System Model (CCSM) GG in Data-Intensive Discovery
IPCC & AR5 • Intergovernmental Panel on Climate Change (IPCC) “The IPCC assesses the scientific, technical and socio-economical information relevant for the understanding of the risk of human-induced climate change” • The Fifth Assessment Report (AR5) • CESM plans • Control: 1 run x 1300 years • Historical: 40 runs x 55 years • Future: 180 runs x 95 years 20,600 years of simulation GG in Data-Intensive Discovery
Would you like to Supersize that? GG in Data-Intensive Discovery
Absolutely, But how will it clog up my networks, and fill up my archive? GG in Data-Intensive Discovery
PetaApps project • PetaApps project: Interactive Ensemble • Kinter, Stan (COLA) • Kirtman (U of Miami) • Collins, Yelick (Berkley) • Bryan, Dennis, Loft, Vertenstein (NCAR) • Bitz (U of Washington) • Ultra-High resolution climate • Explore impact of weather noise on Climate • Explore technical/Computer Science issues • ~99,000 core Cray XT5 system at NICS [Kraken] • Large TG allocation: 35M CPU hours GG in Data-Intensive Discovery
Wendell Sea:Ice thickness (1°) GG in Data-Intensive Discovery
Impact of Supersizing 16x 100x GG in Data-Intensive Discovery
Outline • Climate Modeling • Production Data challenges: AR5 & IPCC • PetaApps simulation • Generating the Data • Analyzing the Data • Conclusions GG in Data-Intensive Discovery
Large scale PetaApps run 100x current production • 155 year control run • 0.1° Ocean model [ 3600 x 2400 x 42 ] • 0.1° Sea-ice model [3600 x 2400 x 20 ] • 0.5° Atmosphere [576 x 384 x 26 ] • 0.5° Land [576 x 384 • Statistics • ~18M CPU hours • 5844 cores for 4-5 months • ~100 TB of data generated • 0.5 to 1 TB per wall clock day generated 4x current production GG in Data-Intensive Discovery
Large scale PetaApps run (con’t) • Work flow • Run on Kraken (NICS) • Transfer output from NICS to NCAR (100 – 180 MB/sec sustained) • Caused noticeable spikes in TG network traffic • Archive on HPSS • Data analysis using 55 TB project space at NCAR GG in Data-Intensive Discovery
Issues/challenges with runs • Very large variability with I/O performance • 2-10x slowdown common • 300x slowdown was observed • Interference with other jobs? GG in Data-Intensive Discovery
Outline • Climate Modeling • Production Data challenges: AR5 & IPCC • PetaApps simulation • Generating the Data • Analyzing the Data • Conclusions GG in Data-Intensive Discovery
Issues with data analysis • Miss-match between structure and form of data needed for analysis • Model generated file: Timestep = n : U,V,T,PS,CLDLOW file.n.nc Timestep= n+step=m: U,V,T,PS,CLDLOW file.m.nc • Analysis: Average T (temperature) for last 40 years • Sub-setting and average operation necessary GG in Data-Intensive Discovery
Issues with data analysis (con’t) • Slightly different form of input for different analysis/tool • Diversity of analysis tools • NCL (NCAR Command Language) • NCO (NetCDF operators • Feret • IDL • Matlab • Cascade of intermediate products “Beware the data cascade, forever your disk will it fill.” -Yoda- GG in Data-Intensive Discovery
Impact of self describing format • All variables [~300 varaibles]for single timestep • + Matches raw output from model • - Does not match needs of analysis tools • One variable for single timestep • + Matches analysis tool needs • + Conceptually simple • - Replication of self describing metadata • - larger number of files…[~75M files] • One variable multiple timesteps • + Matches analysis tool needs • - Lose of generality GG in Data-Intensive Discovery
Parallelizing Analysis • Parallelizing analysis of atmospheric data • Goal: • Speedup analysis of high-resolution data • Analysis no slower then low resolution • AMWG diagnostic package • NCO operators [calculating statistics] • NCL ploting • Parallelization using Swift [T. Sines] GG in Data-Intensive Discovery
AMWG Diagnostics Climate Model (Community Earth System Model) • Improved Computing = increasing amounts of data • Data generation multi-threaded • Data analysis single-threaded GG in Data-Intensive Discovery
Swift • Workflow system from U of Chicago (http://www.ci.uchicago.edu/swift/index.php) • Allows parallel execution of scripts • Relatively easy to use • Dependency driven • Variety of jobs submission options GG in Data-Intensive Discovery
Issues with Swift • Full generality generates “excessive” copying of data • Climate workflow has small number of flops relative to input/output data size • Copy-in data: 15 sec • Calculate: 60 sec • Copy-out data: 15 sec • Used “un-published” option to perform direct-IO [0-copy] • Unresolved issue with deleting out of scope ‘files’ GG in Data-Intensive Discovery
Swift workflow on Dash (8 cores) GG in Data-Intensive Discovery
Data analysis Platforms • Twister@NCAR: • 8 core, Intel box, GPFS filesystem • Dash@SDSC: • Compute nodes with SSD disk • vSMP node with SSD + ScaleMP • GPFS-WAN • Nautilus@NICS • 1024 core SGI Ultra Violet • GPFS filesystem GG in Data-Intensive Discovery
Execution time for AMWG diagnostics Nice/Interesting result Not so interesting result GG in Data-Intensive Discovery
Performance Issues with AMWG diagnostic • Twister@NCAR: • No specialized hardware • Tiny system • Dash@SDSC: • vSMP system has network driver problem • 7.7x too slower compute node vsvSMP • Nautilus@NICS: • Apparently excessive Java exec overhead GG in Data-Intensive Discovery
Conclusions • PetaApps project [Interactive Ensembles] • 155 year CCSM control run • 0.5° ATM & LND + 0.1° OCN ICE • 0.5 to 1 TB/day of data to analysis • Increased ability to generated data creates analysis bottleneck core • Encouraging results for parallel workflow with Swift • No transformative benefit yet for expensive hardware ! • Overnight • After coffee break • Nearly Instant • No solution yet to data volume growth GG in Data-Intensive Discovery
NCAR: D. Bailey F. Bryan T. Craig B. Eaton J. Edwards [IBM] N. Hearn K. Lindsay N. Norton M. Vertenstein COLA: J. Kinter C. Stan U. Miami B. Kirtman U.C. Berkeley W. Collins K. Yelick (NERSC) U. Washington C. Bitz Grant Support: DOE DE-FC03-97ER62402 [SciDAC] DE-PS02-07ER07-06 [SciDAC] NSF Cooperative Grant NSF01 OCI-0749206 [PetaApps] CNS-0421498 CNS-0420873 CNS-0420985 Computer Allocations: TeraGrid TRAC @ NICS DOE INCITE @ NERSC LLNL Grand Challenge Thanks for Assistance: Cray, NICS, and NERSC Acknowledgements • NICS: • M. Fahey • P. Kovatch • ANL: • R. Jacob • R. Loy • LANL: • E. Hunke • P. Jones • M. Maltrud • LLNL • D. Bader • D. Ivanova • J. McClean (Scripps) • A. Mirin • ORNL: • P. Worley and many more… GG in Data-Intensive Discovery
Questions dennis@ucar.edu GG in Data-Intensive Discovery