160 likes | 318 Views
Computation Time Analysis - Climate Reanalysis Data. Dipanwita Dasgupta University of Notre Dame Graduate Operating Systems. Motivation. Climate Analysis : Why it is important?
E N D
Computation Time Analysis - Climate Reanalysis Data Dipanwita Dasgupta University of Notre Dame Graduate Operating Systems
Motivation • Climate Analysis : Why it is important? • Increase in occurrence of climate hazards • Climate Reanalysis Data • Data Centric Approach • Climate Network
Dataset • National Centre for EnvironementalPrediction / National Centre for Atmospheric Research (NCEP/NCAR) Reanalysis Dataset • Composed of data at 17 pressure levels • Total of approximately 10000 grid points • Factors affecting climate
Background • Climate Network Model • Limited to use 7 factors affecting climate • Affects the predictive modeling • Computation Time
Problem • Computation has 3 steps • Reading the data from file • Calculation at each level • Combining the results • Step 2 – highly computation intensive • The present code can only handle 20 units of data at a time
Actual Work • Analyzed time taken to run on a single machine • Distributed Framework • Steps 1 and 2 mentioned in previous slide for each level are independent of each other • Ran in a distributed fashion • Used the CRC SGE Machine
Assumptions • Used only one parameter • Geopotential Height • Only one measure of dispersion • Euclidean Distance • Processing is similar for other parametersas well as for measures of dispersion
Experimental Set-up • NCEP Reanalysis Dataset • 20 units of longitude • Sequential Execution • Used the school workstation desktop • Distributed Framework • Used opteron.crc.nd.edu
Distributed Framework: Setup • opteron.crc.nd.edu • Submitted Bash script • Ran 10 simulations per level • Took the average
Results Analysis • Distributed Framework works better than Sequential Execution • Expected Speed-Up not achieved • Reading data from the file took more time than expected • Reduced time for the other steps
Future Work • Optimization of reading data from file • Use various file systems – NFS/AFS • Include more measures of dispersion • Increase the number of parameters