360 likes | 496 Views
Power-aware Consolidation of Scientific Workflows in Virtualized Environments. Qian Zhu, Jiedan Zhu, Gagan Agrawal Presented by Bin Ren. 1. 3. 4. 5. 2. Background. Key points analysis. Design of pSicMapper. Experimental Evaluation. Motivating Applications . Outlines. 6.
E N D
Power-aware Consolidation of Scientific Workflows in Virtualized Environments Qian Zhu, Jiedan Zhu, GaganAgrawal Presented by Bin Ren
1 3 4 5 2 • Background Key points analysis • Design of pSicMapper • Experimental Evaluation • Motivating Applications Outlines 6 • Conclusion
1 3 4 5 2 • Background Key points analysis • Design of pSicMapper • Experimental Evaluation • Motivating Applications Outlines 6 • Conclusion
Background • Scientific Workflow • Computing process model; • Input: • Various data; • Output: • Data product presentation and virtualization; • An important feature: • Different workflow modules may have different resource requirements; 1
Background • Cloud Computing • Some scientific workflows have been experimented on cloud environments; • Two important characteristics: • Virtualization technologies; • Pay-as-you-go model. • An important issue – tradeoff between: • Power consumption; • Performance. 2
Background • Topic of this work • Focus on: Effective energy and resource costs management for scientific workflows; • Goal: How to minimize the total power consumption and resource costs without a substantial degradation in performance; • Strategy: Consolidation of workflow tasks Why can we put tasks together without big performance degradation? Why this strategy can save power? How to decide which tasks can be put together? … 3
1 3 4 5 2 • Background Key points analysis • Design of pSicMapper • Experimental Evaluation • Motivating Applications Outlines 6 • Conclusion
Motivating Applications • Great Lake Forecasting System (GLFS) • Used to forecast the meteorological conditions of Lake Erie; • Directed acyclic graph; • Compute-intensive application; • An important feature: • Different workflow modules may have different resource requirements/Usage; 4
Motivating Applications • Resource usage of GLFS tasks Task 1: 5
Motivating Applications • Resource usage of GLFS tasks Task 2: 6
Motivating Applications • Analysis on Resource usage of GLFS tasks • The scientific workflow tasks have a periodic behavior with respect to CPU, memory, disk and network usage; • The resource usage of a workflow task is significantly smaller than its peak value for more than 80% of the time; • The resource consumption can be dependent on the application parameters and the characteristics of the host server. 7
1 3 4 5 2 • Background Key points analysis • Design of pSicMapper • Experimental Evaluation • Motivating Applications Outlines 6 • Conclusion
Key Points Analysis • Resource Usage and Power Consumption; • Virtualization Technologies usage and power consumption; • Consolidation and power consumption. Two important explanations: 1. Unit power denotes the power consumption every 1 second; 2. When the system is idle, there is still a power consumption of 32watt. 8
Key Points Analysis • Resource Usage and Power Consumption • The CPU usage: the more CPU workload, the more unit power consumption. However, they are not proportional; • The memory usage: similar to CPU (important difference: cache misses ). • Disk and Network IO: roughly say, they add a constant unit power consumption to the whole one. 9
Key Points Analysis • Virtualization and Power Consumption Using Virtualization technology doesn’t incur too much overhead on power consumption for our work. 10
Key Points Analysis • Consolidation and Power Consumption • There is a multiplication relationship between these two figures: • Nor_Power_Con = U_app1 * T_app1 + U_app2 * T_app2; • Con_Power_Con = U_con * T_con; 11
Key Points Analysis • Consolidation and Power Consumption • Important observations: • 1. Consolidation of dissimilar resource requirement workloads incurs a small slowdown in execution time and saves a large amount of total power; • 2. Consolidation of similar resource requirement workloads incurs a large slowdown in execution time, and the power consumption may not be decreased due to the longer execution time. • PS: The unit power consumption increasing of MM is partially coming from the cache misses. 12
1 3 4 5 2 • Background Key points analysis • Design of pSicMapper • Experimental Evaluation • Motivating Applications Outlines 6 • Conclusion
Design of pSciMapper • Overview of the whole system 13
Design of pSciMapper • Explanation of pSciMapper The main algorithm -- Hierarchical clustering: • Data Modeling; • Distance Metric; • Clustering result evaluation 14
Design of pSciMapper • Data Modeling • Predict the resource usage time series: • Application parameters; • Hardware specification; • Hidden Markov Model(HMM) • Temporal Feature Extraction – Temporal Signature: • Peak value: Max value of the time series; • Relative variance: Normalized sample variance; • Pattern: A sequence of samples to represent the pattern • Notation: 15
Design of pSciMapper • Distance Metric Disti,j: The distance between task i and j; Ri: A kind of resource. There a four types, so (R1, R2) has 10 pairs; aff_score: The pre-defined affect factor for consolidation R1 and R2, value is (0, 1); Corr(peaki, peakj): The Pearson’s correlation between two workloads with regards to some kind of resource usage; 16
Design of pSciMapper • Clustering result evaluation The objective: • Map each clustering strategy to the servers set; • Evaluate the execution time and power consumption of each clustering strategy The method: • At the bottom level: Kernel Canonical Correlation Analysis(KCAA); • At other levels: Nelder-Mead, an optimization algorithm 17
Design of pSciMapper • Workflow Consolidation Algorithm • Initial one-to-one assignment; • Generate resource usage time series (HMM) and evaluate the Time and Power by KCCA; • Merge clusters according to the Distance Metric; • Optimal assignment and reevaluate the Time and Power by Nelder-Mead; • Repeat step 3 and 4 until the merge threshold is reached: the time degradation is too big or the power consumption saving is too small Optimization: Dynamic CPU provisioning 18
Design of pSciMapper • Example C1 C2 C3 C4 C5 CPU: moderate Mem: low Disk: low Net: low CPU: moderate Mem: low Disk: low Net: moderate CPU: moderate Mem: high Disk: high Net: low CPU: high Mem: moderate Disk: low Net: low CPU: low Mem: low Disk: high Net: moderate Assignment <power, time> {(C1, C2, C3), S2}, {(C4, C5), S1} <93.62, 92.87> {(C1, C2), S2}, {C3, S5}, {(C4, C5), S1} <135.11, 88.03> {C1, S2}, {C2, S3}, {C3, S5}, {C4, S1}, {C5, S4} <180.56, 83.93> C1 C2 C3 C4 C5 A small modification of the figure in the paper: according to the description of the paper, we should switch the position of two times: 83.93 and 92.87 19
1 3 4 5 2 • Background Key points analysis • Design of pSicMapper • Experimental Evaluation • Motivating Applications Outlines 6 • Conclusion
Experimental Evaluation • Experiment setup • Algorithm compared • Without consolidation • pSciMapper + Static Allocation • pSciMapper + Dynamic Provisioning • Optimal + Work Conserving • Metrics • Normalized Total Power Consumption • Execution Time • Virtualization Environment • Xen 3.0 20
Experimental Evaluation • Experiment applications • Two real applications: GLFS, VR • Three Synthetic applications: SynApp1, 2, 3 21
Experimental Evaluation • Normalized Power Consumption: GLFS • Four different configurations; • In all case, the pSciMapper can save power, (as much as 20%); • Without dynamic provisioning, pSciMapper is slightly worse than optimal method, with it, pSciMapper is much better 22
Experimental Evaluation • Normalized Power Consumption: VR and Syn • Similar to GLFS; • The dynamic provisioning doesn’t import much optimization to VR and Syn1 and Syn2, since it is used for CPU and Memory, especially to CPU. 23
Experimental Evaluation • Execution Time: GLFS • Clustering stop threshold is set: 15% degradation of execution time; • From the result, we know that the degradation of pSciMapper + dynamic provisioning is within 12% 24
Experimental Evaluation • Execution Time: VR and Syn • Similar to GLFS 25
Experimental Evaluation • Consolidation Overhead and Scalability • The overhead of pSciMapper is much smaller than the heuristic optimal method; • The scalability of pSciMapper is good. 26
1 3 4 5 2 • Background Key points analysis • Design of pSicMapper • Experimental Evaluation • Motivating Applications Outlines 6 • Conclusion
Conclusion • Design and implement a power-aware consolidation framework, pSciMapper, based on hierarchical clustering method; • pSciMapper is able to reduce the total power consumption by up to 56% with a most 15% slowdown for the workflow applications; • pSciMapper incurs low consolidation overhead so its scalability to large scale scientific workflow applications is good 27
Thank you for your listening! Any Questions?