420 likes | 626 Views
Forecasting with Cyber-physical Interactions in Data Centers. Lei Li leili@cs.cmu.edu. Outline. Overview of time series mining Time series examples What problems do we solve Motivation Experimental setup ThermoCast : the forecasting model Results Other time series models and algorithms.
E N D
Forecasting with Cyber-physical Interactions in Data Centers Lei Li leili@cs.cmu.edu PDL Seminar
Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012
What is co-evolving time series? Correlated multidimensional time sequences with joint temporal dynamics (c) Lei Li 2012
[Li et al 2008a] Motion Capture • Goal: generate natural human motion • Game ($57B) • Movie industry • Challenge: • Missing values • “naturalness” Right hand walking motion Left hand (c) Lei Li 2012
Environmental Monitoring • Problem: early detection of leakage & pollution • Challenge: noise & large data Chlorine level in drinking water systems [Li et al 2009] (c) Lei Li 2012
Network Security • Challenge: Anomaly detection in computer network & online activity BGP # updates on backbone from http://datapository.net/ Webclick for TV Webclick for news from NTT (c) Lei Li 2012
Time Series Mining Problems • Forecasting • Imputation (missing values) • Compression • Segmentation, change/anomaly detection • Clustering • Similarity queries • Scalable/Parallel/Distributed algorithms See my thesis for algorithms covering these problems (c) Lei Li 2012
Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012
Datacenter Monitoring & Management Temperature in datacenter • Goal: save energy in data centers • US alone, $7.4B power consumption (2011) • Challenge: • Huge data (1TB per day) • Complex cyber physical systems (c) Lei Li 2012
Google data center Typical Data Center Energy Consumption • LBL data center [Barroso 09] (c) Lei Li 2012 [LBNL/PUB-945]
Towards Thermal Aware DC Management • Data centers are often over provisioned, with ≈40% of energy spent for cooling (total=$7.4B) • How can we improve energy efficiency in modern multi-MegaWatt data centers? JHU data center with Genomote (c) Lei Li 2012
Air cycle in DC (c) Lei Li 2012
Possible Ways for Saving Cooling and Computing Cost • Challenges: • airflow interaction, spatial placement, SLA, … • Possible direction: • Shutdown unused machine according to workload (c) Lei Li 2012 Example MSN workload
Towards Data Driven AC control and server management • Reactive energy saving: • slow down cooling fan in CRAC • raise AC temperature set points • Proactive data center management: • predicting temperature distribution and thermal aware placement of workload supply air temperature < threshold max(active inlet air temperature)< threshold (c) Lei Li 2012
Big Picture: Predictive AC Control and Server Management Server/workload management Computing energy model Sensor measuring Temperature prediction Cooling energy model CRAC control (c) Lei Li 2012
Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012
Experimental setup • Tested in JHU data center with 171 1U servers, instrumented with a network of 80 sensors (c) Lei Li 2012
Sample measurements (c) Lei Li 2012
Observations • Temperature difference cycle (max/min temp. on the same rack) is in anti-phase with air velocity cycle. • Middle and bottom sections are coldest; Top is hottest • Shutting down under-utilized servers could reduce energy consumption. (c) Lei Li 2012
What happens when shutting down servers? Shut down (c) Lei Li 2012
Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012
ThermoCast[Li et al, KDD 2011] • Given: intake temperatures, outtake temperatures, workload for each server , and floor air speed • Goal: forecasting temperature distribution and thermal aware placement of workload • Approach: a zonal forecasting model • divide the machine room into zones, and each rack into sections. (c) Lei Li 2012
Assumptions • A0: incompressible air • A1: environmental temperature is constant • A2: supply air temperature is constant within a period • A3: constant server fan speed • A4: vertical air flow at the outtake is negligible • A5: vertical air flow at the intake is linear to height (c) Lei Li 2012
Sensor measurements & Air interactions (c) Lei Li 2012
ThermoCast (c) Lei Li 2012
ThermoCast Model outlet temp Inlet temp floor air speed Derived from fluid dynamics and thermodynamics together with assumptions [Li et al, KDD 2011] (c) Lei Li 2012
Parameter Learning (c) Lei Li 2012 s.t.
Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012
ThermoCast Results • Q1: How accurately can a server learn its local thermal dynamics for prediction? 2x better using 90 minutes as training, predicting 5 minutes away (c) Lei Li 2012 75% 100% shutdown AR ThermoCast
ThermoCast Results • Q2: How long ahead can ThermoCast forecast thermal alarms? 2x faster FAR=false alarm rate MAT=mean look-ahead time (c) Lei Li 2012
Implication on Capacity Gain • Preliminary results comparing workload placement strategies: • 5 minutes forecast length • With the same cooling: • Inlet temp with ThermoCast: 13.75 C • Inlet temp with Static profiling: 16.5 C • Assume the servers consume 200W on average (Dell PowerEdge 1950), we gain extra 26% computing power with the same cooling (c) Lei Li 2012
Contributions and Impact • Predictability: a hybrid approach to integrate the thermodynamics and sensor data • Scalable learning/training thanks to the zonal thermal model • Real data and instrument in a data center with practical workload • Projected impact: can handle extra26% workload (e.g. PUE 1.5 PUE 1.4) (c) Lei Li 2012
Outline • Overview of time series mining • Time series examples • What problems do we solve • Motivation • Experimental setup • ThermoCast: the forecasting model • Results • Other time series models and algorithms (c) Lei Li 2012
DynaMMo: imputation/forecasting sensor 1 sensor 2 … sensorm Time blackout Goal: recover the missing values Details in [Li et al, KDD 2009] (c) Lei Li 2012
DynaMMo result Reconstruction error Spline MSVD [Srebro’03] Linear Interpolation Our DynaMMo better Ideal Dataset: CMU Mocap #16 mocap.cs.cmu.edu Average missing length harder (c) Lei Li 2012 more results in [Li et al, KDD 2009]
PLiF and CLDS for clustering BGP data: hierarchical clustering + PLiF features Details in [Li et al, VLDB 2010] and [Li & Prakash, ICML 2011] (c) Lei Li 2012
CLDS Clustering Mocap Data CLDS two features PCA top 2 components Accuracy = 93.9% Accuracy = 51.0% (c) Lei Li 2012 walking motion running motion
WindMine • Goal: find patterns and anomalies from user-click streams (c) Lei Li 2012
Discoveries by WindMine Job website weather kids health (c) Lei Li 2012
Conclusion • time series mining with many applications • Numbers for energy consumption in DC, and cooling costs much • Sensor networks find use in data center monitoring • ThermoCast: the forecasting model • Other time series models and algorithms • DynaMMo for imputation • PLiF & CLDS for clustering • WindMine for web clicks
References • Lei Li, et al. ThermoCast: A Cyber-Physical Forecasting Model for Data Centers KDD 2011 • Lei Li, et al. Time Series Clustering: Complex is Simpler. ICML 2011 • Yasushi Sakurai, Lei Li, et al, WindMine: Fast and Effective Mining of Web-click Sequences, SDM, 2011. • Lei Li, et al. Parsimonious Linear Fingerprinting for Time Series. VLDB 2010. • Lei Li, et al. DynaMMo: Mining and Summarization of Coevolving Sequences with Missing Values. ACM KDD 2009. (c) Lei Li 2012
Thanks! contact: Lei Li (leili@cs.cmu.edu) papers, software, datasets on http://www.cs.cmu.edu/~leili