530 likes | 675 Views
Machine Learning for Adaptive Power Management. Authors: G. Theocharous et al Presenter: Guangdong Liu April 1th, 2011. Outline. Introduction Background Related Work Contribution The Context-Based Solution Incorporating Stochastic Processes An Introduction to Machine Learning.
E N D
Machine Learning for Adaptive Power Management Authors: G. Theocharous et al Presenter: Guangdong Liu April 1th, 2011
Outline • Introduction • Background • Related Work • Contribution • The Context-Based Solution • Incorporating Stochastic Processes • An Introduction to Machine Learning
Introduction • Power Management (for laptops) • Motivation: mobile systems face battery life issues && high performance systems face heating issues • Objective: to maximize the battery life while minimizing the annoyance to the user • Approaches: to place a component into various power saving states • Crucial point: which component and when should it be shutdown
Introduction • Existing commercial solutions: Timeout Policies • Decide to turn off a component if the time passed since it was last used for more than some predefined threshold • Control the annoyance by varying the threshold • Widely implemented across all Operating Systems • Windows* OS has several built-in power management schemes that allow the user to choose between different levels of thresholds
Introduction • Existing commercial solutions: Timeout Policies • Advantages • Simple and robust • Disadvantages • Aggressiveness: the length of the timeout • React too slow: waste power during the inactivity periods • React too fast: annoy the user at an inappropriate time • Non adaptive!
Introduction • Adaptive Power Management • Objective: to make PM “autonomous” • What does “autonomous” refer to in this context? • Intelligently decide when to place a component into various power saving states given the user activity • An instance of autonomic computing systems
Background • Autonomic Computing Systems • Motivation • Current programming paradigms, methods, management tools are inadequate to handle the scale, complexity, dynamism and heterogeneity of emerging systems
Background • Without requiring our conscious involvement • when we run, it increases • our heart and breathing rate
Background • Goal of Autonomic Systems • Build a self-managing system with a minimum human interference • Characteristics of Autonomic Systems • Self-Configuring • Self-Adapting • Self-Optimizing • Self-Healing • Self-Protecting • Highly Decentralized • Heterogeneous Architecture
Autonomic Manager Analyze Plan Monitor Execute Knowledge Managed Element S E Background • How Autonomic Systems Work
Background • Power Management
Background • APM As An Autonomic System
Background • Adaptive Power Management • Objective: to make PM “autonomous” • What does “autonomous” refer to in this context? • Intelligently decide when to place a component into various power saving states given the user activity • Challenges: (1) better tradeoff between high power saving and low user annoyance (2) accurate modeling of real world uncertainty
Background • Uncertainty in APM • Perception Uncertainty • User context cannot be directly observed from sensors, such as keyboard and mouse activity, or the currently active application • Action Uncertainty • Turning off a component can generate uncertainty: (1) reflected by the time it takes to turn a component on and off: placing the machine on a standby does not always take the same amount of time (2) reflected by the effects on the user’s context: if the user context is a measure of idleness … if the user context is a measure of the users’ mode of operation
Background • Adaptive Power Management • Idea: taking into account two essential things: “Context” and “Uncertainty” • What is “Context” for an APM? • What is “Uncertainty” for an APM? • How can we consider the two factors APM?
Background • How to solve the uncertainty in APM • Machine Learning • Construct the programs that automatically improve with experience • The ability of a program to learn from experience — that is, to modify its execution on the basis of newly acquired information • Using machine learning algorithms, APM systems can automatically learn to map laptop usage patterns into power management actions specifically for individual users
Related Work • Machine Learning Based Prediction in APM • [16] attempts to predict the length of the idle period based on two thresholds calculated by using regression and manually obtained from data and observes that typically a long short idle time is followed by a long active time and vice versa. • [7] predicts the future idleness based on “Recency” : the future idleness is predicted as an exponentially weighted sum of recent delays • But the thresholds parameters are obtained based on non adaptive “recency”
Related Work • Simple Stochastic Process Approaches in APM • In [1][14][15], a single Markov Decision (MDP) or a Semi MDP is constructed • SMDP is a MDP where the next state does not only depend on the current state but also on how long the current state has been active • However, it is assumed that the state can be directly observed from sensors without uncertainty • No models for user context or consideration of user annoyance
Contribution • Summarize past approaches to the APM problem • Approaches • To develop a context-based approach that maps user patterns to power saving actions (current work with experiment results presented in the paper) • To establish a stochastic model-based approach (future work)
The Context-Based Solution • Objective: to make a good tradeoff between the power savings and the perceived performance degradation (the annoyance)
The Context-Based Solution • Metrics of APM actions and annoyance • Quantify the power savings • Turn on the laptop (18.5Watts/s) • Turn off the LCD (14Watts/s) • Turn off the WLAN (17.5 Watts/s) • Run the CPU in low frequency (15 Watts/s) • Place the laptop in standby mode (0.7 Watt/s )
The Context-Based Solution • Metrics of APM actions and annoyance • Quantify the Annoyance (Based on interviews) • Turn down the CPU frequency by mistake (1) • Turn off the WLAN by mistake (3) • Turn off the LCD by mistake (7) • Move to standby mode by mistake (10) • How to detect mistakes • Detect it if this component is needed after it was turned off. For example, turn off the LCD but the user opens a new application, it is a mistake!
The Context-Based Solution • Step#1: The Direct Approach • Two counterparts • Timeout-based policies • Logistic regression, k-nearest-neighbors and the C4.5 decision tree • For each APM action, a separate classifier is trained based on the given data • What is the given data? • Input: the sensor measurements including active application, keyboard and mouse activity, CPU load and network traffic • Output: whether to turn the component on/off
The Context-Based Solution • Step#2: Context-Based Policy Learning • The basic idea is to partition the data into contexts, for each of which a separate classifier is trained • Choose the decision threshold for each classifier such that the overall power saving is maximized • In a general case, context could be past idleness or any partitioning of the data that improves performance • Specifically, the paper defines context to be “the time since a component was last active”
The Context-Based Solution • Define Context • Partition the data into 30 categories • Category 1 means that the component is active in the previous step • Category 30 means that the component is idle for the last 600 time steps • The rest 28 categories are chosen between
The Context-Based Solution • Threshold computation for contexts • Use an optimization algorithm to determine the thresholds for each context classifier • Start by setting the least annoying thresholds for all classifiers • Increase the threshold that corresponds to the maximal power savings over annoyance increase ratio subject to a global annoyance constraint
Experiment Setup • Two kinds of baselines • Timeout policy • Naïve classifiers: logic regression, k-nearest-neighbors, C4.5 decision tree • A new scheme: context-based logistic regression • Data Source • There are 42 traces, which were collected for 7 users, representing the cumulative experience of 210 usage hours • The performance is compared for the same annoyance level
Incorporating Stochastic Processes • Model-Based Approaches • Capture the temporal dynamics of user and system state as well as the annoyance and power costs • Decouple the decision-making process from the problem of learning and estimating the model of the environment • Learning a model refers to the process of defining/discovering domain variables and how they relate to each other • Using the models refers to the process of computing decisions given the model
Incorporating Stochastic Processes • Markov Decision Processes • Powerful in domains where actions change the states stochastically and where there is usually a delayed reward signal when exercising an action • Objective • Maximize its long term cumulative rewards • Markov property • The true state captures all the information needed to describe the system • Given the state of the system the future is independent of the past
Incorporating Stochastic Processes • Markov Decision Processes
Incorporating Stochastic Processes • Markov Decision Processes • Formal model • <S,A,T,R> • S: a finite set of states • A: a finite set of actions • T(s’|s, a): the transition probability from state s to state s’ under action a • R(s,a): the reward for taking action a in state s • Advantage & Disadvantage • Simplicity and efficiency • Cannot capture unobservable dynamics • Depend on the assumption that the state of the system can be estimated with no errors
Incorporating Stochastic Processes • Dynamic Bayesian Networks • Some of the variables are observed and some are not. An agent reasons the state of the system indirectly by the observed variables. • A special DBN called Hidden Markov Model • <S, T, Z, O> • S: a finite set of states • Z: a finite set of observations • T(s’|s): the transition possibility from s to s’ • O(z|s): the possibility of observing z in s • Advantage & Disadvantage • Capture complex dynamics that not completely observable • Lack of support for decision making
Incorporating Stochastic Processes • The Pros and Cons of MDP and HMM • MDP: it is decision making process, but cannot capture unobservable dynamics • HMM: it can capture complex dynamics but lack the support for decision making
Incorporating Stochastic Processes • Partially Observable Markov Decision Process • Combine the strengths of HMMs and MDPs • Make decision task under uncertainty • A POMDP policy computes an action after every observation such that in the long-run the utility is maximized • Uncertainty: • The true state of the world is usually unknown • Even if the state is known, some actions may have uncertain consequences
Incorporating Stochastic Processes • Partially Observable Markov Decision Process • A POMDP policy computes actions at every step • Due to the fact that the system state is unobserved, the POMDP maps the actions to all possible probability distributions over the states called belief states • The belief state b(s) represents the agent’s current belief that the true state is s
Incorporating Stochastic Processes • Model-based Approach to APM • POMDP model is the only one that is rich enough to capture the two main aspects of APM • APM includes a human user and a complex computer system that cannot be assumed perfectly • APM involves making decisions on which components to turn on and off • Formal model • A: actions such as turning on or off a component • S: the state space is a combination of system state and user state • T: transition possibility • O: the observations including the various sensors/features
Incorporating Stochastic Processes • Model-based Approach to APM • Main problem • How to construct the state space • It is obvious that a model would be too complex if it fully describes the system or the user context (i.e., the way the user is interacting with the system) • Need to balance the complexity of the model with needs to be able to obtain a useful APM policy
Incorporating Stochastic Processes • Model-based Approach to APM • Future work • Develop a more complex model for user context, especially look at automatic context construction • Different ways to measure annoyance and learning the annoyance from the user based on the individual feedback • Study the statistics of duration between changes of the values of the system and user variable • Consider the initial period where the system is initializing
Introduction to Machine Learning • K-Nearest-Neighbors • Given an object X, find the K most similar training examples and classify X into the most common category Y among the K neighbors • Compute object similarity using Euclidean distance:
Sec.14.3 Introduction to Machine Learning Example: k=6 (6NN) Government Science Arts
Introduction to Machine Learning Objects to be classified Decision Trees Feathered? NO YES Endothermic? Volant? NO YES YES NO Carnivorous? Viviparous? Category=ratite NO NO YES YES Category=raptor … ...
Introduction to Machine Learning • Logistic Regression • Assume there are two classes y = 0 and y = 1, we want to learn the conditional distribution P(y|x) • Let py(x;w) be our estimate of P(y|x), where w is a vector of adjustable parameters
Introduction to Machine Learning • Logistic Regression • This is equivalent to • That is, the log odds of class 1 is a linear function of x • Q: How to find W?
Introduction to Machine Learning • Logistic Regression • The conditional data likelihood is the probability of the observed Y values in the training data, conditioned on their corresponding X values. We choose parameters w that satisfy • where w = <w0,w1 ,…,wn> is the vector of parameters to be estimated, yldenotes the observed value of Y in the l th training example, and xl denotes the observed value of X in the l th training example