430 likes | 596 Views
Non-Preemptive Scheduling Policy Design for Tasks with Stochastic Execution Times*. Chris Gill Associate Professor Department of Computer Science and Engineering Washington University, St. Louis, MO, USA cdgill@cse.wustl.edu. The University of North Carolina at Chapel Hill
E N D
Non-Preemptive Scheduling Policy Design for Tasks with Stochastic Execution Times* Chris Gill Associate Professor Department of Computer Science and Engineering Washington University, St. Louis, MO, USA cdgill@cse.wustl.edu The University of North Carolina at Chapel Hill Friday November 20, 2009 *Research supported by NSF grants CNS-0716764 (Cybertrust) and CCF-0448562 (CAREER) and driven by numerous contributions from doctoral students Robert Glaubius and Terry Tidwell; undergraduates Braden Sidoti, David Pilla, Justin Meden, and Cameron Cross; and Prof. William D. Smart
Dept. of Computer Science and Engineering 24 faculty members and 71 Ph.D. students working in: real-time and embedded systems, robotics, graphics, HCI, AI, bioinformatics, networking, high-performance architectures, chip multi-processors, mobile computing, sensor networks, distributed systems, optimization PhD students are fully funded, and we emphasize individual mentorship and interdisciplinary work Recent graduates are on faculty at U. Mass, UT-Austin, Rochester, RIT, CMU, Michigan St., and UNC-Charlotte Graduate study application deadline for Fall 2010 is January 15: http://www.cse.wustl.edu
Motivation Systems are increasingly being designed to interact with the physical world This trend offers compelling new research challenges that motivate our work Consider for example the domain of mobile robotics my name is Lewis Media and Machines Laboratory Washington University in St. Louis
Motivation As in many other systems, resources must be shared among competing tasks Fail-safe modes may reduce consequences of resource-induced timing failures, but precise scheduling matters The physical properties of some resources motivate new models and techniques my name is Lewis Media and Machines Laboratory Washington University in St. Louis
Motivation For example, sharing a camera between navigation and surveying tasks (1) in general doesn’t allow efficient preemption (2) involves stochastically distributed durations Other scenarios also raise scalability questions, e.g., multi-robot heterogeneous real-time data transmission Lewis Media and Machines Laboratory Washington University in St. Louis
System Model Assumptions • To begin, time is modeled as being discrete • E.g., some multiple of the Linux jiffy is the time quantum • Separate tasks require a shared resource • Access is mutually exclusive (a task binds the resource) • Binding durations are independent and non-preemptive • Each task’s distribution of durations can be known • Each task is always available to run • Goal: precise resource allocation among the tasks • E.g., 2:1utilization share targets for tasks A vs B • Need a deterministic scheduling policy (decides which task gets the resource when) that best fits that goal
Towards Optimal Policies A Markov decision process (MDP) is a 4-tuple (X,A,C,T) that matches our system model well: X: a finite set of states (e.g., utilizations of 8 vs. 17 quanta) A: the set of actions (giving resource to a particular task) C: cost function for taking an action in a state T: transition function (probability of moving from one state to another state based on the action chosen) Solving the MDP gives a policy that maps each state to an action to minimize long term expected costs However, to do that we need a finite set of states
Share Aware Scheduling A system state: cumulative resource usage of each task Dispatching a task moves the system stochastically through the state space according to that task’s duration (8,17)
Share Aware Scheduling u Utilization target induces a ray{u:0} through the state space Encode each state’s “goodness” (relative to the share) as a cost Require that costs grow with distance from utilization ray u=(1/3,2/3)
Transition Structure Transitions are state-independent I.e., relative distribution over successor states is the same in each state
Cost Structure States along same line parallel to the utilization ray have equal cost
Equivalence Classes Transition and cost structure thus induce equivalence classes Equivalent states have the same optimal long-term cost and policy!
Periodicity Periodic structure allows us to represent each equivalence class with a single exemplar
Wrapping the State Model Remove all but one exemplar from each equivalence class Actions and costs remain unchanged Remap any dangling transitions (to removed states) to the corresponding exemplar (0,0)
c(x)= c(x)= Truncating the State Model Inexpensive states are nearer the utilization target Good policies should keep costs small Can truncate the state space by bounding sizes of costs considered
Bounding the State Model Map any dangling transitions produced by truncation, to a high-cost absorbing state This guarantees that we will be able to find bounded-cost policies if they exist Bounded costs also guarantee bounded deviation from the resource share (precision)
A Scheduling Policy Design Approach Iteratively increase the bounds and re-solve the resulting MDP As the bounds increase, the bounded model solution converges towards the optimal wrapped model policy
Automating Model Discovery ESPI: Expanding State Policy Iteration • Start with a policy that only reaches finitely many states from (0,…,0). E.g., always run the most underutilized task. • Enumerate enough states to evaluate and improve that policy • If policy can not be improved, stop • Otherwise, repeat from (2) with newly improved policy
Policy Evaluation Envelope Enumerate states reachable from the initial state Explore state space breadth-first under the current policy, starting from the initial state (0,0)
Policy Improvement Envelope Consider alternative actions Close under the current policy using breadth-first expansion Evaluate and improve the policy within this envelope
ESPI Termination As long as the initial policy has finite closure, each ESPI iteration terminates (this is satisfied by starting with the heuristic policy that always runs the most underutilized task) Policy strictly improves at each iteration Anecdotally, ESPI terminates on all of the task scheduling MDPs to which we have applied it
Comparing Design Methods Policy performance is shown normalized and centered on the ESPI solution data Larger bounded state models yield the ESPI solution
What About Scalability? MDP representation allows consistent approximation of the optimal scheduling policy Empirically, bounded model and ESPI solutions appear to be near-optimal However, approach scales exponentially in number of tasks so while it may be good for (e.g.) sharing an actuator, it won’t apply directly to larger task sets
What our Policies Say about Scalability To overcome limitations of MDP based approach, we focus attention on a restricted class of appropriate scheduling policies Examining the policies produced by the MDP based approach gives insights into choosing (and into parameterizing) appropriate policies
Two-task MDP Policy Scheduling policies induce a partition on a 2-D state space with boundary parallel to the share target Establish a decision offset d to identify the partition boundary Sufficient in 2-D, but what about in higher dimensions?
Time Horizons Suggest a Generalization Ht={x : x1+x2+…+xn=t} u (0,0,2) u (0,2,0) H0 H1 (0,0) (2,0,0) H0 H1 H2 H3 H4 H2
Three-task MDP Policy t =10 t =20 t =30 Action partitions meet along a decision ray that is parallel to the utilization ray Action partitions are roughly cone-shaped
x Parameterizing a Partition Specify a decision offset at the intersection of partitions Anchor action vectors at the decision offset to approximate partitions A conic policy selects the action vector best aligned with the displacement between the query state and the decision offset a2 a1 a3
Decision offset d Action vectors a1,a2,…,an Sufficient to partition each time horizon into nregions Allows good policy parameters to be found through local search Conic Policy Parameters
Comparing Policies Policy found by ESPI (for small numbers of tasks) πESPI(x) – chooses action at state x per solved MDP Simple heuristics (for all numbers of tasks) πunderused(x) – runs the most underutilized task πgreedy(x) – minimizes immediate cost from state x Conic approach (for all numbers of tasks) πconic(x) – selects action with best aligned action vector
Policy Comparison on a 4 Task Problem Task durations: random histograms over [2,32] 100 iterations of Monte Carlo conic parameter search ESPI outperforms, conic eventually approximates well
Policy Comparison on a Ten Task Problem Repeated the same experiment for 10 tasks ESPI is omitted (intractable here) Conic outperforms greedy & underutilized heuristics
Comparison with Varying #s of Tasks 100 independent problems for each # (avg, 95% conf) ESPI only tractable through all 2 and 3 task cases Conic approximates ESPI, then outperforms others
Conclusions We have developed new techniques for designing non-preemptive scheduling policies for tasks with stochastic resource usage durations MDP-based methods provide good approximations to optimal policies for 2 or 3 tasks Conic policy performance is competitive with ESPI for smaller problems, and for larger problems improves on underutilized and greedy policies Future work will focus on applying and evaluating our results in different cyber-physical systems, and on extending them further in design and verification
For Further Information R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Policy Design for Autonomic Systems”, International Journal on Autonomous and Adaptive Communications Systems, 2(3):276-296, 2009 R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Design and Verification for Open Soft Real-Time Systems”, RTSS 2008 R. Glaubius, T. Tidwell, B. Sidoti, D. Pilla, J. Meden, C. Gill, and W.D. Smart, “Scalable Scheduling Policy Design for Open Soft Real-Time Systems”, Tech. Report WUCSE-2009-71, 2009 (Under Review for RTAS 2010) R. Glaubius, T. Tidwell, C. Gill, and W.D. Smart, “Scheduling Design with Unknown Execution Time Distributions or Modes”. Tech. Report WUCSE-2009-15, 2009 T. Tidwell, R. Glaubius, C. Gill, and W.D. Smart, “Scheduling for Reliable Execution in Autonomic Systems”, ATC 2008 C. Gill, W.D. Smart, T. Tidwell, and R. Glaubius, “Scheduling as a Learned Art”, OSPERT, 2008 Project web site: http://www.cse.wustl.edu/~cdgill/Cybertrust/
Thank you! Chris Gill Associate Professor of Computer Science and Engineering
Appendix: Comparison to EDF Scheduling • Earliest-Deadline-First (EDF) scheduling: • Enforces timeliness by meeting task deadlines. • Not share aware. • We introduce deadlines as a function of worst-case execution time. • Miss rate is a function of deadline tightness.
Appendix: Stable Conic Policies (0,0,t) • Guaranteed that stable conic policies exist. • For example, set each action vector to point opposite its corresponding vertex. • Induces a vector field that stochastically orbits the decision ray. (t,0,0) (0,t,0)
Appendix: Stable Conic Policies (0,0,t) • Guaranteed that stable conic policies exist. • For example, set each action vector to point opposite its corresponding vertex. • Induces a vector field that stochastically orbits the decision ray. (t,0,0) (0,t,0)
Appendix: More Tasks Implies Higher Cost Simple problem: Fair-share scheduling of n deterministic tasks with unit duration Trajectories under round robin scheduling: 2 tasks: E{c(x)} = 1/2. Trajectory: (0,0)(1,0)(1,1)(0,0) Costs: c(0,0)=0; c(1,0)=1. 3 tasks: E{c(x)} = 8/9. Trajectory: (0,0,0)(1,0,0)(1,1,0)(1,1,1)(0,0,0) Costs:c(0,0,0)=0; c(1,0,0)=4/3; c(1,1,0)=4/3 n tasks: E{c(x)} = (n+1)(n-1)/(3n)