100 likes | 214 Views
Predicting Queue Waiting Time For Individual TeraGrid Jobs. Rich Wolski , Dan Nurmi, John Brevik, Graziano Obertelli, Ryan Garver Computer Science Department University of California, Santa Barbara. Problem: Predicting Delay in Batch Queues. Time in queue is experienced as application delay
E N D
Predicting Queue Waiting Time ForIndividual TeraGrid Jobs Rich Wolski, Dan Nurmi, John Brevik, Graziano Obertelli, Ryan Garver Computer Science Department University of California, Santa Barbara
Problem: Predicting Delay in Batch Queues • Time in queue is experienced as application delay • Sounds like an easy problem, but • Distribution of load from users is a matter of some debate • Scheduling policy is partially hidden • Sites need to change the policies dynamically and without warning • Job execution times are difficult to predict • Much research in this area over the past 20 years, but few solutions • Current commercial systems provide high variance estimates • On-line simulation based on max requested time • “expected” value predictions • Most sites simply disable these features
For Scheduling: It’s all about the big Q • Predictions of the form • “What is the maximum time my job will wait with X% certainty?” • “What is the minimum time my job will wait with X% certainty?” • Requires two estimates if certainty is to be quantified • Estimate the (1-X) quantile for the distribution of availability => Qx • Estimate the upper or lower X% confidence bound on the statistic Qx=> Q(x,b) • If the estimates are unbiased, and the distribution is stationary, future availability duration will be larger than Q(x,b)X% of the time, guaranteed
BMBP: A New Predictive Methodology • New quantile estimator invention based on Binomial distribution • Requires carefully engineered numerical system to deal with large-scale combinatorics • New changepoint detector • Binomial method in a time series context is difficult • Need a system to determining • Stationary regions in the data • Minimum statistically meaningful history in each region • New clustering methodology • More accurate estimates are possible if predictions are made from jobs with similar characteristics • Takes dynamic policy changes into account more effectively
Predicting Things Upside Down • Deadline scheduling: My job needs to start in the next X seconds for the results to be meaningful. • Amitava Mujumdar, Tharaka Devaditha, Adam Birnbaum (SDSC) • Need to run a 4 minute image reconstruction that completes in the next 8 minutes • Given a • Machine • Queue • Processor count • Run time • Deadline • What is the probability that a job will meet the deadline? • http://nws.cs.ucsb.edu/batchq/invbqueue.php
See it In Action • http://nws.cs.ucsb.edu/batchq
How Does it Work? • NWS sensors at each site read batch queue scheduler logs • Sanitized: • Machine name • queue name • Node/core count • Max run time • Submit time • Start time • Sensors periodically send updated log records to UCSB • At UCSB • NWS log data is extracted • Forward and inverted predictions are asynchronously • all made for all machine/queue/cluster combinations • Data served through multiple interfaces • Web service, HTML, BQP
What are the Problems? • Batch queue scheduler logs are designed to support accounting • Each uses a different format and logs different information • Accuracy is not considered important • Not all scheduler relevant events are logged • Node decommisioning/addition • Static metadata is not provided • Queue constraints • Cores or nodes scheduled? • Number of processing elements (nodes/cores) • Better information is needed going forward • Evaluate scheduling policy changes • Urgent computing • Co-allocation/advanced reservations
Static Metadata Proposal • Per Machine • some short one word 'tag' identifying machine (ex: "ncsateragrid") • list of login hostnames that users log in to • hostname of machine with static hostname to ip mapping (net accessibleservices run here) • machine name (ex: "NCSA ia64 TeraGrid") • Number of nodes • Number of processing elements/node • Per Queue • UNIT of computational elements "core", "processor", "node" ...) • default queue? (boolean) • list of job restrictions placed on 'normal user' for this queue max number of computational elements available for request (int) max walltime request (int)
ANL Example • <machine> • <tag>ucteragrid</tag> • <sensorhost>tg-grid.uc.teragrid.org</sensorhost> <sensorport>8062</sensorport> • <totalcores>314</totalcores> • <loginhosts> • <host>tg-login.uc.teragrid.org</host> • <host>tg-login1.uc.teragrid.org</host> • <host>tg-login2.uc.teragrid.org</host> • </loginhosts> • <label>UofC/ANL TeraGrid Cluster</label> <defqueue>dque</defqueue> • <queues> • <queue> • <name>dque</name> • <procunit>cores</procunit> • <proclimit>2048</proclimit> • <walllimit>86400</walllimit> • </queue> • <queue> • <name>high</name> • <procunit>cores</procunit> • <proclimit>512</proclimit> • <walllimit>43200</walllimit> • </queue> • <queue> • name>interactive</name> • <procunit>nodes</procunit> • <proclimit>1</proclimit> • <walllimit>3600</walllimit> • </queue> • </queues> • </machine>