120 likes | 320 Views
Scheduler Basics. Juli Rew CISL User Forum May 19, 2005. IBM Scheduling Life of a Job Submit Filter Batch Priority Scheduler Factors Affecting BPS Job Scheduling LoadLeveler Load Sharing Facility Scheduling • LSF Scheduling on Linux Systems
E N D
Scheduler Basics Juli Rew CISL User Forum May 19, 2005
IBM Scheduling Life of a Job Submit Filter Batch Priority Scheduler Factors Affecting BPS Job Scheduling LoadLeveler Load Sharing Facility Scheduling • LSF Scheduling on Linux Systems Differences from IBM Scheduling Overview
IBM Scheduling: Life of a Job llsubmit job Submit Filter Requirements Processing BPS Job Ordering LoadLeveler Job Execution Requirements Not Met Reject Job Build Ordered List of Jobs Job Starts Requirements Problem Staff Rejects Job Job Completes Done Done Done
Checks the LoadLeveler job script for: - valid parameters - valid queue name - consistent combinations of features, eg., shared/not_shared, tasks_per_node/node options Moves jobs with allocation holds to hold queues Moves jobs with cutoff projects to standby queue Submit Filter Features
Written at NCAR Orders jobs based on policy Creates separate facilities (Community, Climate System Laboratory) Further separates jobs into proposal groups (NCAR/UNIV, CCSM/oCSL) Hands the final order list to LoadLeveler Allows for backfilling of jobs to avoid idle resources Batch Priority Job Scheduler Features
all_spec jobs run with the highest priority and can access all nodes Below that, all com and csl jobs divided equally Round Robin by Group/User ------------------ all_spec ------------------ com csl \ / top job 50-50% split not hard Prioritization of Jobs by BPS
Backfilling - Jobs that will not interfere with start of highest priority job allowed to slip in - Sweet spot: < 3 hours and small node count Allocation Holds - Job flagged if a project/division exceeds its 30-day or 90-day allocation thresholds - H1 and H2 jobs reordered at a priority above standby but below non-flagged jobs Special Initiatives - Nodes reserved for real-time or other special runs Other Factors Affecting Job Scheduling
batchview command gives snapshot of current ordering Basic information on scheduling given at http://www.scd.ucar.edu/docs/ibm/ref/llsched.html Documentation and Utilities
IBM's batch control job system Allows jobs to be started, stopped, or cancelled Controls allocation of resources (CPU, memory) Allows custom scheduler plug-in (e.g., BPS) Two mutually-exclusive options: LoadLeveler scheduler or custom scheduler. LoadLeveler
Commercial product from Platform Computing Currently being used on major Linux platforms Also available for IBM, but still in evaluation Ability to do Hierarchical Fair-Share Scheduling with Backfill, based on same facility scheme used in BPS Community/CSL facility division implemented implicitly within the scheduler rather than explicitly by queue name • Can schedule among multiple platforms - "Grid” Load Sharing Facility