110 likes | 265 Views
David McBride <dwm@doc.ic.ac.uk> Systems Programmer London e-Science Centre Department of Computing, Imperial College. Batch Scheduling at LeSC with Sun Grid Engine. Overview. End-user requirements Brief description of compute hardware Sun Grid Engine software deployment
E N D
David McBride <dwm@doc.ic.ac.uk> Systems Programmer London e-Science Centre Department of Computing, Imperial College Batch Scheduling at LeSC with Sun Grid Engine
Overview • End-user requirements • Brief description of compute hardware • Sun Grid Engine software deployment • Tweaks to the default SGE configuration • Future changes • References for more information and questions.
End-User Requirements • We have many different users: high-energy physicists, bioinfomaticians, chemists, parallel software researchers. • Jobs are many and varied: • Some users run relatively few long running tasks, others submit large clusters of shorter jobs. • Some require several cluster nodes to be co-allocated at runtime (16, 32+ MPI hosts), others simply use a single machine. • Some require lots of RAM.. (1, 2, 4, 8GB+ per machine) • In general users are fairly happy so long as they get a reasonable response time.
Hardware • Saturn: 24-way 750Mhz UltraSparcIII Sun E6800 • 36GB RAM, ~20TB online RAID storage, • 24TB tape library to support long-term offline backups. • Running Solaris 8 • Viking cluster: 260 Node dual P4 Xeon 2Ghz+ • 128 machines with Fast Ethernet; 2x64 machines also with Myrinet • 2 front-end nodes & 2 development nodes. • Running RedHat Linux 7.2 (plus local additions and updates) • Mars cluster: 204 Node dual AMD Opteron 1.8Ghz+ • 128 machines with Gigabit Ethernet; 72 machines also with Infiniband. • Running RedHat Enterprise Linux 3 (plus local refinements) • 4 front-end interactive nodes.
Sun Grid Engine Deployment • Two separate logical SGE installations • Saturn acts as the master node for both cells. • However, Viking is running SGE 5.3 and Mars is running SGE 6.0. • Mars is still ‘in beta’; Viking is still providing the main production service. • When Mars’s configuration is finalized, end-users will be migrated to Mars – Viking will then be reinstalled with the new configuration.
Changes to Default Configuration • Issue 1: • If all the available worker nodes are running long-lived jobs, then a new short-lived job added to the queue will not execute until one of the long-lived jobs has completed. (SGE does not provide a job checkpoint-and-preempt facility.) • Resolution: A subset of nodes are configured to only run short-lived jobs. • Trades slightly reduced cluster utilization for shorter average-case response time for short-lived jobs.
Tweaks to SGE configuration • Issue 2: • If a job is submitted that requires the co-allocation of several cluster nodes simultaneously (eg for a 16-way MPI job) then that job can be starved by a larger number of single-node jobs. • Resolution: Manually intervene to manipulate queues so that the large 16-way job will be scheduled. (SGE 5.3) • Resolution: Upgrade to SGE 6 which uses a more advanced scheduling algorithm (advance reservation with backfill.)
Future Changes • Default requirements for jobs: • Different cluster nodes have different resources; eg some have more memory, fast processors, than others. • Sometimes a low-requirement job will be allocated to one of these more capable machines unnecessarily because the submitter has not specified the job’s requirements. • This can prevent a job which does have high requirements from being run as quickly. • Plan to change the SGE configuration so that a job will, by default, only require the resources of the least-capable node. • Places onus on user to request extra resources if needed.
Future Changes: LCG • We are participating in the Large Hadron Collider Compute Grid as part of the London Tier-2. • This has been non-trivial; the standard LCG distribution only supports PBS-based clusters. • We’ve developed SGE-specific Globus JobManager and Information Reporter components for use with LCG. • We have also been working with the developers to address issues with running on 64bit Linux distributions. • Currently deploying front-end nodes (CE, SE, etc.) to expose Mars as an LCG compute site. • We are also joining the LCG Certification Testbed to provide a SGE-based test site to help ensure future support.
References • London e-Science Centre homepage: • http://www.lesc.ic.ac.uk/ • SGE intergration tools for Globus Toolkit 2, 3, 4 and LCG: • http://www.lesc.ic.ac.uk/projects/sgeindex.html