160 likes | 378 Views
Condor NT Condor ported to Win32. Overview. Intro to Condor NT What does Condor NT do? How does Condor NT differ from Condor for Unix? What are the current limitations of Condor NT? Future Work. Intro to Condor NT. First pre-release at Condor ver 6.1.8 “Deep port” of Condor
E N D
Overview • Intro to Condor NT • What does Condor NT do? • How does Condor NT differ from Condor for Unix? • What are the current limitations of Condor NT? • Future Work
Intro to Condor NT • First pre-release at Condor ver 6.1.8 • “Deep port” of Condor • Daemons run as a system Service under the LocalSystem account • Shares as much source code with Condor for Unix as possible
What can it do? • Almost everything Condor for Unix can… • Submit, run, manage queues of jobs • Jobs run “in the background” • Nearly all Condor tools included • ClassAds • Full compliment of attributes (load average, RAM, benchmarks, free swap, key/mouse idle times, image size, CPU usage, etc) • Everything needed for a Central Manager
What can it do? (cont) • Support for SMP machines • Several security mechanisms (more later…) • Suspend, continue, soft-kill (WM_CLOSE), hard-kill jobs • Correctly manage multi-process jobs • Send email notifications • Yada, yada, …
What’s missing? • Only VANILLA universe included • No STANDARD, PVM, GLOBUS, SCHEDULER universe • Note: MPI being done on both Unix and Win32 • Ability to run the job as the submitting user • Ability to access shared volumes as the submitting user • So – who does the job run as, how does the job get its files?
Job Start on Condor NT • On execute machine, Condor creates • New temporary user account • New temporary working directory • New temporary, non-visible desktop • Permissions (ACLs) set • Files transferred by Condor • Job spawned
While Job is Running… • Condor watches the job and updates dynamic attributes about the job in the job ClassAd • Disk usage, cpu usage, … • Enforces the machine owner’s policy
On Job Vacate/Exit… • Condor conditionally transfers any output files back to the submit machine • Can be told filenames, or automatically send back files which have changed • File transfers are atomic • Cleanup
Some points on shared (network) filesystem access • On Condor Unix, VANILLA requires a shared filesystem • Not true on Condor NT • Condor NT can access a shared filesystem • … but only as user “Guest” or only if the share password is provided by the job
Difficulties of running as the user • Forwarding credentials problem • Windows NTLM in NT 4.0 can impersonate the peer on a socket, but only one “jump” • On Windows NT, cannot just setuid() A B C
Current Work To Do • Improve situation for access to shared filesystem • As user “condor”, or • As user who submitted the job • Run jobs as the submitting user • On NT 4.0 : store the password, forward it encrypted • On Windows 2000: same or PKI
Current Work Todo, Cont. • Windows 2000 support • Current release mostly works on Win2k… • Take advantage of Win2k enhancements • Add in Scheduler Universe • And therefore DAGMan support • Add in the MPI Universe
Future Work • Add remaining missing Condor Universes • STANDARD • Requires addition of process checkpoint and/or remote system call • GLOBUS • Requires Globus Toolkit client libs on Win32 • PVM