1 / 25

Virtual Machine Universe in Condor

Virtual Machine Universe in Condor. What is VM universe?. A job user can submit a virtual machine to Condor Condor runs the virtual machine and sends back a result virtual machine support VMware server and Xen. Virtual Machine. Big picture. Submit machine. Execute machine. Startd.

gaston
Download Presentation

Virtual Machine Universe in Condor

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Virtual MachineUniverse in Condor

  2. What is VM universe? • A job user can submit a virtual machine to Condor • Condor runs the virtual machine and sends back a result virtual machine • support VMware server and Xen

  3. Virtual Machine Big picture Submit machine Execute machine Startd Schedd Starter Shadow VM GAHP

  4. Benefits of VM universe • platform independence • environment independent on host machine • checkpoint • networking in a virtual machine • snapshot disk • input CDROM image

  5. Snapshot disk • All modified data will be stored into snapshot disks without changing original VM disk files. • VM disk files in a shared file system can be safely shared among multiple jobs • Can reduce disk space for result and checkpoint

  6. Submit description file with shared file system • universe = vm • executable = WindowsXP • vm_type = vmware • vm_memory = 256 • vm_checkpoint = TRUE • vm_networking = TRUE • vm_networking_type = dhcp • vmware_dir = /shared/windows_vm • vmware_should_transfer_files = FALSE • vmware_snapshot_disk = TRUE • initialdir = /result1 • Queue • initialdir = /result2 • Queue

  7. Job 1 Snapshot Disk Job 2 Snapshot Disk Snapshot disk with shared file system Execute machine 1 Submit machine /result1 Execute machine 2 /result2 Shared file system /windows_vm

  8. Submit description file without shared file system • universe = vm • executable = WindowsXP • vm_type = vmware • vm_memory = 256 • vm_checkpoint = TRUE • vm_networking = TRUE • vm_networking_type = dhcp • vmware_dir = /windows_vm • vmware_should_transfer_files = TRUE • initialdir = /result1 • vmware_snapshot_disk = TRUE • Queue • initialdir = /result2 • vmware_snapshot_disk = FALSE • Queue

  9. snapshot disk Snapshot disk without shared file system Submit machine Execute machine 1 (Job 1) Job 1 submit description ... vmware_snapshot_disk = TRUE Initialdir = /result1 Job 2 submit description ... vmware_snapshot_disk = FALSE Initialdir = /result2 Execute machine 2 (Job 2) /windows_vm

  10. snapshot disk Snapshot disk without shared file system Submit machine Execute machine 1 (Job 1) Job 1 /result1 Job 2 /result2 Execute machine 2 (Job 2) /windows_vm

  11. Input CDROM image • VM universe can not use input or argument parameter in a job submit description file as other universes do • With input CDROM images, a job user may run the same VM several times on different input data sets

  12. Submit description file with input CDROM image • universe = vm • executable = WindowsXP • vm_type = vmware • vm_memory = 256 • vm_checkpoint = TRUE • vm_networking = TRUE • vm_networking_type = dhcp • vmware_dir = /windows_vm • vmware_should_transfer_files = FALSE • vmware_snapshot_disk = TRUE • initialdir = /result1 • vmware_cdrom_files = a.iso • Queue • initialdir = /result2 • vmware_cdrom_files = a.txt, b.txt • Queue

  13. a.iso a.txt b.txt Input CDROM image Submit machine Execute machine 1 VM Job 1 submit description ... vmware_cdrom_files = a.iso Job 2 submit description ... vmware_cdrom_files = a.txt, b.txt Execute machine 2 VM

  14. VMware VM universe • Snapshot disk • Input CDROM image • Can be used on either Linux host or Windows host

  15. Xen VM universe • No support of snapshot disk • VM disk file in a shared file system can not be shared among multiple job unless it is read-only. • Input CDROM image • Can be used on only Linux host

  16. Checkpoint • Periodic checkpoint and vacate checkpoint • All modified VM disk files and a file for VM memory will be transferred back to a submit machine • When snapshot disks are used, snapshot disk files and a file for VM memory will be transferred.

  17. Suspend • Hard suspend: Memory being used by a VM will be released and the memory will be saved into a file • Soft suspend:Memory being used by a VM will not be released and the VM will be just paused like SIGSTOP

  18. Networking issues when restarting from checkpoint • MAC and IP address for VM are also preserved when checkpointed • When restarting the checkpointed VM, MAC and IP address don’t change. • If we use NAT for VM networking, different execution machines may have different MAC and IP address of NAT gateway. • In VMware, if we install VMware tool inside VM, the tool program will automatically execute DHCP renew when a VM is restarted.

  19. Future work • Support snapshot disks in Xen VM universe • For result, get only output files from VM instead of all VM files. • Support another Virtual machine program (e.g. QEMU)

  20. Summary • We are testing VM universe. • Hopefully VM universe will be included in Condor 6.9.x. Questions?

  21. snapshot disk snapshot disk Case Study 1Hierarchical Snapshot Shared file system -r—r—r— root:root 10GB /windows Parent disk -rw-rw— Todd:Todd 400M /windows_with_matlab Parent disk /windows_with_ matlab_and_excel -rw-rw— Todd:Todd 200M

  22. Submit description file for Case Study 1 • universe = vm • executable = WindowsXP • vm_type = vmware • vm_memory = 256 • vm_checkpoint = TRUE • vm_networking = TRUE • vm_networking_type = dhcp • vmware_dir = /windows_with_matlab_and_excel • vmware_should_transfer_files = FALSE • vmware_snapshot_disk = TRUE • Queue

  23. Case Study 2Vanilla Universe with platformVM • universe = vanilla • platformvm = /redhat_linux • executable = /tmp/test.sh • argument = a.txt • log = vanilla.log • error = vanilla.err • output = vanilla.out • transfer_input_files = /tmp/a.txt • Queue

  24. Convert Vanilla Universe with platformVMinto VM Universe • universe = vm • executable = vanillaUniv • vm_type = vmware • vm_memory = 128 • vm_checkpoint = TRUE • vm_networking = TRUE • vm_networking_type = dhcp • vmware_dir = /redhat_linux • vmware_should_transfer_files = FALSE • vmware_snapshot_disk = TRUE • vmware_cdrom_files = /tmp/test.sh, /tmp/a.txt, submitfile.txt • Queue

  25. Pre-created Platform VMs Shared file system With Condor installed -r—r—r— root:root 10GB /windows With Condor installed -r—r—r— root:root 4GB /freebsd With Condor installed -r—r—r— root:root 8GB /redhat_linux

More Related