220 likes | 336 Views
Implementing Dual-Boot Clusters in a Distributed Environment. Surajit Bose, Technology Services Manager Dustin King, Systems Imaging Architect. Our Environment. Not central IT Over 100 computer clusters, mostly unstaffed Dorms, Row Houses, Graduate Residences Central and Branch Libraries
E N D
Implementing Dual-Boot Clusters in a Distributed Environment • Surajit Bose, Technology Services Manager • Dustin King, Systems Imaging Architect
Our Environment • Not central IT • Over 100 computer clusters, mostly unstaffed • Dorms, Row Houses, Graduate Residences • Central and Branch Libraries • Student Centers • Most open 24/7 • Approximately 500 cluster machines • Historically, even mix of Dells and Apples
Our Prior Infrastructure • Campus-wide Kerberos authentication • PXE/Ghost for Windows imaging • Windows machines joined to AD • Domain scripts for Windows maintenance • NetRestore for Mac imaging • Macs bound to LDAP • Radmind for Mac maintenance • Linux server environment
Why Dual-Boot? • Bypass question of optimal platform mix • Improve availability of single-platform software • Provide choice for students • Homogenize inventory • Seemed like a cool thing to try
Desiderata • Network-based full-disk imaging • Platform parity • Manage each platform independently • Ease of switching OS • Non-ridiculous login times • Server-side control • Consistent imaging process across hardware • Shared local storage across OSes
What We Discovered • Managing the reboot cycle is difficult • Existing solutions unsatisfactory for us • BootPicker, NetRestore/WinClone Mac-centric • rEFIt makes management difficult • No network boot environment works for both Dell and Apple machines • Partition order matters
What We Decided • Control boot process with EFI shell environment (SCUBA) • Inter-OS communication via locally stored state file • NetBoot install environment (Genie) • Use convoluted partition scheme • Use Paragon NTFS and MacDrive • Use customized login screens • Nightly maintenance reboots • Server-side tracking of machine state
EFI Shell Environment • Boot to EFI shell • Fits on a flash drive for full-disk imaging • Shell modified to ignore keyboard interrupts • EFI toolkit has network stack, http client, Python • Startup script • validates nvram boot options • checks with server • reads and updates local state file • sets nextboot value in nvram
Priority of Boot Flags • Required (from server) • Mac Maintenance (from local state file, set by script) • Windows Maintenance (from local state file, set by script) • Requested (from local state file, set by user) • Suggested (from server)
Local State File • Houses maintenance and requested boot flags • Caches most recent response from the server • Has to be writable from both OSes as well as EFI shell environment
Genie • Based on NetInstall set built with Mac OS X Server Admin Tools • Bash scripts check server for configuration and manage imaging process • Report progress through iHook
Partition Scheme • EFI System Partition: leave alone per Apple recommendation • FAT: store Windows images and local state file • NTFS: local storage space for users • NTFS: Windows system partition • HFS+: EFI shell environment • HFS+: Mac system partition
Handling Partitions • Mac OS X • Paragon NTFS • Remount volumes under /Library/Mounts • Windows XP • MacDrive • Some partitions already invisible • Remount volumes under c:\stucomp\mnt
Nightly Maintenance • Scripts on each OS write maintenance flags into state file • Windows • Python reboot service • Domain startup scripts • Mac • Radmind • iHook
Server-Side Setup • Genie • Background downloads • SCUBA flags • Printer configuration • Imaging request page • Status “database”
Gotchas • Per-seat licensing costs • Mouse and keyboard confusion • NetBoot memory management horror • Windows reboot behavior • Time and Kerberos logins • Permissions on shared volumes • SSH keys
Planned Enhancements • Improve build processes for EFI, NetBoot environments • Increase structural similarity of configuration and management between platforms • Implement PKI for client-server communications • Explore emerging solutions (e.g. XHooks) • Implement cross-platform monitoring system • Reduce power usage on clients • Create documentation • Release as open-source
Acknowledgments • Karl Kuehn, Software Image Developer • Alex Schorsch, Student Developer • Fangling Zhang, Student Developer • Paul Nuyujukian, Student Developer • Ian Comfort, Systems Administrator
Thanks!surajit@stanford.edudaking@stanford.edu_________Evaluate!http://www.resnetsymposium.org/rspm/evaluation/Thanks!surajit@stanford.edudaking@stanford.edu_________Evaluate!http://www.resnetsymposium.org/rspm/evaluation/