250 likes | 379 Views
Disaster Recovery on a Budget. Douglas Soltesz VP CIO Budd Van Lines @ DougSoltesz. Santa Clara, CA USA October 2013. 1. About these slides. There is more to Disaster Recovery than can be covered during this presentation These slides are available online
E N D
Disaster Recovery on a Budget • Douglas Soltesz • VP CIO • Budd Van Lines • @DougSoltesz Santa Clara, CA USA October 2013 1
About these slides • There is more to Disaster Recovery than can be covered during this presentation • These slides are available online • Don’t try to write down all the hyperlinks! • OSS will publish these slides online • http://www.doubleparity.com/OSS2013 • Saving money by using Free software can result in more time spent troubleshooting • Sometimes you get what you pay for Santa Clara, CA USA October 2013 2
Budd Van Lines Lessons Learned • 3 Major Storms in NJ over past 2 years • Power Loss at NJ HQ • Employees without power • Roadways closed for days • No Gas!!! Santa Clara, CA USA October 2013 3
4 Steps to Disaster Readiness • Virtualize • Replicate • Automate • Communicate Santa Clara, CA USA October 2013 4
Step 1 – Virtualize Critical Systems • Servers that are virtualized can be moved easily between disparate hardware Santa Clara, CA USA October 2013 5
Hypervisors on a Budget • VMware vSphere • Has a free edition but limitations can prevent successful Automation of failover to DR Site • http://techhead.co/vmware-vsphere-5-1-hypervisor-free-esxi-5-1-limitations/ • Essentials Kits available – give access to vCLI, vCenter, Powershell; but still really limit functionality • http://store.vmware.com/store/vmware/en_US/cat/ThemeID.2485600/categoryID.66192900 Santa Clara, CA USA October 2013 6
Hypervisors on a Budget • Microsoft Hyper-V • Hyper-V Is technically included with Windows Server Licensing • Can be managed via Powershellcmdlets • Manage multiple systems with: • Microsoft Systems Center Virtual Machine Manager (SCVMM) (NOT included / “free”) • 5Nine Manager for Hyper-V (has free edition) • http://www.5nine.com/5nine-manager-for-hyper-v-product.aspx • Others exist but not free • vtUtilitieshttp://vtutilities.com/ Santa Clara, CA USA October 2013 7
Hypervisors on a Budget • Xen / XCP • Built into many Linux distros • http://wiki.xen.org/wiki/Xen_Overview • Xenserver (Citrix) now open sourced & includes management XenCenter • http://www.xenserver.org/ • Many other free management tools • http://wiki.xen.org/wiki/XCP_Management_Tools • http://wiki.xen.org/wiki/Xen_Management_Tools Santa Clara, CA USA October 2013 8
Hypervisors on a Budget • KVM (Kernel based Virtual Machine) • Included in mainline Linux, as of 2.6.20 • http://www.linux-kvm.org/page/Main_Page • Red Hat offers support & management • SmartOS (Joyent) based on illumos adds ZFS, dTrace & management (Project FIFO) • http://wiki.smartos.org/display/DOC/Welcome+to+SmartOS • Many other free management tools • http://www.linux-kvm.org/page/Management_Tools Santa Clara, CA USA October 2013 9
Virtualize on a Budget Summary • Virtualization abstracts a server OS from hardware aiding Disaster Recovery in many areas • Live / Cold Migration • Gives High Availability during single host outage/maintenance • VMs are portable between sites during major disasters • Virtualized networking allows seamless failover between switches • Shared storage should be more fault tolerant than local storage Santa Clara, CA USA October 2013 10
Step 2 - Replicate • Second Datacenter is required • Cloud / Colocation / Branch Office • Connect with IPsec VPN over Internet • Cable Modem / Fios • Static IPs • Copy of critical VMs • RPO (Recovery point objective) • SAN vs VM replication Santa Clara, CA USA October 2013 11
Replicate – Cloud / Colo/ Branch • Branch • Your company is already paying for the space • Hand me downs from Primary Site • Your equipment, power, A/C • Colocation • Your equipment; Primary Site hand me downs • No need to manage power, A/C, Internet Access • Costs for a full rack around $12k/year • Cloud • No equipment, power, A/C, networking headaches • Only pay for VMs when running • Hard to get SAN to SAN replication • Around $100/TB/Month storage Santa Clara, CA USA October 2013 12
Replicate Critical VMs • VM Based Replication • Often a feature of 3rd party backup software • Hard to get sub 15 minute RPO on VMs • Many rely on hypervisor snapshot • Leaf Coalesce can be an issue in Xen, Hyper-V, (KVM?) • SAN to SAN Replication • Requires same SAN OS on both sides • Can be more expensive • Works with any hypervisor • Lowest RPO can be achieved • VMs are “Crash Consistent” 13 Santa Clara, CA USA October 2013
SAN to SAN Replication on a Budget • Use ZFS (now OpenZFS) • Nexenta and TrueNAS offer ZFS systems with HA, replications and enterprise support • Install Napp-it on OmniOS or OpenIndiana • http://www.napp-it.org • Install FreeNAS (ZFS on BSD) • http://www.freenas.org • Script ZFS send / receive on any OpenZFS system • http://open-zfs.org • http://www.aisecure.net/2012/01/11/automated-zfs-incremental-backups-over-ssh/ Santa Clara, CA USA October 2013 14
The VMware SAN Budd Built • NexentaStore 32TB license w/Gold Support & HA Plugin • JBOD 2 SupermicroSuperChassis 847E26-RJBOD1 • 2 STEC ZeusRam Drives for ZIL • 2 OCS Talos 2 C (240GB) Drives for L2ARC • 36 Seagate Constellation ES SAS 6Gb/s 1-TB HD • ST32000424SS • Setup in 18 Mirrored vDevs (Raid 10) • 2 Controllers • Supermicro Chassis w/X8DAH+-F Motherboard • 144GB RAM • Dual Intel E5606 Xeon Quad Core @ 2.13GHz • LSI 9205-8e SAS Controller Entire Solution – Running 100 VMs & File Server $34,000 Santa Clara, CA USA October 2013 15
Notes on Building a ZFS SAN • Don’t skimp on the parts • You’re already saving a ton of $$$ • Always use SAS over SATA • Don’t buy parts on Ebay for mission critical data • Build in extra redundancy • Only use equipment on the HCL list • Some vendors will sell you the hardware without the support / software Santa Clara, CA USA October 2013 16
Part 3 - Automate • How quickly can you fail over to another site? • What is your company Recovery Time Objective (RTO)? • Have you created a Runbook? • http://en.wikipedia.org/wiki/Runbook • Very few products on the market with Automated Runbook for Disaster Recovery • VMware SRM Santa Clara, CA USA October 2013 17
Runbook Process Example • Failing over critical VMs on a single LUN • Start SAN replication to backup site • Power off critical VMs in assigned order • Unregister Critical VMs from hosts • Unmap LUN from hosts • Rerun SAN replication to backup site • Reverse SAN replication backup site to primary • MAP LUNs on backup site hosts • Register VMs on backup hosts • Power Up VMs on backup host in assigned order • Re-IP VMs if subnets are different • Update routing, DNS, NAT Santa Clara, CA USA October 2013 18
How to Automate Failover • Build a system at the backup site to run failover scripts • Windows VM • Use Powershell commands with VMware & Hyper-V • Use Plink to script against ZFS systems, KVM & Xen • http://the.earth.li/~sgtatham/putty/0.53b/htmldoc/Chapter7.html • Script commands together using Powershell or System Center Orchestrator • VMware Orchestrator • http://www.vmware.com/products/vcenter-orchestrator/ • Linux VM • Shell, Python, Perl Scripts to Automate failover • VMware vMA • https://my.vmware.com/web/vmware/details?downloadGroup=VSP510-VMA-510&productId=285 Santa Clara, CA USA October 2013 19
Automation Commands • VMware • PowerCLIhttp://www.vmware.com/support/developer/PowerCLI/index.html • vMAvCLI & Perl http://www.vmware.com/support/developer/vima/ • Examples • Unmount NFS datastore • Remove-Datastore -Datastore Datastore -VMHost 10.23.112.234 -Confirm:$false • esxcli storage nfs remove -v NFS_Datastore_Name • Stop VM • Stop-VM -VM VM -Kill -Confirm:$false • esxclivm process list • esxclivm process kill --type=[soft,hard,force] --world-id=WorldNumber • Unregister VM • Remove-VM VM • vmware-cmd --server vcenter--vihostesxhost –s unregister path_to_vmx_file Santa Clara, CA USA October 2013 19
Automation Commands • Hyper-V • PowerCLIhttp://technet.microsoft.com/en-us/library/hh848559.aspx • Xen • Xencenter XE cli http://docs.vmd.citrix.com/XenServer/6.0.0/1.0/en_gb/reference.html • XE http://wiki.xen.org/wiki/XCP_Command_Line_Interface • KVM • Depends on Manager used • Nexenta • http://info.nexenta.com/rs/nexenta/images/NexentaStor_User_Guide_3.1.5.x.pdf • Illumos ZFS • http://www.datadisk.co.uk/html_docs/sun/sun_zfs_cs.htm Santa Clara, CA USA October 2013 21
Step 4 - Communicate • If users can not make it into the office how will they work? • Post status updates to Facebook / LinkedIn if corporate site down • Forward office lines to cell phones • Deploy Disaster Readiness kits to key employees • Laptop • MiFi • Car Inverter Santa Clara, CA USA October 2013 22
Step 4 - Communicate • Now that the systems are up and running how will end users connect in? • VPN – http://openvpn.net • VDI – Host desktops at DR Site • Citrix / Remote Desktop / Terminal Server Santa Clara, CA USA October 2013 23
Questions? Slides available @ http://www.doubleparity.com/OSS2013 Santa Clara, CA USA October 2013 24