550 likes | 684 Views
Hyper-V High Availability and Live Migration. Symon Perriman Jeff Woolsey Technical Evangelist Principal Program Manager. Introduction to Hyper-V Jump Start. Agenda. High Availability Planning Cluster Deployment Hyper-V Optimization on a Cluster Cluster Shared Volumes & Live Migration
E N D
Hyper-V High Availability and Live Migration Symon Perriman Jeff Woolsey Technical Evangelist Principal Program Manager
Agenda • High Availability Planning • Cluster Deployment • Hyper-V Optimization on a Cluster • Cluster Shared Volumes & Live Migration • Hyper-V Replica
Why is HA Important? • Server downtime is inevitable • Servers will go offline due to • Maintenance • Upgrade • Software or Hardware • Update • Hotfix, Security Patch • Disaster • Power Outage • Accident
Complete Redundancy In the Box • Hyper-V Replica for Asynchronous Replication • CSV 2.0 Integration with Storage Arrays for Synchronous Replication Disaster Recovery • Non-Cluster Aware Apps: Hyper-V App Monitoring • VM Guest Cluster: iSCSI, Fiber Channel • VM Guest Teaming of SR-IOV NICs Application/Service Failover • Network Load Balancing & Failover via Windows NIC Teaming • Storage Multi-Path IO (MPIO) • Multi-Channel SMB I/O Redundancy • Live Migration for Planned Downtime • Failover Cluster for Unplanned Downtime Physical Node Redundancy • Windows Hardware Error Architecture (WHEA) • Reliability, Availability, Serviceability (RAS) Hardware Fault
Overview of Failover Clustering Public VMs & Workloads VMs & Workloads Shared Storage
Host Clustering Avoids a single point of failure when consolidating VMs survive host crashes VMs restarted on another node Restart VM crashes VM OS restarted on same node Recover VM hangs VM OS restarted on same node Zero downtime maintenance & patching Live migrate VMs to other hosts Mobility & load distribution Live migrate VMs to different servers to load balance Cluster SAN
Guest Clustering Application Health Monitoring App or service within VM crashes or hangs and moves to another VM Application Mobility Apps or services moves to another VM for maintenance or patching of guest OS Virtualized HBAs iSCSI (2008 R2 & 2012) Fibre Channel (2012) Combine physical &virtual servers Cluster iSCSI or FC
Combining Host & Guest Clustering Best of both worlds for flexibility and protection VM high availability & mobility between physical nodes Application & service high availability & mobility between VMs Increases complexity Guest Cluster CLUSTER CLUSTER iSCSI or FC SAN SAN
... Increased Scalability 8,000 VMs across 64 nodes 1,024 VMs per node 320 logical processors per host 64 virtual processors per VM 4 TB of RAM per host 1 TB of RAM per VM 64 TB per virtual disk (.vhdx) More storage choices Hyper-V over SMB Virtual Fibre Channel HBA (guest clustering) Scale up . . . Scale out
Hyper-V Validation Tests Faster storage validation Select a specific LUN Replicated storage for multi-site New Hyper-V Configuration Tests Run when Hyper-V role is installed Integration Components Memory Compatibility Virtual Switch Compatibility Hyper-V Role Enabled Network Configuration Storage Configuration
Upgrading Clusters to Windows Server 2012 Cluster Migration Wizard Automated export / import of VMs Migrate to CSV disks Storage mapping Virtual network mapping Use the same storage or different storage
Virtual Machine Priority • Start Order • Node Maintenance • Running Priority • Pre-emption shuts down lower priority VMs • No Auto Start • Must be restarted manually Low Medium High
Disable Starting Low Priority VMs • ‘Auto Start’ setting configures if a VM should be automatically started on failover • Group property • Disabling mark groups as lower priority • Enabled by default • Disabled VMs needs manual restart to recover after a crash Also in Windows Server 2008 R2
Keep VMs on Preferred Hosts • ‘Preferred Owners’ • VMs will start on preferred host • ‘Possible Owners’ • VMs will start on a possible owner, only if a preferred owner is not available • If neither a preferred or possible owner is available, the VM will move to an active node, but not start
Start VMs on Preferred Hosts • ‘Persistent Mode’ will attempt to place VMs back on the last node they were hosted on during start • Only takes affect when complete cluster is started up • Prevents overloading the first nodes that startup with large numbers of VMs • Better VM distribution after cold start • Enabled by default for VM groups • Option is hidden from GUI in 2012
Keep VMs off the Same Host • AntiAffinityClassNames • Groups with same AACN try to avoid residing on the same node • Configured by PowerShell directly on the cluster • System Center 2012 VMM has a GUI “Availability Groups” • Enables VM distribution across host nodes for best resource utilization • Scenarios • Separate similar VMs • Guest cluster nodes • DCs or infrastructure servers • Separate tenets • For affinity, use preferred owners
VM Health Monitoring • Enable VM heartbeat setting • Requires Integration Components (ICs) installed in VM • Health check for VM OS from host • User-Mode Hangs • System Crashes CLUSTER SAN
VM Guest Service Monitoring The host monitors the guest VM Any application with a service Uses Service Control Manager Configurable recovery actions Restart service Reboot VM Move VM
Node Drain (Node Maintenance) Mode Drain all VMs off a node Supports all cluster roles Role-specific features Live migration or quick migration for VMs Uses VM Priority Suspend-ClusterNode Resume-ClusterNode
Cluster-Aware Updating UpdateCoordinator • Automated cluster updating • Coordinator serially updates all nodes • Windows Update Agent (WUA) • Windows Server Update Services (WSUS) • Windows Update • Workflow • Scan nodes to find which patches are needed • Identify node with fewest workloads • Move workloads or live migration VMs to other nodes • Call to WUA to patch • Verify patch is successful • Repeat steps 2 – 5 on next node • Repeat on remaining nodes Windows Update
I have good processes in place, but what other safeguards can I use to protect my data? So You’re a Building a Cloud…
Server Hard Disks Appear on eBayReal Case : A US Power Company • The Company had processes in place to either physically destroy drives or scrub them to U.S. DOD standards • Degaussing • Overwriting the data with a minimum of three specified patterns • Data on drives used in servers, contained: • Proprietary company information such as memos, correspondence • Customers data (460,000+) & Confidential employee information According to Gartner about 1/3 companies use outside firms to dispose of PCs & Servers
HIPAA Breach: Stolen Hard Drives • March 2012: Large Medical Provider in Tennessee paying $1.5 million to the US Dept. Health & Human Services • Theft of 57 hard drives that contained protected health information (ePHI) for over 1 million individuals • Secured by: • Security Patrols • Biometric scanner • Keycard scanner • Magnetic locks • Keyed locks “71% of health care organizations have suffered at least one data breach within the last year” -Study by Veriphyr
CriticalSafeguard for the CloudEncrypted cluster volumes • BitLocker encrypted cluster disks • Support for traditional failover disks • Support for Cluster Shared Volumes • Cluster Name Object (CNO) identity used to lock and unlock Clustered volumes • Enables physical security for deployments outside of secure datacenters • Branch office deployments • Volume level encryption for compliance requirements • Negligible (<1%) performance impact
Cluster Shared Volumes (CSV) All cluster nodes can read/write to the CSV volume LUN ownership by node abstracted from application Applications failover without drive ownership changes No dismounting and remounting of volumes Faster failover times (less downtime)
New CSV Architecture in Windows Server 2012 What it delivers Improved interoperability with file system mini-filter drivers Anti-virus software Backup software (No more redirected mode for backups!) Infrastructure for application consistent distributed backups Integrate with new file system features Support for Offloaded Data Transfer (ODX) Spot-fixing integrated to do online correction Significant performance improvements Supports BitLocker encrypted volumes Memory mapped files now supported No longer Active Directory dependencies for improved performance and resiliency
Your Thoughts on VM Mobility • Don’t provide new features that preclude Live Migration. • I want to be able to securely move any part of a VM anywhere at anytime. No Limits. • No Downtime Servicing • SAN Upgrades/Migrations • When VMs migrate, move the historical data with the VM • Fully Leverage hardware to speed migrations
Improved Live Migration • Live Migration Queuing • Concurrent Live Migrations Concurrent Live Migrations:Multiple simultaneous LMs for a given source or target Live Migration Queuing: In-box tools queue & manage large numbers of VMs
Live Migration Entire VM memory copied Memory content is copied to new server Live Migrate May be additional incremental data copies until data on both nodes is essentially identical Enable-VMMigration, Move-VM SAN VHD
Live Migration Client directed to new host Session state is maintained No reconnections necessary Clients stay connected to VM Multiple live migrations can be performed either concurrently or as a queued request ARP redirects clients to new node Old VM deleted after success SAN VHD
Live Migration with SMB File Based Storage Solution Storage is not moved, just the running virtual machine Like live migration in a cluster, without high availability Requires SMB 3.0
Shared Nothing Live Migration Ability to live migrate a virtual machine with only an Ethernet cable The VM is mirrored to the destination first over the network and then the VM is migrated Live Migrate in/out cluster Live Migrate between clusters
Storage Migration Move any part of a running virtual machine with no need to turn it off VHDs Config files Snapshots Perform storage upgrades with no downtime Respond to I/O bottlenecks with no downtime Move-VMStorage
Storage Migration Architecture Hyper-V Virtual Machine VHD Software VHD Destination Device Source Device
Storage Migration Architecture Hyper-V Virtual Machine VHD Software VHD VHD Destination Device Source Device
Storage Migration Architecture Hyper-V Virtual Machine VHD Software VHD VHD Destination Device Source Device
Storage Migration Architecture Hyper-V Virtual Machine VHD Software VHD VHD Destination Device Source Device
Storage Migration Architecture Hyper-V Virtual Machine VHD Software Move-VMStorage "File Server 3" –DestinationStoragePath "K:\File Server 3" VHD Destination Device Source Device
Disaster Recovery Challenges • Cost • Complexity • Inflexibility • Initial Replication • Distance Requirements
Hyper-V Replica Disaster recovery scenarios Planned, unplanned, and test failover Pre-configuration for IP settings for primary/remote location Key features Recovery point objective and recovery time objective in minutes Seamless integration with Hyper-V and clustering Automatically handles all VM mobility scenarios (e.g. live migration) Supports heterogonous storage between primary and recovery Integrates with Volume Shadow Services (VSS) Enable-VMReplication Set-VMReplicationServer
Hurricane Sandy: Email from10/30/12 Good morning; The Hurricane hit our area badly; many downed trees, even on my wife’s car. Flooding and total power cuts were everywhere. We are very grateful that everyone is well. I now want to thank the Microsoft 2012 server team for giving businesses the new replica feature. Two of our clients (both whom cannot be without their IT infrastructure) were flooded entirely, and might take 2 weeks to get back into their businesses. At 7pm last night we failed over their entire domains to the Replica site, and they have been able to continue their daily business with ZERO interruption. “Windows Server 2012 saved their business”.