1 / 41

VMware vCenter Server High Availability

VMware vCenter Server High Availability. Product Support Engineering. VMware Confidential. Module 2 Lessons. Lesson 1 – vCenter Server High Availability Lesson 2 – Distributed Resource Scheduler Lesson 3 – Fault Tolerance Virtual Machines Lesson 4 – Enhanced vMotion Compatibility

kane-franks
Download Presentation

VMware vCenter Server High Availability

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. VMware vCenter Server High Availability Product Support Engineering VMware Confidential

  2. Module 2 Lessons • Lesson 1 – vCenter Server High Availability • Lesson 2 – Distributed Resource Scheduler • Lesson 3 – Fault Tolerance Virtual Machines • Lesson 4 – Enhanced vMotion Compatibility • Lesson 5 – DPM - IPMI • Lesson 6 – vApps • Lesson 7 – Host Profiles • Lesson 8 – Reliability, Availability, Serviceability ( RAS ) • Lesson 9 – Web Access • Lesson 10 – vCenter Update Manager • Lesson 11 – Guided Consolidation • Lesson 12 – Health Status VI4 - Mod 2-1 - Slide

  3. Module 2-1 Lessons • Lesson 1 – Overview of High Availability • Lesson 2 – VMware HA Clusters • Lesson 3 – Creating HA Clusters • Lesson 4 – Monitoring HA Clusters • Lesson 5 – HA Clusters Best Practices • Lesson 6 – Troubleshooting VMware HA • Lesson 7 – Customizing VMware HA VI4 - Mod 2-1 - Slide

  4. High Availability Solutions Unplanned downtime • VMware Infrastructure builds fault tolerance capabilities into datacenter infrastructure. • These features can be easily configured, thus reducing the cost and complexity of providing higher availability. • Key fault-tolerance capabilities built into VMware Infrastructure include: • Network interface (NIC) teaming to provide tolerance of individual network card failures • Storage multipathing to tolerate storage path failures VI4 - Mod 2-1 - Slide

  5. High Availability Solutions • VMware High Availability and VMware Fault Tolerance, implemented through VMware Infrastructure, offer simple, cost effective solutions that help mitigate situations that could otherwise make data or services unavailable to users. • VMware HA - Checks that ESX/ESXi hosts are functioning. If an ESX/ESXi host fails, another ESX/ESXi host restarts any virtual machines that were running on the server that failed. • VMware Fault Tolerance (FT) - Checks that individual virtual machines are functioning and deals with failures without any interruption in service. VMware FT creates hidden duplicate copies of running virtual machines so if a virtual machine fails due to hardware or software failures, the duplicate virtual machine can immediately replace the one that was lost. VI4 - Mod 2-1 - Slide

  6. High Availability Solutions • High availability and fault tolerance are different from other business continuity offerings in that the solution: • Exists within a single datacenter. Other solutions exist across physical locations. • Uses shared storage for holding the machines' data. Other solutions use multiple copies of the data, which are regularly replicated. • Fault tolerance addresses a number of common problematic situations VI4 - Mod 2-1 - Slide

  7. Understanding the Resource Allocation Tab in Clusters • If the host being used to start a virtual machines is in a cluster, you can view information about reserved resources on the Resource Allocation tab for that cluster. • The information for the CPU and Memory reservations indicates that reservations have been made • Summary reservation information displays information about reservations on the cluster root, where all reservations occurred. • Individual virtual machines do not actually have any reservations. VI4 - Mod 2-1 - Slide

  8. VMware HA Cluster Prerequisites This section describes the prerequisites for establishing VMware HA clusters. • A number of conditions must be established for VMware HA to be used. • All virtual machines and their configuration files must reside on shared storage (such as a SAN) • Hosts must also be configured to have access to the same virtual machine network. • Each host in a VMware HA cluster must have a host name assigned and a static IP address. • VMware recommends redundant Service Console and VMkernel networking NOTE After you have added a NIC to a host in your VMware HA cluster, you must reconfigure VMware HA on that host. VI4 - Mod 2-1 - Slide

  9. A note on VMware HA ‘slot’ Calculation • Slot calculation is still done by the vCenter HA service. • It gives the HA service the capacity of the cluster as a whole • For Virtual Center 2.x the VM with maximum resource consumption was the one chosen as the basis of the slot calculation. • This poised a problem if there was only one heavily resourced Virtual Machine and the other VM’s did not use so much resources. • You would get an unfair calculation of remaining resources. • This has been changed for vCenter 4. • The slot size is shown in the UI. VI4 - Mod 2-1 - Slide

  10. A note on VMware HA ‘slot’ Calculation • When you use the Host failures cluster tolerates option, it is most effective if all virtual machines have a similar CPU and memory requirement. • If you have highly variable configurations, consider using the Percentage of cluster resources reserved as failover spare capacity option. • When tolerating a specific number of host failures, VMware HA plans for a worst-case scenario by considering all powered-on virtual machines in a cluster and finding the maximum memory and CPU reservations. • These maximums are the basis for what is called a slot, which is a logical representation of the largest virtual machine in the cluster. • If no reservations are set on a virtual machine, default requirements of 256MB and 256MHz are assigned. VI4 - Mod 2-1 - Slide

  11. A note on VMware HA ‘slot’ Calculation • VMware HA determines how many slots are available in each ESX/ESXi host based on the host’s CPU and memory capacity. • VMware HA then determines how many ESX/ESXi hosts could fail with the cluster still having at least as many slots as powered on virtual machines. • When you use the Percentage of cluster resources reserved as failover spare capacity option, each time a request is made to power on a virtual machine, admission control determines the amount of resources the virtual machine needs and how much uncommitted resources remain on cluster resources for failovers. • If sufficient resources are available, the virtual machine is powered on. This process does not guarantee maintaining a level of service if a number of hosts fail, but it is a more flexible and less conservative approach to assessing whether or not to power on machines. • This policy does not use slots. It uses the actual reservations of the virtual machines. If a virtual machine does not have reservations, meaning that the reservation is 0, a default of 256MB and 256MHz is applied. This is controlled by the same HA advanced options used for the failover level policy. VI4 - Mod 2-1 - Slide

  12. Monitoring Availability You can monitor changes in your high-availability deployment using events and alarms • Use the functionality included in Alarms and Actions to determine what actions are taken when VMware HA events occur. VI4 - Mod 2-1 - Slide

  13. Creating a VMware HA Cluster • Clusters enable a collection of ESX/ESXi hosts to work together. • This provides higher levels of availability for virtual machines • You can create a new cluster using the Cluster Creation Wizard VI4 - Mod 2-1 - Slide

  14. Admission Control Policy VMware HA provides options for what policy is enforced if admission control is enabled. • Host failures cluster tolerates – VMware HA reserves a certain amount of resources across a set of hosts. These reserved resources are sufficient to sustain performance even if the specified number of hosts fail. • Percentage of cluster resources reserved as failover spare capacity – VMware HA reserves a certain percentage of aggregate resources in the cluster to accommodate failures. • Specify a failover host – VMware HA reserves a specific host to accommodate failures. This is a more static solution, where a single host is designated as the host that will be the target for virtual machines if one of the other hosts fails. VI4 - Mod 2-1 - Slide

  15. Tolerate Some Number of Host Failures • You can configure VMware HA to tolerate a specified number of host failures. • When using the “Host failures cluster tolerates” option, it is most effective if all virtual machines have a similar CPU and memory requirement. • If you have highly variable configurations, consider using the “Percentage of cluster resources reserved as failover spare capacity” option • Each host has some amount of memory and CPU that it can make available for use by virtual machines. • Each virtual machine must be guaranteed its CPU and memory reservation requirements. VI4 - Mod 2-1 - Slide

  16. Tolerate Some Number of Host Failures VI4 - Mod 2-1 - Slide

  17. Tolerate Some Number of Host Failures VI4 - Mod 2-1 - Slide

  18. Reserve a Percentage of Cluster Resources • You can configure VMware HA to reserve a specific percentage of cluster resources for recovery from host failures • When using the “Percentage of cluster resources reserved as failover spare capacity” option, • Each time a request is made to power on a virtual machine, admission control determines the amount of resources the virtual machine would need and how much uncommitted resources remain on cluster resources for failovers • This policy does not use slots, but rather it uses the actual reservations of the virtual machines. VI4 - Mod 2-1 - Slide

  19. Specify a Failover Host • You can configure VMware HA to reserve a specific host as failover capacity. • When using the Specify a failover host option, • If one host fails, first attempts are made to restart the virtual machines on the reserved host, but if this is not possible for some reason such as insufficient resources or that the reserved host has failed, attempts are made to restart virtual machines on any available host in the cluster • This option does not guarantee a level of availability. • It establishes a spare host to use in case of failover • If a failover host is specified, HA admission control prevents users from powering on a virtual machine on the failover host or VMotioning virtual machines to the failover host VI4 - Mod 2-1 - Slide

  20. VM Restart Priority • VM restart priority determines the relative order in which virtual machines are restarted after a host failure. • Assign higher restart priority to the virtual machines that host the most important services. • For example, in the case of a multi-tier application you might opt to rank assignments according to functions hosted on the virtual machines: • High: Database servers that will provide data for applications. • Medium: Application servers that consume data in the database and provide results on web pages. • Low: Web servers that receive user requests, pass queries to application servers, and return results to users. VI4 - Mod 2-1 - Slide

  21. Host Isolation Response • Determines what happens when a host in a VMware HA cluster loses its service console networks (or Vmkernel networks, in ESXi) connection but continues running. • Values are: Leave VM powered on (the default), Power off VM, and Shut down VM. • When a host in a HA cluster loses its console network (or VMkernel network, in ESXi) connectivity, the host is isolated from other hosts in the cluster. • Virtual Machine Settings • You can override the default settings established for the cluster. For each virtual machine, you can establish individual settings for Restart Priority and Isolation Response. VI4 - Mod 2-1 - Slide

  22. The degree to which VMware HA is sensitive to virtual machine failures can be configured to different levels. If you select Enable VM Monitoring, VMware Tools will evaluate whether each virtual machine in the cluster is running by checking for regular heartbeats from the GOS. In such a case, the VM monitoring service determines that the virtual machine has failed and the virtual machine is rebooted to restore service Click on the Custom box to configure advanced features for Monitoring Sensitivity Virtual Machine Monitoring Sensitivity VI4 - Mod 2-1 - Slide

  23. Best Practices for Configuring VMware HA Clusters Networking Best Practices • If your switches support the PortFast (or an equivalent) setting, enable it on the physical network switches that connect servers. • This helps to prevent a host from incorrectly determining that a network is isolated during the execution of lengthy spanning-tree algorithms • On ESX hosts, HA automatically opens the firewall ports that are needed for it to function. The following ports are opened: • Incoming port: TCP/UDP 8042-8045 • Outgoing port: TCP/UDP 2050-2250 VI4 - Mod 2-1 - Slide

  24. Best Practices for Configuring VMware HA Clusters Selection of Networks • The networks that HA will use by defaults is: • ESX: all Service Console Networks • ESXi: All VMKernel networks, *except* the VMotion network, unless there is only one network and it is a VMotion network • By default, the network isolation address is the default gateway , so it is a best practice to add a das.isolationaddress[...] for each network • HA to select the default networks you can use the advanced option das.allowNetwork[...] and HA will use only networks whose port group names match. • ESXi by default uses all VMKernel networks, except the VMotion Network unless there is only one network defined. Use das.AllowVmotionNetworks to override this default behavior. Also, you can use das.allowNetwork[...] to specify the networks that will be used for HA. VI4 - Mod 2-1 - Slide

  25. Clusters with both ESX and ESXi hosts • In mixed ESX and ESXi clusters, using the das.allowNetwork[...] advanced options may be necessary to ensure compatible networks are selected for hosts. • HA configuration enforces that all hosts in the cluster have compatible networks. • The first node added to the cluster dictates the networks that all subsequent hosts must have for them to be allowed into the cluster • Networks are deemed compatible if the IP address and subnet mask combine to result in a network that matches another host's • Use das.allowNetwork[...] advanced options to control which networks are to be used to ensure compatibility between all hosts in the cluster VI4 - Mod 2-1 - Slide

  26. Setting Up Networking Redundancy • Networking redundancy between cluster nodes is important for VMware HA reliability. • Redundant service console networking on ESX4 (or VMkernel networking on ESXi) allows the reliable detection of failures and prevents isolation conditions from occurring. • NIC Teaming • Using a team of two NICs connected to separate physical switches improves the reliability of a service console (or, in ESXi, VMkernel) network. • To configure a NIC team for the service console, configure the vNICs in vSwitch configuration for Active or Standby configuration. The recommended parameter settings for the vNICs are: • Default load balancing = route based on originating port ID • Failback = No VI4 - Mod 2-1 - Slide

  27. Secondary Service Console Network • You can create a secondary service console (or VMkernel port for ESXi), which is attached to a separate virtual switch • The primary service console is used for network and management purposes. • With a secondary service console network created, VMware HA sends heartbeats over both the primary and secondary service consoles. • When you set up service console redundancy, you must specify an additional isolation response address (das.isolationaddress2) for the service console networks • When you specify a secondary isolation address, you should increase the das.failuredetectiontime setting to 20000 milliseconds or greater • Adding a secondary service console network to the VMotion vswitch. A virtual switch can be shared between VMotion networks and a secondary service console network. VI4 - Mod 2-1 - Slide

  28. Other VMware HA Cluster Considerations • Use larger groups of homogeneous servers to allow higher levels of utilization across an VMware HA-enabled cluster (on average). • More nodes per cluster can tolerate multiple host failures while still guaranteeing failover capacities. • The failover level policy used in admission control heuristics is conservatively weighted, so that virtual machines on large servers can fail over to smaller servers. VI4 - Mod 2-1 - Slide

  29. Viewing Information about VMware HA Clusters • You can view current settings for a cluster • The cluster Summary page displays summary information for the cluster. VI4 - Mod 2-1 - Slide

  30. Primary and Secondary Hosts • Some hosts in a VMware HA cluster are designated as primary hosts. • They maintain information about the cluster such as membership. • The first five hosts in the cluster are designated primary hosts, and all subsequent hosts are designated secondary hosts. • When you add a host to a VMware HA cluster, that host communicates with an existing primary host in the same cluster to complete its configuration • When a primary host becomes unavailable or is removed from the cluster • VMware HA promotes one of the secondary hosts to primary status. • Primary hosts help provide redundancy by replicating the cluster's configuration information and virtual machine states and are used to initiate failover actions VI4 - Mod 2-1 - Slide

  31. VMware HA Clusters and Maintenance Mode • Put a host in maintenance mode in preparation for completing administrative tasks that would otherwise cause unwanted HA responses. • Putting a host into maintenance mode effectively disables the HA service. • You cannot power on a virtual machine on a host that is in maintenance mode. • VMware HA does not fail over any virtual machines to a host that is in maintenance mode • When a host exits maintenance mode, the VMware HA service is reenabled on that host, so it becomes available for failover again • If the host is in a cluster, when it enters maintenance mode the user is given the option to evacuate powered-off virtual machines VI4 - Mod 2-1 - Slide

  32. VMware HA Clusters and Disconnected Hosts • Users may initiate state changes, such as during network maintenance. • ESX/ESXi host in a cluster may no longer be able to communicate with other hosts in a cluster • That host becomes disconnected • The unresponsive host continues to function, but its state is unknown • When a host is disconnected, VMware HA cannot use it as a guaranteed failover target. • VMware HA does not consider disconnected hosts when making calculations related to admission control. • When the host becomes reconnected, the host becomes available for failover again VI4 - Mod 2-1 - Slide

  33. VMware HA Clusters and Disconnected Hosts • The difference between a disconnected host and a host that is not responding is that: • A disconnected host has been explicitly disconnected by the user. As part of disconnecting a host, VMware HA is disabled on that host. The virtual machines on that host are not failed over and not considered when the current failover level is computed. • If a host is not responding, no other hosts receive heartbeats from it. This might happen, for example, because of a network problem or because the host failed. • Disconnected and unresponsive hosts are not included in computations of the current failover level, but any virtual machines running on an unresponsive host will be failed over if the host fails. VI4 - Mod 2-1 - Slide

  34. You can specify behavior for individual virtual machines for: VM Restart Priority — Indicates relative priority for restarting the virtual machine in case of host failure. Host Isolation Response — Specifies what the ESX/ESXi host that has lost connection with its cluster should do with running virtual machines. Monitoring Sensitivity — Specifies how quickly failures are detected. Settings can be changes so certain virtual machines are more or less aggressively monitored. Specific custom values can also be set using advanced options. Monitoring Individual Virtual Machines VI4 - Mod 2-1 - Slide

  35. Troubleshooting VMware HA • If no hosts in a cluster are responding, when you attempt to add a new host, VMware HA configuration fails because the new host cannot communicate with any of the primary hosts. • Disconnect all hosts that are not responding before adding the new host. • After disconnecting all other hosts and adding a new host, that host becomes the first primary host. • When other hosts become available again, their VMware HA service is reconfigured and they then become primary or secondary hosts depending on the existing number of primary hosts. VI4 - Mod 2-1 - Slide

  36. Customizing VMware HA After you have established a cluster, you may need to modify settings. There are specific attributes that affect how VMware HA behaves VI4 - Mod 2-1 - Slide

  37. Customizing VMware HA VI4 - Mod 2-1 - Slide

  38. Customizing VMware HA VI4 - Mod 2-1 - Slide

  39. To precisely customize VMware HA behavior, set advanced VMware HA options. Prerequisites You must have a VMware HA cluster for which to modify settings. To modify advanced VMware HA settings, you must have cluster administrator privileges. In the cluster’s Settings dialog box, select VMware HA. Click the Advanced Options button to open the dialog box. Enter each advanced attribute you want to change in a text box Click OK. Set Advanced VMware HA Options VI4 - Mod 2-1 - Slide

  40. Lesson 2-1 Summary • Learn how to Create a HA Cluster • Learn how to Monitor a HA Cluster • Learn how to modify HA Cluster Settings • Learn how to troubleshoot HA Clusters VI4 - Mod 2-1 - Slide

  41. Lab – VMware High Availability Lab 1 Part 1 - Creating a vCenter High Availability (HA) Cluster Lab 1 Part 2 – Adding Hosts to High Availability (HA) Cluster Lab 1 Part 3 – Viewing High Availability (HA) Cluster Settings Lab 1 Part 4 – Modifying High Availability (HA) Cluster Settings VI4 - Mod 2-1 - Slide

More Related