180 likes | 447 Views
EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster. Disaster Restart Business challenges and requirements. Meeting recovery point objective (RPO) and recovery time objective (RTO) requirements with current plan Business and/or regulatory needs Need to reduce RPO and RTO times
E N D
EMC RecoverPoint/Cluster Enabler for Microsoft Failover Cluster
Disaster RestartBusiness challenges and requirements • Meeting recovery point objective (RPO) and recovery time objective (RTO) requirements with current plan • Business and/or regulatory needs • Need to reduce RPO and RTO times • Application benchmarks from days or hours to minutes or seconds • Need for continuous operations with no data loss • Cost of assets and maintenance at disaster recovery site • Need to maintain software versions (updates and patches) • Reliability of the disaster recovery plan • Need to address applications requiring dependent write consistency between and across operating systems • Need to periodically test to ensure it will work when required
Business Impact of Application and Data Inaccessibility Downtime Cost Time Hot site/cold site Electronic vaulting Database replication Remote replication Dedicated hot standby Geographical clusters Cost of downtime escalates quickly over time
Microsoft Failover ClusterA high-availability restart solution Node or resource failure automatically restarts failed nodes on another node where resources are available Node Fails Resource Group: Microsoft SQL Resource Group: Microsoft Exchange Resource Group: Oracle Microsoft failover cluster provides high availability; shared-nothing cluster model
Cluster Enabler 4.0Features and capabilities overview • Integrates RecoverPoint and RecoverPoint/SE with Microsoft failover cluster • Automatic site failover for remote replication operations • Supports majority node set quorum options: Majority Node Set (MNS), and MNS with File Share Witness • Supports RecoverPoint continuous remote replication (CRR) • Using Fibre Channel or Gigabit Ethernet for remote replication • Up to 400 milliseconds maximum latency for asynchronous replication • Up to 4 milliseconds maximum latency for synchronous replication • Supports Windows Server and Server Core for Windows Server 2008 • Up to two nodes per site with Windows Server 2003 • Up to eight nodes per site with Windows Server 2008 and Windows Server 2008 R2 • Supports clustering of up to eight child partitions with Hyper-V Cluster Enabler 4.0supports RecoverPoint3.1.1 or later andany array supported byRecoverPoint and RecoverPoint/SE
Majority Node Set Support Majority Node Set Used as a tie-breaker to avoid split-brain scenarios From a cluster-node perspective, each node sees the quorum as a local resource Each cluster node stores the configuration information on a local disk Each node has access to local disk when it starts up Cluster service ensures cluster configuration is consistent on each cluster node Changes are replicated across the Majority Node Set File Share Witness External to an cluster providing an additional quorum vote 2- to 4-node cluster can survive up to N-1 node failures 4- to 8-node cluster can survive up to N-2 node failures Acts as a witness to Majority Node Set Enhances geographically disbursed failover cluster Recommended that File Share Witness be configured in a third site
Cluster Enabler for Microsoft Failover Cluster LAN/WAN Private Interconnect Cluster nodes with RecoverPoint/CE installed Site A Site B File Share Witness with RecoverPoint/CE installed RecoverPoint Failover cluster supports up to 8 nodes with Windows Server 2003/2008using Majority Node Set with and without File Share Witness
Cluster Enabler for Microsoft Failover ClusterNode failure Role of Major Software Components Microsoft failover cluster software Protects against server hardware or network connection failures Initiates failover actions to a clustered node for resource group restart Cluster Enabler 4.0 software Installed on all cluster nodes and on File Share Witness (if File Share Witness is used) Responds to queries from the cluster service that determine cluster behavior Determines RecoverPoint state and initiates appropriate RecoverPoint actions using the RecoverPoint API RecoverPoint software CRR provides remote mirroring of production data CRR journal retained, allowing for point-in-time recovery outside of cluster operations
Site A Site B Majority Node Set with File Share Witness RecoverPoint Cluster Enabler and Node Failure EventFailover steps • Site A node fails, resulting in heartbeat response timeout • Cluster reforms between the Site B node and the File Share Witness node • The Site B node brings resource groups from the Site A node online • The latest image of the RecoverPoint volumes listed in the resource group are automatically recovered, read/write enabled, and mounted to the Site B node • Application listed as part of the failed Site A node resource group is restarted • The Site A node network address is added to the network interface of the Site B node and client traffic is routed to the Site B node
Disaster Recovery for Hyper-VAutomated failover operations for Hyper-V environments Prod 2 Prod 1 Target 2 Target 1 Cluster nodes with RecoverPoint/CE installed New LAN/WAN Private Interconnect Site A Site B Majority Node Set with File Share Witness RecoverPoint Hyper-V with Failover Clusters supports up to 8 nodes with Windows 2008 R2
Hyper-V OverviewCluster Enabler 4.0 supports Hyper-V with failover clusters New • Failover of the virtual machine (VM) resource • RecoverPoint/CE is deployed in the Hyper-V parent partition • Cluster relocation is at the VM level • Hyper-V Live Migration and Quick Migration—between nodes at the same or different sites • Live Migration supported with RecoverPoint CRR synchronous replication • Quick Migration supported with synchronous and asynchronous replication • Use for planned maintenance—such as VM relocation for hardware upgrades and software upgrades • Use for VM workload re-distribution—move VMs from one physical host to another
Hyper-V Virtual Machine Failure EventFailover steps with Cluster Enabler 4.0 New • Site A Hyper-V physical node fails, resulting in heartbeat response timeout • Cluster reforms between the Site B node and the File Share Witness node • The Site B node brings Hyper-V virtual machine resource groups from the Site A node online • RecoverPoint target volumes for consistency groups listed in affected resource groups are recovered and mounted to the Site B node • Virtual machines listed as part of the failed Site A node resource group are restarted • The Site A node network address is added to the network interface of the Site B node and client traffic is routed to the Site B node Site A Site B Majority Node Set with File Share Witness RecoverPoint Virtual Machines can failover within and between failover cluster nodes
Hyper-V Live Migration New Planned hardware maintenance on physical server requires moving VM to another physical server Site B Site A Majority Node Set with File Share Witness R1 R2 RecoverPoint CRR synchronous replication R1 R2 Live migration can be within the same site or between sites
Multi-Array Support WAN Cluster nodes with RecoverPoint/CE installed Each named cluster group’s associated devices reside in a single RecoverPoint consistency group of the same name RecoverPoint RecoverPoint File Share Witness with RecoverPoint/CE installed Devices for Cluster Group 1 Devices for Cluster Group 2
Microsoft Failover Clusters Deployed with Oracle on Windows Prod 2 Prod1 Target 2 Target 1 Network Oracle Oracle Oracle Oracle Majority Node Set with File Share Witness RecoverPoint Failover clusters configured with Oracle Fail Safe
Benefits of Cluster Enabler Provides rapid site restart with RecoverPoint Automatic site failover for common disruptions—including compete site disasters and server, storage, or network-related failures Minimizes site failback time with RecoverPoint Only changes are copied by RecoverPoint or RecoverPoint/SE to resynchronize the primary cluster storage system Provides multi-array support One cluster can span multiple storage arrays at the same or different sites Different clusters can share storage arrays Supports heterogeneous storage arrays A mix of arrays can be used Storage arrays do not have to be identical between sites