180 likes | 341 Views
EMC ControlCenter Infrastructure D/R Options. Stephen Walsh. Agenda. Review of Types of Disaster Disaster Scenarios Planning What Is Required for Disaster Recovery Review of Service Alerts and Disaster-Recovery Function Prerequisites Current Supported Service Alerts
E N D
EMC ControlCenter Infrastructure D/R Options Stephen Walsh
Agenda • Review of Types of Disaster • Disaster Scenarios • Planning What Is Required for Disaster Recovery • Review of Service Alerts and Disaster-Recovery Function • Prerequisites • Current Supported Service Alerts • Service Alert Functionality in a Disaster Recovery • Supported ControlCenter Disaster-Recovery Scenarios Overview • Additional Options • Archive Recommendations • Questions
Disaster Scenarios • Hardware Failure • Source: Machine, Disk, Controller, etc. • Cause: MBTF, Power Fluctuations, GITM • Outage: Depends on Parts Availability and Service Contracts • OS Failure • Source: Windows OS • Cause: Patches, Memory Error, Kernel Fault, HW Issues, 3rd-Party Application Error • Outage: Typically Short; May Require OS Rebuild / Repair and/or Reintegration into the Domain • Site Loss • Cause: Inclement Weather, Fire, Water Damage, Gas Leak, Terrorist Incident • Outage: Depending on Planned Contingencies – May Be Complete Loss
Planning What Is Required for ControlCenter Recovery • During a Disaster • How do you currently handle disaster recovery ? • Will ControlCenter manage the disaster recovery ? • What are the minimums required to operate post-disaster ? • What ControlCenter data are you trying to save ? • Alerting ? • Array Management ? • SAN Management ? • Performance ? • Utilization Reporting / Trending? • What are your minimum downtime requirements ? Level of service ?
Agenda • Review of Types of Disaster • Disaster Scenarios • Planning What Is Required for Disaster Recovery • Review of Service Alerts and Disaster-Recovery Function • Prerequisites • Current Supported Service Alerts • Service Alert Functionality in a Disaster Recovery • Supported ControlCenter Disaster-Recovery Scenarios Overview • Additional Options • Archive Recommendations • Questions
Prerequisites • Supports 5.1.2 as a new install; an RPQ is required for systems that have been upgraded from 5.x to 5.1.2 • Supports a new 5.2 install or 5.1.2 upgraded to 5.2 • All service packs and patches are covered for 5.1.2 and 5.2 • Installation MUST have been performed against a host registered in DNS with both an A and PTR record • Verify in the server.ini, listener.ora, and tnsnames.ora on repository server • Verify in the path %ECCROOT%\ecc_inf\%FQDNHostNameDir% on repository server • All hosts that have ControlCenter agents are pointing to the repository FQDN • If IP addresses or shortname were captured – D/R is not supported without RPQ
Current Supported Service Alerts • SA 491 • ControlCenter 5.1.2 • Single Infrastructure Only • Contains Backup.bat, Restore.bat, req’d version of Perl, and documentation • RPQ is not required to use for either Disaster Recovery or System Migration • SA 495 • ControlCenter 5.1.2 • Distributed Infrastructure Only • Contains backup.bat, restore.bat, req’d version of Perl, and documentation • RPQ is not required to use for either Disaster Recovery or System Migration • SA 511 • ControlCenter 5.2 • Single and Distributed Infrastructure • Contains backup.bat, restore.bat, req’d version of Perl, and documentation • RPQ is not required to use for either Disaster Recovery or System Migration • E-mail svc@emc.com if you do not have access to the service alerts site and they will direct you to a local resource who can assist you
Service Alert Functionality in a Disaster Recovery • The Service Alerts are a scripted capture process that migrates all unique ControlCenter information from the Windows operating system • Data migrated to %ControlCenterRoot%\tools\utils\backupfiles • Should be performed post install of ControlCenter and redone if any additional infrastructure components are installed • Once run, the %ControlCenterRoot% can be migrated to another host. This does require that the destination host be named identical to the original hostname , contain the same path and appropriate DNS and TCP\IP modifications be made • A general list of what is captured • HKLM\Software\EMC Corporation\*.* • HKLM\CurrentControlSet\Services\EMC ControlCenter* • HKLM\Software\Oracle\*.* • HKLM\CurrentControlSet\Services\Oracle* • C:\Program Files\Oracle • ControlCenter Path Variables • ControlCenter Environmental Variables • Desktop Shortcuts • Other Miscellaneous Files
Agenda • Review of Types of Disaster • Disaster Scenarios • Planning What Is Required for Disaster Recovery • Review of Service Alerts and Disaster-Recovery Function • Prerequisites • Current Supported Service Alerts • Service Alert Functionality in a Disaster Recovery • Supported ControlCenter Disaster-Recovery Scenarios Overview • Additional Options • Archive Recommendations • Questions
Supported D/R Scenarios • Site-to-Site D/R Replication • Boot on SAN array and host all ControlCenter infrastructure components on SAN array with remote mirroring capability • Boot internal, host all ControlCenter infrastructure components on SAN array with remote mirroring capability • Local D/R • Boot on SAN array and host all ControlCenter infrastructure components on SAN array • Boot internal, host all ControlCenter infrastructure components on SAN array • Boot internal, host all ControlCenter infrastructure components on separate JBOD drives that can be migrated • Other Options • CNAME Records or Aliasing, MSCS, SRDF/ce
D/R – SRDF/s – All Disks On SYMMETRIX • The ControlCenter Application is installed on a Symmetrix SRDF/s R1 Boot and Data Disks. The physical host names are bound to A/PTR Records with the R2 side running in a “warm” state for application failover. 2. Service Alert 511 is run on the infrastructure hosts, migrating all localized host application data on the Boot Volume to the ECCRootVolume device. This is a preventative measure – in case the O/S is corrupted by the failure. Once complete the application is ready for a D/R migration to the target site. R1 Active Infrastructure Standby Infrastructure Standby Infrastructure R2 R1 Active Infrastructure SRDF/s SRDF/s DDNS/BIND8.1.2 DDNS/BIND8.1.2 A,PTR Record for Physical Hosts A,PTR Record for Physical Hosts 3. Upon environment failure, the SRDF/s link is failed over. The Standby systems are rebooted to init the R2 Devices. The Host IP’s are modified to reflect their new subnets and the A/PTR records are Modified in DNS to reflect that of the new IP’s. The hosts are rebooted for the application to initialize against the correct DNS entry. 4. Should the Operating system be corrupted for any reason – a new system image can be deployed and SA511 can be run to replace the application in the operating system. Once run the host needs to be rebooted twice – once to initialize the registry, and change all services to automatic and then to initialize the application. Internal R1 R2 Standby Infrastructure R2 Standby Infrastructure R1 Active Infrastructure SRDF/s SRDF/s DDNS/BIND8.1.2 DDNS/BIND8.1.2 A,PTR Record A,PTR Record for Physical Hosts
D/R – SRDF/s - Local Boot, ControlCenter Application on SYMMETRIX • The ControlCenter Application is installed and bound to A/PTR records on hosts with Internal Boot and Symmetrix RDF Pairs for Data. The R2 side is running in a “warm” state for application failover. 2. Service Alert 511 is run on the infrastructure hosts, migrating all localized host application data to the Symmetrix R1/R2 device. Once complete, the application is ready for a disaster migration to the target site. Standby Infrastructure Internal Internal Internal Internal Standby Infrastructure Active Infrastructure Active Infrastructure R1 R2 R1 R2 SRDF/s SRDF/s DDNS/BIND8.1.2 DDNS/BIND8.1.2 A,PTR Record for Physical Hosts A,PTR Records for Physical Hosts 3. Upon failover of the environment, Service Alert 511 is run on the target side to restore all localized application settings to the internal disks. The hosts are rebooted to re-initialize their registry settings, and removing the IP Address associated with the A/PTR of the source hosts and replacing the entries with the new target hosts IP. 4. The ControlCenter Infrastructures are rebooted, resumes processing and recovery of queued data. A review of Data Collection Policies is performed to analyze the status of the data in the repository. Typically there is a synchronization delay, as DCP’s that did not run during the failure, trigger and recollect the data. Internal Internal Internal Internal Active Infrastructure Active Infrastructure R1 R2 R1 R2 SRDF/s SRDF/s DDNS/BIND8.1.2 DDNS/BIND8.1.2 A,PTR Record for Physical Hosts A,PTR Record for Physical Hosts
D/R – Local Boot, ControlCenter Application on SYMMETRIX Disks • The ControlCenter Application is installed and bound to A/PTR records on hosts with Internal Boot and Symmetrix STD for Data. 2. Service Alert 511 is run on the infrastructure hosts, migrating all localized host application data to the Symmetrix STD device. Once complete, the application is ready for a host based disaster Internal Internal Active Infrastructure Active Infrastructure STD STD DDNS/BIND8.1.2 DDNS/BIND8.1.2 A,PTR Records for Physical Hosts A,PTR Record for Physical Hosts 3. Upon a host based failure, replacement hardware is put into service. A new OS is installed and the host is given the same FQDN as the original host it is replacing, and joins the appropriate AD Domain. The HBA is replaced in the Zoneset and VCMDB granting the new host visibility to the ControlCenter Application disk. 4. SA 511’s restore procedure is run on the restored host, restoring all Application data to the OS. The ControlCenter Infrastructures are rebooted, and it resumes processing and recovery of queued data. Typically there is a synchronization delay, as DCP’s that did not run during the failure, trigger and recollect the data. Internal STD Active Infrastructure Internal Active Infrastructure STD Standby Server DDNS/BIND8.1.2 DDNS/BIND8.1.2 A,PTR Record for Physical Hosts A,PTR Record for Physical Hosts
D/R – Local Boot, ControlCenter on SYMMETRIX Disks, CNAME Aliasing 1. The ControlCenter Application is installed and bound to A/PTR records that will migrate to a CNAME (DNS Alias) Records on hosts with Internal Boot and Symmetrix RDF Pairs for Data. The physical host names are bound to A/PTR Records with the R2 side running in a “warm” state for application failover once the installation is complete. 2. The Application CNAME Record is created in DNS, binding the source hostnames. Service Alert 511 is run on the infrastructure hosts, migrating all localized host application data to the Symmetrix R1/R2 device. Once complete, the application is ready for a disaster migration to the target site. Standby Infrastructure Internal Internal Internal Internal Standby Infrastructure Active Infrastructure Active Infrastructure R1 R2 R1 R2 SRDF/s SRDF/s DDNS/BIND8.1.2 DDNS/BIND8.1.2 A,PTR Record for Physical Hosts CNAME For CC Application A,PTR Records that will be migrated to CNAME 3. Upon failover of the environment, Service Alert 511 is run on the target side, to restore all localized application settings to the internal disks. The hosts are rebooted to re-initialize their registry settings and the CNAME record is modified by removing the A/PTR of the source hosts and replacing the entries with the new target hosts. 4. The ControlCenter Infrastructure resumes processing and recovery of queued data. A review of Data Collection Policies is performed to analyze the status of the data in the repository. Typically there is a synchronization delay, as DCP’s that did not run during the failure, trigger and recollect the data for current state analysis. Internal Internal Internal Internal Active Infrastructure Active Infrastructure R1 R2 R1 R2 SRDF/s SRDF/s DDNS/BIND8.1.2 DDNS/BIND8.1.2 A,PTR Record CNAME For CC Application A,PTR Record CNAME For CC Application
Additional Options • MSCS Clusters • Windows 2000 and 2003 MSCS Supported • Supports only Active/Passive 2 node • Documented solution in Administration Guide Vol I • No RPQ Required • SRDF/ce Clusters • SRDF/ce formerly known as GeoSpan • Site Distributed version of MSCS • MSCS Documentation in Administration Guide Vol 1. For SRDF/ce please contact your local account resource • No RPQ Required • SRDF/a • RPQ Required
Agenda • Review of Types of Disaster • Disaster Scenarios • Planning What Is Required for Disaster Recovery • Review of Service Alerts and Disaster-Recovery Function • Prerequisites • Current Supported Service Alerts • Service Alert Functionality in a Disaster Recovery • Supported ControlCenter Disaster-Recovery Scenarios Overview • Additional Options • Archive Recommendations • Questions
Archive Recommendations • Follow service alert applicable to your version • Apply the SA when base install is complete • Re-Apply if any additional Components are installed (From the Install CD) • Daily backups • WLA Archivers • %ECCROOT%\StorageScope\xmlrepository • Hot backup • Exp backup