Business Continuity: Local

Business Continuity: Local Module 4.3

Local Replication After completing this module you will be able to: • Discuss replicas and the possible uses of replicas • Explain consistency considerations when replicating file systems and databases • Discuss host and array based replication technologies • Functionality • Differences • Considerations • Selecting the appropriate technology Module TitleBusiness Continuity: Local

What is Replication? • Replica - An exact copy (in all details) • Replication - The process of reproducing data REPLICATION Original Replica Module TitleBusiness Continuity: Local

Possible Uses of Replicas • Alternate source for backup • Source for fast recovery • Decision support • Testing platform • Migration Module TitleBusiness Continuity: Local

Considerations • What makes a replica good? • Recoverability • Considerations for resuming operations with primary • Consistency/re-startability • How is this achieved by various technologies • Kinds of Replicas • Point-in-Time (PIT) = finite RPO • Continuous = zero RPO • How does the choice of replication technology tie back into RPO/RTO? Module TitleBusiness Continuity: Local

Replication of File Systems Buffer Physical Volume Module TitleBusiness Continuity: Local

Replication of Database Applications • A database application may be spread out over numerous files, file systems, and devices—all of which must be replicated. • Database replication can be offline or online. Data Logs Module TitleBusiness Continuity: Local

Database: Understanding Consistency • Databases/Applications maintain integrity by following the “Dependent Write I/O Principle” • Dependent Write: A write I/O that will not be issued by an application until a prior related write I/O has completed • A logical dependency, not a time dependency • Inherent in all Database Management Systems (DBMS) • e.g. Page (data) write is dependent write I/O based on a successful log write • Applications can also use this technology • Necessary for protection against local outages • Power failures create a dependent write consistent image • A Restart transforms the dependent write consistent to transactionally consistent • i.e. Committed transactions will be recovered, in-flight transactions will be discarded Module TitleBusiness Continuity: Local

Database Replication: Transactions Buffer 1 1 2 2 Data 3 3 Database Application 4 4 Log Module TitleBusiness Continuity: Local

Database Replication: Consistency Source Replica 1 1 Data Data 2 2 3 3 4 4 Log Log Consistent Note: In this example, the database is online. Module TitleBusiness Continuity: Local

Database Replication: Consistency Source Replica 1 Data 2 3 3 4 4 Log Inconsistent Note: In this example, the database is online. Module TitleBusiness Continuity: Local

Database Replication: Ensuring Consistency • Off-line Replication • If the database is offline or shutdown and then a replica is created, the replica will be consistent. • In many cases, creating an offline replica may not be a viable due to the 24x7 nature of business. Source Replica Data Database Application (Offline) Log Consistent Module TitleBusiness Continuity: Local

Database Replication: Ensuring Consistency • Online Replication • Some database applications allow replication while the application is up and running • The production database would have to be put in a state which would allow it to be replicated while it is active • Some level of recovery must be performed on the replica to make the replica consistent Source Replica 1 Data 2 3 3 4 4 Log Inconsistent Module TitleBusiness Continuity: Local

Database Replication: Ensuring Consistency Source Replica 1 1 5 2 2 5 3 3 4 4 Consistent Module TitleBusiness Continuity: Local

Tracking Changes After PIT Creation Later At PIT Resynch Source ≠ Target Source = Target Source = Target Module TitleBusiness Continuity: Local

Local Replication Technologies • Host based • Logical Volume Manager (LVM) based mirroring • File System Snapshots • Storage Array based • Full volume mirroring • Full volume: Copy on First Access • Pointer based: Copy on First Write Module TitleBusiness Continuity: Local

Logical Volume Manager: Review • Host resident software responsible for creating and controlling host level logical storage • Physical view of storage is converted to a logical view by mapping. Logical data blocks are mapped to physical data blocks. • Logical layer resides between the physical layer (physical devices and device drivers) and the application layer (OS and applications see logical view of storage). • Usually offered as part of the operating system or as third party host software • LVM Components: • Physical Volumes • Volume Groups • Logical Volumes Logical Storage LVM Physical Storage Module TitleBusiness Continuity: Local

Physical Volume 1 Physical Volume 2 Physical Volume 3 Volume Groups • One or more Physical Volumes form a Volume Group • LVM manages Volume Groups as a single entity • Physical Volumes can be added and removed from a Volume Group as necessary • Physical Volumes are typically divided into contiguous equal-sized disk blocks • A host will always have at least one disk group for the Operating System • Application and Operating System data maintained in separate volume groups Volume Group Physical Disk Block Module TitleBusiness Continuity: Local

Logical Volume Logical Volumes Logical Volume Logical Disk Block Physical Volume 1 Physical Volume 2 Physical Volume 3 Physical Disk Block Volume Group Module TitleBusiness Continuity: Local

PhysicalVolume 1 PVID1 VGDA Logical Volume PhysicalVolume 2 PVID2 VGDA Host Based Replication: Mirrored Logical Volumes Host Logical Volume Module TitleBusiness Continuity: Local

Host Based Replication: File System Snapshots • Many LVM vendors will allow the creation of File System Snapshots while a File System is mounted • File System snapshots are typically easier to manage than creating mirrored logical volumes and then splitting them Module TitleBusiness Continuity: Local

Host (LVM) Based Replicas: Disadvantages • LVM based replicas add overhead on host CPUs • If host devices are already Storage Array devices then the added redundancy provided by LVM mirroring is unnecessary • The devices will have some RAID protection already • Host based replicas can be usually presented back to the same server • Keeping track of changes after the replica has been created Module TitleBusiness Continuity: Local

Storage Array Based Local Replication • Replication performed by the Array Operating Environment • Replicas are on the same array Array Replica Source Production Server Business Continuity Server Module TitleBusiness Continuity: Local

c12t1d1 c12t1d2 Storage Array Based – Local Replication Example • Typically Array based replication is done at a array device level • Need to map storage components used by an application/file system back to the specific array devices used – then replicate those devices on the array Array 1 File System 1 Replica Vol 1 Source Vol 1 Logical Volume 1 Replica Vol 2 Source Vol 2 Volume Group 1 Module TitleBusiness Continuity: Local

Array Based Local Replication: Full Volume Mirror Attached Read/Write Not Ready Source Target Array Module TitleBusiness Continuity: Local

Array Based Local Replication: Full Volume Mirror Detached - PIT Read/Write Read/Write Source Target Array Module TitleBusiness Continuity: Local

Array Based Local Replication: Full Volume Mirror • For future re-synchronization to be incremental, most vendors have the ability to track changes at some level of granularity (e.g., 512 byte block, 32 KB, etc.) • Tracking is typically done with some kind of bitmap • Target device must be at least as large as the Source device • For full volume copies the minimum amount of storage required is the same as the size of the source Module TitleBusiness Continuity: Local

Copy on First Access (COFA) • Target device is made accessible for BC tasks as soon as the replication session is started. • Point-in-Time is determined by time of activation • Can be used in Copy First Access mode (deferred) or in Full Copy mode • Target device is at least as large as the Source device Module TitleBusiness Continuity: Local

Copy on First Access Mode: Deferred Mode Write to Source Read/Write Read/Write Source Target Write to Target Read/Write Read/Write Source Target Read from Target Read/Write Read/Write Source Target Module TitleBusiness Continuity: Local

Copy on First Access: Full Copy Mode • On session start, the entire contents of the Source device is copied to the Target device in the background • Most vendor implementations provide the ability to track changes: • Made to the Source or Target • Enables incremental re-synchronization Module TitleBusiness Continuity: Local

Array: Pointer Based Copy on First Write • Targets do not hold actual data, but hold pointers to where the data is located • Actual storage requirement for the replicas is usually a small fraction of the size of the source volumes • A replication session is setup between the Source and Target devices and started • When the session is setup based on the specific vendors implementation a protection map is created for all the data on the Source device at some level of granularity (e.g 512 byte block, 32 KB etc.) • Target devices are accessible immediately when the session is started • At the start of the session the Target device holds pointers to the data on the Source device Module TitleBusiness Continuity: Local

Pointer Based Copy on First Write Example TargetVirtual Device Source Save Location Module TitleBusiness Continuity: Local

Array Replicas: Tracking Changes • Changes will/can occur to the Source/Target devices after PIT has been created • How and at what level of granularity should this be tracked? • Too expensive to track changes at a bit by bit level • Would require an equivalent amount of storage to keep track of which bit changed for each the source and the target • Based on the vendor some level of granularity is chosen and a bit map is created (one for Source and one for Target) • One could choose 32 Kb as the granularity • For a 1 GB device changes would be tracked for 32768 32Kb chunks • If any change is made to any bit on one 32Kb chunk the whole chunk is flagged as changed in the bit map • 1 GB device map would only take up 32768/8/1024 = 4Kb space Module TitleBusiness Continuity: Local

Source Target 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Array Replicas: How Changes Are Determined At PIT Source 1 0 0 1 0 1 0 0 After PIT… Target 0 0 1 1 0 0 0 1 1 0 1 1 0 1 0 1 Resynch 1 = changed 0 = unchanged Module TitleBusiness Continuity: Local

Array Replication: Multiple PITs Target Devices 06:00 A.M. Source 12:00 P.M. Point-In-Time 06:00 P.M. 12:00 A.M. : 12 : 01 : 02 : 03 : 04 : 05 : 06 : 07 : 08 : 09 : 10 : 11 : 12 : 01 : 02 : 03 : 04 : 05 : 06 : 07 : 08 : 09 : 10 : 11 : A.M. P.M. Module TitleBusiness Continuity: Local

D C Inconsistent Consistent Array Replicas: Ensuring Consistency Source Replica Source Replica 1 1 1 2 2 2 3 3 3 3 4 4 4 4 Module TitleBusiness Continuity: Local

Mechanisms to Hold IO • Host based • Array based • What if the application straddles multiple hosts and multiple arrays? Module TitleBusiness Continuity: Local

Array Replicas: Restore/Restart Considerations • Production has a failure • Logical Corruption • Physical failure of production devices • Failure of Production server • Solution • Restore data from replica to production • The restore would typically be done in an incremental manner and the Applications would be restarted even before the synchronization is complete leading to very small RTO -----OR------ • Start production on replica • Resolve issues with production while continuing operations on replicas • After issue resolution restore latest data on replica to production Module TitleBusiness Continuity: Local

Array Replicas: Restore/Restart Considerations • Before a Restore • Stop all access to the Production devices and the Replica devices • Identify Replica to be used for restore • Based on RPO and Data Consistency • Perform Restore • Before starting production on Replica • Stop all access to the Production devices and the Replica devices • Identify Replica to be used for restart • Based on RPO and Data Consistency • Create a “Gold” copy of Replica • As a precaution against further failures • Start production on Replica • RTO drives choice of replication technology Module TitleBusiness Continuity: Local

Array Replicas: Restore Considerations • Full Volume Replicas • Restores can be performed to either the original source device or to any other device of like size • Restores to the original source could be incremental in nature • Restore to a new device would involve a full synchronization • Pointer Based Replicas • Restores can be performed to the original source or to any other device of like size as long as the original source device is healthy • Target only has pointers • Pointers to source for data that has not been written to after PIT • Pointers to the “save” location for data was written after PIT • Thus to perform a restore to an alternate volume the source must be healthy to access data that has not yet been copied over to the target Module TitleBusiness Continuity: Local

Array Replicas: Which Technology? • Full Volume Replica • Replica is a full physical copy of the source device • Storage requirement is identical to the source device • Restore does not require a healthy source device • Activity on replica will have no performance impact on the source device • Good for full backup, decision support, development, testing and restore to last PIT • RPO depends on when the last PIT was created • RTO is extremely small Module TitleBusiness Continuity: Local

Array Replicas: Which Technology? … • Pointer based - COFW • Replica contains pointers to data • Storage requirement is a fraction of the source device (lower cost) • Restore requires a healthy source device • Activity on replica will have some performance impact on source • Any first write to the source or target will require data to be copied to the save location and move pointer to save location • Any read IO to data not in the save location will have to be serviced by the source device • Typically recommended if the changes to the source are less than 30% • RPO depends on when the last PIT was created • RTO is extremely small Module TitleBusiness Continuity: Local

Array Replicas: Which Technology? • Full Volume – COFA Replicas • Replica only has data that was accessed • Restore requires a healthy source device • Activity on replica will have some performance impact • Any first access on target will require data to be copied to target before the I/O to/from target can be satisfied • Typically replicas created with COFA only are not as useful as replicas created with the full copy mode – Recommendation would be to use the full copy mode it the technology allows such an option Module TitleBusiness Continuity: Local

Array Replicas: Full Volume vs. Pointer Based Module TitleBusiness Continuity: Local

Module Summary Key points covered in this module: • Replicas and the possible use of Replicas • Consistency considerations when replicating File Systems and Databases • Host and Array based Replication Technologies • Advantages/Disadvantages • Differences • Considerations • Selecting the appropriate technology Module TitleBusiness Continuity: Local

 Check Your Knowledge • What is a replica? • What are the possible uses of a replica? • What is consistency in the context of a database? • How can consistency be ensured when replicating a database? • Discuss one host based replication technology • What is the difference between full volume mirrors and pointer based replicas? • What are the considerations when performing restore operations for each replication technology? Module TitleBusiness Continuity: Local

Apply Your Knowledge… Upon completion of this topic, you will be able to: • List EMC’s Local Replication Solutions for the Symmetrix and CLARiiON arrays • Describe EMC’s TimeFinder/Mirror Replication Solution • Describe EMC’s SnapView - Snapshot Replication Solution Module TitleBusiness Continuity: Local

EMC – Local Replication Solutions • EMC Symmetrix Arrays • EMC TimeFinder/Mirror • Full volume mirroring • EMC TimeFinder/Clone • Full volume replication • EMC TimeFinder/SNAP • Pointer based replication • EMC CLARiiON Arrays • EMC SnapView Clone • Full volume replication • EMC SnapView Snapshot • Pointer based replication Module TitleBusiness Continuity: Local

EMC TimeFinder/Mirror - Introduction • Array based local replication technology for Full Volume Mirroring on EMC Symmetrix Storage Arrays • Create Full Volume Mirrors of an EMC Symmetrix device within an Array • TimeFinder/Mirror uses special Symmetrix devices called Business Continuance Volumes (BCV). BCVs: • Are devices dedicated for Local Replication • Can be dynamically, non-disruptively established with a Standard device. They can be subsequently split instantly to create a PIT copy of data. • The PIT copy of data can be used in a number of ways: • Instant restore – Use BCVs as standby data for recovery • Decision Support operations • Backup – Reduce application downtime to a minimum (offline backup) • Testing • TimeFinder/Mirror is available in both Open Systems and Mainframe environments Module TitleBusiness Continuity: Local

Establish Incremental Establish EMC TimeFinder/Mirror – Operations • Establish • Synchronize the Standard volume to the BCV volume • BCV is set to a Not Ready state when established • BCV cannot be independently addressed • Re-synchronization is incremental • BCVs cannot be established to other BCVs • Establish operation is non-disruptive to the Standard device • Operations to the Standard can proceed as normal during the establish STD BCV BCV Module TitleBusiness Continuity: Local

Business Continuity: Local