720 likes | 957 Views
Business Continuity – Local Replication. Module 4.3. Local Replication. Upon completion of this module, you will be able to: Discuss replicas and the possible uses of replicas Explain consistency considerations when replicating file systems and databases
E N D
Business Continuity – Local Replication Module 4.3
Local Replication Upon completion of this module, you will be able to: • Discuss replicas and the possible uses of replicas • Explain consistency considerations when replicating file systems and databases • Discuss host and array based replication technologies • Functionality • Differences • Considerations • Selecting the appropriate technology Business Continuity – Local Replication
What is Replication • Replica - An exact copy (in all details) • Replication - The process of reproducing data REPLICATION Original Replica Business Continuity – Local Replication
Possible Uses of Replicas • Alternate source for backup • Source for fast recovery • Decision support • Testing platform • Migration Business Continuity – Local Replication
Considerations • What makes a replica good • Recoverability • Considerations for resuming operations with primary • Consistency/re-startability • How is this achieved by various technologies • Kinds of Replicas • Point-in-Time (PIT) = finite RPO • Continuous = zero RPO • How does the choice of replication technology tie back into RPO/RTO Business Continuity – Local Replication
Replication of File Systems Buffer Physical Volume Business Continuity – Local Replication
Replication of Database Applications • A database application may be spread out over numerous files, file systems, and devices,all of which must be replicated • Database replication can be offline or online Data Logs Business Continuity – Local Replication
Database: Understanding Consistency • Databases/Applications maintain integrity by following the “Dependent Write I/O Principle” • Dependent Write: A write I/O that will not be issued by an application until a prior related write I/O has completed • A logical dependency, not a time dependency • Inherent in all Database Management Systems (DBMS) • e.g. Page (data) write is dependent write I/O based on a successful log write • Applications can also use this technology • Necessary for protection against local outages • Power failures create a dependent write consistent image • A Restart transforms the dependent write consistent to transactionally consistent • i.e. Committed transactions will be recovered, in-flight transactions will be discarded Business Continuity – Local Replication
Database Replication: Transactions Buffer 1 1 2 2 Data 3 3 Database Application 4 4 Log Business Continuity – Local Replication
Database Replication: Consistency Source Replica 1 1 Data Data 2 2 3 3 4 4 Log Log Consistent Note: In this example, the database is online. Business Continuity – Local Replication
Database Replication: Consistency Source Replica 1 Data 2 3 3 4 4 Log Inconsistent Note: In this example, the database is online. Business Continuity – Local Replication
Database Replication: Ensuring Consistency • Off-line Replication • If the database is offline or shutdown and then a replica is created, the replica will be consistent • In many cases, creating an offline replica may not be a viable due to the 24x7 nature of business Source Replica Data Database Application (Offline) Log Consistent Business Continuity – Local Replication
Database Replication: Ensuring Consistency • Online Replication • Some database applications allow replication while the application is up and running • The production database would have to be put in a state which would allow it to be replicated while it is active • Some level of recovery must be performed on the replica to make the replica consistent Source Replica 1 Data 2 3 3 4 4 Log Inconsistent Business Continuity – Local Replication
Database Replication: Ensuring Consistency Source Replica 1 1 5 2 2 5 3 3 4 4 Consistent Business Continuity – Local Replication
Tracking Changes After PIT Creation Later At PIT Resynch Source ≠ Target Source = Target Source = Target Business Continuity – Local Replication
Local Replication Technologies • Host based • Logical Volume Manager (LVM) based mirroring • File System Snapshots • Storage Array based • Full volume mirroring • Full volume: Copy on First Access • Pointer based: Copy on First Write Business Continuity – Local Replication
Logical Volume Manager: Review • Host resident software responsible for creating and controlling host level logical storage • Physical view of storage is converted to a logical view by mapping. Logical data blocks are mapped to physical data blocks • Logical layer resides between the physical layer (physical devices and device drivers) and the application layer (OS and applications see logical view of storage) • Usually offered as part of the operating system or as third party host software • LVM Components: • Physical Volumes • Volume Groups • Logical Volumes Logical Storage LVM Physical Storage Business Continuity – Local Replication
Volume Groups • One or more Physical Volumes form a Volume Group • LVM manages Volume Groups as a single entity • Physical Volumes can be added and removed from a Volume Group as necessary • Physical Volumes are typically divided into contiguous equal-sized disk blocks • A host will always have at least one disk group for the Operating System • Application and Operating System data maintained in separate volume groups Physical Volume 1 Physical Volume 2 Physical Volume 3 Volume Group Physical Disk Block Business Continuity – Local Replication
Logical Volumes Logical Volume Logical Disk Block Logical Volume Physical Volume 1 Physical Volume 2 Physical Volume 3 Physical Disk Block Volume Group Business Continuity – Local Replication
Host Based Replication: Mirrored Logical Volumes PhysicalVolume 1 PVID1 VGDA Host Logical Volume Logical Volume PhysicalVolume 2 PVID2 VGDA Business Continuity – Local Replication
Host Based Replication: File System Snapshots • Many LVM vendors will allow the creation of File System Snapshots while a File System is mounted • File System snapshots are typically easier to manage than creating mirrored logical volumes and then splitting them Business Continuity – Local Replication
Host (LVM) Based Replicas: Disadvantages • LVM based replicas add overhead on host CPUs • If host devices are already Storage Array devices then the added redundancy provided by LVM mirroring is unnecessary • The devices will have some RAID protection already • Host based replicas can be usually presented back to the same server • Keeping track of changes after the replica has been created Business Continuity – Local Replication
Storage Array Based Local Replication • Replication performed by the Array Operating Environment • Replicas are on the same array Array Replica Source Production Server Business Continuity Server Business Continuity – Local Replication
Storage Array Based – Local Replication Example • Typically Array based replication is done at a array device level • Need to map storage components used by an application/file system back to the specific array devices used – then replicate those devices on the array Array 1 c12t1d2 c12t1d1 File System 1 Replica Vol 1 Source Vol 1 Logical Volume 1 Replica Vol 2 Source Vol 2 Volume Group 1 Business Continuity – Local Replication
Array Based Local Replication: Full Volume Mirror Attached Read/Write Not Ready Source Target Array Business Continuity – Local Replication
Array Based Local Replication: Full Volume Mirror Detached - PIT Read/Write Read/Write Source Target Array Business Continuity – Local Replication
Array Based Local Replication: Full Volume Mirror • For future re-synchronization to be incremental, most vendors have the ability to track changes at some level of granularity (e.g., 512 byte block, 32 KB, etc.) • Tracking is typically done with some kind of bitmap • Target device must be at least as large as the Source device • For full volume copies the minimum amount of storage required is the same as the size of the source Business Continuity – Local Replication
Copy on First Access (COFA) • Target device is made accessible for BC tasks as soon as the replication session is started • Point-in-Time is determined by time of activation • Can be used in Copy First Access mode (deferred) or in Full Copy mode • Target device is at least as large as the Source device Business Continuity – Local Replication
Copy on First Access Mode: Deferred Mode Write to Source Read/Write Read/Write Source Target Write to Target Read/Write Read/Write Source Target Read from Target Read/Write Read/Write Source Target Business Continuity – Local Replication
Copy on First Access: Full Copy Mode • On session start, the entire contents of the Source device is copied to the Target device in the background • Most vendor implementations provide the ability to track changes: • Made to the Source or Target • Enables incremental re-synchronization Business Continuity – Local Replication
Array: Pointer Based Copy on First Write • Targets do not hold actual data, but hold pointers to where the data is located • Actual storage requirement for the replicas is usually a small fraction of the size of the source volumes • A replication session is setup between the Source and Target devices and started • When the session is setup based on the specific vendors implementation a protection map is created for all the data on the Source device at some level of granularity (e.g 512 byte block, 32 KB etc.) • Target devices are accessible immediately when the session is started • At the start of the session the Target device holds pointers to the data on the Source device Business Continuity – Local Replication
Pointer Based Copy on First Write Example TargetVirtual Device Source Save Location Business Continuity – Local Replication
Array Replicas: Tracking Changes • Changes will/can occur to the Source/Target devices after PIT has been created • How and at what level of granularity should this be tracked • Too expensive to track changes at a bit by bit level • Would require an equivalent amount of storage to keep track of which bit changed for each the source and the target • Based on the vendor some level of granularity is chosen and a bit map is created (one for Source and one for Target) • One could choose 32 KB as the granularity • For a 1 GB device changes would be tracked for 32768, 32KB chunks • If any change is made to any bit on one 32KB chunk the whole chunk is flagged as changed in the bit map • 1 GB device map would only take up 32768/8/1024 = 4KB space Business Continuity – Local Replication
Array Replicas: How Changes Are Determined Source 0 0 0 0 0 0 0 0 At PIT Target 0 0 0 0 0 0 0 0 Source 1 0 0 1 0 1 0 0 After PIT… Target 0 0 1 1 0 0 0 1 Re-synch(Source toTarget) Target 1 0 1 1 0 1 0 1 0 = unchanged 1 = changed Business Continuity – Local Replication
Array Replication: Multiple PITs Target Devices 06:00 A.M. Source 12:00 P.M. Point-In-Time 06:00 P.M. 12:00 A.M. : 12 : 01 : 02 : 03 : 04 : 05 : 06 : 07 : 08 : 09 : 10 : 11 : 12 : 01 : 02 : 03 : 04 : 05 : 06 : 07 : 08 : 09 : 10 : 11 : A.M. P.M. Business Continuity – Local Replication
Array Replicas: Ensuring Consistency Source Replica Source Replica 1 1 1 2 2 2 3 3 3 3 4 4 4 4 D C Consistent Inconsistent Business Continuity – Local Replication
Mechanisms to Hold I/O • Host based • Array based • What if the application straddles multiple hosts and multiple arrays Business Continuity – Local Replication
Array Replicas: Restore/Restart Considerations • Production has a failure • Logical Corruption • Physical failure of production devices • Failure of Production server • Solution • Restore data from replica to production • The restore would typically be done in an incremental manner and the Applications would be restarted even before the synchronization is complete leading to very small RTO -----OR------ • Start production on replica • Resolve issues with production while continuing operations on replicas • After issue resolution restore latest data on replica to production Business Continuity – Local Replication
Array Replicas: Restore/Restart Considerations • Before a Restore • Stop all access to the Production devices and the Replica devices • Identify Replica to be used for restore • Based on RPO and Data Consistency • Perform Restore • Before starting production on Replica • Stop all access to the Production devices and the Replica devices • Identify Replica to be used for restart • Based on RPO and Data Consistency • Create a “Gold” copy of Replica • As a precaution against further failures • Start production on Replica • RTO drives choice of replication technology Business Continuity – Local Replication
Array Replicas: Restore Considerations • Full Volume Replicas • Restores can be performed to either the original source device or to any other device of like size • Restores to the original source could be incremental in nature • Restore to a new device would involve a full synchronization • Pointer Based Replicas • Restores can be performed to the original source or to any other device of like size as long as the original source device is healthy • Target only has pointers • Pointers to source for data that has not been written to after PIT • Pointers to the “save” location for data was written after PIT • Thus to perform a restore to an alternate volume the source must be healthy to access data that has not yet been copied over to the target Business Continuity – Local Replication
Array Replicas: Which Technology • Full Volume Replica • Replica is a full physical copy of the source device • Storage requirement is identical to the source device • Restore does not require a healthy source device • Activity on replica will have no performance impact on the source device • Good for full backup, decision support, development, testing and restore to last PIT • RPO depends on when the last PIT was created • RTO is extremely small Business Continuity – Local Replication
Array Replicas: Which Technology (continued) • Pointer based - COFW • Replica contains pointers to data • Storage requirement is a fraction of the source device (lower cost) • Restore requires a healthy source device • Activity on replica will have some performance impact on source • Any first write to the source or target will require data to be copied to the save location and move pointer to save location • Any read I/O to data not in the save location will have to be serviced by the source device • Typically recommended if the changes to the source are less than 30% • RPO depends on when the last PIT was created • RTO is extremely small Business Continuity – Local Replication
Array Replicas: Which Technology • Full Volume – COFA Replicas • Replica only has data that was accessed • Restore requires a healthy source device • Activity on replica will have some performance impact • Any first access on target will require data to be copied to target before the I/O to/from target can be satisfied • Typically replicas created with COFA only are not as useful as replicas created with the full copy mode – Recommendation would be to use the full copy mode if the technology allows such an option Business Continuity – Local Replication
Array Replicas: Full Volume vs. Pointer Based Full Volume Pointer Based Required Storage 100% of Source Fraction of Source Performance Impact None Some RTO Very small Very small Restore Source need not be healthy Requires a healthy source device Data change No limits < 30% Business Continuity – Local Replication
Module Summary Key points covered in this module: • Replicas and the possible use of Replicas • Consistency considerations when replicating File Systems and Databases • Host and Array based Replication Technologies • Advantages/Disadvantages • Differences • Considerations • Selecting the appropriate technology Business Continuity – Local Replication
Check Your Knowledge • What is a replica? • What are the possible uses of a replica? • What is consistency in the context of a database? • How can consistency be ensured when replicating a database? • Discuss one host based replication technology • What is the difference between full volume mirrors and pointer based replicas? • What are the considerations when performing restore operations for each replication technology? Business Continuity – Local Replication
Apply Your Knowledge Upon completion of this topic, you will be able to: • List EMC’s Local Replication Solutions for the Symmetrix and CLARiiON arrays • Describe EMC’s TimeFinder/Mirror Replication Solution • Describe EMC’s SnapView - Snapshot Replication Solution Business Continuity – Local Replication
EMC – Local Replication Solutions • EMC Symmetrix Arrays • EMC TimeFinder/Mirror • Full volume mirroring • EMC TimeFinder/Clone • Full volume replication • EMC TimeFinder/SNAP • Pointer based replication • EMC CLARiiON Arrays • EMC SnapView Clone • Full volume replication • EMC SnapView Snapshot • Pointer based replication Business Continuity – Local Replication
EMC TimeFinder/Mirror - Introduction • Array based local replication technology for Full Volume Mirroring on EMC Symmetrix Storage Arrays • Create Full Volume Mirrors of an EMC Symmetrix device within an Array • TimeFinder/Mirror uses special Symmetrix devices called Business Continuance Volumes (BCV). BCVs: • Are devices dedicated for Local Replication • Can be dynamically, non-disruptively established with a Standard device. They can be subsequently split instantly to create a PIT copy of data. • The PIT copy of data can be used in a number of ways: • Instant restore – Use BCVs as standby data for recovery • Decision Support operations • Backup – Reduce application downtime to a minimum (offline backup) • Testing • TimeFinder/Mirror is available in both Open Systems and Mainframe environments Business Continuity – Local Replication
EMC TimeFinder/Mirror – Operations • Establish • Synchronize the Standard volume to the BCV volume • BCV is set to a Not Ready state when established • BCV cannot be independently addressed • Re-synchronization is incremental • BCVs cannot be established to other BCVs • Establish operation is non-disruptive to the Standard device • Operations to the Standard can proceed as normal during the establish STD BCV BCV Establish Incremental Establish Business Continuity – Local Replication