Disk Based Disaster Recovery & Data Replication Solutions

Disk Based Disaster Recovery & Data Replication Solutions • Gavin Cole • Storage Consultant SEE

Agenda • Planning for a Disaster • Using Local Copies for Protection • Using Remote Mirroring for Protection • Conclusion

Disaster any interruption in the normal access to a valid set of data used by applications and end users to execute mission critical business processes for an unacceptable period of time. Jon William Toigo Chairman, Data Management Institute 2005

Source: Ontrack Data Report 2007

The threat of data loss can’t be ignored • Risks: • High cost of data loss and downtime • Insufficient data recovery plans and procedures • 70% of businesses fail after major data loss • Recovery overextends limited staffs and financial resources • Needs: • High availability and fast data recovery • Seamless integration with existing IT infrastructures • Interoperability with current and future storage and computing systems

The Cost of Downtime • Forrester Consulting Interviewed 138 companies • Estimated cost of 1 hour downtime • Less than $10,000 / hr – 25% • $10,000 to $100,000 / hr – 33% • $100,000 to $500,000 – 25% • $500,000 to $1 million – 13% • Greater than $1 million – 4% • 67% - could not estimate the financial cost of downtime

Key definitions Resumption of normal business Last safe Backup Disaster Recovery Point Objective (RPO)‏ The time between the last safe backup and the point of time of the disaster Recovery Time Objective (RTO)‏ The time elapsed from when the disaster occurred to the resumption of normal business activities

Business Continuity • Planning to never go down • Always have access to information • Needs more than a good data recovery strategy • A disaster can be something as simple as a deleted file • Use disk duplication strategies • Mirroring • Snapshot • Remote replication

Complete data protection requires personalized and practical solutions Risk Disruption Regional Disaster Seconds Local Disaster Minutes Business goals Threats Budget realities Existing assets Hours Operator error Days Hardware failure Application Department Data Center Enterprise Scale

Seven Key Planning Steps • Business impact assessment • How long can I live without data? • Discovery • What data do I need first? • Budget • What is my data worth? • Role-based teams • Who are the key people? • Data protection • How do I protect what I need? • Logistics • What are the physical requirements? • Testing • Will my plan work?

Volume Copy Terms Complete point in time replication of one Volume (source) to another (target) within a Storage Subsystem Copy Pair Target or Copy or Clone Target = Volume that maintains a copy of the data from the source Source = Volume that accepts host I/O and stores application data

How Volume Copy Works

Snapshot - a logical point-in-time image of another volume. Logical equivalent of a complete physical copy Logical Disk Space Physical Disk Space Repository – stores original blocks from Base before they are overwritten with new data Snapshot Terms • A point-in-time (PiT) image of a volume • Logical equivalent of a physical copy Storage System Physical Disk Space Base Volume - the volume from which the Snapshot will be created

Snapshot Flow Chart

Using Volume Copy and Snapshot together Copying the Snapshot creates a full PiT clone copy while I/O continues to base volume (LUN)‏

Remote Volume Mirroring • Ongoing, real-time replication of a volume from one storage system to another

Secondary Primary Primary Secondary Remote Volume Mirroring Components Primary volume: accepts read and write host I/O Secondary volume: accepts read host I/O. accepts remote writes of data from controller owner of Primary volume.

Remote Volume Mirroring Components Mirror Repository volume: Stores mirroring data, such as info about remote writes that have not completed Mirror Repositories Mirror Repositories Mirror Pairs V1 -> V1M V2 -> V2M V3 -> V3M Primary Secondary Primary Secondary

Synchronous Replication • The primary disk system acknowledges a host write when the data has been successfully mirrored • Primary benefit • Ensures remote data is an exact replicate of the local data • Note: only effective for campus area replication

Asynchronous Write Mode • Allows the primary disk system to acknowledge a host write request before the data has been successfully mirrored • Primary benefits • Reduces impact of latency when replicating over longer distances • Provides performance improvement – compared to synchronous – for primary site I/O (disk system and application)‏ • Enables effective replication over longer distances (WAN)‏

Preserved Write Order • Write operations to the secondary disk system matches I/O completion order on the local disk system • Also referred to as a consistency group • Primary benefit • Maintains data integrity in multi-LUN applications (databases) by eliminating out-of-order updates at the remote side that can cause logical corruption

Remote Volume Mirroring Mirror Management • Role Reversal (from secondary to primary or vice versa) is user-initiated • If primary is also base volume for snapshots, role reversal will cause associated snapshots to fail • It is possible to force role change for the local volume if communication to the remote volume is down • Used in disaster recovery scenarios • Can prepare by mapping secondary volumes to hosts using Storage Partitions before they are promoted

DR / HA Architecture Cluster 1 Cluster 2 Site A Site B Volume 3 replication * M = mirror Volume 1 & 2 replication * M = mirror

Architecture Description • Site A and Site B each contain a copy of critical data • Critical data is copied in real time using the disk controllers • minimal impact on server processing power • OS and key applications are clustered across both sites • If either site fails application transparently fails over to remote • Customers notices minimal disruption • 2 way Disaster Protection • Cluster 1 uses volume V1 and V2 – primary business is at Site A, Mirrored to Site B for protection • Cluster 2 uses volume V3 - primary business is at Site B, Mirrored to Site A for protection • Sites are connected by Fibre Channel network for performance • Could be connected by long distance IP network – will be performance impact on replication.

Could you survive a disaster? • 35% of companies have a plan • 60% of plans are never tested • Half the companies that suffer losses never recover • $10 – $50 K per MB to re-create data

Seven Key Planning Steps • Business impact assessment • How long can I live without data? • Discovery • What data do I need first? • Budget • What is my data worth? • Role-based teams • Who are the key people? • Data protection • How do I protect what I need? • Logistics • What are the physical requirements? • Testing • Will my plan work?

Thank You Disaster Recovery Planning Gavin Cole gavin.cole@sun.com +33 6 70 72 99 53

Disk Based Disaster Recovery & Data Replication Solutions