280 likes | 445 Views
Backup of Distributed MySQL Applications Taking snapshot of a thousand dancing dolphins Chander Kant Paddy Sreenivasan CEO VP Engineering www.zmanda.com Twitter: @zmanda. Zmanda. Worldwide Leader in Open Source Backup 500,000+ Protected Systems
E N D
Backup of Distributed MySQL Applications Taking snapshot of a thousand dancing dolphinsChander Kant Paddy Sreenivasan CEO VP Engineering www.zmanda.com Twitter: @zmanda
Zmanda • Worldwide Leader in Open Source Backup • 500,000+ Protected Systems • Open Source, Open APIs, Open Formats • Smashes traditional backup business model • MySQL Backup Specialist • Zmanda Recovery Manager for MySQL • Zmanda Cloud Backup
Protected by ZmandaSubscribers of Enterprise Editions Web and Media Government Research & Education Telecom & Service Providers Manufacturing & Services
Top 5 MySQL Backup Requirements • Backup live database with minimal impact on application and users • Versatile • Scale Out = Multitude of servers • Scale up = Large Databases with no increase in lock times • Backup of local or remote MySQL servers • Intelligent Recovery • Precise restore to a particular point-in-time or database event • Fast restore in case of failure • Global Enterprise Management • Manage all databases from a single entity • Backup automation from scheduling, monitoring to reporting • Easy to Use and Secure
Zmanda Recovery Manager for MySQL ZRM remote to MySQL ZRM local to MySQL Enterprise-wide MySQL backup ZRM to MySQL Cluster
Zmanda Recovery Manager (ZRM) for MySQL As easy as What, Where, When and How.
Backups of MySQL Running on Amazon EC2 Zmanda Management Console EC2 EBS S3 Backup Catalog Incremental Backups EBS EBS Full Backups
Blazing Fast Snapshot based Full Backups Scenario: • 100+GB of database growing into Terabytes • 24x7 application (i.e. no backup window) • Active OLTP workload • Need ability to restore to specific database event Solution: • Storage Snapshot + MySQL Logs + Automated point-and-click restore • Solaris 10 x86 • ZFS Snapshot • MySQL Enterprise 5.0 • ZRM • Raw copy speed of 500 GB/hr
1 2 Point-in-time Recovery ZRM creates unified snapshots of data and MySQL binary log For point-in-time recovery between T2 and T3, ZRM reads data from snapshot T2 and replays transactions from Binlog T3 up to RPO. Note that ZRM can treat in-place snapshot as a backup (which is ideal for EBS Snapshots)
Backup & DR Needs for a large-scale MySQL Implementation • Application managers desire a point-in-time restore which is coordinated across multiple servers • IT managers want to have as identical configuration across all nodes - so process of replacing nodes becomes simple • Depending on the application, retention policy could be several years • Overall application should be able to recover from multiple node failures, human errors or sabotage, and geographic problems (disaster, connectivity etc.)
Coordinated Backups vs. Coordinated Restore • Coordinated Backups • Backup all nodes consistent to a specific event • E.g. all rows are backed up until a specific Global Sequence Number (GSN) or create a checkpoint event specifically for backup purposes • Cleanest backup images but periodic hiccups
Coordinated Backups vs. Coordinated Restore • Coordinated Restore • Each individual node backed up completely independent of each other • No checkpoint event • However more processing required at the time of recovery • ZRM can be scripted to identify this point in the backed up binary logs for every shard • Visual log analyzer feature of ZRM helps DBAs to efficiently search for these points • Clock synchronization helps
Case Study: ZRM configuration with MySQL Shards 100 database nodes Consolidated Meta ZRM server ZRM servers Converted full and incremental backups LVM Snapshots NFS Remote Remote Data Center Shared Storage (with Deduplication)
Case Study: Restoration Scenarios • Recovery from application errors • Apply transactions for the node (or across nodes) • Recovery from failed disk or node • Apply full backup and incremental backups to latest checkpoint • ZRM provides portable backup images
Backup images • Local full backup image is a LVM snapshot on the local node • The LVM snapshot is converted into regular backups on a weekly basis in the background • The incremental backup data is available over NFS to ZRM meta backup server • The backup images and the catalog from shared storage are replicated to a remote datacenter
Backup policies • The full and incremental backups are compressed • Unless deduplication based storage is deployed • The shared storage for backups can use deduplication
Restoration steps (Operator error) • Identify offending record change • Use Visual Log Analyzer of ZRM on hosts for the record • Reasonable time synchronization is helpful here • Identify prior event for the key • Use Search in Zmanda Management Console • Coordinated Restore Script • Application level script takes input from ZRM and commits new records for all effected nodes.
Restoration steps (Failed node) • Restore failed node to last available backup • Use Meta ZRM server for restoration • If a checkpoint is present, use Visual Log Analyzer of ZRM to identify the last restored checkpoint • Call Application level node synchronization procedure
Zmanda Cloud Backup (For MySQL on Windows) • Apps: Exchange, SQL Server, Oracle, SharePoint and MySQL • Compliant with EU Data Protection Directive 95/46 • Network Drive support • Logical full backups only • Can backup remote MySQL databases
Zmanda Recovery Manager in Action • More than one million new Athletes created every month. • Each with the ability to customize their avatars, accumulate game credits and buy virtual prizes. • Combination of users, identities, games-in-play, credits and prizes generates a lot of data at a very fast pace — all of which is core to the company's success. • Multiple Storage Engines: InnoDB, MyISAM and Archive • In addition to regular full backups, the company must complete an incremental backup of MySQL every 15 minutes.
Zmanda Recovery Manager in Action “ZRM helps us formalize and automate the backup process for all our production data, and consolidates all backups from different systems into one consistent platform.... Furthermore, the ZRM platform greatly simplified our production systems' recovery scenarios by reducing the number of steps required in the data recovery process.” Franck Leveneur, Senior Data Architect, Six Degrees Games, Inc
Protected by ZmandaSubscribers of Enterprise Editions Web and Media Government Research & Education Telecom & Service Providers Manufacturing & Services