430 likes | 675 Views
HBase User Group Meetup 10/29/12 Jesse Yates. HBase Snapshots. So you wanna …. Prevent data loss Recover to a point in time Backup your data Sandbox copy of data. Problem!. a BIG Problem…. Petabytes of data 100’s of servers At a single point in time Millions of writes per- second.
E N D
HBase User Group Meetup 10/29/12 Jesse Yates HBase Snapshots
So you wanna…. • Prevent data loss • Recover to a point in time • Backup your data • Sandbox copy of data
a BIG Problem… • Petabytes of data • 100’s of servers • At a single point in time • Millions of writes per-second
Built-in • Export • MapReduce job against HBase API • Output to single seqeunce file • Copy Table • MapReduce job against HBase API • Output to another table Yay • Simple • Heavily tested • Can do point-in-time Boo • Slow • High impact for running cluster
Replication • Export all changes by tailing WAL YAY • Simple • Gets all edits • Minimal impact on running cluster Boo • Turn on from beginning • Can’t turn it off and catch up • No built-in point-in-time • Still need ETL process to get multiple copies
(Facebook) Solution!1 • Mozilla did something similar2 1. issues.apache.org/jira/browse/HBASE-5509 2. github.com/mozilla-metrics/akela/blob/master/src/main/java/com/mozilla/hadoop/Backup.java
Facebook Backup • Copy existing hfiles, hlogs Yay • Through HDFS • Doesn’t impact running cluster • Fast • distcp is 100% faster than M/R through HBase Boo • Not widely used • Requires Hardlinks • Recovery requires WAL replay • Point-in-time needs filter
Backup through the ages Export Copy Table Replication HBase HBASE-50 HDFS Facebook
Hardlink workarounds • HBASE-5547 • Move deleted hfiles to .archive directory • HBASE-6610 • FileLink: equivalent to Windows link files Enough to get started….
Difficulties • Coordinating many servers • Minimizing unavailability • Minimize time to restore • Gotta’ be Fast
HBASE-50 HBASE-6055
Snapshots • Fast - zero-copy of files • Point-in-time semantics • Part of how its built • Built-in recovery • Make a table from a snapshot • SLA enforcement • Guaranteed max unavailability Coming in HBase-0.96!
Snapshot Types • Offline • Table is already disabled • Globally consistent • Consistent across all servers • Timestamp consistent • Point-in-time according to each server
Offline Snapshots • Table is already disabled • Requires minimal log replay • Especially if table is cleanly disabled • State of the table when disabled • Don’t need to worry about changing state YAY • Fast! • Simple!
Globally Consistent Snapshots • All regions block writes until everyone agrees to snapshot • Two-phase commit-ish • Time-bound to prevent infinite blocking • Unavailability SLA maintained per region • No Flushing – its fast!
Cross-Server Consistency Problems • General distributed coordination problems • Block writes while waiting for all regions • Limited by slowest region • servers = P(failure) • Stronger guarantees than currently in HBase • Requires WAL replay to restore table
Timestamp Consistent Snapshots • All writes up to a TS are in the snapshot • Leverages existing flush functionality • Doesn’t block writes • No WAL replay on recovery
Put/Get/Delete/Mutate/etc. MemStore Timestamp in snapshot? No Yes Snapshot Store Future Store
Recovery • Export snapshot • Send snapshot to another cluster • Clone snapshot • Create new table from snapshot • Restore table • Rollback table to specific state
Export Snapshot • Copy a full snapshot to another cluster • All required HFiles/Hlogs • Lots of options • Fancy dist-cp • Fast! • Minimal impact on running cluster
Clone Table • New table from snapshot • Create multiple tables from same snapshot • Exact replica at the point-in-time • Full Read/Write on new table
Restore • Replace existing table with snapshot • Snapshots current table, just in case • Minimal overhead • Handles creating/deleting regions • Fixes META for you
Goodies • Full support in shell • Distributed Coordination Framework • ‘Ragged Backup’ added along the way • Coming in next CDH • Backport to 0.94?
Special thanks! • MatteoBertozzi • All the recovery code • Shell support • Jon Hsieh • Distributed Two-Phase Commit refactor • All our reviewers… • Stack, Ted Yu, Jon Hsieh, Matteo
Thanks!Questions? Jesse Yates @jesse_yates jesse.k.yates@gmail.com