BASIC Regenerating Codes for Distributed Storage System s

BASIC Regenerating Codes for Distributed Storage Systems Kenneth Shum (Joint work with Minghua Chen, Hanxu Hou and Hui Li)

Window Azure data centers kshum

Inside a data center http://technoblimp.com kshum

Data distribution • Encode and distribute a data file to n storage nodes. Data File: “INC” kshum

Data collector • Data collector can retrieve the whole file by downloading from any k storage nodes. “INC”  kshum

Three kinds of disk failures • Transient error due to noise corruption • repeat the disk access request • Disk sector error • partial failure • detected and masked by the operating system • Catastrophic error • total failure due to disk controller for instance • the whole disk is regarded as erased Aug 2013 kshum 6

Frequency of node failures Figure from “XORing elephants: novel erasure codes for Big Data” by Sathiamoorthy et al. Number of failed nodes over a single month in a 3000 node production cluster of Facebook. Aug 2013 7

Outline of this talk • Repetition scheme • Traditional erasure-correcting codes • Reed-Solomon codes • Network-coding-based scheme • BASIC regenerating codes Aug 2013 kshum 8

Distributed storage system • Encode a data file and distribute it to ndisks • (n,k) recovery property • The data file can be rebuilt from any kdisks. • Repair • If a node fails, we regenerate a new node by connecting and downloading data from any d surviving disks. • Aim at minimizing the repair bandwidth(Dimakis et al 2007). • A coding scheme with the above properties is called a regenerating code. kshum

Repetition scheme • GFS: Replicate data 3 times • Gmail: Replicate data 21 times kshum

2x Repetition scheme Divide the datafile into 2 parts 1G A A, B 1G Data Collector B 1G A 1G Cannot toleratedouble disk failures B

1G Repair is easy for repetition-based system New node A A B A Repair bandwidth =1G B

Reed-Solomon Code Divide the file into 2 parts A A, B Data Collector B A+B It can toleratedouble disk failures A+2B Aug 2013 13

Repair requires essentially decoding the whole file A A New node 1G B 1G A+B Repair bandwidth = 2G A+2B Aug 2013 kshum 14

    BASIC regeneration code Binary AdditionShiftImplementableConvolutional Divide the datafile into 4 parts 0.5G 0.5G 0.5G 0.5G Utilization of bit-wise shift in storage was proposed byPiret and Krol (1983), andQureshi, Foh and Cai (2012).

Download from nodes 1 and 2     1G Data Collector 0.5G 1G 0.5G 0.5G 0.5G Aug 2013 16

Download from nodes 1 and 3     1G Data Collector 0.5G 0.5G 0.5G 1G 0.5G Aug 2013 17

Download from nodes 1 and 4     1G Data Collector 0.5G 0.5G 0.5G 0.5G 1G Aug 2013 18

Download from nodes 2 and 3     1G Data Collector 0.5G 0.5G 0.5G 1G 0.5G Aug 2013 19

Download from nodes 2 and 4     1G Data Collector 0.5G 0.5G 0.5G 0.5G 1G Aug 2013 20

Download from nodes 3 and 4     Data Collector 0.5G 1G 0.5G 0.5G 0.5G 1G Aug 2013 21

Zigzag decoding à laGollakata and Katabi (2008) What to solvefor P1and P2. P1  P2 P1P2 P1  P2’ P1P2’ Aug 2013 kshum 22

Repair of BASIC regenerating code New node XOR Repair bandwidth=1.5 G  Bitwise shift and XOR   Bitwise shift and XOR 

Interference alignment Repair of BASIC regenerating code  Decode the blueand red packets byzigzag decoding   

Comparison of the three examples Aug 2013 kshum 25

Summary • We can reduce repair bandwidth by network coding. • BASIC regenerating codes • A failed storage node can be repaired by simple bit-wise shift and XOR operations. • Small storage overhead due to shifting. Aug 2013 kshum 26

References • Piret and Krol, MDS convolution codes, IEEE Trans. of Information Theory, 1983. • Dimakis, Brighten, Wainwright and Ramchandran, Network coding for distributed storage systems, INFOCOM, 2007. • Gollakata and Katabi, Zigzag decoding: combating hidden terminals in wireless networks, Proc. in the ACM Sigcomm, 2008. • Qureshi, Foh, and Cai, Optimal solution for the index coding problem using network coding over GF(2), Proc. IEEE Conf. on Sensor Mesh and Ad Hoc Comm. and Network, 2012. • Sung and Gong, A zigzag decodable code with MDS property for distributed storage systems, Proc. IEEE Symp. on Information Theory, 2013. • Hou, Shum, Chen and Li, BASIC regenerating code: binary addition and shift for exact repair, Proc. IEEE Symp. on Information Theory, 2013. kshum

Two modes of repair • Exact repair • The content of the new node is exactly the same as the content of the failed node • Functional repair • only requires that the (n,k) recovery property is preserved. kshum

BASIC Regenerating Codes for Distributed Storage System s

BASIC Regenerating Codes for Distributed Storage System s

Presentation Transcript

Bigtable : A Distributed Storage System for Structured Data

Bigtable : A Distributed Storage System for Structured Data

Self-repairing Homomorphic Codes for Distributed Storage Systems [1]

Distributed Storage

BigTable A System for Distributed Structured Storage

Bigtable : A Distributed Storage System for Structured Data

Bigtable : A Distributed Storage System for Structured Data

Bigtable : A Distributed Storage System for Structured Data

Bigtable : A Distributed Storage System for Structured Data

Distributed Storage System Survey

Bigtable : A Distributed Storage System for Structured Data

Exact Regenerating Codes on Hierarchical Codes

Distributed LT Codes

Distributed Storage

Simple Regenerating Codes: Network Coding for Cloud Storage

Cooperative regenerating codes for distributed storage systems

L-Store Distributed storage system

Bigtable : A Distributed Storage System for Structured Data

Compound Codes for Optimal Repair in Distributed Storage

BigTable: A Distributed Storage System for Structured Data

Bigtable : A Distributed Storage System for Structured Data

Big Table: Distributed Storage System For Structured Data