610 likes | 1.07k Views
Introduction to Erasure coding. Kenji Kaneda. 発表の動機と目的. 耐故障ファイルシステム関係の論文中に Erasure coding という用語がよく現れる 例) Oceanstore, RAID, … 詳細については余りよく知らない アルゴリズムの効率は? 実装にかかる手間はどれくらい? Erasure coding の一種である Reed-Solomon Coding について調べる. Outline. Problem Specification General Strategy
E N D
Introduction to Erasure coding Kenji Kaneda
発表の動機と目的 • 耐故障ファイルシステム関係の論文中にErasure codingという用語がよく現れる 例)Oceanstore, RAID, … • 詳細については余りよく知らない • アルゴリズムの効率は? • 実装にかかる手間はどれくらい? Erasure codingの一種である Reed-Solomon Codingについて調べる
Outline • Problem Specification • General Strategy • Overview of Reed-Solomon Coding • An Example • Appendix: Galois Fields
Outline • Problem Specification • General Strategy • Overview of Reed-Solomon Coding • An Example • Appendix: Galois Fields
Problem Specification (1/2) • Given • n Data devices (D1, D2, …, Dn) • Each holds k bytes • m Checksum devices (C1, C2, …, Cm) • Each holds k bytes D1 D2 D3 D4 D5 D6 D7 D8 n=8 C1 C2 m=2
Problem Specification (2/2) • Goal • Define the calculation of each Ci such that if any m of D1, D2, …, Dn, C1, C2, …, Cm fail, then the failed devices can be reconstructed from the non-failed devices D1 D2 D3 D4 D5 D6 D7 D8 n=8 C1 C2 m=2
An Example Configuration • “n+1-parity” coding (RAID Level 5) • m=1 • c1,j= d1,j ⊕ d2,j⊕ … ⊕ dn,j where c1,j= j-th byte of C1and di,j = j-th byte of Di D1 D2 Dn … C1
Outline • Problem Specification • General Strategy • Overview of Reed-Solomon Coding • An Example • Appendix: Galois Fields
General Strategy (1/4) • Partition storage devices … D1 D2 Dn … C1 C2 Cm
General Strategy (2/4) • Initialize checksum devices … D1 D2 Dn … C1 C2 Cm
General Strategy (3/4) • Update data and checksum devices update … D1 D2 Dn update update update … C1 C2 Cm
General Strategy (4/4) • Recover storage devices from failures … D1 D2 Dn … C1 C2 Cm
Partitioning of Devices (1/2) • Break up each device into words • Size of each word is w bits • w is chosen by a programmer w bits k bytes Di
Partitioning of Devices (2/2) • Henceforth we assume that each device holds just 1 word (for simplicity) • data words: d1, d2, …, dn • checksum words: c1, c2, …, cm d1 d2 dn … D1 D2 Dn c1 c2 cm … C1 C2 Cm
Calculation of Checksum • Define a coding function Fi (d1, d2, …, dn) • Calculates a checksum word on Ci E.g.) F1(d1, d2, …, dn) = d1⊕ d2⊕ … ⊕ dn d1 d2 dn … D1 D2 Dn c1=F1(d1, …, dn) c2=F2(d1, …, dn) cm=Fm(d1, …, dn) … C1 C2 Cm
Update of Checksum • Define an update function Gi,j (dj, dj’, ci) • Calculates a checksum word on Ci when a checksum word on Ci is ci and a data word on Dj is updated from dj to dj’ E.g.) G1,j (dj, dj’, ci) = c1⊕ dj⊕ dj’ d1 d2 dn … d2’ D1 D2 Dn c1 c2 cm … c1’=G1,2(d2,d2’,c1) c2’=G2,2(d2,d2’,c2) c3’=G3,2(d2,d2’,c3) C1 C2 Cm
Recovery from Failure • Restore the words in any failed data device Dj from the words in the non-failed devices E.g.) dj = d1⊕ … ⊕ dj-1⊕ dj+1⊕ … ⊕ dn ⊕ c1 • Re-compute any failed checksum devices Ciwith Fi
Problem Restatement • Given n data words d1, d2, …, dn, all of size w • Define functions F and G to calculate and maintain the checksum words c1, c2, …, cm
Outline • Problem Specification • General Strategy • Overview of Reed-Solomon Coding • An Example • Appendix: Galois Fields
Overview of Reed-Solomon Coding • Using the Vandermonde matrix to calculate and maintain checksum words • Using Gaussian Elimination to recover from failures • Using Galois Fields to perform arithmetic
Calculating and Maintaining Checksum Words • Define a coding function Fiand an update function Gi,j
Definition of Coding Function (1/2) • Define Fi to be a linear combination of the data words • Vector representation n • ci = Fi (d1, d2, …, dn) = Σ dj fi,j j = 1 C = = = FD
Definition of Coding Function (2/2) • Define F to be the m×n Vandermonde matrix • fi, j= j i-1 F = =
Definition of Update Function • Subtract out the portion of the checksum word that corresponds to dj • Add the required amount for dj’ Gi,j (dj, dj’, ci) = ci + fi,j(dj’ – dj)
Recovering from Failures (1/4) • Define matrix A and E A = E = I : n×n identity matrix AD = E
Recovering from Failures (2/4) • When devices fail, • Delete the corresponding rows from A and E AD = = = E
Recovering from Failures (3/4) • When devices fail, • Delete the corresponding rows from A and E A’D = = = E’
Recovering from Failures (4/4) • Values of D are recovered from A’D = E’ using Gaussian Elimination E.g.)if m devices fail, D = (A’) -1E’ • A’is a non-singular because F is Vandermonde matrix
Problem with Arithmetic Operations (1/2) • Domain and range of the computation are binary words of a fixed length w • Not infinite precision real numbers
Problem with Arithmetic Operations (2/2) • The algebra is correct when all the elements are infinite precision real numbers We must make sure that it is correct for the fixed-size words
Naïve Solution and its Problem • Arithmetic over the integers modulo 2w Division is not defined for all pairs of elements E.g.) (3÷2) is undefined modulo 22 (=4)
Our Solution • Perform addition/multiplication over a Galois Field
Mapping Between Elements of GF(2w)and Binary Words • r(x) ∈ GF(2w) ⇔ a binary word b of size w such that i-th bit of b = the coefficient of xi inr(x) r(x) = awxw + aw-1xw-1 +… + a1x + a0 b = awaw-1… a1a0
Examples of Mapping (1/3) • GF(22) = GF(2)[x]/x2+x+1
Examples of Mapping (2/3) • GF(24) = GF(2)[x]/x4+x+1
Addition/Subtraction over Binary Elements • XOR operation Binary elements 11 + 7 = 1011⊕ 0111 = 1100 = 12 GF(2w) 11 + 7 = (x3+x+1) + (x2+x+1) = x3+x2 = 12
Multiplication/Division over Binary Elements (1/4) • Covert the binary words to their polynomial elements • Multiply/divide the polynomials modulo a primitive polynomial q(x) • Covert the result back to a binary element Binary elements b1 * b2 = b3 GF(2w) r1(x) * r2(x) = r3(x)
Multiplication/Division over Binary Elements (2/4) Use two logarithm tables • gflog • Maps a binary element b to power j such that xj is equivalent to b • gfilog • Maps from a power j to its binary element b … GF(24)
Multiplication/Division over Binary Elements (3/4) • Convert each binary element to its discrete logarithm • By looking up gflog • Add/Subtract the logarithms modulo 2w-1 ※ x2^w-1 = q(x) • Covert result back to a binary element • By looking up gfilog
Multiplication/Division over Binary Elements (4/4) Binary elements 3 * 7 = gfilog[gflog[3]+gflog[7]] = gfilog[4+10] = 9 GF(2w) 3 * 7 = (x+1) * (x2+x+1) = x4+10 = x3+1 = 9
Summary of Algorithm • Choose w such that 2w > n + m • Set up the tables gflog and gfilog • Set up the matrix F • Calculate words of the checksum devices • If any number of devices up to m fails, • Choose any n of the remaining devices • Construct the matrix A’ and E’ • Solve for D in A’D = E’
Outline • Problem Specification • General Strategy • Overview of Reed-Solomon Coding • An Example • Appendix: Galois Fields
An Example • Suppose n=3 and m=4
Step 1~3 • Choose w to be 4 ※ 2w > n + m である必要がある • Set up gflog and gfilog • Set up the 3×3 matrix F • Defined over GF(24) F = =
Step 4 • Calculate each word of the checksum devices using FD=C • d1=3, d2=13, d3=9 c1= (1)(d1) ⊕(1)(d2) ⊕(1)(d3)= 7 c2= (1)(d1) ⊕(2)(d2) ⊕(3)(d3) = 2 c3= (1)(d1) ⊕(4)(d2) ⊕(5)(d3)= 9
Step 5 • Change d2 to 1 • D2 send the value (1-13) = (0001⊕1101) = 12 c1= 7 ⊕(1)(12) = 11 c2= 2 ⊕(2)(12) = 9 c3= 9 ⊕(4)(12) = 12
Step 6 • D2, D3, and C3 are lost AD = D = = E
Step 7 • D2, D3, and C3 are lost A’D = D = = E’
Step 7 • Recovery D = (A’)-1E’ = = c3= (1)(3) ⊕(4)(1) ⊕(5)(9)= 12