200 likes | 367 Views
Failure Resilience. Failure Resilience in the Peer-to-Peer-System OceanStore Speaker: Corinna Richter. Outline. Introduction An Overview over OceanStore Failure Resilience in OceanStore Byzantine Fault Protocol Proactive Threshold Signatures Erasure Coding Summary Questions.
E N D
Failure Resilience Failure Resilience in the Peer-to-Peer-System OceanStore Speaker: Corinna Richter
Outline • Introduction • An Overview over OceanStore • Failure Resilience inOceanStore • Byzantine Fault Protocol • Proactive Threshold Signatures • Erasure Coding • Summary • Questions 20/08/2014 Corinna Richter: Failure Resilience 2
Introduction • Failure Resilience: “A system responds according to the specification in spite of a limited number of faults” • Availibility • Reliablitiy • How does this work in open Peer-to-Peer-Systems? • Specific problems • Solutions in OceanStore 20/08/2014 Corinna Richter: Failure Resilience 3
OceanStore: Basics Archival Storage • “Internet-scale, persistent data store designed for incremental scalability, secure sharing and long-term durability” • infrastructure is constantly changing and untrusted except in aggregate Client Client Client Inner ring Inner ring Replicas Archival Storage Quelle: John Kubiatowicz 20/08/2014 Corinna Richter: Failure Resilience 4
OceanStore: Inner Ring • Primary replica for one data-object • serializesupdate actions for this object • checks the correctness of the update • “knows” the current version of the object • implemented by a group of servers: distributed load no „single point of failure“ • What about correct decisions, if some hosts are faulty? 20/08/2014 Corinna Richter: Failure Resilience 5
Byzantine Fault Protocol - Problem • Byzantine faults vs. Fail-Stop-Processes • Fail-Stop: Omission, Crash no reaction • Byzantine faults reaction might be faulty • How many faulty processes are tolerable? • How can all correct processes (of the primary replica) find the same decision? • Illustration: The Byzantine Generals Problem 20/08/2014 Corinna Richter: Failure Resilience 6
Byzantine Fault Problem - Model Commander Commander Go! Go! Go! Stop! P1 P2 P1 P2 Stop! Stop! Who is the traitor? There is only a solution of the BFP-Problemif less than one third of the processes are faulty! 20/08/2014 Corinna Richter: Failure Resilience 7
Byzantine Fault Problem - A “proof” by intuition Update X f=3, n=? Client 3 answers may be delayed and faulty He can’t wait for more than n-3 messages. 3 of n-3 messages may still be faulty must have (n-3)-3 > 3 n > 9 Primary Replica 20/08/2014 Corinna Richter: Failure Resilience 8
i i i i i i k k Byzantine Fault Protocol • Ex.:order of updates - position of update X? Round 1: P1 sends his decision to n-1 processes P1 Round 2: each of the n-1 processes sends value he received to n-2 processes Round i: use the majority of round i-1 P2 P3 P4 After round f+1 P2:(i , i, k) => i P3:(i, i, k) => i 20/08/2014 Corinna Richter: Failure Resilience 9
Byzantine Fault Problem - Solution in OceanStore • How can a system guarantee this? • other systems: Reboot of a secure partition at regular intervals • OceanStore: dynamically exchange the Server of the inner ring • Responsible Party • chooses the hosts for the inner ring • analyses the stability of the hosts • more Responsible Parties in a system 20/08/2014 Corinna Richter: Failure Resilience 10
BFP with signed messages: OceanStore • Symmetric Keys vs. asymmetric Keys : • MACs for the intern communication of the inner ring • Public Key for the communication with others • Proactive Threshold Signatures: • One Public Key for all n hosts of the inner ring • generate n=3f+1 private key shares 20/08/2014 Corinna Richter: Failure Resilience 11
Proactive Threshold Signatures - BFP in OceanStore • f+1 private keys are combined to a full signature • at most one of these messages comes from a correct host • all correct hosts work deterministically • Exchange of the server • no interruption: public key stays unchanged • generate new set of private key shares and delete the old set 20/08/2014 Corinna Richter: Failure Resilience 12
OceanStore: Update Archival storage Primary Replica Write object x Other users Secondary Dissemination Tree Quelle: S.Rhea, P. Eaton, D. Geels, H. Weatherspoon, B.Zhao, and J. Kubiatowicz 20/08/2014 Corinna Richter: Failure Resilience 13
Erasure Coding - Motivation • Data availability must be guaranteed • Omission of hosts, crashes, etc. • Redundancy of the data • replicated, distributed data storage on several hosts • Problem of naive Replication • inefficient with respect to the total storage consumed Erasure Coding 20/08/2014 Corinna Richter: Failure Resilience 14
Erasure Coding • Idea: divideone block of data in m fragments and code these in n fragments (n>m). Distribute these n fragments arbitrarily on the hosts. • m/n=r, Rate of encoding • Storage costs multiplied by n/m • Example: m=16, n=32, r=1/2, storage costs x 2 m=16 fragments Code them in 32 fragments on distributed servers 20/08/2014 Corinna Richter: Failure Resilience 15
Erasure Coding: Efficiency • POND: Cauchy Reed Solomon Code with m = 16 and n = 32 • The reconstruction of the data is possible with any m fragments • complex algorithm for (de-) coding • Data availibility is determined by possible permutations of the fragments • increased by a factor of 4000 for n=32 20/08/2014 Corinna Richter: Failure Resilience 16
Erasure Coding: Disadvantages • Primary Replica has to compute the coding and decoding of the fragments • Very expensive operation! • Just decode, if there is no secondary replica for this object whole block caching 20/08/2014 Corinna Richter: Failure Resilience 17
OceanStore: Dissemination Tree • Tree-Structure for one data-object • root: primary replica • nodes: secondary replicas in cache • publication of updates down the tree • self-organising structure Primary Replica Secondary Replica ..... 20/08/2014 Corinna Richter: Failure Resilience 18
Summary • OceanStore:Internet-scale, global, persistent data store • interesting solutions for failure resilience in peer-to-peer-systems • Proactive Threshold Signatures • Byzantine Fault Protocol • Erasure Coding • Results of a Prototype-Implementation • Threshold Signatures not efficient to compute • Further research based on OceanStore API 20/08/2014 Corinna Richter: Failure Resilience 19
Failure Resilience Questions? 20/08/2014 Corinna Richter: Failure Resilience 20