1 / 26

Depot: Cloud Storage with minimal Trust

COSC 7388 – Advanced Distributed Computing Presentation By Sushil Joshi. Depot: Cloud Storage with minimal Trust. Introduction Typical Key Value Store Fork-Join-Causality Consistency Architecture of Depot Basic Protocol Properties Provided by Depot Experimental Evaluation. Agenda.

freya
Download Presentation

Depot: Cloud Storage with minimal Trust

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. COSC 7388 – Advanced Distributed Computing Presentation By Sushil Joshi Depot: Cloud Storage with minimal Trust

  2. Introduction Typical Key Value Store Fork-Join-Causality Consistency Architecture of Depot Basic Protocol Properties Provided by Depot Experimental Evaluation Agenda

  3. Cloud storage system that minimizes trust Cloud Storage Service Provider (SSP) are fault-prone (software bug, malicious insider, operator error, natural disaster) Depot eliminates trust for safety Minimizes trust for liveness and availability Introduction

  4. GET and PUT api available to customers. Most services store and retrieve data based on primary key only Not implemented in RDBMS since typical usecase do not require complex querying and managing facilities provided by RDBMS Excess functionality requires extra hardware and extra manpower RDBMS chooses consistency over availability Partitioning scheme can not be used for load balancing in RDBMS Typical Key-Value Storage

  5. Strong consistency and high data availability can not be obtained together. Availability can be achieved by replicas and allowing concurrent write operation. This leads to conflicting changes that needs to be resolved Problem arises: when to resolve those conflicts and who resolves them. Eventually consistent – all replicas receive all updates eventually. Consistency Vs Availability

  6. Vector Clock for version reconcilation. Sx, Sy, Sz are replicas of data storage. Version Evolution of an Object Write handled by Sx D1 [Sx, 1] Write handled by Sx D2 [Sx, 2] Write handled by Sy Write handled by Sz D4 [Sx, 2][Sz,1] D3 [Sx, 2][Sy,1] Write handled by Sx D5 [Sx, 3][Sy, 1][Sz,1]

  7. Random peer is chosen by each peer every second for gossip exchange. Used to propagate membership changes Mapping stored at different nodes are reconciled during same gossip exchange. Partitioning and placement information also propagates via gossip-based protocol. Gossip-based Protocol

  8. Definition 1 An observer graph is an execution and an edge assignment Definition 2 An execution is a set of read and write vertices Read Vertex = (n, s, oID, val) tuple Write Vertex = (n, s, oID, wl) tuple Definition 3 An edge assignment for an execution is a set of directed edges connecting vertices of an execution. Definition 4 A consistency check for consistency semantics C is set of conditions that an observer graph must satisfy to be called consistent with respect to C Definition 5 An execution alpha is C-consistent iff there exists and edge assignment for alpha such that the resulting observer graph satisfies C's consistency check Definition 6 Vertex u preceeds vertex v in observer graph G if there is a direct path from u to v in G. If u does not preceed v and v doesn't preceed u, then u and v are concurrent. Definition 7 An operation u is said to be observed by a correct node p in G if either p executes u or if p executes an operation v such that u preceeds v. Fork-Join-Causality (FJC) Consistency

  9. Fork-Join-Causality (FJC) Consistency (a) An execution with a faulty node p2 and (b) an observer graph that is FJC and FCC.

  10. An execution is FJC Consistent if following holds in an observer graph G Serial Ordering at each correct node Reads by correct nodes return latest preceeding concurrent writes. The observer graph in (b) is both FJC and FCC consistent because FJC and FCC do not require total ordering of p2 ’s operations. Fork-Join-Causality (FJC) Consistency

  11. Architecture of Depot Arrows between servers indicate replication and exchange.

  12. Exchange an “update” with other servers in the event of an update to a key's value Format : dVV, {key, H(value),localClock@NodeId, H(History)} sign of Node LogicalClock advanced on every update at nodeId and also every successful update from peer (advanced to more than peer's value). H(value): collision-resistant hash of the value rather than whole value H(History): collision-resistant hash of most recent update by each node know to writer at that instant of issuing update. Basic Protocol

  13. Example of Series of writes

  14. At the End of W5 W3 W2 W1 W0 W5

  15. Fork-Join-Causal Consistency Eventual Consistency Availability and Durability Bounded Staleness Integrity and authorization Data Recovery Evicting Faulty Nodes Properties Provided by Depot

  16. Baseline variants used for comparison with depot Baseline Variants for Experimental Evaluation

  17. Mean and standard deviation for GETs and PUTs of various object sizes in Depot and four baseline variants Experimental Evaluation

  18. 99th Percentile for GETs and PUTs of various object sizes in Depot and four baseline variants Experimental Evaluation

  19. Baseline (B), B+Hash (H), B+H+Sig (S), B+H+S+Store (St), and Depot (D) in 100/0 (GET) and 0/100 (PUT) workloads with 10KB objects. Per Request Average Resource Use

  20. The labels indicate the absolute per-request averages. (C) and (S) indicate resource use at clients and servers, respectively. Per Request Average Resource Use

  21. (C-S) and (C-S) are client-server and server-server network use, respectively. For storage costs, we report the cost of storing a version of an object. Per Request Average Resource Use

  22. Dollar cost to GET 1TB of data, PUT 1TB of data, or store 1TB of data for 1 month. Each object has a small key and a 10KB value. 1TB of PUTs or GETs corresponds to 10^8 operations, and 1TB of storage corresponds to 10^8 objects. Evaluated Dollar Cost

  23. The effect of total server failure (t=300s) on staleness Effect of Total Server Failure

  24. The effect of total server failure (t=300s) on GET Latency Effect of Total Server Failure

  25. ? Questions

  26. [1] Depot: Cloud storage with minimal trust (extended version)∗, Prince Mahajan, Srinath Setty, Sangmin Lee, Allen Clement, Lorenzo Alvisi, Mike Dahlin, and Michael Walfish [2] Dynamo: Amazon’s Highly Available Key-value Store Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, Peter Vosshall and Werner Vogels References

More Related