Efficient Fork-Linearizable Access to Untrusted Shared Memory

Efficient Fork-Linearizable Access to Untrusted Shared Memory

  1. Efficient Fork-Linearizable Access to Untrusted Shared Memory Joint work with: Christian Cachin IBM Zurich Abhi Shelat University of Virginia Presented by: Alex Shraer (Technion) IBM Zurich Research Laboratory

  2. Data in a storage system • Users store data on remote storage • Data integrity: • Single user – hashing • Merkle hash trees for large data volume • Multi user – digital signatures • Public key infrastructure is needed • Data consistency? • What if the server is faulty?

  3. Model • Asynchronous system • Clients C1,…, Cn • correct • communicate only with the server via reliable links • have public/private key pair (stored data is signed) • each client executes read/write operations sequentially • Server S • emulates n SWMR registers - client Ci writes only to register i • CANNOT BE TRUSTED - perhaps faulty Server

  4. Consistency Model? Attempt 1: linearizable shared memory • Requires a read to return the latest written value • Impossible σ is linearizable if exists a sequential permutation  that preserves 1. the real-time order of σ and 2. the sequential specification

  5. write (1, v) write (1, u) C1: C2: time Attempt 2: Sequential consistency • Read does not have to return latest value • For every process, the order in which its operations take effect must be preserved Example: read (1) → u read (1) → v

  6. write (1, u) read (2) → ┴ read (1) → ┴ write (2, v) C1: C2: time Sequential consistency – Cont. • Still impossible to implement ! • Proof:

  7. Fork-Linearizability Previously defined: (1) [Mazières, Shasha PODC 2002] (2) [Oprea, Reiter DISC 2006] The definition we use - similar to (2): A seq. of events  is fork-linearizable if for each client Ci there exists a subsequence i of  consisting only of completed operations and a sequential permutation i of i • All operations of client Ci are in i • i preserves real-time order of i • Isatisfies the sequential specification • If an operation op is in I and k then the sequence of operations that precede op in both groups is the same Every client sees linearizable memory R1 C1 & C2 C3 By telling one lie to C1 & C2 and another to C3, the server “forks” their views W1 R2 C2 C1 R3 W2

  8. New on Fork-Linearizability • Fork Linearizable Byzantine emulation (simplified) - every execution is fork-linearizable - if the server is correct • every operation completes • every execution is linearizable • Global Fork-Linearizability: a simpler and equivalent definition

  9. Some Motivation Fork-Linearizability is the strongest known consistency that can be enforced with a possibly faulty server • Guarantees a linearizable view for each client, and linearizable executions when the server is correct • The server can hide operations from clients, but nothing worse! • If the server forks the views of C1 and C2, their views are forked ever after (no join), i.e., they do not see each other’s further updates • otherwise the run is not fork-linearizable, which can be detected by the clients (unlike in linearizability or sequential consistency) • fork-linearizability is not a weak condition • Linearizability is stronger • (New) Sequential consistency is not ! Linearizable Fork Linearizable Seq. Consistent proof – in previous slides

  10. Emulating fork-linearizable memory requires waiting • Theorem: Every protocol has executions with a correct server where Ci must wait for Cj • Formal proof in the paper. The idea: • by contradiction, assume that no waiting is necessary • r’(1) must return v since w’(1, v) might have completed • The server can cause this run to be indistinguishable from Run 1 • v cannot be returned • r’(1) cannot return neither u nor v in Run 1 – it must wait… Run 1: Correct server w(1, u) w’(1, v) C1 C2 r(1) → u r’(1) → ? Run 2: Faulty server w(1, u) w’(1, v) C1 C2 r(1) → u r’(1) → v

  11. Protocols • Trivial method: Sign the complete history • Server sends history with all signatures • Client verifies all operations and signatures • Client adds its operation and signs new history • Message size proportional to system age • [Mazières, Shasha PODC 2002] : Use n “version vectors” • A blocking protocol and a concurrent protocol • Communication complexity Ω(n2) • Message size ~400MB for 10’000 users • Our results: • A blocking protocol and a concurrent protocol • Communication complexity O(n) • Message size ~40KB for 10’000 users

  12. 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 1 0 0 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 0 0 1 1 0 1 0 0 <REPLY, , > (val,1) 1 0 0 1 1 0 1 1 0 Lock-Step with Correct Server (simplified) Correct Server C1 <SUBMIT, WRITE, 1> <REPLY, > < COMMIT, , (val,1) > <SUBMIT, READ, 1> C2 <COMMIT, >

  13. Faulty Server C1 0 0 0 <SUBMIT, WRITE, 1> 0 0 0 1 0 0 0 0 0 <REPLY, > 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 < COMMIT, , (val,1) > 1 0 0 0 0 0 <SUBMIT, READ, 1> C2 1 0 0 0 1 0 0 0 0 <REPLY, > , (┴,0) 1 0 0 0 1 0 <COMMIT, > Lock-Step with Faulty Server (simplified)

  14. 0 0 0 1 0 0 1 1 0 0 0 0 1 0 0 0 1 0 What happened? Example 1 Example 2 start start • The ≥ relation: V ≥ V’ if for all j, V[j] ≥ V’[j] • B reads stale data of A B signs a version structure which cannot be ordered with what A signed • Proof idea – based on “No-Join” property: no operation signs a version structure V s.t. V ≥ VA and V ≥ VB • Subsequences i can be constructed from the branch of Ci in the fork-tree write1(1, val) write1(1, val) read2 (1) → ┴ read2 (1) → val

  15. Increasing concurrency • Any protocol will sometimes block… • Concurrent protocol – details in the paper • Allow operations to complete concurrently • A read must wait for a previously scheduled concurrent write to the same register to complete (at most one such operation) • Message size:O(n)

  16. Summary of Results • On the notion of Fork-Linearizability • Global Fork-Linearizability • Fork-Linearizable Byzantine emulations • Comparing Fork-Linearizability with seq. consistency • Communication efficient protocols • Lock-Step • Concurrent • A proof that any Fork-Linearizable Byzantine emulation must sometime block • As in [MS02] and in our concurrent protocols Questions?

