480 likes | 496 Views
Explore the benefits of the Actor model for concurrent and distributed programming, including lightweight processes, asynchronous message passing, and coordination-free consistency.
E N D
Anna: A KVS For Any ScaleChenggang Wu, Jose M. Faleiro, Yihan Lin, Joseph M. HellersteinUC Berkeley, Yale University, Columbia University Presented by Cheng Li Advanced Data Systems Lab, USTC 2018-3-23
Concurrent programming • = when programs are designed as collection of interacting computational processes that may be executed in parallel (Wikipedia) • Inter-dependent processes • executing simultaneously • affecting each-other's work • must exchange information to do so
Threads & Shared Memory • Smallest executable unit is thread • Communication is implemented by sharing variables
Threads & Shared Memory • Cons: • Complicated & error-prone client code • Not extendable to distributed programming • Threads are heavy-weight – not too scalable • Examples: • Standard concurrency libraries in Java, C#, etc..
Distributed programming X Client a.withdraw(10) A T 1 T Y T = openTransaction openSubTransaction T B b.withdraw(20) 2 a.withdraw(10); openSubTransaction b.withdraw(20); Z openSubTransaction . T c deposit(10) C c.deposit(10); 3 openSubTransaction d.deposit(20) D d.deposit(20); T 4 closeTransaction
Distributed programming • Distributed transactions across machines • Strong assumptions: ACID properties • High cost for coordinating on conflictsand tolerating faults
Solutions • Wait-free execution • Replacing Threads + shared memory with Actor model • Coordination-free consistency • From strong ACID to weak ACI properties with replication
Preliminaries 1: Actor Model • Smallest executable unit is an actor • An actor is a concurrency primitive that does not share any resources with other actors • Communication is implemented by actors sending each-other messages
Actors and event-drivenprogramming • No “sleeping” or “waking up” • Actors get passive when no more messages of interest to be processed • Passive actors activate immediately when an interesting message arrives
Asynchronous messagepassing • Each actor has a “mailbox”. • If message arrives when actor is busy, it gets stored in it's mailbox • If actor arrives at a point where it waits new messages to continue, it picks up the first suitable message to proceed
Benefits of Actor model • Light-weight processes • 1 code-level thread = 1 operating system thread • 1 operating system thread = 1 active actor + unlimited amount of inactive actors • Natural extension to distributed Env • Local and remote actors “look the same” • Additional lookup and identification required
Preliminaries 2: ACI properties ACID ACI Associative Commutative Idempotence • Atomicity • Consistency • Isolation • Durability
Preliminaries 2: ACI properties • Replicate data at many nodes • Performance: local reads • Fault-tolerance: no data loss unless all replicas fail or become unreachable • Availability: data still available unless all replicas fail or become unreachable • Scalability: load balance across nodes for reads • Updates • Push to all replicas • Consistency: expensive!
Conflicts • Updating replicas may lead to different results inconsistent data s1 3 7 5 s2 3 7 5 s3 5 7 3
Strong Consistency • All replicas execute updates in same total order • Deterministic updates: same update on same objects same result s1 3 7 5 3 s2 3 7 5 s3 7 5 7 3 coordinate
Strong Consistency • All replicas execute updates in same total order • Deterministic updates: same update on same objects same result • Requires coordination and consensus to decide on total order of operations • N-way agreement, basically serialize updates very expensive!
Eventual Consistency • If no new updates are made to an object all replicas will eventually converge to the same value • Update local and propagate • No consensus in the background scale well for both reads and writes • Expose intermediate state • Assume, eventual, reliable delivery • On conflict • Arbitrate & Rollback
Eventual Consistency • If no new updates are made to an object all replicas will eventually converge to the same value • Move consensus to background • However: • High complexity • Unclear semantics if application reads data and then we have a rollback!
Strong Eventual Consistency • Like eventual consistency but with deterministic outcomes of concurrent updates • No need for background consensus • No need to rollback • Available, fault-tolerant, scalable • But not general; works only for a subset of updates
CRDTs Conflict-free Replicated Data Types • Data Types whose operations that are • Associative — A • (B • C) = (A • B) • C • Commutative — A • B = B • A
Portfolio of CRDTs • Counter • Unlimited • Non-negative • Map • Set of Registers • … … • Register • Last-Writer Wins • Set • Grow-Only • 2P • Observed-Remove • … …
Observed-Remove Set • Sequential specification: • {true} add(e) {e ∈ S} • {true} remove(e) {e ∉ S} • {true} add(e) || remove(e) {????} • linearizable? • add wins? • remove wins? • last writer wins? • error state?
Observed-Remove Set • add(e) = A ≔ A ∪ {(e, α)} • Remove: all unique elements observed remove(e) = R ≔ R ∪ { (e, –) ∈ A} • lookup(e) = ∃ (e, –) ∈ A \ R • merge (S, S') = (A ∪ A', R ∪ R') • {true}add(e) || remove(e) {e ∈ S}
Summary • Anna is motivated by • Threads + shared memory is costly and not applicable to distributed env • ACID properties are too strong • High performance at various scales is demanded • Anna is an efficient KVS relying on: • Actor model, a nice abstraction for scaling concurrent and distributed programs • CRDTs eliminating coordination by restricting types of supported objects for concurrent updates
Open questions? • Do you think this work is novel? • Theory innovations? • Engineering innovations? • Is it sufficient to offer multi-key transactions with only atomicity and durability?
Thanks for listening! Cheng Li 2018-3-23
Outline • Background • Introduction to CRDTs • A case for scaling relational databases • Summary
Two-tier Application Model • Observation: Side effects are encapsulated into a sequence of DB statements App Server App Server App Server Database
Two-tier Application Model • Observation: Side effects are encapsulated into a sequence of DB statements • Insight: We can model the database using commutative replicated data types (CRDTs) App Server App Server App Server CRDT Database
Leveraging CRDTs • Transform each DB statement into one or more CRDT operations • Programmers only annotate schema with CRDTs: Counter/ Rewritable value DB Table Set DB Field Transaction: [CRDT_OP1; CRDT_OP2; CRDT_OP3;…]
Experimental Setting • Local cluster • Maximum of 10 nodes • Clients spread across 5 nodes • TPC-C benchmark • Gold standard for database transaction processing • TPC-C standard + Read Dominant workloads • Baselines • MySQL-Cluster • Sharding • Galera-Cluster • Full replication
Performance Evaluation 5 replicas TPC-C Standard
Scalability Evaluation TPC-C Standard