400 likes | 518 Views
EventWave : Programming Model and Runtime Support for Tightly-Coupled Elastic Cloud Applications. Wei-Chiu Chuang , Bo Sang , Sunghwan Yoo, Rui Gu, Charles Killian, Milind Kulkarni. Motivation. clients. world. building. building. room. room. server. room. room. # clients.
E N D
EventWave: Programming Model and Runtime Support forTightly-Coupled Elastic Cloud Applications Wei-Chiu Chuang, Bo Sang, Sunghwan Yoo, Rui Gu, Charles Killian, Milind Kulkarni
Motivation clients world building building room room server room room # clients Response time Time
Motivation Scale up clients Elasticity is hard server # clients Response time Time
Objectives A programming model which supports: • Statefulcomputation Transparent elasticity Simple sequential semantics
Related Work MapReduce [Dean et. al. OSDI ‘04] stateless No simple sequential semantics Data Flow Dryad [Isard et. al. EuroSys ‘07] No Stateful Computation No transparent elasticity CIEL [Murray et. al. NSDI ‘11] Live Migration of Virtual Machines [Clark et. al. NSDI ‘05] Does not change scale: “split”/”merge” state Zephyr [Elmore et. al. SIGMOD ‘11] Live Migration Transactional, reconcile conflicts Orleans [Bykov et. al. SoCC ‘11] Scalable programming model
Event Driven Systems client Event queue server Event 1 commits Event 2 commits Event 3 commits 1 2 3 • Typical event driven systems are not scalable.
Context • Scalability comes from parallelism • Partition program state into `contexts` • An event accesses one or more contexts • Events accessing disjoint contexts can run in parallel world world building building building room room room Contexts enable implicit parallelism hallway room room
EventWave Event 1 commits Event 1 finishes • Stateful • Sequential semantics • Parallelism Context 1 Event 2 commits Event 2 finishes Context 2 3 2 1 Context 3 Event 3 finishes Enforce sequential ordering Event 2 can not commit until Event 1 commits Event 3 commits
Access Multiple Contexts • A player can move from one room to another • Remove it from source room • Insert it into destination room An event may access multiple contexts world Player list building building Room 1 Room 2 room room Bob Bob Alice Bob room room
Access Multiple Contexts • Must ensure • Sequential semantics • parallelism To be scalable, events can not access contexts arbitrarily Event 2 can’t start before event 1 finishes 2 1 Context 1 Event 2 commits Context 2 Event 1 finishes Context 3
Hierarchical Contexts • Contexts are not completely independent • The world has many buildings • A building has many rooms world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>
Wave of Events • Must access contexts from top to bottom The hierarchical access enables parallelism world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>
Wave of Events • Move a player from room 1 to room 2 world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>
Wave of Events Allow the next event to access Building<1> world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>
Wave of Events world Building<1> Building<2> Enter Room<1> Room<1> Room<1> Room<2> Room<2>
Wave of Events world Building<1> Building<2> Remove player Release exclusive access Room<1> Room<1> Room<2> Room<2>
Wave of Events world Building<1> Building<2> Enter Room<2> Room<1> Room<1> Room<2> Room<2>
Wave of Events world Event finishes, releasing all contexts Building<1> Building<2> Insert player Room<1> Room<1> Room<2> Room<2>
Wave of Events world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>
Wave of Events Event commits, releasing snapshot world Building<1> Building<2> Room<1> Room<1> Room<2> Room<2>
Distributed Execution • Scale more by executing events across multiple nodes • Map contexts • Head node world Head node Building<1> Building<2> Room<1> Room<3> Room<2> Room<4>
Distributed Execution • Logical Node: a set of physical nodes Server Logical Node Client Logical Node #1 Client Logical Node #2
Elasticity Request nodes from cloud scheduler Update context mapping world Building<2> Building<1> Room<1> Room<1> Room<2> Room<2>
Elasticity Transfer contexts to the new node world Building<2> Building<1> Room<1> Room<1> Room<2> Room<2>
Evaluation In the paper Key-value store Does it scale? • Microbechmarks • Scalability What is the cost of migration? • Microbechmarks • Migration latency Case study • Multi-player game server
Microbenchmark-Scalability Setup • One logical node, fixed context mapping • EC2 Small Instances • 1 vCPU, 1.7GB RAM, 160 GB local disk • Distribute 160 contexts to physical nodes Measures • Throughput
Microbenchmark-Scalability Takeaway: Throughput grows w.r.t. # of nodes P: workload
Microbenchmark-Migration Latency Setup • 2 x 8-core 2.0 GHz Xeon, 8GB RAM • 1Gb Ethernet connection Scale does not change
Microbenchmark-Migration Latency Migrate a 100MB context The migration event commits Measure • Throughput of events Finished events must wait for migration event
Multi-player Game Server Setup • Server logical node • 1 x Extra Large Instance (head) • 64 x Small Instances • Client logical nodes • 128 clients on 16 EC2 Small Instances Measure • Latency
Multi-player Game Server Synthetic workload Server contexts spread to 64 physical nodes Server contexts merge to 1 physical nodes
Conclusion • Elasticity is crucial for cloud applications. • Our programming model enables transparent elasticity for tightly-coupled applications • Case studies show EventWave is efficient http://www.macesystems.org
Language Construct state_variables{ Hallway hw; vector<Room> rooms; } context Hallway{ int x; } context Room<int>{ inty; } Declare implicit parallelism Mace [Killian et. al. PLDI ‘07] Hallway Hallway Room[0] Room[1] … Room<0> Room<1>
Event Handler upcall deliver(Message m){ } Annotation Specify what context to access Message(roomID = 2) upcall[Room<m.roomID>]deliver(Message m){ } Context Room<2>
Key-value store Setup • 2 x 8-core 2.0 GHz Xeon, 8GB RAM • 1Gb Ethernet connection Measure • Latency
Microbenchmark-Migration Latency Setup • 2 x 8-core 2.0 GHz Xeon, 8GB RAM • 1Gb Ethernet connection Scale does not change Context
Context Migration Update context-node mapping Head Event 3 goes to the new node 1 3 M Event 1 goes to the old node Old node Copy context state New node Replicate context state