340 likes | 466 Views
Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services. Ke Wang , Abhishek Kulkarni , Mi chael Lang, Dorian Arnold, Ioan Raicu USRC @ Los Alamos National Laboratory Datasys @ Illinois Institute of Technology CS @ Indiana University
E N D
Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services Ke Wang, AbhishekKulkarni, Michael Lang, Dorian Arnold, IoanRaicu USRC @ Los Alamos National Laboratory Datasys @ Illinois Institute of Technology CS @ Indiana University CS @ University of New Mexico November 20th, 2013 at IEEE/ACM Supercomputing/SC 2013
Current HPC System Services 2 Extreme scale Lack of decomposition for insight Many services have centralized designs Impacts of service architectures an open question Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Long Term Goals • Modular components design for composable services • Explore the design space for HPC services • Evaluate the impacts of different design choices Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Contribution • A taxonomy for classifying HPC system services • A simulation tool to explore Distributed Key-Value Stores (KVS) design choices for large-scale system services • An evaluation of KVS design choices for extreme-scale systems using both synthetic and real workload traces Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Outline • Introduction & Motivation • Key-Value Store Taxonomy • Key-Value Store Simulation • Evaluation • Conclusions& Future Work Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Outline • Introduction & Motivation • Key-Value Store Taxonomy • Key-Value Store Simulation • Evaluation • Conclusions& Future Work Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Distributed System Services • Job Launch, Resource Management Systems • System Monitoring • I/O Forwarding, File Systems • Function Call Shipping • Key-Value Stores Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Key IssuesDistributed System Services • Scalability • Dynamicity • Fault Tolerance • Consistency Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Key-Value Stores and HPC • Large volume of data and state information • Distributed NoSQL data stores used as building blocks • Examples: • Resource management (job, node status info) • Monitoring (system active logs) • File systems (metadata) • SLURM++, MATRIX [1], FusionFS [2] [1] K. Wang, I. Raicu. “Paving the Road to exascale through Many Task Computing”, Doctor Showcase, IEEE/ACM Supercomputing 2012 (SC12) [2] D. Zhao, I. Raicu. “Distributed File Systems for Exascale Computing”, Doctor Showcase, IEEE/ACM Supercomputing 2012 (SC12) Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Outline • Introduction & Motivation • Key-Value Store Taxonomy • Key-Value Store Simulation • Evaluation • Conclusions& Future Work Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
HPC KVS TaxonomyWhy? • Decomposition • Categorization • Suggestion • Implication Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
HPC KVS TaxonomyComponent • Service model: functionality • Data model: distribution and management of data • Network model: dictates how the components are connected • Recovery model: how to deal with component failures • Consistency model: how rapidly data modifications propagate Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Centralized Architectures Data model: centralized Network model: aggregation tree Recovery model: fail-over Consistency model: strong Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Distributed Architectures Data Model: distributed with partition Network Model: fully-connected partial knowledge Recovery Model: consecutive replicas Consistency Model: strong, eventual Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Outline • Introduction & Motivation • Key-Value Store Taxonomy • Key-Value Store Simulation • Evaluation • Conclusions& Future Work Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
KVS Simulation Design • Discrete Event Simulation PeerSim • Evaluated others: OMNET++, OverSim, SimPy • Configurable number of servers and clients • Different architectures • Two parallel queues in a server • Communication queue (send/receive requests) • Processing queue (process request locally) Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Simulation Cost Model The time to resolve a query locally (tLR), and the time to resolve a remote query (tRR) is given by: tLR= CS + SR + LP + SS + CR For fully connected: tRR= tLR+ 2 × (SS + SR) For partially connected: tRR= tLR+ 2k× (SS + SR) where k is the number of hops to find the predecessor Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Failure/Recovery Model • Defines what to do when a node fails • How a node-state recovers when rejoining after failure EM EM client client X X notify back notify failure s0 r5,1 r4,2 s0 r5,1 r4,2 s1 r0,1 r5,2 s1 r0,1 r5,2 s5 r4,1 r3,2 s5 r4,1 r3,2 s0, s4, s5 data first replica down s0 is back replicate my data remove s5data second replica down s4 r3,1 r2,2 s4 r3,1 r2,2 s0 is back s2 r1,1 r0,2 s2 r1,1 r0,2 replicate my data replicate s0 data remove s0 data s3 r2,1 r1,2 s3 r2,1 r1,2 Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Consistency Model • Strong Consistency • Every replica observes every update in the same order • Client sends requests to a dedicated server (primary replica) • Eventual Consistency • Requests are sent to randomly chosen replica (coordinator) • Three key parameters: N, R, W, satisfying R + W > N • Use Dynamo [G. Decandia, 2007] version clock to track different versions of data and detect conflicts Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Outline • Introduction & Motivation • Key-Value Store Taxonomy • Key-Value Store Simulation • Evaluation • Conclusions& Future Work Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Evaluation • Evaluate the overheads • Different architectures, focus on distributed ones • Different models • Light-weight simulations: • Largest experiments 25GB RAM, 40 min walltime • Workloads • Synthetic workload with 64-bit key space • Real workload traces from 3 representative system services: job launch, system monitoring, and I/O forwarding Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Validation • Validate against ZHT [1] (left) and Voldemort (right) • ZHT BG/P up to 8K nodes (32K cores) • Voldemort PROBE Kodiak Cluster up to 800 nodes [1] T. Li, X. Zhou, K. Brandstatter, D. Zhao, K. Wang, A. Rajendran, Z. Zhang, I. Raicu. “ZHT: A Light-weight Reliable Persistent Dynamic Scalable Zero-hop Distributed Hash Table”, IEEE International Parallel & Distributed Processing Symposium (IPDPS) 2013 Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Fully-connected vsPartial-connected • Partial connectivity higher latency due to the additional routing • Fully-connected topology faster response (twice as fast at extreme scale) Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Replication Overhead • Adding replicas always involve overheads • Replicas have larger impact on fully connected than on partially connected Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Failure Effect • Higher failure frequency introduces more overhead, but the dominating factor is the client request processing messages Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Combined Overhead • Eventual consistency has more overhead than the strong consistency Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Real Workloads • For job launch and I/O forwarding • Eventual consistency performs worse almost URD for both request type and the key • Monitoring • Eventual consistency works better all requests are “put” Fully connected Partially connected Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Simulation Real Services • ZHT (distributed key/value storage) • DKVS implementation • MATRIX (runtime system) • DKVS is used to keep task meta-data • SLURM++ (job management system) • DKVS is used to store task & resource information • FusionFS (distributed file system) • DKVS is used to maintain file/directory meta-data Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Outline • Introduction & Motivation • Key-Value Store Taxonomy • Key-Value Store Simulation • Evaluation • Conclusions& Future Work Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Conclusions • Key-value Store is building block • Service taxonomy is important • Simulation framework to study services • Distributed architecture is demanded • Replication adds overhead • Fully-connected topology is good • As long as the request processing message dominates • Consistency tradeoffs Read-Intensity/Performance Weak Consistency Strong Consistency Eventual Consistency Write-Intensity/Availability Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Future Work • Extend the simulator to cover more of the taxonomy • Explore other recovery models • log-based • information dispersal algorithm • Explore other consistency models • Explore using DKVS in the development of: • General building block library • Distributed monitoring system service • Distributed message queue system Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Acknowledgement • DOE contract: DE-FC02-06ER25750 • Part of NSF award: CNS-1042543 (PRObE) • Collaboration with FusionFSproject under NSF grant: NSF-1054974 • BG/P resource from ANL • Thanks to Tonglin Li, Dongfang Zhao, HakanAkkan Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
More Information • More information: • http://datasys.cs.iit.edu/~kewang/ • Contact: • kwang22@hawk.iit.edu • Questions? Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services
Related Work • Service Simulation • Peer-to-peer networks simulation • Telephony simulations • Simulation of consistency • Problem: not focus on HPC, or combine distributed features • Taxonomy • Investigation of distributed hash tables, and an algorithm taxonomy • Grid computing workflows taxonomy • Problems: none of them drive features in a simulation Using Simulation to Explore Distributed Key-Value Stores for Extreme-Scale System Services