280 likes | 468 Views
Distributed Application Coordination. CMSC 491/691 Hadoop-Based Distributed Computing Spring 2014 Adam Shook. Apache ZooKeeper. What is it?. Apache ZooKeeper is an effort to develop and maintain an open-source server which enables highly reliable distributed coordination. Simple
E N D
Distributed Application Coordination CMSC 491/691 Hadoop-Based Distributed Computing Spring 2014 Adam Shook
What is it? • Apache ZooKeeper is an effort to develop and maintain an open-source server which enables highly reliable distributed coordination. • Simple • Replicated • Ordered • Fast
Provides • Configuration Information • Distributed Synchronization • Group Services • Each of these services are used in some by distributed applications
Interface • ZooKeeper provides a very simple interface to a highly reliable and distributed service • Powerful abstractions can be built from this very simple interface • Currently interfaces are in Java and C • Want to expand to Python, Perl, and REST.
The Core • Shared hierarchical name space of data registers, called znodes • Unlike file systems, provides clients with high throughput, low latency, highly available, and ordered access to znodes
znodes • Meta-information: • Configuration • Status Information • Location Information • Whatever you want (that’s small)
znodes • Each node acts as a file and directory • 1 MB maximum per znode • Persistent vs. Ephemeral • Sequential znodes • Full paths • An optional “chroot” suffix can be appended to connection string • “127.0.0.1:3000,127.0.0.1:3002/app/a”
Watchers • Tied to each znode • One-time trigger • Sent to the client • The data for why it was sent
That’s It • In a nutshell • Very basic service, from which powerful abstractions can be built • Let’s talk about how good it is! • That is, if you don’t have any questions right now… • You can ask. I don’t bite • Really • Promise
Use Case: Location Data • Servers store machine hostname as ephemeral znodes • /app1/machine1 • /app1/machine87 • /app1/machine4 • When a server is added, create a new znode • When a server is removed, znode is deleted • When a server fails, ZK will delete the ephemeral node • Allows for dynamic throttling of resources • Clients can choose a hostname from children of /app1 to connect to • Set a child watch on /app1, if server goes down it will receive notification and can choose a new server
Command Line Interface • Interactive usage of the namespace in a shell • create [path] [data] • delete [path] • get [path] • set [path] • ls [path] • rmr [path] • A number of other commands… • Tab completion!
API • Current and stable v3.4.6 (March 2014) • Requires only a list of ZK servers to connect • IMO, good but messy interface • Recommend building a nice wrapper API for getting/setting POD types and handling exceptions
Recipes! • We are going to talk about these: • Configuration • Distributed Locks • Distributed Queue
Configuration • Configuration is often driven through key/value pairs stored in a file • Can get messy when configuration is dynamic • Implementation is very straightforward, as it is what ZooKeeper was designed for • Each full-pathedznode is the key and the data associated with the znode is the value
Variables • Static Variables • Those ones that are probably never going to change (not as much fun) • Dynamic Variables • Changed by hand via command line or by the application itself • Track status of processes • Update historical data
Use of Watchers • Applications can change configuration on the fly for some variables • Whenever a variable changes, those watching a node can receive the changed variable and make the correct changes • Very useful for long-running applications that require the most up to date information
Distributed Locks • A means to have distributed processes retrieve a lock for some operation • Throttled updating of database • Your use case here! • Exists in ZooKeeper's recipes directory and is distributed with the release -- src/recipes/lock
Algorithm • Define a znode to hold the lock, say “/dlock” • mypath = create(“/dlock/lock-”), with the sequence and ephemeral flags set • children = getChildren(“/dlock”), no watch • If mypath has lowest number suffix in chlidren, exit • Call exists() on node from children with next lowest sequence number with the watch flag set • i.e., if mypath is “/dlock/lock-6” and children contains 3,4,6, 7, call exists on “/dlock/lock-4” • If exists is false, go to step 2 • If true, wait for watch trigger before going to step 2
Distributed Queues • A means to allow clients to asynchronously add elements to a queue and have a single processor application dequeue and process them. • I can’t remember the last time I needed a queue • Maybe you have a few
Algorithm • Designate a znode to hold the queue, say “/dqueue” • Enqueue: create(“/dqueue/queue-”), with sequence and ephemeral flags set. • Returns a real path node /dqueue/queue-X, where X is a monotonic increasing number • Dequeue: getChildren(“/dqueue”), watch set to true • Process these nodes with the lowest number first • No need to call getChildren() until the current received list is exhausted • If no children are in the queue, wait for watch notification before checking again
Priority Queue Extension • Two simple modifications to this algorithm! • When enqueuing, pathnames ends with queue-ZZ, where ZZ is the priority of the element • Lower the number, higher the priority • When dequeuing, if the watch notification is triggered on the “/dqueue” node, client needs to call getChildren() again and resort by priority.
Other Recipes • Group membership • Barriers • Two-phased commit • Leader Election
References • http://zookeeper.apache.org