140 likes | 285 Views
distributed systems. distributed systems and protocols. distributed systems : use components located at networked computers use message-passing to coordinate computations distributed system characteristics concurrency of components no global clock components fail independently
E N D
distributed systems and protocols • distributed systems: • use components located at networked computers • use message-passing to coordinate computations • distributed system characteristics • concurrency of components • no global clock • components fail independently • asynchronous protocols • process knows nothing after sending message • synchronous protocols • paired request and reply (whiteboard diagram)
remote procedure call (RPC) • class of synchronous protocols • process-to-process • may be implemented on or instead of UDP • generic RPC functions: • name space for remote procedures • flat: needs central authority • hierarchical: per machine, per process, ... • way to multiplex calls and replies • unique message IDs • boot IDs
additional RPC functions: • reliable message delivery • ACKs, timeouts • multiple “channels” • concurrency of “open” messages • no “in-order” guarantee • fragmentation and reassembly • avoid TCP overhead • no “sliding window” • ACKs & NACKs
Lamport's logical clocks • operated per process • monotonically increasing time counter • typically integer, incremented at each event • Lamport timestamps • event e; process pi; Li(e); happened-before relation: → • rule LC1: increment Li(e) prior to ei • rule LC2: • to message m, append time t = Li • [process pj receives message (m,t)]: pj does: • compute Lj := max(Lj, t) • apply LC rule 1 • timestamp event e: receive(m)
global state • distributed garbage collection • garbage: not referenced within system • communication channels, too • distributed deadlock detection • waits-for relationship • distributed termination detection • communication channels, too
run: total ordering of all events in global history, • consistent with each local history ordering • →i , for (i=1,2,...,n) • linearization (consistent run): a run that is • consistent with the → relation on the global history • global state predicate: • maps domain of global process state to {T,F} • stable (garbage, deadlock, termination) • unstable (ongoing)
safety and liveness • safety: • deadlock is an undesirable predicate P • S0 is system initial state • we want P to evaluate to False for all states • liveness: • termination is a desirable predicate P' • for any linearization L, P' evaluates to True, • for some state reachable from S0
shared resources • critical section (CS) problem: • resource shared by distributed processes • solution based solely on message passing • mutual exclusion solution: requirements • ME1 (safety): <= 1 process executing in CS • ME2 (liveness): process requests to • enter/exit CS eventually succeed
mutual exclusion • mutex (mutual exclusion object): • resource shared by distributed processes • two states: {locked, unlocked} • arbitration/scheduling: FIFO, priority, ... • fairness • absence of starvation • ordering of process entry to CS • who gets the lock next? • but, no global clocks • ... and, recall deadlock problem ME3 (→ ordering): entry to CS is granted in → order
peer-to-peer node lookup • distributed index of distributed resources • example algorithm: Chord • each peer’s IP address hashed to m-bit key • hash function is shared among peers • text example: 160-bit keys • some small fraction of keys appropriated for real nodes • linear ascending search (modulo 2m) • effectively, a circle of 2m keys • for k, some key • successor(k) := key of next actual node of greater key
Chord lookup • index stored materials by resource name: • hash(name) => key • send (name, file-IP-address) tuple to node(successor(hash(name))) • find/retrieve materials by resource name: • hash(name) => key • contact successor(hash(name)) • ask for (name, file-IP-address) tuple • may be multiple records • ask file-IP-address for name materials
Chord infrastructure • each participating node n • stores IP address of node(successor(key-of-n)) • may also store IP address of predecessor • set of resource records submitted by peers • {(name, file-IP-address)} • “finger table” for node k • m entries: 0, ..., (m-1) • {(starti, IP-address(successor(starti)))} • starti = (k + 2i) mod 2m
Chord lookup example • idea: resource advertiser (repository) and resource searcher both hash resource name to one index key • search by sending packet around circle • without finger tables, n/2 average lookups • pure linear search • with finger tables, log2n lookups • binary indexing • finger closest predecessor of key(name) • look up key 3 at node 1 • look up key 14 at node 1 • look up key 16 at node 1 • search can start anew further around the circle • search over: node knows the key sought is between itself and its successor; the node returns successor IP address