1 / 31

CS614: Time Instead of Timeout

CS614: Time Instead of Timeout. Ken Birman February 6, 2001. What we’re after. A general means for distributed communication Letting n processes coordinate an action such as resource management or even replicating a database. Paper was first to tackle this issue

eacton
Download Presentation

CS614: Time Instead of Timeout

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS614: Time Instead of Timeout Ken Birman February 6, 2001

  2. What we’re after • A general means for distributed communication • Letting n processes coordinate an action such as resource management or even replicating a database. • Paper was first to tackle this issue • Includes quite a few ideas, only some of which are adequately elaborated

  3. Earlier we saw… • Distributed consensus impossible with even one faulty process. • Impossible to determine if failed or merely “slow”. • Solution 1: Timeouts • Can easily be added to asynchronous algorithms to provide guarantees about slowness. • Assumption: Timeout implies failure.

  4. Asynchronous Synchronous • Start with an asynchronous algorithm that isn’t fault-tolerant • Add timeout to each message receipt • Assumes bounds on the message transmission time and processing time • Exceeding the bound implies failure • Easy to “bullet-proof” a protocol. • Practical if bounds are very conservative

  5. Example: Resource Allocation P I want Resource X Yes / No, In Use Q Timeout = 2δ

  6. Null messages • Notice that if a message doesn’t contain real data, we can sometimes skip sending it • For example: if resource isn’t in use, I could skip sending the reply and after δ time interpret your “inaction” as a NULL message • Lamport is very excited by this option • A system might send billions of NULL messages per second! And do nothing on receiving them!! Billions and billions…

  7. Another Synchronous System • Round Based • Each round characterized by time needed to receive and process all messages.

  8. Lamport’s version: • Use Physical Clocks • Also fault-tolerant realtime atomic broadcast • Assumptions about time lead to conclusions other than failure • Passage of time can also have “positive” value • Provides generality for distributed computing problems • State machines • Resource acquisition and locking • Expense?

  9. Assumptions • Bounded message delay δ • Requires bandwidth guarantees. • A message delayed by > δ treated as failure. • Clock Synchronization • Clock times differ by less than ε. • Use clock synchronization algorithms (could be costly; revisit in next lecture). • Any process can determine message origin (e.g. using HMAC signatures) • Network cannot be partitioned

  10. An Algorithm… If send message queue not empty Send m with timestamp Ti If receive message queue not empty If queue contains exactly one message m from j with timestamp Ti - (δ + ε) Then Received Message = m Else Received Message = NULL • Implies Δ = (δ + ε)

  11. Example Ti Ti+Δ i Message M Tj Tj+Δ j Tj’ Tj’+Δ j’ ε

  12. More • This can be expressed more elegantly as a broadcast algorithm (more later). • Can inductively extend definition to allow for “routing” across path of length n • Δ = (n·δ + ε) • To tolerate f failstop failures, will need f + 1 disjoint paths. • To tolerate f Byzantine Failures, will need 2·f + 1 disjoint paths. • Transmitting NULL message easy: do nothing.

  13. Even More • For good guarantees, need close synchronization. • Message arrives Tmessage- ε, …, Tmessage + δ + ε • Thus, need to wait (δ + ε).

  14. Synchronization required? • A means to reliably broadcast to all other processes. • For process P broadcasting message M at time Tp, every (correct) process must receive the message at time Tp + Δ • For correct j, j’, receive by Tj + Δ and Tj’ + Δ, respectively, or neither does.

  15. = Atomic Broadcast • Atomicity • All correct processors receives same message. • Same order • All messages delivered in same order to all processors. • Termination • All updates delivered by T + Δ.

  16. Lamport’s Assumption • Somebody implements Atomic Broadcast black box. • Next slide summarizes options • Lamport briefly explains that previous point to point algorithm is strong enough. • Only assumes ability to send along a path correctly.

  17. Atomic Broadcast: [CASD]* • Describes 3 atomic broadcast algorithms. • All based on Diffusion (Flooding) • Varying degrees of protection • 1. Tolerant of omission failures • Δ = πδ + dδ + ε • 2. Works in presence of Clock Failures • Δ = π(δ+ ε )+ dδ + ε • 3. Works in presence of Byzantine Failures • Δ = π(δ+ ε )+ dδ + ε • δ much larger than previous for message authentication * F. Cristian, H. Aghali, R. Strong and D. Dolev, "Atomic Broadcast: From Simple Message Diffusion to Byzantine Agreement", in Proc. 15th Int. Symp. on Fault-Tolerant Computing. June 1985.  

  18. State Machine • General model for computation (State Machine = Computer!) • Describe computation in terms of state + transformations on the state

  19. State Machines • Multiple replicas in lock-step • Number of replicas bounded (below) by fault-tolerance objectives • Failstop model • Failover, > f + 1 replicas • Byzantine model • Voting, > 2·f + 1 replicas

  20. State Machine:Implementation • Let CLOCK = current time While ( TRUE ) Execute MessageCLOCK – Δ Execute Local Processing(CLOCK) Generate and Send MessageCLOCK • If there exist multiple messages with same time stamp, create an ordering based on sending process.

  21. State Machine (Cont.) • If we use our broadcast algorithm, all processes will get message by Tsender + Δ • Using the sending process id to break ties ensures everyone executes messages in same order.

  22. State Machines for Distributed Applications • Resource allocation • All processes maintain list of which process has resource “locked”. • Lock expires after Δ’ seconds • Requests for resource are broadcast to all • Rules govern who is granted lock (followed by all correct processes) • Ensure no starvation • Maintain consistency of resource locking

  23. Example: Resource Allocation Ti i Request R’ Request R Tj j Request R Tj’ j’ Wait Time: Δ

  24. Comparison • No explicit acknowledgement needed • Would be needed in traditional asynchronous algorithm • But here, requesting process knows that any conflicting request would arrive within T + Δ window.

  25. Key: • Non-occurrence of event (non-request) tells us of info: we can safely lock the resource! • Cost is the delay, as message sits in “holding pen.” • Concern about scalability in n: • We always see n requests in each  time period, so  will grow in n. Not addressed • Must bound request processing time so that all can be satisfied (else could starve process with higher id hence lower priority)

  26. Timeout Max Delay: 2·δ Average Delay: 2·δexp Messages: n + dependent on failure mode Time [Lamport] Max Delay: Δ = δ + ε Average Delay: Δ = δ + ε Messages: dependent on failure mode l More on Comparison: Resource Allocation • But is request processing time the “real” issue?

  27. Characterizing ε • ε proportional to δvar • Low level algorithms can achieve good clock synchronization. • δvar small for low-level algorithms • δvar large for high-level algorithms • Variance added by traversing low levels of protocol stack

  28. Summary… • Expressing application as state machine transitions can easily be transferred to distributed algorithm. • Event based implementation can be easily created from transitions.

  29. Other State Machine uses • Distributed Semaphores • Transaction Commit • State Machine synchronization core on top of distributed apps. • Entire application need not be distributed state machine.

  30. Ideas in this paper • Coordination and passing of time modeled as synchronous execution of steps of a state machine • Absence of a message becomes NULL message after delay Δ • Notion of dynamic membership (vague) • Broadcast to drive state machine (vague) • State transfer for restart (vague) • Scalability in n (not addressed) • Fault-tol. (ignores application semantics) • Δ-T behavior (real-time mechanism)

  31. Discussion • How far can we take the state machine model? • Can it be made to scale well? • Extreme clock synchronization dependence, practical? Worth it? • Possibly large waiting time for each message, dependent upon worst case message delivery latency

More Related