300 likes | 425 Views
Optimizing Buffer Management for Reliable Multicast. Zhen Xiao AT&T Labs – Research Joint work with Ken Birman and Robbert van Renesse. Why important?. Many applications desire reliable or semi-reliable delivery. IP multicast is best-effort. Buffering is necessary for retransmission.
E N D
Optimizing Buffer Management for Reliable Multicast Zhen Xiao AT&T Labs – Research Joint work with Ken Birman and Robbert van Renesse
Why important? • Many applications desire reliable or semi-reliable delivery. • IP multicast is best-effort. • Buffering is necessary for retransmission. • Buffer space is limited! How to utilize the amount of buffer space most efficiently?
Previous Work • RMTP: Buffer all messages on repair servers. • Impractical for long-lived sessions. • SRM: Regenerate messages at the application. • Buffer management at the application level remains a challenge. • Stability Detection: Buffer messages until they are stable (i.e. received by all members in the group). • It takes a long time to achieve stability in a large multicast group. • Bimodal Multicast: Buffer messages for a fixed amount of time. • Optimization: buffer messages on a sub-group of members.
Talk Overview • RRMP: Randomized Reliable Multicast Protocol • Error recovery algorithm in RRMP: Infocom 2001 • Buffering algorithms in RRMP: DSN 2002 • Feedback based short-term buffering • Randomized long-term buffering • Simulation results • Summary
RRMP: Randomized Reliable Multicast Protocol Key idea: combine previous work on randomized error recovery with the Bimodal Multicast protocol and hierarchical error recovery similar to that employed by tree-based protocols. • Group receivers into a hierarchy. • Do not use any repair server. • parent region: the least upstream region of a receiver in the hierarchy. • Each receiver maintains group membership information about receivers in its region and receivers in its parent region.
Two-phase Error Recovery Assumea receiver p detects a message loss. • local loss: the loss affects a fraction of receivers in p’s region • regional loss: the loss affects all receivers in p’s region Local recovery: a receiver tries to recover the loss from randomly selected neighbors. Remote recovery: some receivers in the region request retransmissions from the parent region.
s routers receivers s sender
s routers receivers s sender q p
s routers receivers s sender q p
Error Recovery Buffering Local recovery Short-term buffering Long-term buffering Remote recovery Overview of Buffering Scheme Short-term buffering: when a message is first introduced into the system. Long-term buffering: when almost all receivers in a region have received the message.
As , this probability can be approximated by Feedback-based Short-term Buffering idea: a member uses the retransmission requests it received as feedback to estimate how many members in the region still miss the message. n: the size of a region p: the percentage of members in this region missing a message The probability that a member will not receive any request: idle message: no request for this message has been received for a time interval T. (T is the idle threshold.) Short-term buffering: buffer a received message until it becomes idle. Result: messages most needed in the system stay in the buffer longer. No extra traffic overhead!
Simulation Results • Short-term buffering in a local region. • 100 members in the region, fully connected. • RTT between any two members: 10ms. • idle threshold: 40ms. • Outcome of IP multicast: select a random subset of members to hold a message initially. • Measure how long these members buffer the message.
s routers receivers s sender
s routers receivers s sender idle idle idle q Sorry, you are out of luck! p
Randomized Long-term Buffering idea: provide long-term buffering for an idle message at a small subset of receivers in each region. Load balancing: spread the load of buffering across all receivers in a region. Randomized algorithm: each member independently tosses a coin to decide whether to become a long-term bufferer. C: the expected number of long-term bufferers. Saving in buffer space: n / C Network dynamics: message transfer
The probability that k members buffer an idle message for different values of C, the expected number of long-term bufferers.
The probability that no member buffers an idle message decreases exponentially with C
s routers receivers s sender q p
help! Do you have the msg? have the msg? have the msg? have the msg?
Search Overhead • Evaluate penalty in recovery time due to search for a bufferer in a region with 100 members. • RTT between any two members: 10ms. • Assume a remote request arrives at a random member. • Simulation repeated 100 times with different random seeds. • Question I: how does the search time change with the number of bufferers? • Question II: how does the search time changes with the region size?
Summary • Efficient buffer management is essential for reliable multicast in a large group. • Two phase buffering to address variances in delivery latency in a large group. • Retransmission requests can be used as feedback to allocate buffer space adaptively. • Spread the load of buffering among all members in a group through randomization.