570 likes | 589 Views
This thesis explores Overlay Multicast Mechanism in student Jia-Hui Huang’s research, focusing on Topology-Aware Grouping and End System Multicast, aiming to improve multicast efficiency and reliability. The study outlines the design, simulation, and summary of the mechanisms implemented.
E N D
Overlay Multicast Mechanism Student : Jia-Hui Huang Adviser : Kai-Wei Ke Date : 2006/5/9
Outline • Introduction • Topology-Aware Grouping • End system multicast • Simulation • Summary
Introduction • IP multicast Drawback • Require router to maintain per-group state • Reliability, congestion control, flow control more difficulty • Overlay multicast • Build an overlay multicast tree on top IP layer • Unicast data along tree links • Application level multicast
Overlay multicast mechanism • Topology-Aware Grouping (TAG) • End system multicast (ESM) • Narada
Outline • Introduction • Topology-Aware Grouping • End system multicast • Simulation • Summary
TAG(1/2) • Exploits underlying network topology information • Use path overlap among member to reduces • Delay • Link Stress • TAG node maintain IP and paths for parent and children – Family table (FT)
TAG(2/2) • Definition • A path from node A to node B • The spath of A where S is the root of the tree • Length of a path or is the number of routers in the path • if is a prefix of where s is the root of the tree
Complete path matching(1/2) • Like longest prefix match • Algorithm consider three mutually exclusive conditions • Select a node A such that A is child of C • Select children of C • No child of C satisfying 1 or 2 N : new member C : the node being examined
Complete path matching(2/2) • Recursive algorithm until condition 2 or 3 is meet • Tree management • Member join • Member leave • Fault resilience • Parent and children periodically exchange messages • Child failure : discards the child from it’s FT • Parent failure : rejoin
Outline • Introduction • Topology-Aware Grouping • End system multicast • Simulation • Summary
ESM • Shift multicast feature to end system • Group membership • Multicast routing • Packet duplication • Using a self-organizing and fully distributed algorithm • Narada algorithm • Two steps of Narada algorithm • Construct a mesh • Construct per-source spanning tree for mesh
ESM Concept 27 A C A C • Link Stress (Si): number of identical copies of a packet carried by a physical link • Distance (di) • Resource usage 1 1 27 28 25 R1 R1 3 2 1 2 28 B D B D IP Multicast Resource Usage : 30 IP Unicast Resource Usage : 57 Complete virtual graph End System Multicast Resource Usage : 32
Narada Design (1/2) • objectives of Narada algorithm • Self-organizing • Overlay efficiency • Self-improving • Narada algorithm • Group Management • Mesh Performance • Data delivery
Narada Design (2/2) • Two steps of algorithm • Group management functions are abstracted out and handled at the mesh • Distributed heuristics for repairing mesh partition • We may leverage standard routing algorithms for construction of data delivery trees Tree Mesh
Group management (1/5) • Distributed manage membership • Every member maintain a list of other members in the group • List need update when join, leave or fail • Refresh message mechanism • Each member periodically generate a refresh message with sequence number • Dissemination refresh message along the mesh
Group management (2/5) • Member i keeps track of the information for every other member k in the group • Member address k • Last sequence number • Time of first receive • Reduce overhead of refresh message • Each member periodically exchange its knowledge of membership with neighbors
Group management (3/5) • Three operation of group management • Member join • Member leave and failure • Repairing mesh partitions • Member join process • It assume can get a some member list • Random select member from list to send join message • The join message request added as a neighbor of that member • Repeat process until successful join the group • Refresh message mechanism to obtain group info.
Group management (4/5) • Member leave and failure • Member must notifies its neighbors before leave • Leave information will propagated to the rest of group members • Abrupt • Detected by neighbors when stop receive refresh • Propagate information to other members • Ex of failure if node c fail C E B A G F D
Group management (5/5) • Repairing mesh partitions • Member failure may cause partition • Each member maintain a queue that stopped receive refresh message for at least time • Periodically run a scheduling algorithm to probe and delete member from head of queue
Mesh performance (1/3) • The constructed mesh can be suboptimal because • Random selection neighbor when join • Link add in partition repair my not useful in long time • Underlying network conditions may vary • Using utility mechanism to add or drop link dynamically and improve quality
Mesh performance (2/3) • Utility function depends on the what kind of performance metric specific • Example latency and bandwidth ( conferencing application ) • Addition of links • Every member periodically probe some random members that is not neighbor • And evaluate the utility of adding a link to this member • Determine if add link by a given threshold
Mesh performance (3/3) • Dropping of links • Every member periodically computes the cost of its link to every neighbor using the cost algorithm • The cost of a link between I and j in I’s perception is the number of group members for which I use j as next hop • Picks the lowest cost link and drops it if it falls below threshold
Data delivery • The per-source trees constructed from the reverse shortest path between each recipient and source
Outline • Introduction • Topology-Aware Grouping • End system multicast • Simulation • Summary
Simulation (1/2) • Properties of simulation topology • Power-law • Larger number of low-degree routers than high-degree routers • Small-world • Avg. shortest distance between two randomly chosen nodes is approximately six hops
Simulation (2/2) • Property of constructed overlay tree • High-degree high-bandwidth router more likely traversed by links near the source • Simulation metrics • Number of hops vs. overlay tree level • Relative delay penalty (RDP) • Longest Latency • Mean Bandwidth
Number of hops vs. overlay tree level Number of hops decreases as the host level increases
Relative delay penalty (RDP) ESM < MDDBST < TAG
Longest Latency • Latency & RDP for ESM decrease as more hosts join • Lower latency paths become available ESM > TAG > MDDBST
Mean Bandwidth • Trade-off between latency and bottleneck bandwidth MDDBST > TAG > ESM
Outline • Introduction • Topology-Aware Grouping • End system multicast • Simulation • Summary
Summary • Both delay and number of hops between parent and child decrease as the level increase • Balance the trade-off between delay and bandwidth
Reference • Yang-hua Chu, Sanjay G. Rao, Srinivasan Seashan, and Hui Zhang, “A Case for End System Multicast,” IEEE Journal On Selected Areas In Communications, VOL. 20 ISSUE 8, Oct. 2002, pp. 1456-1471 • Sherlia Y. Shi, Jonathan S. Turner and Marcel Waldvogel, “Dimensioning Server Access Bandwidth and Multicast Routing in Overlay Networks,” Proceedings of NOSSDAV 2001. • Minseok Kwon and Sonia Fahmy, “Topology-Aware Overlay Networks for Group Communication,” Proceedings of NOSSDAV'02, May 2002. • Minseok Kwon and Sonia Fahmy, “Characterizing Overlay Multicast Networks,” IEEE International Conference on Network Protocols, pp. 61
Outline • Introduction • Dimensioning server multicast routing • Topology-Aware Grouping • End system multicast • Simulation • Summary
Dimensioning server multicast routing(1/2) • Use AMcast network architecture • Deploy application servers on the networks • Spawn a start topology from each server to its end users • End users send/receive exactly one copy of packet • Work shifted from source to all servers • Design routing algorithms from two objectives
Dimensioning server multicast routing(2/2) • Delay Optimization • Minimum diameter, degree-bounded spanning tree (MDDBST) • Load balancing • Bounded diameter, residual-balanced spanning tree (BDRBST) • Two objectives are orthogonal
MDDBST(1/4) • Definition given G=(V,E) : undirected complete graph : degree bound : cost for edge e Find A spanning tree T of G for each and degree of v satisfies diameter (the cost of the longest simple path) of T is minimized
MDDBST(2/4) Longest path of u to any other nodes in T
MDDBST(3/4) A 1 2 A 6 7 4 E B 5 10 9 B C D E 3 C D 8 A A E E B D C C D B
MDDBST(4/4) A A 1 E E 4 9 10 B B C D C D
BDRBST(1/3) • Definition given G=(V,E) : undirected complete graph : degree bound : cost for edge e B : cost Bound Find A spanning tree T of G for each and degree of v satisfies diameter (the cost of the longest simple path) of T < B and maximize (residual bandwidth)
BDRBST(2/3) • Introduce balance factor M • Algorithm similar MDDBST • Main difference • Select a set of M smallest nodes • Select the largest residual bandwidth (smallest degree) node as parent node • Special cases • M=1 : algorithm same as MDDBST • M= # of servers : only considers load balancing
BDRBST(3/3) • Increase system capacity by increase end-to-end delay • Small values of M provide good load balance while still meeting the diameter bound
Family table (FT) …...
Topology aware definition S Path from S to D5 ( spath of D5 ) R1 D2 R5 D1 R2 R3 R4 D4 D3 D5
Path match condition S S S Path match C C C A1 A2 A3 A1 A2 A Path match S S Path match N C C Condition 1 N N A1 A2 A1 A2 A3 Condition 2 Condition 3
CPM Member join Request/Reply Path Matching S Root Request/Reply D1 R1 Request/Reply Member1 Join R2 D4 Member2 R3 ….. R4 New Member D5 D2 D3 CPM Join process