160 likes | 348 Views
Load Balancing and Stability Issues in Algorithms for Service Composition. Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM 2003. Outline. One line comment Motivation Assumed environment & Challenges Proposed Mechanisms & Experiments Critique. One line comment.
E N D
Load Balancing and Stability Issues in Algorithms for Service Composition Bhaskaran Raman & Randy H.Katz U.C Berkeley INFOCOM 2003
Outline • One line comment • Motivation • Assumed environment & Challenges • Proposed Mechanisms & Experiments • Critique
One line comment • Propose mechanisms to perform scalable and stable service composition in a wide-area network
Motivation (1/2) • Service composition • Enables quick & flexible development of new applications • Reuse existing services • Service Scenario 1 France Korea RRS RRS By SKT Many Many Users! Translation
Motivation (2/2) • Service Scenario 2 • Scenario 1 • Many Users Issues of scale • Scenario 2 • Multimedia Session Availability (Failure detection & Recovery) Long living multimedia session VoD Server 9시 News Transcoder
Assumed Environment • Services are deployed at service clusters • Mechanisms to handle failures and share load are leveraged • Service clusters have a cluster manager • Perform monitoring & computation required for management • Service clusters form an overlay network • Stretch across the wide-area Internet • Services are deployed by multiple service providers • Service clusters may be spread in many different ASes Exit node
Assumed environment & Challenges • System Characteristics • Wide area service overlay network • Many client sessions • sessions last for a long time • Requirements • Scalability • Balance load among replicas • Stability • Rapid failure detection & recovery
Exit node Proposed Mechanism – Scalability • Load balancing • Load definition & Load balancing mechanism • Metric for load estimation : LIAC (Least Inverse Available Capacity) • Side effect of LIAC • No cost for intermediary nodes • Path length comparison • 8000 paths cost Service 0 Service 1 Exit node Exit node cost cost Exit node
Proposed Mechanism – Scalability • Enhanced metric for load estimation • Assign a cost to all links • Cost: proportional to the AC of the downstream node • Effect of the new metric – shorter path length & good load balancing
Proposed Mechanism – Scalability • Load balancing • Load information dissemination • Propagating load information • Simple periodic flooding : incur load oscillation • Reduce link-state update period? • No! Increase the overhead • On-demand link-state update? • No! add load during an overloaded period
Proposed Mechanism – Scalability • Piggybacking • Feedback load information along the established service path • Low control overhead Service 0 Service 1 Exit node Exit node Exit node Exit node
Exit node Proposed Mechanism – Stability • Failure detection & Recovery • End-to-End recovery • Deliver failure notification to an exit node reconstruct a service path • Local Recovery • Failure notice find an alternate path
Proposed Mechanism – Stability • Failure detection & Recovery • Heartbeat mechanism for failure monitoring • 300ms period • Packet losses are correlated within 1 sec • Timeout value • Timeout value to distinguish temporary failures and long term failures • Trade-off between early detection vs. false detection • Empirically found the appropriate value through experiment
Proposed Mechanism – Stability • Failure detection & Recovery • Measured the failure gap period of a wide-area Internet path • Exchange heartbeat for a week • US-Berlin-Austrailia • 1.8 sec for the timeout value • Early detection & acceptable false detection rate
Proposed Mechanism – Stability • Recovery time • End-to-End recovery vs. local recovery • Slightly longer recovery time • Failure notification is the only additional cost • Better path after reconstruction • Globally optimized path
Critique • Strong points • Simplified the problem well • Scalability load balancing estimation & propagation of load info • Stability Failure detection & recovery timeout value selection • Emulation strategy • Lower cost than real experiment • More realistic compared to simulations • Weak points • Didn’t consider bandwidth in the metric • Target applications are bandwidth sensitive • Only applicable to service paths • There can be requests in the form of graphs • Limitation of piggybacking • Length of composition is limited to 2 • If the length gets longer, path length will be more important • α-value selection should be selected carefully