Beyond Server Selection: Challenges in Multiple-Origin Content Distribution

Beyond Server Selection: Challenges in Multiple-Origin Content Distribution Mostafa H. Ammar College of Computing Georgia Institute of Technology Atlanta, GA ammar@cc.gatech.edu

Contributors • Ellen Zegura • Hyewon Jun • Christos Gkantsidis • Pradnya Karbhari • Matt Sanders • Li Zou

Multiple-Origin Content Distribution Systems • Content is Replicated • Authoritative • Grass-roots (Peer-to-Peer) • Content is Re-constituted

Challenges • Server Selection Benefit of content replication can only be realized with proper selection • Multipoint-to-point sessions … on their way to becoming a dominant communication paradigm in a network that was designed for pt-to-pt connections

Talk Outline • Server Selection • Application-Layer Anycasting • Selection vs Binding • Multipoint-to- point sessions • Impact of Parallel Downloading • Per Session Rate Allocation Please forgive lack of references

Talk Outline • Server Selection • Application-Layer Anycasting • Application vs Network-Layer Anycasting • Multipoint-to- point sessions • Impact of Parallel Downloading • Per Session Rate Allocation

Server Replication • Server Selection Problem How does a client determine which of the replicated servers to access • Interested in Wide-Area Replication

Anycasting • Network-Layer Anycasting in RFC 1541 • Anycast IP addresses • Network-layer metrics • Per-packet selection

Application-Layer Anycasting • Group of servers identified by Anycast Name • Clients request service from group identified by name • Automatic connection to a “good” server

Server y Go to server y Green Service? Resolver Orange Server Group Green Server Group An Architecture

Resolver • “Close” to client • Maintains • Anycast group membership • Selection-enabling information • Client may provide filter that tells resolver how to select • DNS-like hierarchy of resolvers

Web Server Selection • An instantiation of architecture • Criterion: Best Response Time • [client request, last byte received] • includes path and server delays • Problem: Maintaining response time estimate for each server in anycast group at resolver

Response Time Estimation Alternatives • Probe • Push • User-Experience • Developed a Hybrid Push/Probe Technique

Wide-Area Experiments WU 3 UMD 4 5 1 5 5 5 3 4 3 GT Servers: UCLA, GTx2, WU, Clients: UMDx4, GTx16, Resolvers: UMD, GT UCLA

Anycasting VS Random Selection

What if Anycasting is popular?

Checkpoint • Appropriate guidance of clients to servers is an important infrastructure function • Client-perceived as well as global performance can be improved with the appropriate selection technology • What about a network-layer anycasting infrastructure?

Talk Outline • Server Selection • Application-Layer Anycasting • Application vs Network-layer Anycasting • Multipoint-to- point sessions • Impact of Parallel Downloading • Per Session Rate Allocation

Selection vs Binding

Selection vs Binding • Selection: A function that returns instantaneous server choice. • Binding: An application-level function which decides on the use a particular server.

Spectrum Of Binding

Spectrum of Binding (2) • Initial Binding (IB) : Select one server and stay with it during the connection life time • Periodic Binding (PB) : Periodically select a server and switch to the new server. • Continuous Binding (CB) : Select the best server per packet to react fast to the server performance change

Design Space App-Layer Anycasting Our Own Server Migration Protocol The desirability of a network-layer anycasting infrastructure depends on whether Continuous Binding can be shown to outperform Initial Binding

Migration of a CB Client

Simulation Topolgy

Initial vs. Continuous Binding Server Rank Change every [1,10] sec Server Rank Change everfy [51,60] sec • Despite the overhead of migration, Continuous Binding is able to improve performance when the connection is long-lived.

Heterogeneous Binding Increasing use of either scheme over the other by all clients with long-lived connections leads to overall performance degradation!

Checkpoint • Network-layer anycasting allows for efficient continuous binding • Continuous binding outperforms initial binding in some long transfer, highly-dynamic situations • Did not account for overhead of selection function • But we have something more sinister to worry about ….

Talk Outline • Server Selection • Application-Layer Anycasting • Application vs Network-layer Anycasting • Multipoint-to- point sessions • Impact of Parallel Downloading • Fairness

Motivation • Traditional data retrieval- over a point-to-point connection from a single server to a single client • Current trend- retrieval over multiple point-to-point connections from multiple servers to a single client • examples: CDNs, replicated servers, caches, parallel file downloads, web-traffic, MD-CDNs

What is a Session? • Definition of multipoint-to-point session: • A set of point-to-point connections started from multiple servers to a single client in order to transfer an application-level object

Typical Sessions in the Internet

Typical Sessions

Talk Outline • Server Selection • Application-Layer Anycasting • Application vs Network-layer Anycasting • Multipoint-to- point sessions • Impact of Parallel Downloading • Per Session Rate Allocation

Impact of Parallel Downloading Question 1: How much can a single user gain by parallel downloading? Question 2: What happens if all users perform parallel downloading? Question 3: How do parallel downloading users affect single downloading users?

Aggressiveness pays off. • For a ~7MB file: • Best rate: ~3Mbps. • 4x faster than single server. Time (in sec) Number of servers

Wide deployment of Parallel Downloading • More Connections • Number of competing flows increases. • More requests at the server (but, for a shorter period of time). • More Overhead • Fixed overhead is paid multiple times: Cost of a request = {size, rate, etc.}-Dependent cost + Fixed Cost.

Many aggressive clients are harmful!

Aggressive clients can hurt simple clients

Summary • There is strong local incentive for a client to use parallel downloading. • But if every one does it there is evidence global performance suffers • We need a per session rate allocation.

Talk Outline • Server Selection • Application-Layer Anycasting • Application vs Network-layer Anycasting • Multipoint-to- point sessions • Impact of Parallel Downloading • Per-Session Rate Allocation

Our Goal • To develop algorithms to achieve rate allocations which are fair to all sessions • Some challenges: • Data path of each session forms a tree • Every session has multiple bottlenecks • Partial sharing of bottlenecks between sessions • Inter-session and Intra-session fairness

Focus on Static Sessions • For purposes of rate allocation, connections start and terminate at approximately the same time • Examples: parallel file downloads, multimedia streaming using MD-CDNs

Current Rate Allocation Approach • Max-min fairness, TCP fairness • Problems with allocating rate on a per-connection basis: • sessions with more connections get higher rate allocation than sessions with fewer connections • this is not a fair rate allocation from a session point of view

Proposed Session Fair Approaches (1) • Normalized rate session fairness • rate allocation is based on weight of each connection • weights wi,j are assigned to each connection j in each session i, subject to the constraint: • this constraint ensures that total session rates are fair with respect to each other

Proposed Session Fair Approaches (2) • Per-link session fairness • rate allocation at each link on a per-session basis • each session then allocates this rate amongst the connections that traverse that link • this ensures fair allocation of session rates

Example- Connection fair

Example - Normalized rate session fair

Example- Per-link session fair

Simulation Model and Fairness Measures • 100,600-node topologies using GT-ITM • varying percentages of clients and servers • sessions with 1,4,15 connections with varying percentages • fairness measures: variance, mean, maximum, minimum of session rates and fairness index

Beyond Server Selection: Challenges in Multiple-Origin Content Distribution