190 likes | 260 Views
Dynamic Replica Placement for Scalable Content Delivery. Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, kubitron}@CS.Berkeley.EDU EECS Department UC Berkeley. replica. cache. always update. adaptive coherence. CDN server. client. Motivation Scenario. data source.
E N D
Dynamic Replica Placement for Scalable Content Delivery Yan Chen, Randy H. Katz, John D. Kubiatowicz {yanchen, randy, kubitron}@CS.Berkeley.EDU EECS Department UC Berkeley
replica cache always update adaptive coherence CDN server client Motivation Scenario data source data plane Web content server network plane
Goal and Challenges Provide content distribution to clients with good Quality of Service (QoS) while retaining efficient and balanced resource consumption of the underlying infrastructure • Dynamicchoice ofnumber and location of replicas • Clients’ QoS constraints • Servers’ capacity constraints • Efficient update dissemination • Delay • Bandwidth consumption • Scalability: millions of objects, clients and servers • No global network topology knowledge
Previous Work (Replica Placement) • Focused on static replica placement • Clients’ distributions and access patterns known in advance • Assume global IP network topology • Data Location via DNS-redirection • Highly inefficient (this is a “hack”) • Centralized CDN name server cannot record replica locations
Previous Work (Info Dissemination) • No inter-domain IP multicast • Application-level multicast (ALM) unscalable • Root maintains states for all children (Narada, Overcast, ALMI, RMX) • Root handles all “join” requests (Bayeux) • Root split is common solution, but suffers consistency overhead
Solutions for Dissemination Tree • Peer-to-Peer Overlay Location Services with Good Scalability&Locality • Simultaneous Replica Placement and Tree Construction
Peer-to-peer Routing and Location Services • Properties Needed by Tree Building Algorithms • Distributed, scalable location with guaranteed success • Search with locality • P2P Routing and Location Services: Tapestry • CAN, Chord, Pastry insufficient locality or flexibility to place objects • Http://www.cs.berkeley.edu/~ravenben/tapestry
Simultaneous Replica Placement and TreeConstruction • Static Replica Placement + IP Multicast • Modeled as a global optimization problem • Design a greedy algorithm with logN approximation • Optimal case for comparison • Dynamic Replica Placement + Application-level Multicast • Search for qualified local replicas first • Place new replicas on Tapestry overlay path • Two approaches: naïve and smart • Soft-state Tree Maintenance • Each node only maintains states for its parent and direct children
parent candidate proxy Tapestry overlay path Dynamic Replica Placement: naïve data plane s c network plane Tapestry mesh
first placement choice Dynamic Replica Placement: naïve data plane parent candidate s proxy c network plane Tapestry mesh Tapestry overlay path
Aggressive search • Greedy load distribution parent candidates Tapestry overlay path Dynamic Replica Placement: smart data plane client child s parent proxy sibling c server child network plane
first placement choice Tapestry overlay path Dynamic Replica Placement: smart • Aggressive search • Lazy placement • Greedy load distribution data plane parent candidates client child s parent proxy sibling c server child network plane
Evaluation Methodology • Network Topology • 5000-node network with GT-ITM transit-stub model • 500 d-tree server nodes, 4500 clients join in random order • Dissemination Tree Server Deployment • Random d-tree • Backbone d-tree (choose backbone routers and subnet gateways first) • Constraints • 50 ms latency bound and 200 clients/server load bound
Four Approaches for Comparison • Overlay Dynamic Naïve Placement (dynamic_naïve) • Overlay Dynamic Smart Placement (dynamic_smart) • Static Placement on Overlay Network (overlay_static) • Static Placement on IP Network (IP_static)
Number of Replicas Deployed and Load Distribution • Overlay_smart uses much less replicas than overlay_naïve and very close to IP_static • Overlay_smart has better load distribution than od_naïve, overlay_static and very close to IP_static
Multicast Performance • 85% of overlay_smart Relative Delay Penalty (RDP) less than 4 • Bandwidth consumed by overlay_smart is very close to IP_static and much less than overlay_naive
Tree Construction Traffic Including “join” requests, “ping” messages, replica placement and parent/child registration • Overlay_smart consumes three to four times of traffic than overlay_naïve, and the traffic of overlay_naïve is quite close to IP_static • Far less frequent event than update dissemination
Conclusions • Peer-to-peer networks can be used to construct CDNs • Dissemination Tree: dynamic Content Distribution Network with good QoS, efficiency and load balancing • P2P location service to improve scalabilityandlocality • Simultaneous dynamic replica placement and tree construction • In particular • Use Tapestry to contact nearby region of tree to select parent • Lazy placement of new replicas on Tapestry overlay path • Close to optimal number of replicas, good load distribution, low multicast delay and bandwidth penalty at the price of reasonable construction traffic
Future Work • Evaluate with more diverse topologies and real workload • Dynamic replica deletion/migration to adapt to the shift of users’ interests • Implementation for OceanStore, a global-scale persistent data storage system