430 likes | 447 Views
Learn about overlays' impact, capabilities, and Resilient Overlay Networks (RON). Discover why overlays are crucial in today's networking landscape and the advantages they offer over traditional routing. Dive deep into RON's design, goals, and performance to grasp the power of overlay networks.
E N D
Part III: Overlays, peer-to-peer Jinyang Li
Overlays are everywhere • Internet is an overlay on top of telephone networks • Overlays: a network on top of Internet • Endpoints (instead of routers) are nodes • Multi-hop paths among routers are links • Instant deployment!
What can overlays do? • Routing • Improve routing robustness (e.g. convergence speed) • Multicast • Anonymous communication • New applications • Peer-to-peer file sharing and lookup • Content distribution networks • Peer-to-peer live streaming • Your imagination is the limit
Why overlays? • Internet is ossified • IPv6 proposed in 1992, still not widely deployed • Multicast (1988), QoS (early 90s) etc. • Avoid burdening routers with new features • End hosts are cheap and capable • Copy and store files • Perform expensive cryptographic operations • Perform expensive coding/decoding operations • …
Today’s class • Overlays that take over routers’ jobs • Resilient Overlay Networks (RON) • Application-level multicast (NICE)
RON’s motivation • Internet routing is not reliable
Internet routing is unsatisfactory • Slow in detecting outage and recovery • Unable to use multiple redundant paths • Unable to detect badly performing paths • Applications have no control of paths Q: Why can’t we fix BGP? Q2: Hasn’t multi-homing already solved the fault tolerance problem?
BGP converges slowly Given a failure, can take up to 15 minutes to see BGP.Sometimes, not at all. [Feamster]
RON in a nutshell A small set of (<100) nodes) • What failures? • Outages: configuration/software error, broken links • Performance failures: severe congestion, Dos attacks Scalable BGP-based IP routing substrate
RON’s goals • Fast failure detection and recovery • Detect & fail-over within seconds • Applications influence path selection • Applications define failures • Applications define path metrics • Expressive and fine-grained policies • Who and what applications are allowed to use what paths
Why would RON work? • RON testbed study (2003): • About 60% of failures within two hops of edge • RON routes around many link “failures” • If exists a node whose paths to S, D doe not contain failed link • RON cannot route around access link failure
Conduit Conduit Forwarder Forwarder Router Prober Router Prober Link-state routing protocol, disseminates info using RON! RON Design Nodes in Different ASes RON library Performance Database Application-specific routing tables Policy routing module
RON reduces loss rate 30-min avg loss rate on Internet 30-min avg loss rate with RON RON loss rate is never more than 30%
RON routes around failures 30-minute average loss rates 6,825 “path hours” represented here 5 “path hours” of 100% loss (complete outage) 38 “path hours” of TCP outage (>= 30% loss) RON routed around all of these! One indirection hop provides almost all the benefit!
Lessons of RON • End hosts know better about performance and outages than routers • Internet routing trades off scalability for performance and fast failover • A small amount of redundancy goes a long way
Scalability Performance (fast convergence etc.) Flexibility (application specific metric & policy) RON’s tradeoff BGP ??? Routing overlays (e.g., RON)
Open Questions • Efficiency • generates redundant traffic on access links • Scaling • Probing traffic is O(N^2) • Can a RON be made to scale to > 50 nodes? • Is a 1000 node RON much better than 50-node? • Interaction of overlays and IP network • Interaction of multiple overlays
Application level multicast A.k.a. overlay multicast End host multicast
Why multicast? • Send the same stream of data to many hosts • Internet radio/TV/conference • Stock quote dissemination • Multiplayer network games • An efficient way to send data to many hosts
Naïve approach is wasteful • Sender’s outgoing link carries n copies of data • 128Kbps mp3 stream, 10,000 listeners = 1.28Gbps
IP multicast service model • Mimic LAN broadcast • Anyone can send, everyone hears • Use multicast address • 224.0.0.0 -- 239.255.255.255 (2^28 addresses) • Each address is called a “group” • End hosts register with routers to receive packets
Basic multicast techniques • Construct trees • Why trees? (why not meshes?) • How many trees? • Shared vs. source specific trees • Criteria of a “good” tree? • Who build trees? • Routers vs. end hosts
IP multicast • Routers construct multicast trees for packet replication and forwarding • Efficient (low latency, no dup pkts on links)
IP multicast: Augmenting DV • How to broadcast using DV routing tables without loops? • Idea: shortest paths from S to all nodes form a tree • RPF protocol: A router duplicates and forwards all packets if they arrive via the shortest path to S
a: a, 0 b: b, 1 c: c, 10 d: c, 11 c: c, 1 d: d, 0 a: a, 1 b: b, 0 c: c, 1 d: c, 2 a: a, 10 b: b, 1 c: c, 0 d: d, 1 Reverse path flooding (RPF) a • C does not forward packets from A and vice versa • However, link a <--> c sees two packets 1 d b 10 1 1 c
Reverse path broadcast (RPB) • RPF causes every ‘upstream’ routers on a LAN (link) to send a copy • RPB: only one router sends a copy • Routers listen to each others’ DV advertisements • Only the one with lowest hopcount sends
IP multicast: augmenting DV • Requires symmetric paths • Needs to prune unnecessary broadcast packets to achieve multicast [Deering et. Al. SIGCOMM 1988, TOCS 1990]
IP multicast: augmenting LS • Basic LS: each router floods with changes in link state • LS w/ multicast: routers monitor local multicast group membership and changes result in flooding • Routers use Dijkstra to compute SP trees • How expensive to compute trees for N nodes, E edges, G groups?
IP multicast has not taken off • Requires support from routers • Do ISPs have incentives to support multicast? • Not scalable • Routers keep state for every active group! • Multicast group addresses cannot be aggregated • Group membership changes much more frequently than links going up and down • Difficult to provide congestion/flow control, reliability and security
Overlay multicast • Multicast code run on end hosts • End hosts can copy&store data • No change to IP infrastructure needed • Easy to implement complex functionalities: flow control, security, layered multicast etc. • Less efficient: higher delay, duplicate pkts per link
Overlay multicast challenge • How can hosts form an efficient tree? • Hosts do know all that routers know • What’s wrong with a random tree? • Stretch: packets travel farther than have to • Stress: packets traverse links multiple times • A particular concern with access links and cross country links
Cluster-based trees (NICE) Reside in 1 cluster • A hierarchy of clusters • Cluster consists of [k,3k-1] members • Log N depth Reside in 2 clusters Reside in 3 clusters
Cluster-based trees (NICE) • Each node knows all members of its cluster(s)
Cluster-based trees • Cluster nodes according to latency • packets do not travel too far out of the way • Not perfect • Packets are sent to cluster heads (who are in the middle) so might overshoot
NICE in action • How to join a hierarchy? • Which is the right cluster? • How long does join take? • How to split/merge clusters? • What if a cluster head fails?
When do clustering not work well? Cogent MCI • Key assumption: low latency is transitive • As a node descends tree to join, assumes children of close-by cluster head are also close-by MIT Harvard Boston U MIT & Harvard peers with each other
Lessons • Where should a functionality reside? Routers vs. end hosts • End hosts • Scalability vs. Performance • Flexibility • Instant deployment! • Routers • Efficiency
Project draft report • You should be able to reuse your draft for the final report • You should have complete related work by now • You should have a complete plan • Most of the system design • Most of the experiment designs • If you have preliminary graphs, use them, try to explain them
The sandwich method for explanation • An easy example illustrating the basic idea • Detailed explanations of challenges and how your system addresses them • Does it work in general environments?