1.39k likes | 1.51k Views
Internet Routing. Renata Teixeira Laboratoire LIP6 CNRS and Université Pierre et Marie Curie. What is routing?. A famous quotation from RFC 791 “A name indicates what we seek. An address indicates where it is. A route indicates how we get there.” -- Jon Postel. Scale
E N D
Internet Routing Renata Teixeira Laboratoire LIP6 CNRS and Université Pierre et Marie Curie
What is routing? • A famous quotation from RFC 791 “A name indicates what we seek.An address indicates where it is.A route indicates how we get there.” -- Jon Postel
Scale With over 1 billion destinations can’t store all dest’s in routing tables! routing table exchange would swamp links! Administrative autonomy internet = network of networks each network admin may want to control routing in its own network Challenges for Internet routing
Scalability: Classless Inter-Domain Routing (CIDR) • IP Prefix represents a network (group of dests) • Represented by two 32-bit numbers (address + mask) • Hierarchy in address allocation • Address allocation by ARIN/RIPE/APNIC and by ISPs • Today, routing tables contain ~150,000-200,000 prefixes 00001100 00000100 00000000 IP address: 12.4.0.0 00000000 11111111 IP mask: 255.254.0.0 00000000 00000000 11111110 Network Prefix for hosts Usually written as 12.4.0.0/15
Reduce routing table size Without CIDR: 232.71.0.0 232.71.1.0 232.71.2.0 ….. 232.71.255.0 232.71.0.0 232.71.1.0 232.71.2.0 ….. 232.71.255.0 Internet Big ISP With CIDR: 232.71.0.0 232.71.1.0 232.71.2.0 ….. 232.71.255.0 Internet Big ISP 232.71.0.0/16
Autonomy: network of networks • Internet = interconnection of Autonomous Systems (AS) • Distinct regions of administrative control • Routers/links managed by a single “institution” • Service provider, company, university, etc. DT AS 2 AS 1 AS 3 LIP6 network
Hierarchical routing Inter-AS routing (Border Gateway Protocol) determines AS path and egress point DT AS 2 AS 1 LIP6 network AS 3 Intra-AS routing (Interior Gateway Protocol) Most common: OSPF,IS-IS determines path from ingress to egress
Outline • Intra-domain routing • Shortest path routing • Link state: OSPF/IS-IS • Distance vector: RIP • Inter-domain routing • Challenges and goals • AS relationships • Path vector: Border Gateway Protocol (BGP)
Shortest-path routing • Path-selection model • Destination-based • Load-insensitive (e.g., static link weights) • Minimum hop count or sum of link weights 2 1 3 1 4 2 1 5 4 3
Two types of routing algorithms • Link-state algorithm • Uses global information • Based on Dijkstra’s algorithm • Distance-vector algorithm • Information is distributed • Based on Bellman-Ford
Link-state routing • Each router keeps track of its incident links • Whether the link is up or down and the cost on the link • Each router broadcasts the link state • To give every router a complete view of the graph • Each router runs Dijkstra’s algorithm • To compute the shortest paths and construct the forwarding table • Example protocols • Open Shortest Path First (OSPF) • Intermediate System – Intermediate System (IS-IS)
Detecting topology changes • Beaconing • Periodic “hello” messages in both directions • Detect a failure after a few missed “hellos” • Performance trade-offs • Detection speed • Overhead on link bandwidth and CPU • Likelihood of false detection “hello”
Broadcasting the link state • Flooding • Node sends link-state information out its links • And then the next node sends out all of its links • … except the one where the information arrived X A X A C B D C B D (b) (a) X A X A C B D C B D (c) (d)
Broadcasting the link state • Reliable flooding • Ensure all nodes receive link-state information • … and that they use the latest version • Challenges • Packet loss • Out-of-order arrival • Solutions • Acknowledgments and retransmissions • Sequence numbers • Time-to-live for each packet
When to initiate flooding • Topology change • Link or node failure • Link or node recovery • Configuration change • Link cost change • Periodically • Refresh the link-state information • Typically (say) 30 minutes • Corrects for possible corruption of the data
Convergence • Getting consistent routing information to all nodes • E.g., all nodes having the same link-state database • Consistent forwarding after convergence • All nodes have the same link-state database • All nodes forward packets on shortest paths • The next router on the path forwards to the next hop 2 1 3 1 4 2 1 5 4 3
Transient disruptions • Detection delay • A node does not detect a failed link immediately • … and forwards data packets into a “blackhole” • Depends on timeout for detecting lost hellos 2 1 3 1 4 2 1 5 4 3
Transient disruptions • Inconsistent link-state database • Some routers know about failure before others • The shortest paths are no longer consistent • Can cause transient forwarding loops 2 2 1 1 3 3 1 1 4 4 2 2 1 1 5 4 4 3 3
Convergence delay • Sources of convergence delay • Detection latency • Flooding of link-state information • Shortest-path computation • Creating the forwarding table • Performance during convergence period • Lost packets due to blackholes and TTL expiry • Looping packets consuming resources • Out-of-order packets reaching the destination • Very bad for VoIP, online gaming, and video
Reducing convergence delay • Faster detection • Smaller hello timers • Link-layer technologies that can detect failures • Faster flooding • Flooding immediately • Sending link-state packets with high-priority • Faster computation • Faster processors on the routers • Incremental Dijkstra algorithm • Faster forwarding-table update • Data structures supporting incremental updates
Bellman-ford algorithm • Define distances at each node x • dx(y) = cost of least-cost path from x to y • Update distances based on neighbors • dx(y) = min {c(x,v) + dv(y)} over all neighbors v 2 v y 1 3 1 4 x z u 2 1 5 du(z) = min{c(u,v) + dv(z), c(u,w) + dw(z)} t w 4 3 s
Distance vector algorithm • c(x,v) = cost for direct link from x to v • Node x maintains costs of direct links c(x,v) • Dx(y) = estimate of least cost from x to y • Node x maintains distance vector Dx = [Dx(y): y є N ] • Node x maintains its neighbors’ distance vectors • For each neighbor v, x maintains Dv = [Dv(y): y є N ] • Each node v periodically sends Dv to its neighbors • And neighbors update their own distance vectors • Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N • Over time, the distance vector Dx converges
Iterative, asynchronous: each local iteration caused by: Local link cost change Distance vector update message from neighbor Distributed: Each node notifies neighbors only when its DV changes Neighbors then notify their neighbors if necessary wait for (change in local link cost or message from neighbor) recompute estimates if DV to any destination has changed, notify neighbors Distance vector algorithm Each node:
Changes in a link cost Dy(x) = min{c(y,x) + Dx(x), c(y,z) + Dz(x)} = min{60+0, 1+5} = 6 X X 4 Z Z 1 Z 8 Z 6 60 Y Routing loop 4 1 X Z 50 Y Y 4 Z Y 5 Y 7 X Y 5 Y Y 1 This process will continue for 44 iterations! Usually called count-to-infinity problem.
Routing Information Protocol (RIP) • Distance vector protocol • Nodes send distance vectors every 30 seconds • … or, when an update causes a change in routing • Link costs in RIP • All links have cost 1 • Valid distances of 1 through 15 • … with 16 representing infinity • Small “infinity” smaller “counting to infinity” problem • RIP is limited to fairly small networks • E.g., used in the Princeton campus network
Message complexity LS: with n nodes, E links, O(nE) messages sent DV: exchange between neighbors only Speed of Convergence LS: O(n2) algorithm requires O(nE) messages DV: convergence time varies May be routing loops Count-to-infinity problem Robustness: What happens when router malfunctions? LS: Node can advertise incorrect link cost Each node computes only its own table DV: DV node can advertise incorrect path cost Each node’s table used by others (error propagates) Link state vs. distance vector
Conclusions:Intra-domain routing • Routing is a distributed algorithm • React to changes in the topology • Compute the shortest paths • Two main shortest-path algorithms • Dijkstra link-state routing (e.g., OSPF and IS-IS) • Bellman-Ford distance vector routing (e.g., RIP) • Convergence process • Changing from one topology to another • Transient periods of inconsistency across routers
Outline • Inter-domain routing • AS relationships • Path-vector routing • Border Gateway Protocol (BGP) • Incremental, prefix-based, path-vector protocol • Internal vs. external BGP • Business relationships and traffic engineering with BGP • BGP convergence delay
Hierarchy of autonomous systems • Large, tier-1 provider with a nationwide backbone • At the “core” of the Internet, don’t have providers • Medium-sized regional provider with smaller backbone • Small network run by a single company or university Large Big Medium1 Medium2 Univ
Big Small Large gateway router access router Internet exchange point IXP Connections between networks dial-in access IXP private peering Medium commercial customer
Univ Big Medium1 Large Single-homed customers • Univ has only one connection to the Internet Medium2
Univ Big Medium1 Large Multi-homed customers • Same provider: e.g., Medium2 to Big • Different providers: e.g., Medium2 to Big and Huge Medium2
traffic to/from Univ transit traffic is not allowed Customer-provider relationship • Customer needs to be reachable from everyone • Provider exports routes learned from customer to everyone • Customer does not want to provide transit service • Customer does not export from one provider to another Univ is customer of Medium1 Medium2 is a customer of Big and Large Big Medium1 Medium2 Univ Large
customers exchange traffic Big doesn’t provide transit for its peers Peer-peer relationship • Peers exchange traffic between customers • AS exports only customer routes to a peer • AS exports a peer’s routes only to its customers Big and Large are peers Big and Medium1 are peers Big Medium1 Medium2 Univ Large
Peering provides shortcuts peer peer Peering also allows connectivity between the customers of “Tier 1” providers provider customer
Reduces upstream transit costs Can increase end-to-end performance May be the only way to connect your customers to some part of the Internet (“Tier 1”) You would rather have customers Peers are usually your competition Peering relationships may require periodic renegotiation How peering decisions are made? Peer Don’t Peer
Forwarding table is configured by both intra- and inter-AS routing algorithm Intra-AS sets entries for internal dests Inter-AS & Intra-As sets entries for external dests 3c 2c 2a 1a 1b 2b 3b 3a 1d 1c Interconnected ASes AS 3 AS 2 AS 1 Intra-AS Routing algorithm Inter-AS Routing algorithm Forwarding table
Suppose router in AS1 receives datagram for which dest is outside of AS1 Router should forward packet towards one of the gateway routers, but which one? AS1 needs: to learn which dests are reachable through AS2 and which through AS3 to propagate this reachability info to all routers in AS1 Job of inter-AS routing! 3c 3a 3b 1c 1b 1a 1d 2a 2c 2b Inter-AS tasks AS 3 AS 1 AS 2
3c 3a 3b 1c 1b 1a 1d 2a 2c 2b Example: Setting forwarding table in router 1d • Suppose AS1 learns (via inter-AS protocol) that prefix x is reachable via AS3 (gateway 1a) but not via AS2. • Inter-AS protocol propagates reachability info to all internal routers. • Router 1d determines from intra-AS routing info that its interface I is on the least cost path to 1a. • Puts in forwarding table entry (x,I). AS 1 AS 3 AS 2 l
3c 3a 3b 1c 1b 1a 1d 2a 2c 2b Example: Choosing among multiple ASes • Now suppose AS1 learns from the inter-AS protocol that prefix x is reachable from AS3 and from AS2. • To configure forwarding table, router 1d must determine towards which gateway it should forward packets for dest x. This is also the job on inter-AS routing protocol! AS 1 AS 3 AS 2 l
Challenges for inter-domain routing • Scale • Prefixes: 150,000-200,000, and growing • ASes: 30,000 visible ones, and growing • Privacy • ASes don’t want to divulge internal topologies • … or their business relationships with neighbors • Policy • No Internet-wide notion of a link cost metric • Need control over where you send traffic • … and who can send traffic through you
Shortest-path routing is restrictive • All traffic must travel on shortest paths • All nodes need common notion of link costs • Incompatible with commercial relationships YES National ISP2 National ISP1 NO Regional ISP3 Regional ISP1 Regional ISP2 Cust1 Cust2 Cust3
Link-state routing is problematic • Topology information is flooded • High bandwidth and storage overhead • Forces nodes to divulge sensitive information • Entire path computed locally per node • High processing overhead in a large network • Minimizes some notion of total distance • Works only if policy is shared and uniform • Typically used only inside an AS • E.g., OSPF and IS-IS
Distance vector is on the right track • Advantages • Hides details of the network topology • Nodes determine only “next hop” toward the dest • Disadvantages • Minimizes some notion of total distance, which is difficult in an inter-domain setting • Slow convergence due to the counting-to-infinity problem (“bad news travels slowly”) • Idea: extend the notion of a distance vector
Path-vector routing • Extension of distance-vector routing • Support flexible routing policies • Avoid count-to-infinity problem • Key idea: advertise the entire path • Distance vector: send distance metric per dest d • Path vector: send the entire path for each dest d d: path(2,1) d: path(1) 2 1 3 d data traffic
Faster loop detection • Node can easily detect a loop • Look for its own node identifier in the path • E.g., node 1 sees itself in the path “3, 2, 1” • Node can simply discard paths with loops • E.g., node 1 simply discards the advertisement d: path(2,1) d: path(1) 2 1 3 d: path(3,2,1)