490 likes | 642 Views
Routing. Jinyang Li. Administravia. Hand in PS1 to Hui Zhang before you leave! Email me project teamlist by Oct-1. Routing basics. Model the network as a graph Goal: Find a (best) path from A to B You favorite path finding algorithm? Breadth first search Bellman-Ford Dijkstra
E N D
Routing Jinyang Li
Administravia • Hand in PS1 to Hui Zhang before you leave! • Email me project teamlist by Oct-1
Routing basics • Model the network as a graph • Goal: Find a (best) path from A to B • You favorite path finding algorithm? • Breadth first search • Bellman-Ford • Dijkstra • Floyd-Warshall • Routing protocols must be decentralized
Challenges • Network topology is dynamic • Links go up and down • Nodes go up and down • Link costs (metrics) change • Nodes might have stale information • Nodes might have different information
Basic decentralized routing algorithms • Distance-vector (DV) • Link state (LS)
Distance vector routing • Based on Bellman-Ford • Nodes only keep path metrics to all destinations • Neighboring nodes exchange path metrics
a: a, 0 b: b, 1 c: c, 10 c: c, 1 d: d, 0 a: a, 1 b: b, 0 c: c, 1 a: a, 10 b: b, 1 c: c, 0 d: d, 1 Distance vector routing a 1 d b 10 1 1 c
a: 1 b: 0 c: 1 a: a, 0 b: b, 1 c: b, 2 + = a: 10 b: 1 c: 0 d: 1 a: a, 1 b: b, 0 c: c, 1 d: c, 2 + = DV: routing table update a: a, 0 b: b, 1 c: c, 10 a 1 c: c, 1 d: d, 0 d b a: a, 1 b: b, 0 c: c, 1 10 1 1 c a: a, 10 b: b, 1 c: c, 0 d: d, 1
DV: routing table update • When does DV find best paths? • Static topology, synchronous exchange • A node learns the best path ≤ x hops after x rounds of exchange • DV converges after n rounds if longest short path is n hops
DV under dynamics • DV update rule • reduce path metric if get a better one from nbr • Always correct w/ static topology • Might be wrong when topology changes • My path metric is based on new topology • Neighbor’s path metric could be for old topology
a: 1 b: 0 c: 3 d: 4 a: a, 0 b: b, 1 c: b, 4 d: b, 5 + = a: a, 1 b: b, 0 c: inf d: inf a: 0 b: 1 c: 2 d: 3 a: a, 1 b: b, 0 c: a, 3 d: a, 4 = + a: 10 b: inf c: c, 0 d: d, 1 Incorrect update of path metric based on old topology DV: count-to-infinity a: a, 0 b: b, 1 c: b, 2 d: b, 3 a 1 a: c, 1 b: c, 2 c: c, 3 d: d, 0 d b a: a, 1 b: b, 0 c: c, 1 d: c, 2 10 1 1 c a: b, 2 b: b, 1 c: c, 0 d: d, 1
(Partial) solutions to count-to-infinity • Make infinity a finite number (e.g. 64) • Split-horizon • Do not advertise routes you learnt from neighbor N back to N • Split-horizon with poison reverse • Advertise route you learnt from neighbor N with an infinity metric
a: a, 1 b: b, 0 c: inf d: inf a: a, 0 b: b,1 c: b,c 2 d: b,c,d 3 a: a, 1 b: b, 0 c: inf d: inf + = a: a, 10 b: inf c: c, 0 d: d, 1 Discard old info based on path vector Path vector:no more count-to-infinity a: a, 0 b: b, 1 c: b,c, 2 d: b,c,d, 3 a 1 a: c, 1 b: c,b, 2 c: c,b,a, 3 d: d, 0 d b a: a, 1 b: b, 0 c: c, 1 d: c,d, 2 10 1 1 c a: b,a, 2 b: b, 1 c: c, 0 d: d, 1
DV Summary • Periodic exchange among neighbors • Each update has O(N) size, N is the # of nodes (routable prefixes) • Convergence delays • Explicit path vector info speeds up convergence
Link State Routing • In DV, topology is implicit in the routing tables • Convergence is delayed when using old topology for updates • LSR: make topology explicit!
a 1 d b 10 1 1 c Link State Routing a: a, 0 b: b, 1 c: b, 2 d: b, 3 Dijkstra • Each nodes keeps link state information (complete topology) • Each node computes paths based on topology using a centralized algorithm
Link state updates • Both ends of a link floods link state to the entire network • Immediately upon change • Periodically with a long period • LS seq # distinguishes old LS from new ones • Old LS times out eventually
Link State vs. DV • Routing state • LS: O(E) to keep complete topology • DV: O(N) to keep path metrics to all nodes • Routing message overhead • LS: O(E*E) floods each LS to entire network • DV: O(E*N) to exchange routing tables on all links • LS converges faster than DV • Does LS guarantee loop-free forwarding?
Common link metrics • What’s the “cost” of different links? • 1 • Latency • Bandwidth • Queue length • …
Other routing algorithms? • LS/DV find optimal paths • Both incur substantial message overhead • Trade off path optimality for lower overhead? • Compact routing: O( ) state, 3 times longer paths in the worst case • Geographic routing: constant state
Routing on the Internet -- from algorithms to protocols
Intra-domain routing • Goal: • Find best paths between all intra-AS networks • Traffic engineering to load balance different paths • Popular IGP (interior gateway) protocols: • OSPF (LS) • IS-IS (LS) • RIP (DV)
Inter-domain routing • Goal: • Provide reachability for different ASes • Comply to polices of different ASes • BGP: path vector based on ASes
BGP • Routing policies • Protocol operations • Disseminating BGP routes within an AS • BGP challenges • Policy interactions • Multihoming • security
Free for traffic between customers of two peering ISPs only Small ISP pays $$ to AT&T NYU pays $ to small ISP Inter-AS topology is not simply a graph AT&T Another ISP Small ISP NYU Customer
BGP export policy: what to reveal to neighbors? • If you tell N about A --> you agree to forward traffic from N to A • If you do not want to forward traffic to A, don’t tell others about it • Always advertise customer routes • Carrying traffic for customer brings $$$ • Advertise non-customer routes to customers only • If you advertise non-customer routes to another provider/peer, you are carrying traffic for nothing!
Customers pay for their traffic Avoid paying providers by using peer routes BGP import policies: which route to use? • Not simply shortest path! • Different preferences for routes from different ASes • Customer > peer > provider
Longest matching prefix Next hop AS Path High values are better Example BGP routes >show ip bgp 216.165.108.8 BGP routing table entry for 216.165.0.0/17, version 221058 Paths: (41 available, best #39, table Default-IP-Routing-Table) Not advertised to any peer 4513 701 7018 12 12, (aggregated by 12 192.76.177.66) 209.10.12.125 from 209.10.12.125 (209.10.12.125) Origin IGP, metric 4103, localpref 100, valid, external, atomic-aggregate ….
Route selection based on attributes Local Pref • Used to prefer customer > peer > provider • high values are better ASPATH • Prefer paths with lowest # of ASes MED • Tell others to choose one exit point over another • low values are better IGP path cost • Lower values are better • leads to “hot potato” routing Router ID
MED=500 MED=100 Hot potato routing • All ASes want to get rid of external traffic asap • Hot potato routing causes asymmetric traffic Blue AS’ preferred route
BGP operations • A router establishes a BGP session with its neighbors over TCP • Neighbors might be many hops away • Two neighbors exchange • UPDATE (announcements, withdrawal) • KEEPALIVE
Disseminating routes within an AS Routers inside an AS establish iBGP session to learn external routes Routers establish eBGP sessions between different ASes
Challenges of route dissemination • Loop free • Routers should not disagree on how to route • Complete • Each router chooses route as if it knows all external routes from all eBGP sessions • Scalable
A strawman that works: full mesh dissemination • Each router establishes an iBGP session with all eBGP speaking routers. • Complete All routers know all routes. • Loop free All routers know the same set of routes. • Not scalable Requires e(e-1)/2 + ei iBGP sessions among e eBGP routers and i non-eBGP routers
A simple route reflector setup RR learns routes from eBGP sessions • Requires e+i BGP sessions • Clients and the reflector exchange less traffic • All loads are on one router • Not all clients get best routes if there are multiple egress routers RR tells clients best route for each prefix over iBGP Route Reflector Reflector client
RR2 learns equally good route to prefix D from eBGP RR1 learns best route to prefix D from eBGP A problematic RR topology setup RR1 RR2 RR2 tells clients its best route to D, next hop RR2 1 RR1 tells clients its best route to D, next hop RR1 1 3 3 Reflector client Reflector client 2
BGP • Routing policies • Protocol operations • Disseminating BGP routes within an AS • BGP challenges • Policy interactions • Multi-homing • Security
When policy goes against shortest path… AS1 • Each AS prefers two-hop route via its clock-wise neighbor AS0 AS3 AS2
Shortest path routing always converges • Why? • Shortest paths form a DAG (directed acyclic graph) from all nodes to a destination. • When polices override shortest path, there’s danger…
Ensuring convergence • Global policy check • Each AS submits its policy & neighbors to a global registry • Centrally check for bad policy interactions • Checking is NP-complete • Topology might change • Gao/Rexford (today’s paper) • AS graphs are hierarchical • Restrict the set of allowed policies
Gao’s observation • AS graphs are not just any graph • Provider-subscriber relationships form a DAG Peeringlink Publisher-subscriber link
Gao’s rule for convergence • Do not go against DAG edges • Customer route > provider peer routes • If peering links do not cause cycles… • Customer peer routes > provider routes A peering link that might cause cycle A peering link that will not cause cycle
Gao’s rule for convergence • Gao’s rule matches ISPs’ incentives • ISP Incentives: customer > peer > provider • Gao’s: customer > peer provider
BGP • Routing policies • Protocol operations • Disseminating BGP routes within an AS • BGP challenges • Policy interactions • Multi-homing • Security
BGP and multi-homing • “stub” AS uses 2 links to the same ISP • “stub” AS uses 2 links to different ISPs • Transit AS uses 2 providers & peers
Announce one route to d/19 Announce d1/20 Announce d2/20 Stub AS with a single ISP • Resilient to a single link failure • announce d/19 on both links • Balance load between two links • split prefix, announce sub-prefix on different links • No need for a public AS number for stub Stub AS, d/19
Announce d/19 with 6 12 12 12 Announce d/19 with 5 12 Announce d/19 with ASPATH 12 Announce d/19 with ASPATH 12 12 12 Stub AS with multiple ISPs AS 5 AS 6 • Resilient to one ISP failure • announce prefix over both links in primary/backup setup • Balance load between two ISPs • split prefix and announce sub-prefix on each link • Need a public AS number Stub AS 12, d/19
Service providers multi-home • Load balance transit traffic on many prefixes (inter-domain traffic engineering) • Control both outbound and inbound traffic • Redundancy • Primary/backup etc. • Challenge: scalability and predictability