580 likes | 690 Views
Network Layer: Real World. UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred , Kurose&Ross (some edited), Caesar&many others (also edited). host, router network layer functions:. IP protocol addressing conventions datagram format packet handling conventions.
E N D
Network Layer: Real World UIUC CS438: Communication Networks Summer 2014 Fred Douglas Slides: Fred, Kurose&Ross(some edited), Caesar&many others (also edited)
host, router network layer functions: • IP protocol • addressing conventions • datagram format • packet handling conventions forwarding table The Internet network layer Out-of-band “control” traffic transport layer: TCP, UDP Names used in routing • ICMP protocol • error reporting • router signaling Control plane • routing protocols • path selection • RIP, OSPF, BGP Data plane Writes into Goes through IP datagram IP datagram link layer physical layer
Routing: Inter- and intra-domain • Intra-domain • Get the packet to the right computer • Paths are e.g. “router 1, router 2, router 3, host x” • Run on a network owned by one entity (e.g. UIUC) • Inter-domain • Get the packet to the right network • (To then be taken care of by intra-domain routing) • Paths are e.g. “ISP 1, ISP 2, ISP 3, ISP 4” • Runs among entities (UIUC, Comcast, Deutch Telekom, governments, … ) • This is what glues the internet together
The “Domains”: Autonomous Systems • AS is a network under a single administrative control • currently over 30,000 ASes • Think AT&T, France Telecom, UIUC, IBM, etc. • ASes are sometimes called “domains”. • Hence, “interdomain routing” • Each AS is assigned a unique identifier • 16 bit AS Number (ASN) • We will come back to this in inter-domain routing
Intra-Domain Routing • This is basically what we covered last time. • OSPF and IS-IS, the most common intra-domain protocols, are essentially the Link State protocol, including the “areas” optimization. • What about addresses outside our network? • A “border router” speaks both inter- and intra-domain routing, and tells the routers inside that it takes care of outside addresses
00001100 00100010 10011110 00000101 IP Addresses (IPv4) • Unique 32-bit number associated with a host • Represented with the dotted-quad notation, e.g., 12.34.158.5: 12 34 158 5
01010000 01000100 00010011 01110011 11110000 10110111 00110011 00000111 Examples 80.19.240.51 • What address is this? • How would you represent 68.115.183.7?
History of Internet Addressing • Always dotted-quad notation (for IPv4) • Always network/host address split • But nature of that split has changed over time • Originally ONLY one of: (see next slide) • 1.2.3.4 • 1.2.3.* (Class C) • 1.2.*.* (Class B) • 1.*.*.* (Class A) • Now: variable divisions: “Classless interdomain routing” (slide after)
0 0 network host “Classful” Addressing 8 126 nets ~16M hosts Class A (1.*.*.*) 0 16 network 1 0 host ~16K nets ~65K hosts Class B (1.2.*.*) 0 24 network host 1 1 0 ~2M nets 254 hosts Class C (1.2.3.*) Problem: Networks only come in three sizes!
00001100 00100010 10011110 00000101 Classless Addressing • 32 bits are partitioned into a prefix and suffix components • Prefix is the network component; suffix is host component • Network prefix is used for inter-domain routing • Terminology: 12.34.158.0/23 represents a “slash 23” network; it has a 23 bit prefix and 29 host addresses • Its “net mask” is 255.255.254.0: 1 bit if the bit is part of the prefix 12 34 158 5 Network (23 bits) Host (9 bits)
CIDR (example) • Suppose a network has fifty computers • allocate 6 bits for host addresses (since 25 < 50 < 26) • remaining 32 - 6 = 26 bits as network prefix • E.g., 128.23.9/26 is a “slash 26” network • Flexible boundary between network and host bits means the boundary must be explicitly specified with the network address • informally, “slash 26” 128.23.9/26 • formally, represent length of prefix with a 32-bit mask: 256.256.256.192where all network prefix bits set to “1” and host suffix bits to “0”
Classful vs. Classless addresses • Example: an organization needs 500 addresses. • A single class C address not enough (254 hosts). • Instead a class B address is allocated. (~65K hosts) • That’s overkill, a huge waste! • CIDR allows an arbitrary prefix-suffix boundary • Hence, organization allocated a single /23 address (equivalent of 2 class C’s) • Maximum waste: 50%
Allocation Done Hierarchically • Much like DNS! (ICANN does DNS, too.) • Internet Corporation for Assigned Names and Numbers (ICANN) gives large blocks to… • Regional Internet Registries (e.g., ARIN), which give blocks to • ARIN American Registry for Internet Numbers • Large institutions (ISPs), which give addresses to… • Individuals and smaller institutions • FAKE Example: ICANN ARIN AT&T UIUC CS
CIDR: Addresses allocated in contiguous prefix chunks Recursively break down chunks as get closer to host : : : 12.0.0.0/15 12.3.0.0/22 12.2.0.0/16 12.3.4.0/24 : : 12.3.0.0/16 12.3.254.0/23 : : 12.0.0.0/8 12.253.0.0/19 12.253.32.0/19 12.253.64.0/19 12.253.0.0/16 12.253.64.108/30 : 12.253.96.0/18 12.253.128.0/17
FAKE Example in More Detail • ICANN gives ARIN several /8s • ARIN gives AT&T one /8, 128.0.0.0/8 • Network Prefix: 10000000 • AT&T gives UIUC a /16, 128.174.0.0/16 • Network Prefix: 1000000010101110 • UCB gives CS a /24, 128.174.64.0/24 • Network Prefix: 1000000010101110 01000000 • EECS gives me a specific address 128.174.64.16 • Address: 1000000010101110 0100000000010000
IP addressing scalable routing? • Hierarchical address allocation helps routing scalability if allocation matches topological hierarchy
IP addressing scalable routing? a.b.*.* is this way a.c.*.* is this way France Telecom AT&Ta.0.0.0/8 LBLa.c.0.0/16 UCBa.b.0.0/16
IP addressing scalable routing? Can add new hosts/networks without updating the routing entries at France Telecom a.*.*.* is this way France Telecom AT&Ta.0.0.0/8 foo.coma.d.0.0/16 LBLa.b.0.0/16 UCBa.c.0.0/16
IP addressing scalable routing? ESNet must maintain routing entries for both a.*.*.* and a.c.*.* ESNet AT&Ta.0.0.0/8 LBLa.b.0.0/16 UCBa.c.0.0/16
IP addressing scalable routing? • Hierarchical address allocation helps routing scalability if allocation matches topological hierarchy • Problem: may not be able to aggregate addresses for “multi-homed” networks • Two competing forces in scalable routing • aggregation reduces number of routing entries • multi-homing increases number of entries
Dot-com implosion; Internet bubble bursts Advent of CIDR allows aggregation: linear growth Initial growth super-linear; no aggregation Back in business Internet boom: multihoming drives superlinear growth Growth in Routed Prefixes (1989-2005)
Same Table, Extended to Present Linear growth Superlinear growth What Happened Here? Stock Market Crash of 2008
Summary of Addressing • Hierarchicaladdressing • Critical forscalablesystem • Don’t require everyone to know everyone else • Reduces amount of updating when something changes • Non-uniform hierarchy • Useful for heterogeneous networks of different sizes • Class-based addressing was far too coarse • Classless InterDomain Routing (CIDR) more flexible
NAT: network address translation rest of Internet local network (e.g., home network) 10.0.0/24 10.0.0.1 10.0.0.4 10.0.0.2 138.76.29.7 10.0.0.3 datagrams with source or destination in this network have 10.0.0/24 address for source, destination (as usual) alldatagrams leaving local network have same single source NAT IP address: 138.76.29.7,different source port numbers
NAT: network address translation implementation: NAT router must: • First outgoing packet: (private IP addrsrc, src port #) (public IP addrsrc, new src port #) • remember (in NAT translation table) that mapping for future packets • incoming datagrams:look up in table, and (public IP addrdest, dest port #) (private IP addrdest, dest old port #)
3 1 2 4 S: 10.0.0.1, 3345 D: 128.119.40.186, 80 S: 138.76.29.7, 5001 D: 128.119.40.186, 80 1:host 10.0.0.1 sends datagram to 128.119.40.186, 80 2:NAT router changes datagram source addr from 10.0.0.1, 3345 to 138.76.29.7, 5001, updates table S: 128.119.40.186, 80 D: 10.0.0.1, 3345 S: 128.119.40.186, 80 D: 138.76.29.7, 5001 NAT: network address translation NAT translation table WAN side addr LAN side addr 138.76.29.7, 5001 10.0.0.1, 3345 …… …… 10.0.0.1 10.0.0.4 10.0.0.2 138.76.29.7 10.0.0.3 4:NAT router changes datagram dest addr from 138.76.29.7, 5001 to 10.0.0.1, 3345 3:reply arrives dest. address: 138.76.29.7, 5001
NAT: network address translation • 16-bit port-number field: • 60,000 simultaneous connections with a single LAN-side address! • NAT is “controversial”: • routers should only process up to layer 3 • violates end-to-end argument • NAT possibility must be taken into account by app designers, e.g., P2P applications • address shortage should instead be solved by IPv6
Recall from Lecture 3 “Autonomous System (AS)” or “Domain”Region of a network under a single administrative entity “Border Routers” An “end-to-end” route “Interior Routers”
Administrative structure shapes Interdomainrouting • ASes want freedom to pick routes based on policy • “My traffic can’t be carried over my competitor’s network” • “I don’t want to carry A’s traffic through my network” • Not expressible as Internet-wide “shortest path”! • ASes want autonomy • Want to choose their own internal routing protocol • Want to choose their own policy • ASes want privacy • choice of network topology, routing policies, etc.
Choice of Routing Algorithm Link State (LS) vs. Distance Vector (DV)? • LS offers no privacy -- global sharing of all network information (neighbors, policies) • LS limits autonomy -- need agreement on metric, algorithm • DV is a decent starting point • per-destination advertisement gives providers a hook for finer-grained control over whether/which routes to advertise • but DV wasn’t designed to implement policy • and is vulnerable to loops if shortest paths not taken The “Border Gateway Protocol” (BGP) extends distance-vector ideas to accommodate policy
Topology and policy is shaped by the business relationships between ASes • Three basic kinds of relationships between ASes • AS A can be AS B’s customer • AS A can be AS B’s provider • AS A can be AS B’s peer • Business implications • Customer pays provider • Peers don’t pay each other • Exchange roughly equal traffic
Business Relationships Relations between ASes Business Implications • Customers pay provider • Peers don’t pay each other customer provider peer peer
Why peer? A E.g., D and E talk a lot B C Peering saves B and C money D E Relations between ASes Business Implications • Customers pay provider • Peers don’t pay each other customer provider peer peer
Routing Follows the Money! Provider Customer Q Peer Peer • ASesprovide “transit” between their customers • Peers do not provide transit between other peers A B C D E F traffic not allowed traffic allowed
Routing Follows the Money! Provider Customer Q Peer Peer • An AS only carries traffic to/from its own customers over a peering link A B C D E F
Routing Follows the Money! Provider Customer Peer Peer • Routes are “valley free” C A F
In Short • AS topology reflects business relationships between Ases • Business relationships between ASes impact which routes are acceptable • BGP Policy: Protocol design that allows ASes to control which routes are used
Interdomain Routing: Setup • Uses path vector routing: Border Gateway Protocol (BGP) • Nodes are Autonomous Systems (ASes) • Internals of each AS are hidden • Links represent both physical links and business relationships • Destinations are IP prefixes (12.0.0.0/8)
2 3 1 Differences between BGP and PV (1) not picking shortest path routes • BGP selects the best route based on policy, not shortest distance (least cost) • Reminder: how do we avoid loops, again? • Look for your own name in the paths you receive Node 2 may prefer“2, 3, 1” over “2, 1”
Differences between BGP and DV (2) BGP may aggregate routes • For scalability, BGP may aggregate routes for different prefixes a.*.*.* is this way AT&Ta.0.0.0/8 foo.coma.d.0.0/16 LBLa.b.0.0/16 UCBa.c.0.0/16
Differences between BGP and DV (3) Selective route advertisement • For policy reasons, an AS may choose not to advertise a route to a destination • Hence, reachability is not guaranteed even if graph is connected AS 1 AS 3 Example: AS#2 does not want to carry traffic between AS#1 and AS#3 AS 2
advertisements traffic Typical Export Policy: Peer-Peer Case • Peers exchange traffic between their customers • AS exports only customer routes to a peer • AS exports a peer’s routes only to its customers providers peer peer d customers
advertisements traffic Typical Export: Customer-Provider • Customer pays provider for access to Internet • Provider exports its customer routes to everybody • Customer exports provider routes only to its customers Traffic to customer Traffic from customer d provider provider customer d customer
Typical Route Selection Policy • In decreasing order of priority • make/save money (send to customer > peer > provider) • maximize performance (smallest AS path length) • minimize use of my network bandwidth (“hot potato”) • …
Propagating BGP Info within the AS • BGP (Border Gateway Protocol):the de facto inter-domain routing protocol • “glue that holds the Internet together” • BGP provides each AS a means to: • eBGP: obtain subnet reachability information from neighboring ASs. • iBGP: propagate reachability information to all AS-internal routers. • determine “good” routes to other networks based on reachability information and policy. • allows subnet to advertise its existence to rest of Internet: “I am here”
2c 2b 1b 1d 3c 1c BGP message 3a 3b 2a 1a AS1 BGP basics • BGP session:two BGP routers (“peers”) exchange BGP messages: • advertising pathsto different destination network prefixes (“path vector” protocol) • exchanged over semi-permanent TCP connections • when AS3 advertises a prefix to AS1: • AS3 promises it will forward datagrams towards that prefix • AS3 can aggregate prefixes in its advertisement AS3 other networks other networks AS2
2c 2b 1b 1d 1c 3a 3b 2a 1a BGP basics: distributing path information • using eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1. • 1c can then use iBGP do distribute new prefix info to all routers in AS1 • 1b can then re-advertise new reachability info to AS2 over 1b-to-2a eBGP session • when router learns of new prefix, it creates entry for prefix in its forwarding table. eBGP session iBGP session AS3 other networks other networks AS2 AS1
Putting it Altogether:How Does an Entry Get Into a Router’s Forwarding Table? • Answer is complicated! • Ties together hierarchical routing (Section 4.5.3) with BGP (4.6.3) and OSPF (4.6.2). • Provides nice overview of BGP!