360 likes | 501 Views
A Hierarchical IPv4 Framework. Patrick Frejborg pfrejborg@gmail.com 24 Feb 2009. Why hIPv4 ?. Addressing RFC 4984
E N D
A Hierarchical IPv4 Framework Patrick Frejborg pfrejborg@gmail.com 24 Feb 2009
Why hIPv4 ? • Addressing RFC 4984 It is commonly recognized that today’s Internet routing and addressing system is facing serious scaling problems. The ever increasing user population, as well as multiple other factors including multi-homing, traffic engineering, and policy routing, have been driving the growth of the Default Free Zone (DFZ) routing table size at an increasing and potentially alarming rate. While it has been long recognized that the existing routing architecture may have serious scalability problems, effective solutions have yet to be identified, developed, and deployed.
Influence sources • The Locator ID Separation Protocol development work at IRTF • MPLS solutions, mainly the shim header that made it possible to create new services on top of an IP backbone • Anycast Rendezvous Point (RP) with Multicast Source Discovery Protocol (MSDP) • IPv6 installations at Enterprises • Why would enterprises migrate to IPv6 – what will they gain? • Bigger migration project than Y2K – for what reason? • Applications have to be ported to IPv6, a lot of work to be done – who will sponsor? • Shortage of IPv4 is not the problem of an enterprise – will use NAT instead! • PSTN architecture • Haven’t seen or heard that PSTN will soon run out of decimal numbers and that we have to migrate to hexadecimal keypads, do you? • Either not aware of scalability issues with SS7 – hidden prefixes to solve routing issues are used between PSTN switches
So, what if… • What if we borrow concepts from existing solutions and glue them together • Basic ideas and goals in LISP are definitely interesting, especially the Routing Locators (RLOC) and Endpoint ID (EID) concept • MPLS forwarding and shim header concept • Anycast RP • Numbering architecture from the PSTN, i.e. country and national destination code concepts are ported to the IPv4 world – an “Internet country” is an Autonomous System or an area of a service provider! • Trade off is • New hardware is needed at some spots in the Internet • Minor software upgrade for Internet routers • Extensions are needed for DNS and DHCP • Extension to current IPv4 stack at hosts, but most applications continue to use the IPv4 socket API (stream and datagram sockets) • Raw socket applications needs to be enhanced
Some basic rules (1) • Allocate a globally unique IPv4 block for RLOC allocations; hereafter called the Global RLOC Block (GRB) • Assign one RLOC for each Autonomous System (AS) or service provider, this AS or service provider area is called a RLOC realm • Only GRB prefixes are exchanged between RLOC realms • A multihomed enterprise with an AS number will have a RLOC assigned and thus is a RLOC realm • Regional Internet Registries will allocate Provider Independent IP addresses for enterprises – both single and multihomed. This assignment is unique in the country/countries where the IP block is deployed • Residential/consumer customers will use Provider Aggregatable IP addresses
Some basic rules (2) • Introduce extensions to current protocols • DNS; add RLOC record for each host • DHCP; add RLOC option for a scope • Current IGP and BGP are still valid routing protocols • Define a “shim” header that contains RLOC and EID information. The new shim header is called a LISP header • When the LISP header is inserted to an IPv4 datagram the new header combination is called a hIPv4 header • Introduce new functionalities, routing is still done upon the IPv4 forwarding plane • LISP Switch Router (LSR); in a certain situation the LSR shall swap the IPv4 and LISP header • The RLOC identifier is configured as an Anycast address on one or several LSR within a RLOC realm • Intermediate routers need to support hIPv4 in the control plane in order to reply to ICMP requests
Outcome, when hIPv4 is fully implemented • Gaining several “recyclable” IPv4 address blocks • Allocation of PI blocks are unique within a country or countries of deployment • PA addresses are only locally significant within the RLOC realm • Creating hierarchy at the control plane • Only GRB prefixes are announced between RLOC realms • Multihomed enterprises will only advertise their assigned RLOC to the service providers • Single homed PI addresses are installed in the RIB of the local RLOC realm • PA addresses are installed in the RIB of the local RLOC realm • Current size of the Default Free Zone (DFZ) RIB is decreased • No or minor changes to the current DFZ topology • No new signaling protocols, neither an overlay topology is introduced – instead AS destination based routing with IPv4 as the forwarding plane!
Client -> Server www.foo.com? A-record: 10.2.2.2RLOC:172.16.0.5 S:10.1.1.1 D:10.2.2.2 S:10.1.1.1 D:172.16.0.5 S:10.1.1.1 D:10.2.2.2 R:172.16.0.3 E:10.2.2.2 S:172.16.0.3 D:10.2.2.2 S:172.16.0.3 D:10.2.2.2 S:10.1.1.1 D:172.16.0.5 R:172.16.0.5 E:10.1.1.1 R:172.16.0.5 E:10.1.1.1 R:172.16.0.3 E:10.2.2.2 SWAP IPv4 API IPv4 header LISP header
Server -> Client S:10.2.2.2 D:10.1.1.1 S:10.2.2.2 D:10.1.1.1 S:10.2.2.2 D:172.16.0.3 R:172.16.0.5 E:10.1.1.1 S:10.2.2.2 D:172.16.0.3 S:172.16.0.5 D:10.1.1.1 S:172.16.0.5 D:10.1.1.1 R:172.16.0.5 E:10.1.1.1 R:172.16.0.3 E:10.2.2.2 R:172.16.0.3 E:10.2.2.2 SWAP IPv4 API IPv4 header LISP header
The hIPv4 header • Version 4 is still valid but new protocol IDs are needed for current IPv4 protocols (ICMP, IGMP, TCP, UDP, IP in IP, GRE, ESP, AH etc) in order for the stack to identify when IPv4 or hIPv4 header is applied • Forwarding network devices will calculate the IPv4 header checksum per each hop • Hosts shall calculate the TCP and UDP pseudoheader checksum including RLOC and EID values • Since remote LSR will swap the IPv4 and LISP header the TCP checksum will be bogus, unless…
LSR functionality • The assigned RLOC shall be configured as an Anycast address and announced to the Internet • When the IPv4 header’s destination address of the hIPv4 packet is equal to the RLOC at the remote LSR, then • verify IP and TCP/UDP checksum, include RLOC and EID values for the pseudoheader calculation • replace the source address in the IPv4 header with the RLOC address of the LISP header • replace the destination address in the IPv4 header with the EID address of the LISP header • replace the RLOC address in the LISP header with the destination address of the IPv4 header • replace the EID address in the LISP header with the source address of the IPv4 header • decrease TTL with one • calculate IP and TCP/UDP checksums, include RLOC and EID values for the pseudoheader calculation • forward the datagram upon the destination address of the IPv4 header
The hIPv4 stack functionalities • The IPv4 socket API is still using the tuplets • RLOC identifiers are provided by DHCPand DNS schemas • The hIPv4 stack must assemble the outgoing datagram with • local IP address -> src IP address • remote IP address -> EID • local RLOC -> RLOC • remote RLOC -> dst IP address • The hIPv4 stack must present the headers of the incoming datagram to the IPv4 socket API as • src IP address -> remote RLOC • dst IP address -> local IP address • RLOC -> local RLOC • EID -> remote IP address
Src IP = Dst IP considerations • Since source and destination addresses are only locally significant within a RLOC realm there is a slight chance that source and destination address at the API will be the same when connections are established between RLOC realms. • Connection is still unique since two processes communicating over TCP form a logical connection that is uniquely identifiable by the tuplets involved, that is by the combination of < local_IP_address, local_port, remote_IP_address, remote_port>
Src IP = Dst IP considerations S:10.2.2.2 D:10.2.2.2 S:172.16.0.4 D:10.2.2.2 S:172.16.0.4 D:10.2.2.2 R:172.16.0.5 E:10.2.2.2 R:172.16.0.5 E:10.2.2.2 SWAP www.foo.com? A-record: 10.2.2.2RLOC:172.16.0.5 S:10.2.2.2 D:172.16.0.5 R:172.16.0.4 E:10.2.2.2 S:10.2.2.2 D:10.2.2.2 S:10.2.2.2 D:172.16.0.5 R:172.16.0.4 E:10.2.2.2 IPv4 API IPv4 header LISP header
“Identical connection situation” • Since source and destination addresses are only locally significant within a RLOC realm there is a slight chance that source and destination address and source ports at the API will be the same when connections are established from two clients residing in separate RLOC realms contacting a server in a third RLOC realm. • Connection is unique since two processes communicating over TCP form a logical connection that is uniquely identifiable by the tuplets involved, that is by the combination of < local_IP_address, local_port, remote_IP_address, remote_port> • But if the source port from both clients have the same value the connection is no longer unique! • Solution is, the hIPv4 stack must accept only one unique connection upon RLOC information, the “identical connection” is not allowed and the client is informed by an ICMP notification
“Identical connection situation” www.foo.com? SWAP S:10.1.1.1 D:172.16.0.5 S:172.16.0.3 D:10.2.2.2 A-record: 10.2.2.2RLOC:172.16.0.5 R:172.16.0.3 E:10.2.2.2 R:172.16.0.5 E:10.1.1.1 S:172.16.0.3 D:10.2.2.2 S:10.1.1.1 D:10.2.2.2 R:172.16.0.5 E:10.1.1.1 S:10.1.1.1 D:172.16.0.5 R:172.16.0.3 E:10.2.2.2 S:10.1.1.1 D:10.2.2.2 S:172.16.0.4 D:10.2.2.2 S:172.16.0.4 D:10.2.2.2 R:172.16.0.5 E:10.1.1.1 R:172.16.0.5 E:10.1.1.1 www.foo.com? SWAP A-record: 10.2.2.2RLOC:172.16.0.5 S:10.1.1.1 D:172.16.0.5 R:172.16.0.4 E:10.2.2.2 S:10.1.1.1 D:10.2.2.2 S:10.1.1.1 D:172.16.0.5 R:172.16.0.4 E:10.2.2.2 IPv4 API IPv4 header LISP header
Traceroute considerations • The routers and devices in the path to the remote RLOC realm needs to support ICMP extensions • ICMP services are deployed in the control plane, the forwarding plane remains intact • That is, software upgrade is needed for the control plane • The hIPv4 ICMP extensions shall be compatible with RFC 4884
Traceroute,1 (intra-AS) traceroute www.foo.com A-record: 10.2.2.2RLOC:172.16.0.5 S:10.2.2.2 D:10.1.1.1 ICMP extensions S:10.1.1.1 D:172.16.0.5 R:172.16.0.3 E:10.2.2.2 S:172.16.0.3 D:10.1.1.1 R:172.16.0.5 E:OIF ICMP extensions IPv4 API IPv4 header LISP header
Traceroute,2 (inter-AS) traceroute www.foo.com A-record: 10.2.2.2RLOC:172.16.0.5 S:10.2.2.2 D:10.1.1.1 ICMP extensions S:10.1.1.1 D:172.16.0.5 R:172.16.0.3 E:10.2.2.2 S:10.1.1.1 D:172.16.0.5 S:OIF D:172.16.0.3 R:172.16.0.3 E:10.2.2.2 R:172.16.0.1 E:10.1.1.1 ICMP extensions S:172.16.0.1 D:10.1.1.1 R:172.16.0.3 E:OIF ICMP extensions SWAP IPv4 API IPv4 header LISP header
Traceroute,3 (target-AS) traceroute www.foo.com A-record: 10.2.2.2RLOC:172.16.0.5 S:10.2.2.2 D:10.1.1.1 ICMP extensions S:10.1.1.1 D:172.16.0.5 SWAP R:172.16.0.3 E:10.2.2.2 S:172.16.0.3 D:10.2.2.2 R:172.16.0.5 E:10.1.1.1 S:10.1.1.1 D:172.16.0.5 S:OIF D:172.16.0.3 R:172.16.0.3 E:10.2.2.2 R:172.16.0.5 E:10.1.1.1 S:OIF D:172.16.0.3 ICMP extensions R:172.16.0.5 E:10.1.1.1 S:172.16.0.5 D:10.1.1.1 ICMP extensions R:172.16.0.3 E:OIF ICMP extensions SWAP IPv4 API IPv4 header LISP header
Multicast considerations • Source address (S) for a group (G) is no longer visible outside the local RLOC realm (only GRB prefixes are seen), therefore Reverse Path Forwarding (RPF) is only valid within the local RLOC realm • In order to enable RPF globally for a (S,G), the multicast enabled LSR (mLSR) must at the source RLOC realm replace the source address with the local RLOC identifier • LSR in the source RLOC realm shall act as an Anycast RP with MSDP capabilities • The mLSR will decide which multicast groups are announced to other AS • The receiver will locate the source via MSDP, the shared tree can be established to the mLSR • Source Specific Multicast schema will need an extension, RLOC and EID options shall be added to SSM
Multicast forwarding S:10.1.1.1 G:225.5.5.5 S:10.1.1.1 D:225.5.5.5 S:10.1.1.1 D:225.5.5.5 S:172.16.0.3 D:225.5.5.5 S:172.16.0.3 D:225.5.5.5 R:172.16.0.3 l E:10.1.1.1 S:172.16.0.3 D:225.5.5.5 S:10.1.1.1 D:225.5.5.5 R:172.16.0.3 E:10.1.1.1 R:172.16.0.3 E:10.1.1.1 R:172.16.0.3 E:10.1.1.1 SWAP IPv4 API IPv4 header LISP header
RTCP receiver reports S:10.1.1.1 D:225.5.5.5 S:10.1.1.1 G:225.5.5.5 S:10.2.2.2 D:172.16.0.3 S:10.2.2.2 D:172.16.0.3 R:172.16.0.5 E:10.1.1.1 S:172.16.0.5 D:10.1.1.1 R:172.16.0.5 E:10.1.1.1 S:172.16.0.5 D:10.1.1.1 R:172.16.0.3 E:10.2.2.2 R:172.16.0.3 E:10.2.2.2 SWAP IPv4 API IPv4 header LISP header
Traffic Engineering considerations • Load balancing is influenced by the placement of LSRs within a RLOC realm; LSR provides “nearest routing” schema • A service provider can have several RLOC assigned; traffic engineering and filtering can be done upon RLOC addresses • If needed an RLOC identifier based Traffic Engineering solution can perhaps be developed. Establish explicit routing paths upon RLOC information, that is create explicit paths that can be engineered via specific RLOC realms.
Path MTU Discovery considerations • Since the hIPv4 header is assembled at the host the hIPv4 packet will use current PTMUD mechanisms • The network will not see any differences between the sizes of an IPv4 or an hIPv4 datagram
SIP considerations • SIP uses the local IP address of the host in the messages • In SDP for the target of the media • In the Contact of a REGISTER as the target for incoming INVITE • In the Via of request as the target for a response • Since SIP is carrying IP addresses of hosts it have caused a lot of problems in NAT environments – hIPV4 can mitigate the pain since it will reduce the need of NAT • SIP needs to be extended to support the hIPV4 framework, i.e. carry RLOC information in the SIP messages • New SDP attribute is needed to provide the RLOC information to the remote UA • Add a RLOC Extension Header Field for SIP
SIP considerations, INVITE S:172.16.0.4 D:10.2.2.2 sip.foo.com? SWAP R:172.16.0.5 E:10.3.3.3 A-record: 10.3.3.3RLOC:172.16.0.4 S:10.1.1.1 D:172.16.0.4 INVITE: bob@10.2.2.2SDP a=10.1.1.1SDP m=45668 RTPSDP l=172.16.0.3 R:172.16.0.3 E:10.3.3.3 INVITE: bob@foo.comSPP a=10.1.1.1SDP m=45668 RTPSDP l=172.16.0.3 SWAP S:172.16.0.3 D:10.3.3.3 S:10.3.3.3 D:172.16.0.5 R:172.16.0.4 E:10.1.1.1 R:172.16.0.4 E:10.2.2.2 INVITE: bob@foo.comSDP a=10.1.1.1SDP m=45668 RTPSDP l=172.16.0.3 INVITE: bob@10.2.2.2SDP a=10.1.1.1SDP m=45668 RTPSDP l=172.16.0.3 bob@10.2.2.2;R=172.16.0.5
SIP considerations, 200 OK SWAP S:172.16.0.4 D:10.1.1.1 R:172.16.0.3 E:10.3.3.3 200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5 S:10.2.2.2 D:172.16.0.4 R:172.16.0.5 E:10.3.3.3 S:10.3.3.3 D:172.16.0.3 200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5 R:172.16.0.4 E:10.1.1.1 200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5 S:10.2.2.2 D:172.16.0.4 S:172.16.0.5 D:10.3.3.3 R:172.16.0.5 E:10.3.3.3 R:172.16.0.4 E:10.2.2.2 200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5 200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5 SWAP
SIP considerations, RTP INVITE: bob@10.2.2.2SDP a=10.1.1.1SDP m=45668 RTPSDP l=172.16.0.3 200 OKSDP a=10.2.2.2SDP m=35678 RTPSDP l=172.16.0.5 S:10.2.2.2 D:172.16.0.3 S:10.1.1.1 D:172.16.0.5 R:172.16.0.5 E:10.1.1.1 RTP R:172.16.0.3 E:10.2.2.2 RTP
Mobility considerations • Site mobility, a site wishes to changes its attachment point to the Internet without changing its IP address block. The change of attachment point is possible when PI addresses are allocated to the site. Only local RLOC identifier needs to be changed. • Host mobility, Alex C. Snoeren’s and Hari Balakrishnan’s “An End-to-End Approach to Host Mobility” is interesting. Since the IPv4 stacks needs to be enhanced studies should be carried out to see if “TCP connection method” can be implemented in the hIPv4 stack. • Another interesting host mobility solution is “Reliable Network Connections” paper by Victor C. Zandy and Barton P. Miller. Studies should be carried out to see rocks and racks can be integrated to the hIPv4 stack. http://pages.cs.wisc.edu/~zandy/rocks/
Transition considerations • Upgrades of host stacks, DNS & DHCP databases, security devices and network devices can be carried out in parallel without change of topology or major network breaks • LSRs can be added to an AS or a service provider area when commercially available in order to create a RLOC realm • When the hIPV4 framework is ready at a RLOC realm the RLOC record can be added for those hosts in the DNS, one by one. • Legacy IPv4 clients will still use legacy IPv4 schema but when a hIPv4 client receives a DNS response with RLOC (and not matching local RLOC) it can use the hIPV4 framework to reach the server. Intra-RLOC realm connections (remote RLOC=local RLOC) will use legacy IPv4 connections – no added value to use the hIPv4 framework inside a RLOC realm. • When will Internet migrate from a flat to a hierarchical topology? • Possible tipping point #1; when the RIB of DFZ is getting close to the capabilities of current hardware – who will pay for the upgrade? Or will the service provider only accept GRB prefixes from other providers and avoid capital expenses? • Possible tipping point #2; when the exhaust of IPv4 addresses is causing enough problems for enterprises • Both customer and provider have a common interest that Internet is available and affordable!
Security considerations • Hijacking of prefixes by longest match from another RLOC realm is no longer possible since the source prefix is separated by a locator. • In order execute a hijack of a certain prefix the whole RLOC realm must be routed via a bogus RLOC realm. Studies should be carried out with the Secure Inter-Domain Routing (SIDR) workgroup if the RLOC identifiers can be protected from hijacking.
Carrots for Everyone, Long Term • Enterprises • No need to learn a new protocol, only RLOC concept is introduced • Minimize porting of applications to a new protocol, IPv4 socket API is extended • Get Provider Independent addresses without multihoming requirement, i.e. achieve site mobility • When hosts are upgraded to support the hIPv4 framework, NAT solutions can be removed • Internet Service Providers • No need to learn new routing protocols • Remove IPv4 address constraints • Hierarchical BGP, smaller RIB for each RLOC realm • Internal prefix flaps are not seen in other RLOC realms, only GRB state changes are reflected globally – “update churn” is reduced