150 likes | 318 Views
VS (Virtual Subnet) draft-xu-virtual-subnet-03. Xiaohu Xu <xuxh@huawei.com> IETF 79, Beijing. VS Overview. VS aims to be a practical and scalable data center network architecture which is desired to meet the following objectives: Maximize Bandwidth Utilization:
E N D
VS (Virtual Subnet)draft-xu-virtual-subnet-03 Xiaohu Xu <xuxh@huawei.com> IETF 79, Beijing
VS Overview • VS aims to be a practical and scalable data center network architecture which is desired to meet the following objectives: • Maximize Bandwidth Utilization: • Use L3 routing to overcome the limitations of STP. • Layer-2 Connectivity Service: • Just as if the servers of a given service domain were on a LAN or a subnet. • Service Domain Isolation: • Due to performance isolation and security considerations, servers of different service domains should be isolated from each other, just as if they were isolated via VLANs. • Broadcast Flooding Suppression • Limit the broadcast flooding (e.g., ARP broadcast traffic, unknown unicast traffic) scope as small as possible.
VS Overview (cont) • VS provides an IP-only L2VPN service for server interconnection in data center networks by mainly combining L3VPN and ARP proxy [RFC 925] (was invented by Jon Postel) technologies. • On PE control plane • Host routes (i.e., /32) for local CE hosts are generated automatically according to learnt ARP entries. • Host routes for remote CE hosts are learnt by using the existing L3VPN technology to distribute the above local CE host routes across PEs. • Acting as an ARP proxy, the PE returns its own MAC as a response to an ARP request for a remote CE host which is sent from a local CE host. • On PE data plane • Use L3VPN forwarding mechanism WITHOUT ANY CHANGE.
Prefix Next-hop Protocol 1.1.1.1/32 PE-1 BGP 1.1.1.2/32 PE-1 BGP 1.1.1.3/32 Local ARP 1.1.1.4/32 Local ARP Prefix Next-hop Protocol 1.1.1.1/32 Local ARP 1.1.1.2/32 Local ARP 1.1.1.3/32 PE-2 BGP 1.1.1.4/32 PE-2 BGP IP(A)->IP(B) IP(A)->IP(B) IP(A)->IP(B) VLAN ID VLAN ID VPN Label MAC(A)->MAC(PE-1) MAC(PE-2)->MAC(B) Tunnel to PE-2 PE-1 PE-2 Unicast Communication Example VRF Blue: VRF Blue: MPLS/IP Backbone ARP Proxy ARP Proxy ToR Switch ToR Switch Host D: 1.1.1.4 Host B: 1.1.1.3 Host A: 1.1.1.1 Host C: 1.1.1.2 ARP: ARP: IP MAC IP(C) MAC(C) IP(B) MAC(PE-1) IP(D) MAC(PE-1) IP MAC IP(D) MAC(D) IP(A) MAC(PE-2) IP(C) MAC(PE-2) VPN Blue: 1.1.1.0/24 VPN Blue: 1.1.1.0/24
Local CE Host Discovery • Local CE hosts are discovered through ARP learning. • PE sends unicast ARP requests to those learnt local CE hosts periodically to keep their corresponding ARP entries from expiring. • To ensure the PE has learnt all local CE hosts, especially in the event of rebooting, ARP scan should be performed at least once after rebooting: • Option 1 (available today): • PE sends to its local site an ARP request for each IP address within the configured IP subnet in turn. • Option 2 (extensions to existing ARP needed): • PE sends to its local site an ARP request for a directed broadcast address (i.e., 255.255.255.255) or an ALL-Systems multicast group address (i.e., 224.0.0.1). • Any CE host receiving such ARP request should respond with an ARP reply containing its IP and MAC addresses.
ARP Reduction • Besides ARP learning, PE should perform the ARP proxy [RFC 925] function: • For an ARP request for a local CE host, discards it. • For an ARP request for a remote CE host, return its own MAC as an ARP reply. • For an ARP request for an unknown CE host (i.e., no matching VRF entry found), discards it. • ARP broadcast traffic from CE hosts is limited to local VPN sites • ARP broadcast traffic would not be flooded across PEs. • ARP update for a CE host (e.g., triggered by VM mobility) would not trigger any BGP update as long as that CE host is still attached to its original PE and VRF instance (e.g., VM mobility within the VPN site).
CE Multi-homing • CE multi-homing is an important feature for redundancy and load-balancing, especially in data center networks. • Multiple equal-cost host routes with different BGP next-hops (i.e., remote PEs) for a given multi-homed CE host can be used to achieve maximum capacity for server interconnection. • CE hosts can be multi-homed to PEs via Intermediary bridges (e.g., ToR switches) in the following way. • VRRP is enabled on PEs of a given redundancy group, • and only VRRP master is delegated to act as ARP proxy and respond with its VIRTUAL MAC.
CE Mobility (e.g., VM Mobility) • CE mobility within a VPN site. • PE just needs to update the corresponding ARP entry. • No BGP update is triggered. • CE mobility across VPN sites. • Upon learning a host route for a given local CE host via BGP, PE should immediately send an ARP request to that host to determine whether that host is still connected to it. • If not, PE should delete the corresponding ARP entry and host route for that CE host, and withdrawn the corresponding BGP route advertised before. • Otherwise, it is judged as CE multi-homing.
Multicast/Broadcast • MVPN technology can be used directly without any change to distribute customer multicast traffic among PEs. • Inclusive multicast distribution tree • Selective multicast distribution tree • Customer broadcast traffic can be processed as a special customer multicast group.
Next-steps • Any comments?
IPLS vs. VS (CE Reachability Advertisement) • In IPLS, MAC reachability is advertised via LDP • LDP sessions face scalability challenge in a full-meshed large data center network. • Adding new PEs would require configurations on all remote PEs. • In VS, IP reachability is advertised via BGP • BGP session can scale well with the help of route reflector mechanism. • Adding new PEs just induce configuration on RRs. • The forwarding table size on PE is the same for both IPLS and VS. • Both host routes and MAC routes are not aggregatable.
IPLS vs. VS (ARP Reduction) • In IPLS, ARP storm issue is not solved completely. • ARP packets even including the unicast ARP reply packets are forwarded from attachment circuits to "multicast" PWs and the received APR packets from the "multicast" PWs will be flooded to all CE hosts. • How to keep the consistency of ARP caches on different PE routers is a hard issue. • In VS, by using ARP Proxy on PE routers, ARP traffic is limited within a site scope.
IPLS vs. VS (CE Multi-homing) • IPLS prohibits connection of a common LAN or VLAN to more than one PE router. • That’s to say, IPLS can not support redundancy and load-balancing of PE-CE connections. • VS can support CE multi-homing natively.
IPLS vs. VS (Intermediary Bridge’s MAC Table Size) • In IPLS, the intermediary bridges between PEs and CEs would have to learn all CE hosts (both local and remote) • An IP frame received over a unicast PW is prepended with the PE router’s own local MAC address before transmitting it on the appropriate attachment circuits. However, the destination MAC address of the packet to a remote CE host which is sent from a local CE host is the MAC of the remote CE host, rather than the local PE router’s MAC. Thus, flooding unknown unicast frames on the above Ethernet bridges would happen sooner or latter. • To avoid flooding unknown unicast frames, these bridges are configured to not age out the learned MAC entries. • In VS, the intermediary bridges only need to learn the MAC addresses of local CE hosts and local PE routers.