320 likes | 435 Views
Real world FabricPath Deployment at IBM Data Centers. Santiago Freitas CCIE#18776 (R&S / SP) Consulting Systems Engineer Cisco-IBM Global Team safreita@cisco.com. Lasse Leegaard IT Architect AT&T lasse@intl.att.com. Cisco Confidential. 1. What?.
E N D
Real world FabricPath Deployment at IBM Data Centers Santiago Freitas CCIE#18776 (R&S / SP) Consulting Systems Engineer Cisco-IBM Global Team safreita@cisco.com Lasse Leegaard IT Architect AT&T lasse@intl.att.com Cisco Confidential 1
What? IBM has achieved tangible benefits by migrating its infrastructure from Catalyst 6500 to a Nexus 2000, 5000, 7000 Architecture. IBM has adopted FabricPath on the Nexus 5K and 7K and MPLS L3VPN on the Nexus 7K. The solution was extensively tested at Cisco ECATS. FabricPath was a key differentiator when competing with Juniper. We learned a lot from this deployment. 2
Session Objectives At the end of the session, you should be able to: Articulate to your customers the Business Benefits that IBM has achieved by migrating to a Nexus 2K/5K/7K Architecture. Explain the reasons why they adopted FabricPath. Understand the Tests performed to validate the solution before deployment. Understand IBM’s future direction and how they plan to get there. 3
IBM Nordic Strategic Outsourcing One of the company’s largest Integrated Market Teams (IMT) globally IBM SO provides outsourcing services that offer management of applications and other IT components in either an onsite or hosted arrangement. Eight Data Centers located in Denmark, Sweden and Finland. Serve around 200 customers • Some have dedicated infrastructures • Over 100 served by a shared, multitenant infrastructure 4
Overview of the Network Infrastructure 6 ~150 Cisco 6500/7600 ~50 Cisco 7200/7300 ~ 3000 VLANs and 290 virtual firewalls ~26000 Ethernet ports
One of the Access Blocks Reached EoL A portion of the shared infrastructure was approaching end of life - 22 Access Switches - 4080 access ports • 1G or 2x1G uplinks - 2 pair of FWSM in Service Switches 7 Cisco and AT&T performed EoL analysis. Factual discussion: Vital to demonstrate the need for a full network refresh.
Business Benefits of the Nexus-based Solution Why IBM chose to deploy Nexus and FabricPath Significant OPEX savings when compared with the existing infrastructure: • Reduced the power consumption by 61% • Reduced the rack space used by network switches by 60% • Reduced the number of managed devices in the network by 38.5% (from 26 to 16) Easier way to scale, supports more access blocks on the same Core devices, therefore less expensive per customer port Reduction in the time to onboard and configure the network for new customers CAPEX savings – Next Generation DC based on Cisco Nexus and FabricPath was 46% cheaper than building similar architecture using Catalyst 6500 9
Juniper!!! Yes – we did consider doing it differently Like-for-like (EX8200/4500/4200 + MX routing + 6500/FWSM firewall) Qfabric (Qfabric switching + MX routing + 6500/FWSM firewall) No FCoE capable hardware 10G server density not impressive FCoE development is beginning to catch up However, Nexus has more/longer field exposure than Juniper kit in this area. Organizational inertia and training would have to be overcome 10
What IBM actually deployed? MPLS Backbone FabricPath 11 2x Nexus 7010 • M1/F1 combination • MPLS L3 VPN PE 12x Nexus 5548UP • Across 3 DCs 70x Nexus 2200 • 3360 access ports 2x 6500 Service chassis for FWSM modules
FabricPath Flexibility The Network Can Evolve With No Disruption L3 L3 FabricPath FabricPath FabricPath → Add more leaf switches → Add more links and spines Need more edge ports? Need more bandwidth?
Why IBM adopted FabricPath? vPC and traditional STP topologies were considered Better utilization of links Increased Agility • New PODs and/or links for more capacity can be added non-disruptively • Any VLAN anywhere Simplicity of Configuration • Much simpler to implement and configure than vPC Very fast convergence - sub-second in most cases Need to route over the Fabric • Layer 3 over FabricPath 13
FabricPath enablement Was that really it? vpc domain 11 role priority 100 peer-keepalive destination 10.1.20.46 source 10.1.20.45 peer-gateway auto-recovery fabricpath switch-id 1000 fabricpath domain default spf-interval 50 50 50 lsp-gen-interval 50 50 50 root-priority 255 / 254 (N7K) fabricpath switch-id 1 install feature-set fabricpath feature-set fabricpath vlan 3865 mode fabricpath spanning-tree mst configuration name IBMMST02 revision 10 instance 1 vlan 1-2048 instance 2 vlan 204 interface Ethernet1/5 switchport mode fabricpath 14
MPLS L3 VPN on Nexus 7000 Works together with the rest of the infrastructure Nexus 7010 Nexus 7010 as the MPLS L3VPN PE. Customer VLANs mapped into VRF/VPN in the Aggregation Layer. Remote Sites are 6500, 7600, 7300 and 7200, working well with the rest of the infrastructure. Advantage over Juniper, extra layer required. 15
Migration plan How to get from here to there (or from there to here depending on your point of view) MPLS P VPLS PE MPLS PE + Aggregation L3 L3 VLANs VLANs FabricPath VLANs FW/LB service + Access 17
ECATS End of Test Report Cisco Enhanced Customer Aligned Testing Services - http://ecats See Additional Resources Slides for link to it 36 Major Tests Areas Detailed Results DDTS/Bugs Found and workarounds Technical Notes Convergence Summary Table HW and SW utilized Lessons Learned Configuration Files 18
Migration plan How to get from here to there (or from there to here depending on your point of view) MPLS P VPLS PE MPLS PE + Aggregation L3 L3 VLANs VLANs FabricPath VLANs FW/LB service + Access 19
ECATS End of Test Report Cisco Enhanced Customer Aligned Testing Services - http://ecats See Additional Resources Slides for link to it 36 Major Tests Areas Detailed Results DDTS/Bugs Found and workarounds Technical Notes Convergence Summary Table HW and SW utilized Lessons Learned Configuration Files 20
ECATS testing experience Cisco Enhanced Customer Aligned Testing Services - http://ecats Vital on the success of this deployment. Gives us experience before having used it Test overlap with rollout Reduction of risk of introducing new technology 21
Testing Topology 22 Remote Site • Agg/MPLS PEs (7600) • L2/L3 Aggregation • ISIS / MP-BGP / LDP • Access Layer Cat6500 (Layer 2) ISIS and MPLS in the core Site Under Test • Nexus 7010 as Agg/MPLS PE (L2/L3) • vPC+ at the Core for Active/Active HSRP • Nexus 5548UP/Nexus 2248 as Access • FabricPath • Servers attached with vPC+ • OSPF/BGP over FP
Testing Topology and Scale Numbers Hardware and Software Versions and Scale Numbers For Your Reference • 300 VLANs • 300 SVIs and 300 HSRP • 200 VRFs / MPLS L3 VPN • 3000 MAC addresses injected • IMIX Ethernet Traffic • 4Gbps within Nexus Access Block (East – West) • 800Mbps towards remote site (North-South) • A full mix of bi-directional traffic paths (Inter-VLAN, Intra-VLAN, Inter-VRF) Access Layer • Nexus 5548UP – NX-OS 5.1(3)N1(1a) • FEX Nexus 2248 Core • Nexus 7010 – NX-OS 5.2(3a) • 2x M1 8x 10GE (N7K-M108X2-12L) • 2x F1 32x 1/10GE (N7K-F132XP-15) Remote Site PE • 7609 with RSP-720 – IOS 15.1(1)S 23
Convergence Times Failover Test Result Convergence Summary Sub-second on FabricPath link failures Layer 3 Link Failure on Core towards Remote site – 64 ms / 30 ms on Recovery M1 Line Card Failure on Core - 950 ms (North-South) / 75 ms on Recovery Fabric Path Link Failures (multiple tests) – 117 ms / 241 ms on Recovery F1 Line Card failure on Core - 1380 ms / 319 ms on Recovery Core Node Failure (power off N7010) - 2584 ms / 2703 ms on Recovery Access Node Failure - 316.52 ms for vPC+ attached servers / 181 ms on Recovery 24
Dynamic Routing Protocol and FabricPath You can run OSPF and BGP over FabricPath, you can’t over vPC CE3-2851-RK18#sh ip ospf neighbor Neighbor ID Pri State Dead Time Address Interface 10.10.101.5 1 FULL/BDR 00:00:36 10.10.101.5 GigabitEthernet0/1 10.10.101.7 1 FULL/DR 00:00:33 10.10.101.7 GigabitEthernet0/1 10.10.101.8 0 2WAY/DROTHER 00:00:30 10.10.101.8 GigabitEthernet0/1 CE3-2851-RK18# 25 The OSPF CE routers CE-3 and CE-4 were configured with “ip ospf priority 0” interface configuration so they don’t participate in DR/BDR election process FULL OSPF neighborships are formed with both Core1 and Core2 Traffic still forwarded even when crossing peer-link FabricPath doesn’t have same limitations as vPC
Technical Lessons Learned It would be a session on its own… Details on the hidden slides and on Additional Resources page No Show Stopper DDTS • One cosmetic, one catastrophic but with an easy workaround (already fixed) and one Unreproducible. Several Technical Lessons Learned on the areas of: • Peer-Link Failure and vPC+ attached devices • MAC Learning with vPC+ domain • Multidestination tree and vPC+ • MAC Learning on N7K with M1/F1 for L2 Traffic 26
Further developments Where do we see the rest of the infrastructure go? 27
Evolution Plan SAN A SAN B SAN A SAN B SAN A MPLS P Routers Layer 3 / MPLS MPLS PE/ Agg Switches MPLS PE/ Agg Switches Layer 2 IPv4 Layer 2 IPv4 Layer 2 IPv4 Layer 2 IPv4 Layer 2 IPv4 VPLS PE VPLS PE Services Switches (FWSM/ACE/NAM) Services Switches (FWSM/ACE/NAM) Services Switches (FWSM/ACE/NAM) Services Switches (FWSM/ACE/NAM) Services Switches (FWSM/ACE/NAM) Up to 20 Access Switches Up to 20 Access Switches Up to 20 Access Switches Up to 20 Access Switches Up to 20 Access Switches SAN A SAN A SAN B SAN B SAN B 28
Evolution Plan Management Orchestration Provisioning Automation Dynamic Infrastructure VPLS PE VPLS PE Storage FC/FCoE/NAS 1 2 1 2 Storage FC/FCoE/NAS 3 4 3 4 SAN A SAN B SAN A SAN A SAN A SAN B SAN B SAN B SAN A SAN B SAN A SAN B MPLS P Routers Layer 3 / MPLS 2^12 = 4096 VLANs… 2^24 = 16777216 Segment IDs MPLS PE/ Agg Switches MPLS PE/ Agg Switches Layer 2 IPv4 Layer 2 IPv4 Layer 2 IPv4 Layer 2 IPv4 Layer 2 IPv4 Layer 2 IPv4/ IPv6 IPv4/IPv6 IPv4/IPv6 Services Switches (FWSM/ACE/NAM) Services Switches (FWSM/ACE/NAM) Services Switches (FWSM/ACE/NAM) Services Switches (FWSM/ACE/NAM) Services Switches (FWSM/ACE/NAM) Up to 20 Access Switches Up to 20 Access Switches Up to 20 Access Switches Up to 20 Access Switches Up to 20 Access Switches 29
Key Takeaways The Key Takeaways of this presentation were: IBM has achieved OPEX and CAPEX savings by migrating to a Nexus 2K/5K/7K Architecture in their Data Centers. IBM has adopted FabricPath and is very happy with its Flexibility, Easy to Implement and Use and Convergence Time. FabricPath was extensively tested and validated at Cisco ECATS. FabricPath and MPLS on N7K were differentiators against Juniper. You can reuse the lessons learned and additional resources available from this deployment to position FabricPath to your customers. 30
Additional Resources http://bock-bock.cisco.com/wiki/User:Safreita:FabricPath_Testing You can find the following additional information on the link below • Customer Requirements and Business Case for Catalyst 6500 -> Nexus and FabricPath • Joint Technical Plan of Record (test requirements) • Detailed Test Plan • Complete end of Test Report (including detailed test results and configurations) • Lessons Learned Presentation • INTERNAL Case Study of IBM Nordic Adoption of Nexus and FabricPath • EXTERNAL version of the Case Study 31