430 likes | 439 Views
This presentation discusses design trends, the OmniPoP design model, VLANs, and switch requirements and capabilities for the future of GigaPOPs.
E N D
OmniPoPGigaPOP-of-GigaPOPs Design and Future Debbie Montano dmontano@force10networks.com Internet2 Member Meeting – April 2007
Special Note Regarding Forward Looking Statements This presentation contains forward-looking statements that involve substantial risks and uncertainties, including but not limited to, statements relating to goals, plans, objectives and future events. All statements, other than statements of historical facts, included in this presentation regarding our strategy, future operations, future financial position, future revenues, projected costs, prospects and plans and objectives of management are forward-looking statements. The words “anticipates,” “believes,” “estimates,” “expects,” “intends,” “may,” “plans,” “projects,” “will,” “would” and similar expressions are intended to identify forward-looking statements, although not all forward-looking statements contain these identifying words. Examples of such statements include statements relating to products and product features on our roadmap, the timing and commercial availability of such products and features, the performance of such products and product features, statements concerning expectations for our products and product features [and projections of revenue or other financial terms. These statements are based on the current estimates and assumptions of management of Force10 as of the date hereof and are subject to risks, uncertainties, changes in circumstances, assumptions and other factors that may cause the actual results to be materially different from those reflected in our forward looking statements. We may not actually achieve the plans, intentions or expectations disclosed in our forward-looking statements and you should not place undue reliance on our forward-looking statements. In addition, our forward-looking statements do not reflect the potential impact of any future acquisitions, mergers, dispositions, joint ventures or investments we may make. We do not assume any obligation to update any forward-looking statements. Any information contained in our product roadmap is intended to outline our general product direction and it should not be relied on in making purchasing decisions. The information on the roadmap is (i) for information purposes only, (ii) may not be incorporated into any contract and (iii) does not constitute a commitment, promise or legal obligation to deliver any material, code, or functionality. The development, release and timing of any features or functionality described for our products remains at our sole discretion.
Topics • Design Trends • OmniPoP Design • Layer 2 Model • VLANs • Switch Requirements & Capabilities
R&E Design Trends • R&E utilizes services at all three layers: • Layer 1: Fiber/Lambdas • Layer 2: Ethernet/VLANs • Layer 3: IP • HOPI – Hybrid Optical & Packet Infrastructure • VLANs to provide dedicated paths/bandwidth • Regional Aggregation • Number of VLANs continues to increase. • Number of Peers continues to increase • Segmentation is a major trend
OmniPOP Design • OmniPOP – Layer 2 Only • 10 GbE & 1 GbE connections for each member • 10 GbE connections to R&E networks • Multiple VLANs per connection • VLANS for aggregating router-to-router peering connections • GigaPOP-to-GigaPOP • University-to-University • University-to-GigaPOP • University/GigaPOP-to-National Network • VLANs which extend through to end researchers/sites • (VLANs can be used for control plane) • CIC also has Chicago Fiber Ring – Layer 1
VLANs • Local Area Network (LAN) • Virtual LAN (VLAN) • Create independent logical networks • Isolate traffic/groups of users in different VLANs • Create VLANs across multiple devices • Multiple VLANs across a single port • Institute of Electrical & Electronics Engineers (IEEE) • IEEE 802.3 - Ethernet standards • IEEE 802.1 - higher layer LAN protocol standards • IEEE 802.1q - Virtual LAN (VLAN) standard • 802.1q limit: 4096 VLAN IDs
Model • StarLight is a Layer 2 service, and does not mandate Layer 3 peering to a central router. Supports open peering amongst its members. • You provide a GigE [or 10GigE] connection from a router in your AS to the Starlight Force10 switch. • Individual peering sessions are bilaterally negotiated. • Your router must support 802.1Q tagged VLANs. • Starlight configures 1 point-to-point VLAN for each of your peerings at your request, or the peer's request. • If you desire to peer with, say, five other StarLight connectors, we will assign 5 VLAN id's and configure them on our switch, giving you five individual point-to-point connections. Reference:http://www.startap.net/starlight/CONNECT/connectPeering.html
StarLight Bilateral 802.1q VLANS • StarLight provides bilateral 802.1q VLANs to peering participants for three primary reasons: • The MTU of the peerings is decided between the peers. StarLight does not enforce a common MTU size for all peers; instead, each peering MTU is individually negotiated between the peers. • By eliminating a common IEEE 802 broadcast domain, IP multicast is not flooded to all participants. Avoids the problems with PIM Asserts in an interdomain peering mesh. • IPv4, IPv4 multicast, and IPv6 are transparent to StarLight. The services you enable on your peerings are between you and each of your peers. They're not limited by anything StarLight supports.
Matching MTUs • Maximum Transmission Unit (MTU) or Media Transmission Unit • the largest physical packet size, measured in bytes, that a network can transmit. Any messages larger than the MTU are divided into smaller packets before being sent • Link-layer MTU is frame size of an ethernet packet • IP (layer3) MTU is size used for IP fragmentation • All VLAN members must use same IP MTU value • Force10 defines Link MTU as = Entire Ethernet Packet (ethernet header + frame check sequence + Payload) • E.g. Max link MTU 9252 B; Max IP MTU 9234 B • Link MTU on OmniPOP switch set to Max (9252)
I2 Recommendation - IP MTU Internet2-wide Recommendation on IP MTU: • Engineers throughout all components of the extended Internet2 infrastructure, including its campus LANs, its gigaPoPs, its backbone(s), and exchange points, are encouraged to support, where ever practical, an… • IP MTU of 9000 bytes. • Recommended by the Joint Engineering Team (JET) for federal research networks.
IP MTU Rational The rationale for this recommendation includes the following points: • Applications, including but not limited to bulk TCP, benefit from being able to send 8K (i.e., 8 times 1024) bytes of payload plus various headers. An IP MTU of 9000 would satisfy this application need. • A growing number of routers, switches, and host NICs support IP packets of at least 9000. • Very few routers, switches, and host NICs support IP packets of more than 9500. Thus, there is comparatively little motivation for a value much more than 9000. • There is anecdotal evidence that Path MTU discovery would be more reliable if a given agreed-on value were commonly used. This relates to weaknesses in current Path MTU discovery technology. • 9000 is an easy number to remember. • It is stressed that this is an interim recommendation. Engineers are also encouraged to explore the benefits and practicalities of much larger MTUs, up to the full 64 KBytes permitted for an IPv4 datagram.
OminPOP VLANs • OmniPOP selected VLAN IDs 2000-2499 • Each connecting institution tries to set aside this range • IEEE 802.1q standard limit: 4096 VLAN IDs • Need to select a range of VLAN IDs not already in use by connecting institutions and gigaPOPs • Multiple VLAN supported on 1 physical network connection.
VLAN IDs • reserve the first 20 for broadcast vlans • (we'll need some v4 & v6 address space to talk with each other on each of these): • 2000 v4 unicast, jumbo frames • 2001 v4 multicast, jumbo frames • 2002 v6 unicast, jumbo frames • 2003 v6 multicast, jumbo frames • 2004 v4 unicast, 1500B frames • 2005 v4 multicast, 1500B frames • 2006 v6 unicast, 1500B frames • 2007 v6 multicast, 1500B frames • 2008-2019 reserved
2099-2070: NLR • 2083 NLR-Northwestern 10G • 2082 NLR-Northwestern 1G backup • 2081 NLR-Ohio State 10G • 2080 NLR-Ohio State 1G backup • 2079 NLR-Purdue 10G • 2078 NLR-Purdue 1G backup • 2077 NLR-Wisconsin 10G • 2076 NLR-Wisconsin 1G backup • 2075 • 2074 • 2073 • 2072 • 2071 • 2070 • 2099 NLR-U of Chicago 10G • 2098 NLR-U of Chicago 1G backup • 2097 NLR-U of Ill-Chicago 10G • 2096 NLR-U of Ill-Chicago 1G backup • 2095 NLR-UIUC 10G • 2094 NLR-UIUC 1G backup • 2093 NLR-Indiana 10G • 2092 NLR-Indiana 1G backup • 2091 NLR-Iowa 10G • 2090 NLR-Iowa 1G backup • 2089 NLR-U Michigan 10G • 2088 NLR-U Michigan 1G backup • 2087 NLR-Michigan State 10G • 2086 NLR-Michigan State 1G backup • 2085 NLR-U Minnesota 10G • 2084 NLR-U Minnesota 1G backup
2069-2040: Internet2 • 2053 Internet2-Northwestern 10G • 2052 Internet2-Northwestern 1G backup • 2051 Internet2-Ohio State 10G • 2050 Internet2-Ohio State 1G backup • 2049 Internet2-Purdue 10G • 2048 Internet2-Purdue 1G backup • 2047 Internet2-Wisconsin 10G • 2046 Internet2-Wisconsin 1G backup • 2045 Internet2-IndianaCPS • 2044 • 2043 • 2042 • 2041 • 2040 • 2069 Internet2-U of Chicago 10G • 2068 Internet2-U of Chicago 1G backup • 2067 Internet2-U of Ill-Chicago 10G • 2066 Internet2-U Ill-Chicago 1G backup • 2065 Internet2-UIUC 10G • 2064 Internet2-UIUC 1G backup • 2063 Internet2-Indiana 10G • 2062 Internet2-Indiana 1G backup • 2061 Internet2-Iowa 10G • 2060 Internet2-Iowa 1G backup • 2059 Internet2-U Michigan 10G • 2058 Internet2-U Michigan 1G backup • 2057 Internet2-Michigan State 10G • 2056 Internet2-Michigan State 1G backup • 2055 Internet2-U Minnesota 10G • 2054 Internet2-U Minnesota 1G backup
2150-2215 Intra-CIC Point-to-Point VLANs • 2165 UIC-MSU • 2166 UIC-UMN • 2167 UIC-NU • 2168 UIC-OSU • 2169 UIC-Purdue • 2170 UIC-Wisconsin • 2171 UIUC-Indiana • 2172 UIUC-Iowa • 2173 UIUC-Michigan • 2174 UIUC-MSU • 2175 UIUC-UMN • 2176 UIUC-NU • 2177 UIUC-OSU • 2178 UIUC-Purdue • 2179 UIUC-Wisc • 2180 IU-Iowa • 2150 UOC-UIC • 2151 UOC-UIUC • 2152 UOC-Indiana • 2153 UOC-Iowa • 2154 UOC-Michigan • 2155 UOC-MSU • 2156 UOC-UMN • 2157 UOC-NU • 2158 UOC-OSU • 2159 UOC-Purdue • 2160 UOC-Wisconsin • 2161 UIC-UIUC • 2162 UIC-Indiana • 2163 UIC-Iowa • 2164 UIC-Michigan
2150-2215 Intra-CIC Point-to-Point VLANs (Cont) • 2200 Michigan-Wisconsin • 2201 MSU-UMN • 2202 MSU-NU • 2203 MSU-OSU • 2204 MSU-Purdue • 2205 MSU-Wisconsin • 2206 UMN-NU • 2207 UMN-OSU • 2208 UMN-Purdue • 2209 UMN-Wisconsin • 2210 NU-OSU • 2211 NU-Purdue • 2212 NU-Wisconsin • 2213 OSU-Purdue • 2214 OSU-Wisconsin • 2215 Purdue-Wisconsin • 2216 Iowa10G-Iowa1G • 2181 IU-Michigan • 2182 IU-MSU • 2183 IU-UMN • 2184 IU-NU • 2185 IU-OSU • 2186 IU-Purdue • 2187 IU-Wisc • 2188 Iowa-Michigan • 2189 Iowa-MSU • 2190 Iowa-UMN • 2191 Iowa-NU • 2192 Iowa-OSU • 2193 Iowa-Purdue • 2194 Iowa-Wisc • 2195 Michigan-MSU • 2196 Michigan-UMN • 2197 Michigan-NU • 2198 Michigan-OSU • 2199 Michigan-Purdue
OmniPOP Switch Design • E1200 Overview • 1 GbE and 10 GbE ports • Upgrade path to higher densities & 100 GbE • Resiliency Architecture • Switching Protocols
E1200 Overview CableManagement Redundant Fansand Fan Modules Passive Copper Backplane 5 Tbps Capacity,100 GbE Ready Redundant Route Processor Modules (RPMs): 1+1 No Central Clock 14 Line Card Slots Redundant PowerSupplies: 1+1 DC Redundant SwitchFabric Modules(SFMs): 8:1 1.6875 Tbps Capacity
Line Rate & High Density Ports OmniPOP includes: • 4-port 10GbE Line Rate cards • 16-port 10GbE High Density cards • 4:1 lookup oversubscribed 10 GbE ports • Functions as a line-rate card if every fourth XFP is used • Provides • Flexibility & control • Can balance ports/chassis use with bandwidth requirements • Room for growth – up to 224 10 GbE ports per E1200 chassis
Path to 100 GbE & Higher Density 4th Generation Line Cards High Density 100 GbE 2nd Generation Line Cards “TeraScale” 1st Generation Line Cards “EtherScale” E-Series E1200 90-port GbE 16-port 10 GbE 3rd Generation Line Cards High Density 10 GbE Very High Density GbE E-Series E600 5 Tbps Passive Copper Backplane 100 GbE Ready 1999 – 2002 April 2005 October 2005 March 2006 January 2002 October 2002 September 2004 2008* 200x* E1200 56/224 x 10 GbE 672/1260 x GbE E600 28/112 x 10 GbE 336/630 x GbE E1200 (1.6875 Tbps) 28 x 10 GbE 336 x GbE E1200 56 x 10 GbE 672 x 1 GbE E600 28 x 10 GbE 336 x GbE 3rd Generation Switch Fabric Module 337.5 Gbps/Slot 1st Generation Switch Fabric Module (SFM) 112.5 Gbps/Slot 2nd Generation Switch Fabric Module (SFM3) 225 Gbps/Slot E1200 (3.375 Tbps) E600 (1.8 Tbps) 100 GbE Ready E600 (900 Gbps) 14 x 10 GbE 196 x GbE E1200 56 x 10 GbE 672/1260 x GbE E600 28 x 10 GbE 336/630 x GbE * planned
Higher Speeds Drive Switch/Router Requirements • Driving architectural requirements • Massive hardware and software scalability • >200 Gbps/slot switch fabric capacity • Support for several thousand interfaces • Multi-processor, distributed architectures • Fast packet processing at line-rate • 100 GbE is ~149 Mpps or 1 packet every 6.7 ns (10 GbE is only ~14.9 Mpps or 1 packet every 67 ns)
Higher Speeds Drive Density • 100 Gbps Ethernet will benefit all • Drives 10 GbE port density up and cost down • Possible line-rate combinations • 1 x 100 GbE port • 10 x 10 GbE ports • 100 x 1 GbE ports • And even more oversubscribed port density… • The more things change the more they stay the same….
100 GbE Ready Chassis = No Forklift Upgrade • Current backplane can scale to 5 Tbps and 337.5 Gbps/slot with future components • Designed and tested for 5 Tbps • Advanced fiberglass materials improve transmission characteristics • Unique conductor layers decouple 5 Tbps signal from power traces • Engineered trace geometry for channel stability and 25 Gbps channels • Force10 has 19 patents awarded and more than 60 patents pending on its switching technology * No other vendor has openly discussed testing their backplanes for future capacity
Backplane Considerations • Slot Capacity • Switching Capacity • Performance • Signal coding • BER • Impacts SerDes Design • Design and technology drives scalability • Advanced fiberglass materials • Unique conductor layers • Engineered trace geometries
Power Considerations • End User Restrictions? • Total system wattage? • Input power quality not specified • Higher speeds require lower noise
Thermal Management Considerations • Cooling capacity per slot? • Front to back filtered airflow for carrier deployments • Cooling redundancy • Heat can affect material performance which affects high-speed signaling performance
100 GbE Ready Chassis = No Forklift Upgrade • Chassis designed for 100 GbE and high density 10 GbE • Backplane and channel signaling for higher internal speeds • Lower system BER • Connectors • N+1 switch fabric • Reduced EMI • Clean power routing architecture • Thermal and cooling • Cable management
Hitless Software Upgrade • Hot Swap • Logging and Tracing • One Software Image Manageability and Serviceability • OSPF/BGP Restart • RSTP, MST • VRRP Protocol Resiliency • LAG • ECMP • LFS/WAN PHY • BFD Link Resiliency • Modular OS (NetBSD) • 3 CPU (L2, L3, CP) • Line Card CPU • HA Software Resilient Hardware Architecture HA Software Architecture • Hardware Redundancy • Distributed Forwarding • Hitless Failover • DoS Protection Resiliency Architecture
Completely passive copper backplane • Scalability tested to 5 Tbps • Passive copper backplanes more reliable than optical backplanes • Force10 holds many patents on backplane design and manufacturing technology • 1.6875 Tbps fabric • 8:1 SFM redundancy provides OpEx savings Backplane/Fabric Scalability and Reliability Route Processor Module Line Card Ternary CAM 1GE/10GE MAC Buffer/Traffic Management Backplane Scheduler Buffer/Traffic Management 56.25 Gbps 56.25 Gbps Per Slot Passive Copper Backplane 1.6875 Tbps Switch Fabric
Multi-CPUs with modular OS for routing and management (control plane) • Distributed ASICs for line rate forwarding and packet processing (data plane) • Independent data and control paths (among CPUs) Modularity Enables Predictability Route Processor Module Line Card Ternary CAM 1GE/10GE MAC Buffer/Traffic Management Backplane Scheduler Buffer/Traffic Management 56.25 Gbps 56.25 Gbps Per Slot Passive Copper Backplane 1.6875 Tbps Switch Fabric
Traffic Going to Each CPU is Prioritized & Rate Limited More than 1,000,000 ACLs with No Performance impact CPU Utilization >85% Triggers Internal Protection Mechanisms ACL Buffer/Traffic Management Backplane Scheduler Force10 TeraScale ArchitectureEmbedded Security & Catastrophic Failure Prevention Route Processor Module Line Card Ternary CAM Switching Routing Management 1GE/10GE MAC Passive Copper Backplane 1.6875 Tbps Switch Fabric
Each line card has distributed CAMs for L2/L3 forwarding, ACL, and QoS lookup • Always 5 lookup allows line rate packet processing • FIB based architecture scales extremely well against flow based architectures (no slow path, no flow cache thrashing) • Hardware based ACLs with Sequence numbers ensure no security holes • CPUs calculate best paths, and download to line cards synchronously • Line cards make independent forwarding decisions • During RPM failover, line cards continue to forward without packet loss (hitless) Distributed and Hitless Forwarding Route Processor Module Line Card Ternary CAM 1GE/10GE MAC Buffer/Traffic Management Backplane Scheduler Buffer/Traffic Management 56.25 Gbps 56.25 Gbps Per Slot Passive Copper Backplane 1.6875 Tbps Switch Fabric
Switching Separate Module for System Mgt. • 4K VLANs • 128K MAC + L2 ACL Entries per Port Pipe • 802.1Q VLAN Tagging • VLAN Stacking • 802.1p VLAN Prioritization • 802.3ad Link Aggregation w/ LACP • 802.1D Spanning Tree Protocol • RRR Rapid Root Redundancy (STP) • Force10 VLAN Redundancy Protocol (FVRP) • MSTP (802.1s), RSTP (802.1w) • Filtering/Load balancing on L3 header • 802.1ac Frame Extension for VLAN tagging Route Processor 2 Protocols Run as Individual Processes MAC Manager Spanning Tree ARPManager Link Aggr. (LAG) VRRP, ICMP, PPP SystemManager IPC Kernel Layer
MAC Learning & Filtering Mac Learning Limit to Control Number of Entries Learnt L2 Interface Filtering • MAC Addresses Received Beyond Limit are Discarded • L2 ACL Entry to Prevent Discarded MACs from Forwarding Traffic • Counter to Measure Dropped Addresses • Standard MAC ACL: Source MAC Address • Extended MAC ACL: SA, DA, Ethernet Frame Type, VLAN • Standard IP ACL: Source IP Address • Extended IP ACL: IP-DA, IP-SA, Protocol Type, Destination Port, Source Port
Link Aggregation 802.3ad • Up to 16 Links in a LAG • Up to 256 LAGs Per System • No dependency on Slot • Any slot any port • Must be like ports – all 1 GE or all 10GE • Adding or Deleting Ports from a LAG Does not Require LAG/System Reset • Map traffic on to a Link Based on: • L2 header: MAC-DA, MAC-SA • L3 header within L2 packet: IP-DA, IP-SA, protocol type, destination port, source port
Enhanced STP Support • 802.1D Standard Spanning Tree • STP can be disabled per interface • STP Portfast Support on Switch Ports Connected to End Hosts • STP BPDU Guard • Disable port if BPDU received on a portfast port. • Rapid Root Redundancy (RRR) Protocol • Sub 50 msec STP recovery from root port failures • Multiple Spanning tree protocol (802.1s) for rapid convergence and efficient link utilization • RSTP (802.1w)
Thank You Debbie Montano dmontano@force10networks.com Question?