430 likes | 760 Views
OmniSwitch 9000 - Product architecture. Guillaume Ivaldi OS9000 Product Line Manager. Agenda. Introduction / OS9000 product overview Block architecture (CMM / Network Interfaces / Backplanes) Presentation of the Packet Processor Presentation of the Switching Fabric & resiliency mechanism
E N D
OmniSwitch 9000 - Product architecture Guillaume IvaldiOS9000 Product Line Manager
Agenda • Introduction / OS9000 product overview • Block architecture (CMM / Network Interfaces / Backplanes) • Presentation of the Packet Processor • Presentation of the Switching Fabric & resiliency mechanism • Advanced network policies (QoS / ACLs)
Highest performances & availability • Power solution for backbone and data center solutions • Highest performance • 768 Gbps switching capacity over a 1.92 Tbps backplane • 570 Mpps throughput • Wire rate 10/100 /1000 /10,000 • Advanced L2 / L3 / L4 switching (unicast and multicast, IPv4 and IPv6) • Built for the highest availability • Smart Continuous Switching enable by a fully distributed architecture • Everything 100% resilient • Designed for large gigabit aggregation and server farm efficiency • Up to 96 ports 10Gigabit Ethernet or 384 Gigabit Ethernet ports • Native embedded server load balancing • Jumbo frame (9K) support
A modular system • 10 slot chassis (10 NI slots) • Back-plane architecture • Fan trays (hot swappable) • 18 slot chassis (16 NI slots) • Back-plane architecture • Fan trays (hot swappable) • 2 CMMs (hot swappable) • x management x switching fabric • 8/16 NIs (hot swappable) • x 24-port 10/100/1000 x 24-port 1000-X SFP x 2/6 –port 10000-X XFP • 3/4 PSUs (hot swappable) x N+1 redundancy x 110/220V input (AC) x 48V input (DC)
1st Release (Dec. 05): 8-slot chassis & modules OS-9700-CMM OS9-GNI-C24 OS9-GNI-U24 OS9-XNI-U2 2nd Release (Jul. 06): 4-slot chassis OS-9600-CMM 3rd Release (Oct. 06): 16-slot chassis & modules OS-9800-CMM OS9-XNI-U6 OS9-GNI-P24 4th Release (Q1 07): modules OS9-GNI-C48T OS9-GNI-C20L => Support of 8-slot chassis (OS9700 only) => 24-port GigE RJ45 (10/100/1000) => 24-port GigE SFP => 2-port 10GigE XFP => Support of 4 & 8-slot chassis(OS9600/9700) => Support of 16-slot chassis (OS9800 only) => 6-port 10GigE XFP (High Density 10G) => 24-port GigE RJ45 (10/100/1000), PoE => 48-port GigE (10/100/1000) using MRJ21 (8-port) => 20-port RJ45 (10/100, SW upgradeable to 10/1001000)+ 2-port GigE SFP OS9000 - HW Roll-out
Key figures • Switch Fabric operating in load sharing (True switching capacity) • 12 Gbps / slot with single CMM (1) installed • 24 Gbps / slot with dual CMMs (2) installed • Network Interfaces to support Local switching (aka U-turn) • Up to 35.7 Mpps / slot • Port density • OS-9600: 96 (192) x GigE / 8 (24) x 10GigE • OS-9700: 192 (384) x GigE / 16 (48) x 10GigE • OS-9800: 384 (768) x GigE / 32 (96) x 10GigE • Hardware specifications • L2: 16K hosts – 4K VLANs • L3: 8K hosts (*) – 12K LPM (*) – 4K local interfaces (single MAC)(*) 1 entry per IPv4 record, 2 entries per IPv6 records • QoS: 2K network policies for L1/L2/L3/L4 classification 8 HW based Priorities per port
Agenda • Introduction / OS9000 product overview • Block architecture (CMM / Network Interfaces / Backplanes) • Presentation of the Packet Processor • Presentation of the Switching Fabric & resiliency mechanism • Advanced network policies (QoS / ACLs)
OS9700-CMM (aka OS9600-CMM) NI slot #1 Red. CMM RAM Flash Eth. Switch CPU USB Processor board SwitchFabric (8 x 12G) NI slot #8 Fabric board • New Generation of CMM – 192 Gbps (142.8 Mpps) • Fabric board, providing 12 Gbps per channel • IPC to leverage their dedicated & independent BUS (Ethernet) • OS9700: 1 channel per slot • OS9600: 2 channels per slot • Processor Board • CPU: Freescale 833 MHz • 8 MB Boot Flash • RAM: 256MB (DDR SDRAM) • Int. Flash: CF 128MB • Ext. Flash: USB (future) • EMP: RJ45 (10/100/1000) • Console: RJ45 • New AOS LEDs • OK1, OK2 • Control • Fabric • Temp, Fan & PSU Logically independent, but physically one board
OS9800-CMM (AOS 6.1.3R01) Red. CMM RAM Flash Eth. Switch CPU NI slot #1 USB Processor board SwitchFabric (16 x 12G) NI slot #16 Fabric board • New Generation of CMM – 384 Gbps (285 Mpps) • Fabric board, providing 12 Gbps per channel • IPC to leverage their dedicated & independent BUS (Ethernet) • OS9800: 1 channel per slot • Processor Board • CPU: Freescale 833 MHz • 8 MB Boot Flash • RAM: 256MB (DDR SDRAM) • Int. Flash: CF 128MB • Ext. Flash: USB (future) • EMP: RJ45 (10/100/1000) • Console: RJ45 • New AOS LEDs • OK1, OK2 • Control • Fabric • Temp, Fan & PSU Logically independent, but physically one board
OS9-GNI-C24 RAM CPU PHY Prim. CMM PHY StandardFwd Engine PHY PHY Sec. CMM PHY PHY PoE feed PoE Socket • 24 ports 10/100/1000 using RJ45 connectors • Leveraging both fabrics, if installed • Throughput of 35.7 Mpps (U-turns supported) • 2 x 12Gbps channel • Standard Packet Proc. • On-chip buffering • Integrated tables • 8 HW Queues per port • Dedicated CPU • Freescale 833 MHz • Distributed Processing
OS9-GNI-P24 (AOS 6.1.3R01) RAM CPU PHY Prim. CMM PHY StandardFwd Engine PHY PHY Sec. CMM PHY PoE control PHY PoE feed PoE Socket • 24 ports 10/100/1000 using RJ45 connectors • Leveraging both fabrics, if installed • Throughput of 35.7 Mpps (U-turns supported) • 2 x 12Gbps channel • Standard Packet Proc. • On-chip buffering • Integrated tables • 8 HW Queues per port • Dedicated CPU • Freescale 833 MHz • Distributed Processing
OS9-GNI-U24 • 24 ports GigE using LC connectors (SFP) • Leveraging both fabrics, if installed • Throughput of 35.7 Mpps (U-turns supported) • 2 x 12Gbps channel • Supported SFP • SX, LX, LH, Extended Gig • Standard Packet Proc. • On-chip buffering • Integrated tables • 8 HW Queues per port • Dedicated CPU • Freescale 833 MHz • Distributed Processing RAM CPU Prim. CMM StandardFwd Engine Sec. CMM PoE feed
OS9-XNI-U2 RAM CPU PHY Prim. CMM StandardFwd Engine Sec. CMM PHY PoE feed • 2 ports 10GigE using X2 connectors (XFC) • Leveraging both fabrics, if installed • Throughput of 29.8 Mpps (U-turns supported) • 2 x 12Gbps channel • Supported XFP • SR, LR • Standard Packet Proc. • On-chip buffering • Integrated tables • 8 HW Queues per port • Dedicated CPU • Freescale 833 MHz • Distributed Processing
OS9-XNI-U6 (AOS 6.1.3R01) RAM CPU PHY Fwd Eng.(HD MAC) Prim. CMM PHY PHY Local fabric(Fwd Eng.) PHY Fwd Eng.(HD MAC) Sec. CMM PHY PHY PoE feed • 6 ports 10GigE using X2 connectors (XFC) • Leveraging both fabrics, if installed • Throughput of 2 x 35.7 Mpps (U-turns supported on each fwd engine) • 2 x 12Gbps channel • Supported XFP • SR, LR • Standard Packet Proc. • Dual Packet Processor design • Full QoS/ACL & Statistics • 8 HW Queues per port • Dedicated CPU • Freescale 833 MHz • Distributed Processing
OS-9700 Backplane (1) • High speed connections per NIs, to each CMM : 8 lanes (bi-dir.) • Data path : FBUS+ (3.75 GHz per lane – 4 lanes / 15 Gbps ‘raw’) • Control path : BBUS+ (1.25 GHz per lane – 1 lane) Active FBUS+ link NI-1 NI-5 Fwd eng Fwd eng CMM-A NI-2 NI-6 Fwd eng Fwd eng Fabric NI-3 NI-7 Fwd eng Fwd eng CMM-B Fabric NI-4 NI-8 Fwd eng Fwd eng
OS-9700 Backplane (2) • High speed connections per NIs, to each CMM : 8 lanes (bi-dir.) • Data path : FBUS+ (3.75 GHz per lane – 4 lanes / 15 Gbps ‘raw’) • Control path : BBUS+ (1.25 GHz per lane – 1 lane) Active BBUS+ link Standby BBUS+ link NI-1 NI-5 CPU CPU CMM-A CPU Pilot switch NI-2 NI-6 CPU CPU NI-3 NI-7 CPU CPU CMM-B Pilot switch NI-4 NI-8 CPU CPU CPU
OS-9600 Backplane • High speed connections per NIs, to the CMM : 9 lanes (bi-dir.) • Data path : FBUS+ (3.75 GHz per lane – 4 lanes / 15 Gbps ‘raw’) • Four additional lanes available for future use • Each lane capable of supporting 7.5GHz • Control path : BBUS+ (1.25 GHz per lane – 1 lane) • Secondary BBUS+ connection unused in OS9600 Active FBUS+ Active BBUS+ NI-1 NI-3 Fwd eng Fwd eng CPU CPU CMM-A Fabric Pilot switch NI-2 NI-4 Fwd eng Fwd eng CPU CPU CPU
Agenda • Introduction / OS9000 product overview • Block architecture (CMM / Network Interfaces / Backplanes) • Presentation of the Packet Processor • Presentation of the Switching Fabric & resiliency mechanism • Advanced network policies (QoS / ACLs)
Inside the Forwarding Engine Memory RoutingEngine ClassificationEngine Memory SwitchingEngine BufferManagement SecurityEngine Traffic Management Parser Modification 2 x 10GigE 24 x GigE 1 x 12GigE(FBUS+) 1 x 12GigE(FBUS+) • A common Fwd Engine for OS6850 / OS9000 • in-chip memory table • L2 MAC Address • L3 Host / Forwarding • QoS / ACLs • In-chip buffer • Smart buffer allocation • Non-blocking design • U-turn support • Pckt forwarding: 35.7Mpps
Ingress/Egress block of ports Memory RoutingEngine ClassificationEngine Memory SwitchingEngine BufferManagement SecurityEngine Traffic Management Parser Modification 2 x 10GigE 24 x GigE 1 x 12GigE(FBUS+) 1 x 12GigE(FBUS+) • Type of blocks based on module - Available types are • 24 x GigE ports (Serdes/SGMII) for Copper (PHY req.) or Fiber • 2 x 10GigE (XAUI) for 10G XFP • 2 x Fabric Interface (FBUS+ 12Gbps) • Fabric interface to be used in Load-Sharing • Load-sharing is obtained by ‘aggregating’ two fabric Interfaces • each one going to different fabric (9700/9800) • each one going to same fabric (9600) • Load balancing using hashing • IP traffic: IP & TCP/UDP (Src & Dst) • Non IP traffic: MAC (Src & Dst)
Parser Memory RoutingEngine ClassificationEngine Memory SwitchingEngine BufferManagement SecurityEngine Traffic Management Parser Modification 2 x 10GigE 24 x GigE 1 x 12GigE(FBUS+) 1 x 12GigE(FBUS+) • Parsing all first 128 Bytes of each packet • Dedicated parsers for FBus+ ports (FBUS+ Header) • Full parsing only needed on original ingress Ethernet ports • Information is used for : • Subsequent engines : switching, routing & classification • sFlow • Extra information available on Fabric Interface • FBus+ header contains information extracted by the ingress parser like • type of packet, MC/UC bit etc.
Security Engine Memory RoutingEngine ClassificationEngine Memory SwitchingEngine BufferManagement SecurityEngine Traffic Management Parser Modification 2 x 10GigE 24 x GigE 1 x 12GigE(FBUS+) 1 x 12GigE(FBUS+) • DoS attack Detection (packets dropped based on the following conditions) • The conditions are individually programmable, however checks are turned off in the first release (most to be individually controlled w/ 6.1.3) • SIP = DIP for IPv4/IPv6 packets • TCP packets with control flags = 0, and sequence number = 0 • TCP packets with FIN, URG and PSH bits set, and sequence number = 0 • TCP packets with SYN and FIN bits set • TCP source port no. = TCP destination port no. • First TCP fragment does not have the full • TCP header (less than 20 bytes) • TCP header has fragment offset value as 1 • UDP source port no. = UDP destination port no. • CMP ping packets payload is larger than the • programmed value of ICMP maximum size • Fragmented ICMP packets
Switching Engine Memory RoutingEngine ClassificationEngine Memory SwitchingEngine BufferManagement SecurityEngine Traffic Management Parser Modification 2 x 10GigE 24 x GigE 1 x 12GigE(FBUS+) 1 x 12GigE(FBUS+) • Provides the Switching information • VLAN type select • VLAN look-up • L2 unicast look-up (VLAN+MAC) • L2 multicast look-up (non IP Multicast) • Standard Fwd Engine – Key figures • 4k VLANs • 256 Spanning Trees • 16k MAC • 1k Mobile/Authenticated MAC
Routing Engine Memory RoutingEngine ClassificationEngine Memory SwitchingEngine BufferManagement SecurityEngine Traffic Management Parser Modification 2 x 10GigE 24 x GigE 1 x 12GigE(FBUS+) 1 x 12GigE(FBUS+) • Provides the Routing information (IPv4/IPv6) • L3 unicast look-up • L3 multicast look-up • LPM : Longest Prefix Match – Wire Rate from the first packet • Look-up switch logic • Standard Fwd Engine – Key figures • 12k LPM / 8K Host • 1 entry per IPv4 entry (LPM/Host) • 2 entries per IPv6 entry (LPM/Host) • w/ max of 8k Next Hop • 4k Interfaces • 128 Tunnels • IPX handled in software
Classification Engine Memory RoutingEngine ClassificationEngine Memory SwitchingEngine BufferManagement SecurityEngine Traffic Management Parser Modification 2 x 10GigE 24 x GigE 1 x 12GigE(FBUS+) 1 x 12GigE(FBUS+) • Provides support for • Access Control List (filtering) • QoS (prioritization / priority mapping) • Policers • Counters • Server Load Balancing / Redirect • Provides the classification according to the user-defined rules • Appropriate actions are applied • Drop packets • Change queuing (priority/destination) • Ingress policing (64 Kbps) • Egress shaping (64 Kbps) • Std Fwd Engine. • 2k policies
Buffer Management Memory RoutingEngine ClassificationEngine Memory SwitchingEngine BufferManagement SecurityEngine Traffic Management Parser Modification 2 x 10GigE 24 x GigE 1 x 12GigE(FBUS+) 1 x 12GigE(FBUS+) • Manages a total of 16K 128Byte buffers • Tracks the number of buffers in use • Information used to control • Queuing for incoming packets • Scheduling for outgoing packets • Controls maximum memory for: • Each port • Each of the 8 priorities • Priority-aware buffer management complements priority queuing • Congestion Control • Monitoring of ingress buffers • Monitoring of egress buffers
Traffic Management Memory RoutingEngine ClassificationEngine Memory SwitchingEngine BufferManagement SecurityEngine Traffic Management Parser Modification 2 x 10GigE 24 x GigE 1 x 12GigE(FBUS+) 1 x 12GigE(FBUS+) • Main features • Queuing • Supports 8 COS Queues per egress port on the chip • AOS to pre-configure thresholds to control the Queue lengths • Scheduling Options • Strict Priority • Weighted Round Robin • Deficit Round Robin • Shaping • Per port (see classification for per-flow) • Controls the rate at which packets are sent • Example: 25 Mbps rate on a 100 (or 1000) Mbps port • Buffer Management works with Traffic • Management to provide COS
Modification Memory RoutingEngine ClassificationEngine Memory SwitchingEngine BufferManagement SecurityEngine Traffic Management Parser Modification 2 x 10GigE 24 x GigE 1 x 12GigE(FBUS+) 1 x 12GigE(FBUS+) • Packets are modified due to: • VLAN Tag insertion/removal • L3 Routed packet modification • Tunneling
Agenda • Introduction / OS9000 product overview • Block architecture (CMM / Network Interfaces / Backplanes) • Presentation of the Packet Processor • Presentation of the Switching Fabric & resiliency mechanism • Advanced network policies (QoS / ACLs)
Fabric Resiliency through Load Balancing • Based on a non-blocking shared-memory switching fabric • OS9600 / OS9700 to feature a 8-port FBUS+ (12Gbps) Fabric • OS9800 to feature a 16-port FBUS+ (12Gbps) Fabric • Nominal throughput achieved with Load balancing • Hash based: • IP Traffic: IPSA + IPDA + TCP/UDP port numbers • Non-IP Traffic: MAC SA + MAC DA • Load balancing is hash based • Load sharing mechanism to provide a fairly even distribution(similar mechanism to the one used for link aggregation) • Fabric Resiliency: • Hardware checks detect complete failures and board removals • Software checks detect unusual error rates on the FBUS+ links • Load balancing is reconfigured when needed
Failover (1) • CMM removal • Each CMM to includes two independent sub-modules: • Processing sub-module (aka CPM) • Switching Fabric module (aka CFM) • Presence pins to indicate module removal • Automatic take-over by remaining CPM (if not already primary) • NI card CPUs detect the removal through interrupt • Each NI card switches to the active control plane switch (BBUS+) • Each NI card turns off fabric load sharing and sends all traffic to the remaining fabric card • Packets in flight and in fabric memory are lost • Packets to and from the remaining CMM is not affected Worst case data loss – a few milliseconds
Failover (2) • Primary CPU failure / Take-over • SW crash or processor failure on the primary CMM triggers a failover to the standby CMM processor • The failing CPU reboots and becomes standby CPU • NI/Line card CPUs detect the failure through a backplane signal and switch to the new control plane switch • Fabric load sharing continues => no data plane interruption • Fabric failure • Detection by monitoring error rates & data traffic on FBUS+ links • Whenever CFM failure is detected that fabric card is disabled • Fabric Load sharing is disabled, NI to use the remaining CFM only • The second fabric card continues to operate normally
Agenda • Introduction / OS9000 product overview • Block architecture (CMM / Network Interfaces / Backplanes) • Presentation of the Packet Processor • Presentation of the Switching Fabric & resiliency mechanism • Advanced network policies (QoS / ACLs)
Fuji Qos • QoS Policies • Classification on L1/L2/L3/L4 (IPV6 support in future release) • Enqueuing in one of the 8 COS queues • Actions • Drop frames • Change queuing priority • Update TOS/Diffserv and/or 802.1P priority tags • 802.1p/TOS/Diffserv marking • 802.1p/TOS/Diffserv mapping • Per COS max bandwidth (64K bps) • Statistics (# of packets, # of bytes) • Ingress policing / Egress shaping • Multi-actions support
OS-9000 HW Classification • One Packet Processor per Module • Each Packet Processor to feature 16 integrated classifiers • Support for 128 rules per classifier (total of 2K rules) • Support of 16 L4 ranges (to limit consumption of entries) • Support for mask/key on any field (=> support of “don’t care”) • Capability of concatenating results of 2 classifiers (IPv6) - Ingress Port Bitmap- Src-port & Dest-port- TCP/UDP src & dst- Outer & inner VID- Ethertype & Outer VID- Ethertype & IP Protocol- Lookup Status - IP-SA, IP-DA, IP-Prot, L4 src, L4 dst, DSCP, IP-Flag, TCP-Control, TTL- IP-SA, IP-DA, IP-Prot, L4 Range, L4 dst, DSCP, IP-Flag, TCP-Control, TTL- IP-SA, IP-DA, IP-Prot, L4 src, L4 Range, DSCP, IP-Flag, TCP-Control, TTL- IPv6-SA- IPv6-DA- IPv6-DA (64b prefix), NH, TC, FL, TTL, TCP Control- MAC-DA, MAC-SA, Ethertype, Outer VID- MAC-SA, IP-SA, Ethertype, Outer VID- MAC-DA, IP-DA, Ethertype, Outer VID- User Defined 1 (4 x 4 Bytes – among the 1st 128 Bytes)- User Defined 2 (4 x 4 Bytes – among the 1st 128 Bytes) Field-1 - IP-Info, Opcode, Format- Src-port- Dest-port- Lookup Status Field-3 Field-2
Content Aware Processor Allocation • Allocation of content aware processor (based on classification ) • When running out of entry (128 entries per content aware proc.) • When needing different lookup type (L2 vs L3) • When enforcing a precedence (avoid implicit precedence)
De-queuing • Choice between 3 Algorithms • Strict Priority • Starting w/ highest priority first, Queues are serviced until empty • Weighted Round Robin (packet based) • Each queue to indicate how many packet to be serviced per interval • Weight configurable 0-15 • Value of 0 to indicate the queue is to be considered Strict Priority • Deficit Round Robin (bandwidth based) • Each queue to indicate how many chunk (10KB) to be serviced per time interval • Weight configurable 0-15 • Value of 0 to indicate the queue is to be considered Strict Priority
Key Tables in the Fwd Engine (1) • Port tables: provide port specific settings • Port tables specify per-port options including default VLAN • mirroring, learning mode (CPU managed or not), • default port priority, • subnet based VLANs, MAC based VLANs, VLAN precedence, etc. • VLAN tables: Obtain VLAN ID for untagged frames • Supports MAC based VLANs (1K entries) • Supports IP Subnet based VLANs (256 entries) • Supports Protocol based VLANs (16 entries) • L2 MAC table, used for both SA & DA lookups (16K entries) • MAC SA – verifies MAC/VLAN combination and triggers learning • MAC DA – finds destination module (chip number) and port number
Key Tables in the Fwd Engine (2) Pre-loaded LPM Table to enable 1st packet in HW • L3 tables: Routing decisions (IPv4 / IPv6) • LPM Table (12K entries) • Match on destination network (remote / local) • If local destination, then host table look-up is required • Match on variable length network/subnet • Match on entire destination address • Host Table (8K entries) • Local destination – Exact Match • Interface Table (4K entries) • IP DA Look up results in pointers to either • Single next HOP • Multiple next HOP (ECMP)
NI processor : OS7 versus OS9 • OS7000 : Ultra Sparc II • 171.6 Mips • PC100 SDRAM: 0.8 GBps transfer rate • System Bus Speed: 143 MHz • ICache: 8KB • DCache: 8KB • L2 Cache: N/A • OS9000 : Freescale MPC8540 • 1926 Mips • PC166 DDR2 SDRAM: 2.65 GBps transfer rate • System Bus Speed: 833 MHz • ICache: 32KB • DCache: 32KB • L2 Cache: 256 KB
Physical Layer Interfaces supported • FBUS+/XAUI - 12 Gbps/10Gbps • GbE Serdes - 1 Gbps • SGMII - 10/100/1000 Mbps • PCI - 66 MHz