220 likes | 277 Views
Explore the evolution of high performance networking, from Ultranet and HIPPI to Infiniband and 10 Gigabit Ethernet. Learn about physical layers, protocols, and network management. Discover the development of Infiniband and its specifications. Gain insights into the handling of data and the future of high performance networking.
E N D
TO-DAY • 1 High Performance Networking as sign of its time. • A Historical Overview of Hardware and Protocols • Yesterdays High Performance Networks • Ultranet, HIPPI, Fibre Channel, Myrinet, Gigabit Ethernet • GSN ( the first 10 Gbit/s network and secure ) • Physical Layer, Error Correction, ST Protocol, SCSI-ST • Infiniband ( the imitating 2.5 – 30 Gbit/s interconnect ) • Physical Layer, Protocols, Network Management • 5 SONET and some facts about DWDM, 10 Gigabit Ethernet, Physical Layers, Coupling to the WAN Arie Van Praag CERN IT/ADC 1211 Geneva 23 Switzerland E-mail a.van.praag@cern.ch
Summer 1996; DIGITAL, now COMPAQ, now HP participates with every new technology such also Future..IO and has already its own Memory Channel Summer 1996; SUN is very advanced with Fibre Channel as system interconnect INTEL advances NGIO a high speed serial back-plane bus Summer 1996; Summer 1996; IBM is always willing to send some people to look what happens. Infiniband History What happened in the background; It is Almost a Joke ?? Summer 1996; 1 A meeting near Chicago to define HIPPI-6400 policies. 2 Participants, All major computer manufacturers and some potential users. T I M E 3 Most participants receive very well the progress made on 10 Gbit/s HIPPI-6400 4 One major player defines carefully that he has no available interface for his products and will go his own way. 5 He copies HIPPI-6400 Phys, doubles the size of a micropacket and made some more minor modifications. 6 And takes up ST-Protocol, makes some modifications and calls it RDMA. 7 Future IO is born
F.C. Sun NGIO Intel PCI HP GSN FIO Compaq RIO other IBM INFINIBAND: The standards MIXER Mix All INFINIBAND
Infiniband Development Development of Infiniband Specifications in started: 1998 Objectives: A flexible High Performance Interconnect with Network Capabilities Physical connection: Full Duplex; Copper Cable or Fiber Optics Bandwidth: Bandwidth on demand, defined as: 1X – 2X – 4X – 12X Flexible Interface Possibilities from the Internal Bus and Back-Plane to the Network Connection. Is this seen before: remember ATM Networking by means of bandwidth adaptable Crossbar Switches Flexible Network management with or without Programmable Nodes The INFINIBAND community has a very large number of members from Computer Manufacturers and Software houses to Silicon foundries,But did not connect any Standards Organization; hence it is an Industry Norm
Infiniband Roadmaps 20 00
INFINIBAND PHY Physical Specifications: Bandwidth in Gbit/s Basic 2.5. Payload: ??? Multiplied Bandwidth Basic, 4X, 12X. 3 Different Standard Speeds 2.5 Gbit/s, 10 Gbit/s, 30 Gbit/s. Upcoding: 8B/10B Copper connections by 1, 4 or 12 Differential Pairs ( 100 Ohm, PECL levels )Fiber Connections 1 Pair on ST connector or 4 or 12 Parallel Fiber Cable Distance Covered: copper 17 m. Fiber 100 m. 10 Km
Differential Lines termination: 100 ohm ( against Common Mode ) Common Mode Level: 1 V Max. 0.5 V Min. Voltage Levels: ( Peak-Peak differential ) 1.6 V Max. 1 V Min. Rise and Fall Time: 100 ps. Remember HIPPI-800 PHY uses ECL This is PECL Wavelength: Short Range = IB-1X-SX 850 nm Connector: Dual LC Wavelength: Short Range = IB-4X-SX 850 nm Connector: Single MPO Wavelength: Short Range = IB-12X-SX 850 nm Connector: Dual MPO Short Range Distance: 2m – 75 m with 62.5/125 Fiber 2m – 125 m with 50/125 Fiber Wavelength Long Range = IB-1X-LX 1300 nm Connector: Dual LC Long Range Distance: 2m – 10 Km with Single Mode Fiber Remember HIPPI-6400 PHY ( GSN ) it pioneered 12X parallel Fiber Infiniband Electrical & Optical Specification
Byte 12 Byte 11 Byte 10 Byte 9 Byte 8 Byte 8 Byte 7 Byte 7 Byte 6 Byte 6 Byte 5 Byte 5 Byte 4 Byte 4 Byte 3 Byte 3 Byte 2 Byte 2 Byte 0 Byte 0 Byte 4 Byte 5 Byte 6 Byte 7 10 0 1 3 5 2 8 9 4 11 6 7 Byte 0 Byte 1 Byte 2 Byte 3 12 Multichannel Data Handling 1X, 4X, 12 X OR 2.5 Gbit/s, 10 Gbit/s and 30 Gbit/s How is the data handled ? INFINBAND calls it“STRIPING” In reality it is interleaved transmission.
Message Data IB Packet IB Packet IB Packet CONNECT RoutingHeader Transport Header Packet Payload InvariantCRC VariantCRC Local RoutingHeader Global RoutingHeader Transport Header Packet Payload InvariantCRC VariantCRC Local RoutingHeader RawHeader Other Transport Header Packet Payload VariantCRC Local RoutingHeader IPv6 RoutingHeader Other Transport Header Packet Payload VariantCRC Infiniband Basic Data Structures 8 12 0 – 4096 32 16 Local Packets( Within the subnet ) 8 40 12 0 – 4096 32 16 Global Packets( Between the subnets ) 8 3 12 0 – 4096 16 Raw PacketsWith RAW Header 8 40 12 0 – 4096 16 RAW PacketsWith IPv6 Header
Local RoutingHeader Global RoutingHeader Base TransportHeader Extended Transport Header(s) ImmediateBuffer Packet Payload InvariantCRC VariantCRC Bytes 8 40 12 4 – 44 4 0 – 4096 /4 32 16 LRH GRH BTH (X)ETH BTH IBPP ICRC VCRC IPVer = IP version 4 Flow label = Flow Label (for packets with special handling) 20 PayLen = Payload Length ( BTH to VCRC ) 16 NxtHdr = Next header 8 HopLmt = Hop Limit ( Limits Nu of Subnets Hops ) 8 SGID = Source Global ID 128 DGID = Destination Global ID 128 Remark that the Global Header is laid out such that either IPv4 or IPv6 will fit. VL = Virtual lane 4 SL = Service Level 4 = Reserved 2 LNH = Link Next Header(s) 2 DLID = Destination Local ID 16 = Reserved 5 PktLen = Packet Length ( 32 bit words) 11 SLID = Source Local ID 16 IB Full Data Structure 1
Local RoutingHeader Global RoutingHeader Base TransportHeader Extended Transport Header(s) ImmediateBuffer Packet Payload InvariantCRC VariantCRC Bytes 8 40 12 4 – 44 4 0 – 4096 /4 32 16 LRH GRH BTH (X)ETH BTH IBPP ICRC VCRC Opcode = 8 SE = Solicited Event 1 M = MigReq 1 PadCnt = Pad Count 2 Tver = Transport Header Type 4 P_KEY = Partition Key 16 = Reserved ( variant CRC ) 8 DESTQP = Destination Queue Pair No 24 A = Acknowledge Request 1 = Reserved ( invariant CRC ) 7 PSN = Packet Sequence Number 24 RDETH = Reliable Datagram Extended Transport Header 8 DETH = Datagram Extended Transport Header 16 RETH = RDMA Extended Transport Header 16 AETH = ACKExtended Transport Header 4 AtomicETH = Atomic Extended Transport Header 28 AtomicACKETH = AtomicACK Extended Transport Header 8 IB Full Data Structure 2
Local RoutingHeader Global RoutingHeader Base TransportHeader Extended Transport Header(s) ImmediateBuffer Packet Payload InvariantCRC VariantCRC Bytes 8 40 12 4 – 44 4 0 – 4096 /4 32 16 LRH GRH BTH (X)ETH BTH IBPP ICRC VCRC The Invariant Cyclic Redundancy Code covers all parts of the packet that do not change during transfer whatever the number of Hubs Switches or Routers this packet passes. This CRC field is present in all packets, except RAW packets as fields are not always known. The Polynomial used is the same as specified for Ethernet: CRC32 0x04c11db7 initial value: 0xFFFFFFFF The Variant Cyclic Redundancy Code covers all parts of the packet from the first byte of the LRH to the last byte before the VCRC. As a number of the fields will change depending on the number of Hubs, Switches or Routers passed during transfer, the VCRC should be regenerated during every passage of such a device. The Polynomial used is the same as specified for: HIPPI-6400 Phy 0x100B initial value: 0xFFFF Infiniband Full Data Structure 3
Simple Packet (send) LRH BTH IBPP ICRC VCRC Simple Packet with Global Route LRH GRH BTH IBPP ICRC VCRC Acknowledge Packet LRH BTH AETH ICRC VCRC RDMA Request Packet LRH BTH RETH ICRC VCRC RDMA Response Packet LRH BTH AETH IBPP ICRC VCRC RDMA Write Request LRH BTH RETH IBPP ICRC VCRC Datagram Packet LRH BTH DETH IBPP ICRC VCRC Reliable Datagram Packet LRH BTH RDETH DETH IBPP ICRC VCRC Atomic ( CmpSwap ) Packet LRH BTH Atomic ETH ICRC VCRC Atomic Acknowledge Packet LRH BTH AtomicACK ICRC VCRC Raw Packet LRH RWH IBPP VCRC Raw Packet with IPv6 Route Header LRH IPv6 IBPP VCRC Or up to 156 Bytes header data for 4096 bytes Message Data / Packet = up to 4 % packing overhead As most modern network INFINIBAND has a number of parameters that are set by “OPCODES” In this case they are transmitted with either administrative Packets or with the Atomic class of Packets. Only the RDMA class has a large number of specific link parameters in its header. Infiniband Data Structure Examples
Infiniband has its own data format that is near to IPv6 and its own error checking, Other protocols are encapsulations. IPv4 is supported in the RAW DATA format. TCP is not supported and replaced by its own Error detection ( ICRC, VCRC ). IPv6 is supported in the RAW DATA format. TCP is not supported and replaced by its own Error detection ( ICRC, VCRC ). SCSI is mentioned as encapsulation; a standard is worked out by ANSI T10 and should be nearly equivalent to SST and iSCSI. Preferred transfer mode is RDMA FTP is mentioned but not defined. UDP is not mentioned, but can be encapsulated; again the most appropriate is the RAW data format IB Protocol Support
Host(s) IBA-IP Router WANs HCA SW LANs HCA Legacy SANs IBA-LAN Switch/NIC SAN Storage INFINIBAND Examples CPU CPU • • • PCI I/O adapters Mem PCI PCI Ext. IBA interface(s) PCI - IBA SX HCA PCI PCI TCA TCA TCA Native IBA I/O adapters Infiniband Implementation in a processor or server environment. In IB every node can be an network administrator node, which makes already that at this level a real problem may occur due to the complex relations.
From: IB I/O Infrastructure Some IB Network Properties IB brings itself forwards as a secure network, but uses error detection and retransmits: HENCE HIGH LATENCY. Every Node, or interface in a Subnet or Global net can be network manager with continued Master-Slave functions: HENCE COMPLEX MANAGMENT Many data formats, resulting in a large number of variable header fields:HENCE SOFTWARE LATENCY RDMA brings fast memory to memory transfers but putting all the DMA data in the headers enlarges the packets unnecessary. HENCE INCREMENT TRANSFER LATENCY. Every transfer must be acknowledged including the 10 Km fiber connection. HENCE DISTANCE LATENCY.
GigabitEthernet Fibre Channel( to storage ) Total 96 Processors IB Cook Book • 1 Take some IB Processors • Take a Crate with IB Back-Panel • Put in the Processors into the crate • Add IB X4 interfaces • Add Gigabit Ethernet • Add a Fibre Channel Storage Connection • 7 Put the crates in a Rack • You have a very powerful Unit with up to 10Gbit/s communication bandwidth.
Some more IB products PCI 64/66 Interfaces To 8 X IBX1 at 2.5 Gbit/s IB Switch Prototypewith 8X IBX1 and 1X IB X4 One of a number of Silicon Chips with its architecture
Infiniband has even more options and more protocols as FC. With over 2000 pages it is one of the most complicated standards known. Where will it be successful ?? Remember Fibre Channel; To many options, to many protocols, not very profitable using TCP/IP. As it was adapted by Disk manufacturers it crystallized out on the storage market. INFINIBAND is not directly storage oriented with a Disk interface. The Interface Standards are advanced as well, PCI-X 2 = 2.1 GByte/s, 4.2 GByte/s HyperTransport = 12.8 MByte/s PCI-Express = 2.5 GByte/s. I/O and Processor Back-plane developments will stay with popular PCI interfaces and Processor oriented interconnects that are fast enough to drive 10 Gbit/s bandwidth. Networks are TCP/IP oriented. Read Ethernet variants Intel and Microsoft pulled out of the INFINIBAND Technical association Hence INFINIBAND will be a niche market product, Where: The Performing BLADE SERVER market. The Future of Infiniband
Ethernet T base 100 Fibre Channel ATM HIPPI HIPPI-Serial 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 2000 01 02 03 04 05 Standards & Popularity( made in 1995 and extended 2000 ) Gigabit Ethernet Popularity GSN Infiniband PCI / PCI-X/ PCI-X2
InfiniBandTM Architecture Specification, Vol1, Vol2, Release 1a, June 19, 2001, Infiniband Trade Organization.http://www.infinibandta.org/specs Realizing the full potential of server, switch &I/O blades with InfiniBand Architecture, Document WP120801100, December 2001, Mellanox Technology, http://www.mellanox.com/products/shared/BladesArchWP110.pdf High Performance Networking, at CERN and Elsewhere, Arie van Praag, CERN, IT/PDP, CERN-IT-2002-002, 15 June 2001 Infiniband Organization responsible for the standard. http://www.infinibandta.org/home References:
END Part 4Coming Next 5 10 Gigabit Ethernet ( the name everybody knows ) Physical Layers, SONET protocol and problems, Coupling to the WAN