130 likes | 275 Views
Longbow InfiniBand Extension. Dr. David T Southwell President & CEO. Agenda. The fundamentals of InfiniBand flow control InfiniBand range limitations – two mechanisms Longbow InfiniBand range extension technology Potential applications at CERN. InfiniBand flow control.
E N D
Longbow InfiniBand Extension Dr. David T Southwell President & CEO
Agenda • The fundamentals of InfiniBand flow control • InfiniBand range limitations – two mechanisms • Longbow InfiniBand range extension technology • Potential applications at CERN Obsidian Research Corporation - CERN
InfiniBand flow control • InfiniBand is credit based • On initialisation, each fabric end-point declares its capacity to receive data • This capacity is described as it’s “buffer credit” • As buffers are freed up, end points post messages updating their credit status • InfiniBand is therefore lossless – data is never thrown away, since… • …InfiniBand flow control happens before the transmission, not after it! • Note the buffer credit mechanism applies to every point-to-point link (not end-to-end) • This mechanism is in contrast to Ethernet’s loss-based “flow control” : • On network over-subscription, packets are simply thrown away • Detected packet loss triggers retransmissions and adjustments to the injection rate Obsidian Research Corporation - CERN
InfiniBand range limitations • As commercialised today, InfiniBand addresses the cluster/ supercomputer market… • High equipment packing density (rack-to-rack connections are short) • InfiniBand switches cascade easily (very low latency), so multi-hop is ok • High port count switches (large ICs) • NASA’s “Columbia”– 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN
Mechanism (1) – physical layer • These applications are served by standard InfiniBand cables: • Balanced copper cables (twin-axial, shielded & tight impedance control) • Cheaper than optics, but with a range < 20m (RF losses) @ 2.5GBits/s (“SDR”) • At DDR (5Gbits/s) and QDR (10Gbits/s) rates per channel, cables get even shorter • There exists a parallel optic multi-mode fibre solution (simple E-O-E) • More expensive (especially the parallel fibre bundles themselves) • Self-limits @ ~200m • A good solution for longer inter-rack runs or for links between floors • MPO will see more use at DDR/ QDR rates (today) (soon!) • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN
Mechanism (2) – link layer • Optimised for a short signal flight time; small buffers are used inside the ICs: • Facilitates switch IC implementation, but limits effective range to ~ 300m Undersized buffers restrict the sustained data flow rate – in this case data is only moving in phases 1 and 5! The inefficiency is caused by an inability to keep the pipe full by restoring the receive credits fast enough to avoid a break up of the burst. The longer the flight time, the lower the effective transfer rate is. This limits the useful length of an InfiniBand link no matter what the physical transport is capable of. (Nb. this has no impact on copper InfiniBand links – receive buffers >> 2x wire data capacity). • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN
Longbow Technology • Obsidian has developed a technology that performs InfiniBand encapsulation over 10GbE, Packet Over SONET/SDH and ATM WANs at 4x InfiniBand speeds: Longbow XR. • Looks like a 2-port InfiniBand switch to the InfiniBand fabric • Designed for 100,000km+ ranges, prototypes publicly tested over 1,500km and 8,500km OC-192c networks (SC|04, OFC’05, SC|05) • 950+MBytes/s sustained performance in a single logical flow • ~ 4% CPU load (Opteron 242s using RDMA transport) • IPv6 Packet Over SONET & ATM modes • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN
Longbow Transport • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN
Longbow @ SC|05 • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN
The Obsidian Longbow XR • Transparent to InfiniBand hardware, stacks and applications • Very user-friendly long-haul wire-speed InfiniBand data pump • Compatible with all InfiniBand equipment and stacks, including OpenFabrics • High availability architecture – telecom grade equipment • A managed device (HTTP GUI, SSH CLI, SNMP) – 10/100 Ethernet/ serial console • Also encapsulates two GbEthernet channels along with the 4x SDR InfiniBand channel • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN
Potential Application…ATLAS In collaboration with Dr. Bryan Caron (University of Alberta, Canada), Bill St.Arnaud (Canarie Inc. - Canada’s high performance research network) and others, Obsidian will soon launch a multi-stage Long Haul InfiniBand project which will demonstrate reliable, 10Gbits/s transfer of bulk data back & forth across the Atlantic: CERN would be the preferred end point for such a demonstration - Canarie has confirmed that the entire lightpath would be available for sustained streaming demonstrations. Obsidian Research Corporation - CERN
Longbow Campus and Metro • Obsidian also sees application for the range extension technology over SONET/ SDH networks for Metro Area Networks (up to 120km), and for dark fibre campus applications (up to 10km). • Remote InfiniBand storage (replication, distributed SAN) • Visualisation applications; tap directly and natively into distant clusters • Aggregate remote InfiniBand clusters into larger compute resources • Campus and Metro versions are currently in development. They will be optimised for latency and the more efficient use of smaller networks. Obsidian Research Corporation - CERN
Conclusions InfiniBand is becoming a critical element in high performance computing architectures. With demonstrated uncompromising long haul capability, InfiniBand and Longbow technology may represent an excellent long term platform for globally distributing the relentless data streams LHC will emit during its lifetime. InfiniBand, global optical network transports and Longbow technologies will scale in performance over time to continue to offer a compelling system-level solution that will present a stable interface to the applications software. Thank you for your attention. http://www.obsidianresearch.com (P.S. Thanks for Web too!) Obsidian Research Corporation - CERN