Mapping of scalable RDMA protocols to ASIC/FPGA platforms

Mapping of scalable RDMA protocols to ASIC/FPGA platforms Yosef Gavriel Tirat-Gefen, PhD Senior Member IEEE Chief Scientist Castel Systems Inc. & Dept. Physics and Astronomy George Mason University Fairfax, VA yosefgavriel@computer.org

Presentation Overview Motivation TCP Off-loading Zero-copying RDMA protocol RDMA protocol stack Structure of a RDMA card Results Conclusion

Motivation Supercomputer or Server farm Supercomputer or Server farm WAN Terabyte storage Terabyte storage Workstation Enabling high-bandwidth WAN applications

Applications Distributed Command and Control. Signal processing (e.g. RADAR) Sharing of intelligence data real-time. Distributed large scale computation/ simulation of aerospace problems. Extension of storage area networks over a wide area network (WAN). Enabling technology for modern supercomputing installations.

Layer 3 Layer 2 Layer 1 Layer 3 Layer 2 Layer 1 Traditional TCP/IP Networking Application/O.S. TCP Layer 3 (IP) Layer 2 (MAC) Layer 1 (PHY) Application/O.S. TCP Layer 3 (IP) Layer 2 (MAC) Layer 1 (PHY) Router

L3 L2 L1 Standard Data Flow on TCP/IP Application A Memory Space Application B Memory Space WAN/LAN TCP Buffer/Stack Memory Space TCP Buffer/Stack Memory Space L1 L2 L3

Standard Data Flow on TCP/IP • Traditional TCP/IP copies data from application to TCP memory buffer • Leads to CPU lost cycles in buffer copying • CPU gets overwhelmed to rates above 2.5 Gbps • TCP/IP off-loading is a help but it does not solve the problem on the receiver side

Application/O.S. TCP Layer 3 (IP) Layer 2 (MAC) Layer 1 (Phy) TCP/IP off-load processing Application/O.S. TCP/IP offload Processor (TOE) Mapped to hardware

Zero-copying and TCP offloading processing Host CPU Cache Memory TCP off-load Processor TOE/NIC Card Host CPU Host Main Memory Receive Buffer Network buffer WAN/LAN

Zero-copying and TCP offloading processing • Zero-copying is still not achieved as receiver buffer is still copied back to application memory space • TCP/IP off-loading is not scalable • RDMA protocols provide a solution

RDMA data-flow for WAN applications Host Memory Host Memory Host CPU B Host CPU A Application Memory Space Application Memory Space WAN RDMA NIC Card RDMA NIC Card

Scalable WAN-RDMA for bandwidths above 10 Gbps 10 Gbps links RDMA NIC Card for WAN Tx Buffer PHY Host MAC > 10 Gbps WAN RDMA Engine Rx Buffer DMA channel

The RDMA protocol layers and our prototype Running on Host CPU ULP (e.g. iSCSI, NFS) RDMA DDP MPA SCTP TCP Layer 3 (e.g. IP) Layer 2 (MAC) Layer 1 (PHY) FPGA implementation FPGA and off-the-shelf MAC/PHY chips

PCI-Express/Hyper-transport Interface Overall Hardware/Firmware Organization of the WAN RDMA card IP/Firmware module RDMA Protocol Engine Rx Memory controller Tx Memory controller SCTP Protocol Engine Rx Memory Bank Layer 3 (IP) Processor Rx Memory Bank Data stream split/join unit SAR SAR SAR SAR 10GE/OC-192 framer 10GE/OC-192 framer 10GE/ OC-192 framer 10GE/OC-192 framer PHY PHY PHY PHY

Present Results Currently using Virtex-II/Virtex-IIPro (Xilinx) as target devices for our cores Data indicate that most of the key cores will fit one FPGA device (Virtex-II) Aggregate of all cores is spanning several FPGAs Intra-device communication is a issue, need to be careful with PCB design. We are currently trying to accommodate most of the cores in one FPGA. Most of the cores will be made available free-of-charge to researchers in non-profit or government organizations.

Conclusion Advent of Hyper-transport/ PCI-Express and VITA (embedded computing) standards will enable I/0 bandwidths above 10 Gbps locally Extension of RDMA protocol enables large bandwidths over wide area networks The proposed cores will fulfill the natural growth of bandwidth requirements in commercial/defense/aerospace applications.

Mapping of scalable RDMA protocols to ASIC/FPGA platforms

Mapping of scalable RDMA protocols to ASIC/FPGA platforms

Presentation Transcript

Constructing Scalable Overlays for Pub/Sub With Many Topics

Network+ Guide to Networks 5 th Edition

Process Mapping

Curriculum Mapping

Introduction to Intervention Mapping

CHAPTER Protocols and IEEE Standards

13. Positional Cloning and Gene Mapping

FPGA and ASIC Technology Comparison

GIS and Geologic Mapping Day 2

Routing

Data mining @ Mahout

Chapter 3 Genetic analysis of l inkage and chromosome mapping

Protocols

Network+ Guide to Networks 5 th Edition

2013 CAD Contest Technology Mapping for Macro Blocks

WIRELESS SENSOR NETWORKS

Texture Maps

Epidemics

Chapter 2: Application Layer