150 likes | 307 Views
Kevin Lam Professor Paul Chow. Design and Characterization of TMD-MPI Ethernet Bridge. Background. In embedded development, Field Programmable Gate Arrays (FPGAs) allow mixing of hard and soft elements Issue: Communication between components
E N D
Kevin Lam Professor Paul Chow Design and Characterization of TMD-MPI Ethernet Bridge
Background • In embedded development, Field Programmable Gate Arrays (FPGAs) allow mixing of hard and soft elements • Issue: Communication between components • MPI is a standard for inter-process communication in software parallel programming • TMD-MPI implements a hardware interface for MPI • Unified method of communication between all components • Abstracts the communication medium
TMD-MPI Application FPGA Embedded CPU Hardware Engines
Research Objective • Attempt to create a module that routes MPI messages through on-board Ethernet hardware • Design Objectives: • Reliability • High Bandwidth • Low Latency
Multi-Board TMD-MPI system (your desk!)
Design Specification • Behaviour • FSL interface to on-board network (via NetIF module) • Transparent inter-board communication via Ethernet • FIFO behaviour • Assumptions • Only two systems connected, by short Ethernet cable • Reliable, in-order delivery of packets
Design Specification FPGA FPGA Logical FSL FSL-Ethernet FSL-Ethernet FSL FSL
Design Specification Packet Format • Basic Ethernet framing • Source/Destination MAC address • Type code – set to an unused value • CRC checksum performed by Ethernet hardware • Length header to account for smaller-than-minimum packets • Set to 0 if data is longer than minimum packet length • Followed by data payload, up to maximum Ethernet frame length • No handshake protocol implemented
Implementation • Base design around Xilinx echo example • First send/receive data independently • Test using PC as second ‘board’ • Develop FSL interface, test with MicroBlaze • Merge FSL interface to Ethernet module • Add protocol headers, implement data packing
Testing • Initial testing done using MicroBlaze processors • High speed testing done using custom cores Board 1 Board 2 MicroBlaze 1 FSL-Enet FSL-Enet MicroBlaze 2 UART UART PC
Design Evaluation • Reliability testing using two dedicated tester cores • Transmitting sequence numbers, checking for mismatch/out of order Board 1 Board 2 Test Core 1 FSL-Enet FSL-Enet Test Core 2 GPIO GPIO MicroBlaze MicroBlaze UART UART PC
Design Evaluation Bandwidth • Uses 2 dedicated hardware cores • Core A writes data to the FSL-Ethernet core whenever the FSL is not full • Core B reads data from the FSL-Ethernet core whenever the FSL has data • Core A counts the number of clock cycles elapsed to write a certain amount of data to the FSL • Result read by MicroBlaze and displayed via UART • Measured constant 962.5 Mbit/s data throughput rate
Design Evaluation Latency • Uses 2 dedicated hardware cores • Core A sends 1 word of data to Core B via FSL-Ethernet • Core B replies with 1 word • Core A measures the clock cycles elapsed between sending its data and receiving the response • Result read by MicroBlaze and displayed via UART • Several trials yield 598-599 cycle round trip (125 MHz) • One-way latency is approximately 300 cycles or 2.4 microseconds
Conclusions • Bandwidth sufficient for VGA video streaming • Latency may be an issue for applications that pass data back and forth frequently • No handshaking protocol • Receiver in streaming applications must not be slower than the sender • Messages cannot be sent until both boards are powered on and connected • Future versions should attempt to implement handshaking • Affect complexity and bandwidth • Much more robust system
Acknowledgements Professor Paul Chow Vince Mirian Sami Sadaka Tony Zhou