190 likes | 299 Views
Hardware ‘flow control’. How we can activate our NIC’s ability to avoid overwhelming the capacities of its ‘link partner’. Our ‘txburst.c’ demo. This module is intended to let us explore a problem that can arise in using our ‘ nic.c ’ character-mode device-driver
E N D
Hardware ‘flow control’ How we can activate our NIC’s ability to avoid overwhelming the capacities of its ‘link partner’
Our ‘txburst.c’ demo • This module is intended to let us explore a problem that can arise in using our ‘nic.c’ character-mode device-driver • It lets the user trigger the transmission of multiple ethernet-packets in a single ‘burst’ • When using the ‘cat’ command, our ‘nic.c’ device-driver cannot seem to keep up with the amount of arriving packet-data
In-class demonstration • Install ‘nic.ko’ on one of our anchor-cluster machines and execute the ‘cat’ command: $ cat /dev/nic • Compile our ‘txburst.c’ module and install it on another of the anchor-cluster stations, then execute the following ‘cat’ command: $ cat /proc/txburst Timeout for this classroom demonstration
Some packets got ‘lost’ • The burst of packets being transmitted are arriving too rapidly for our device-driver to service all of them – hence ‘lost’ packets! • Modern ethernet controllers (like 82573L) offer a convenient way for the hardware to assist a device-driver in overcoming this ‘data-congestion’ problem • It’s an IEEE 802.3 ‘flow control’ standard
How it works An Overview of the IEEE 802.3 Flow Control Sequence Courtesy of Cisco Systems Documentation online
Format a of PAUSE frame the standard ‘pause’ opcode a special reserved multicast-address a special reserved frame Type 01 80 C2 00 00 01 source MAC-address 88 08 00 01 delay-time 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 frame checksum the desired maximum duration of the ‘pause’ (expressed in 512 ‘bit-times’)
Automatic XOFF/XON • In principle it would be possible for the device-driver programmer to include code that would transmit a PAUSE frame • But it’s much simpler to just delegate that functionality to the network hardware and avoid consuming CPU-time setting it up • The 82573L makes it easy for a driver to “turn on” the ‘flow control’ mechanism
The ‘Flow Control’ registers enum { E1000_FCAL = 0x0028, // Flow Control Address Low E1000_FCAH = 0x002C, // Flow Control Address High E1000_FCT = 0x0030, // Flow Control frame Type E1000_FCTTV = 0x0170, // Flow Control Tx Timer Value E1000_FCRTL = 0x2160, // Flow Control Rx Threshold Low E1000_FCRTH = 0x2168, // Flow Control Rx Threshold High };
Packet Buffer Allocation 000C 0014 the default allocation 20-KB for TX 12-KB for RX PBA = 32-KB FIFO Transmit FIFO Receive FIFO when space consumed in the Rx FIFO reaches the high-water mark, the NIC transmits an XOFF frame to PAUSE any further reception until some data drains from the FIFO enough that the space consumed drops beneath the low-water, at which time the NIC transmits an XON frame to request its Link-Partner to RESUME sending packets FCRXH FCRXL
Programming details # setting up the 82573L Flow Control registers iowrite32( 0x00C28001, io + E1000_FCAL ); iowrite32( 0x00000100, io + E1000_FCAH ); iowrite32( 0x00008808, io + E1000_FCT ); iowrite32( 0x00000680, io + E1000_FCTTV ); iowrite32( 0x800047F8, io + E1000_FCRTL ); iowrite32( 0x00004800, io + E1000_FCRTH );
The ‘Flow Control’ statistics enum { E1000_XONRXC = 0x4048, // XON Received Count E1000_XONTXC = 0x404C, // XON Transmitted Count E1000_XOFFRXC = 0x4050, // XOFF Received Count E1000_XOFFRXC = 0x4054, // XOFF Transmitted Count };
Device Control (0x0000) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 PHY RST VME R =0 TFCE RFCE RST R =0 R =0 R =0 R =0 R =0 ADV D3 WUC R =0 D/UD status R =0 R =0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 R =0 R =0 R =0 FRC DPLX FRC SPD R =0 SPEED R =0 S L U R =0 R =0 R =1 GIO M D 0 0 R =0 F D FD = Full-Duplex SPEED (00=10Mbps, 01=100Mbps, 10=1000Mbps, 11=reserved) GIOMD = GIO Master Disable ADVD3WUP = Advertise Cold Wake Up Capability SLU = Set Link Up D/UD = Dock/Undock status RFCE = Rx Flow-Control Enable FRCSPD = Force Speed RST = Device Reset TFCE = Tx Flow-Control Enable FRCDPLX = Force Duplex PHYRST = Phy Reset VME = VLAN Mode Enable We used 0x040C0241 to initiate a ‘device reset’ operation 82573L
Receive Control (0x0100) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 R =0 0 FLXBUF 0 SE CRC BSEX R =0 PMCF DPF R =0 CFI CFI EN VFE BSIZE 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 B A M R =0 MO DTYP RDMTS I L O S LBM S L U LPE MPE UPE 0 0 SBP E N R =0 EN = Receive Enable DTYP = Descriptor Type DPF = Discard Pause Frames SBP = Store Bad Packets MO = Multicast Offset PMCF = Pass MAC Control Frames UPE = Unicast Promiscuous Enable BAM = Broadcast Accept Mode BSEX = Buffer Size Extension MPE = Multicast Promiscuous Enable BSIZE = Receive Buffer Size SECRC = Strip Ethernet CRC LPE = Long Packet reception Enable VFE = VLAN Filter Enable FLXBUF = Flexible Buffer size LBM = Loopback Mode CFIEN = Canonical Form Indicator Enable RDMTS = Rx-Descriptor Minimum Threshold Size CFI = Canonical Form Indicator bit-value We used 0x0480801E in RCTL to prepare the ‘receive engine’ for flow control
Transmit Control (0x0400) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 R =0 R =0 R =0 MULR TXCSCMT UNO RTX RTLC R =0 SW XOFF COLD (upper 6-bits) (COLLISION DISTANCE) 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 COLD (lower 4-bits) (COLLISION DISTANCE) CT (COLLISION THRESHOLD) 0 ASDV SPEED I L O S S L U TBI mode P S P R =0 0 0 E N R =0 EN = Transmit Enable SWXOFF = Software XOFF Transmission PSP = Pad Short Packets RLTC = Retransmit on Late Collision CT = Collision Threshold (=0xF) UNORTX = Underrun No Re-Transmit COLD = Collision Distance (=0x3F) TXCSCMT = TxDescriptor Minimum Threshold MULR = Multiple Request Support We used 0x0103F0F8 in TCTL to setup the ‘transmit engine’ before enabling it 82573L
Our ‘txburst.c’ again • The statement that enables flow control in the Device Control register originally was “commented out” for our earlier demo • Now we restore that statement as part of the executable code during initialization • This time we observe a different effect! Timeout for this second classroom demonstration
In-class exercise #1 • Can you reduce the value in the FCTTV-register (to PAUSE for a briefer time) and still avoid losing any transmitted packets? • How small can FCTTV be?
In-class exercise #2 • Is it necessary to turn on BOTH of the bits in the Device Control register that enable the controller’s hardware flow control? • The RFCE-bit (bit 23) • The TFCE-bit (bit 24)
In-class exercise #3 • Must the ‘receive’ engine be enabled? • Must the PMCF-bit (bit #23) be turned on in the RCTL register? PMCF = Pass MAC Control Frames • Could the DPF-bit (bit 324) be turned on? DPF = Drop PAUSE Frames
Out-of-class exercise • Can you design module-code that would demonstrate the use of the SWXOFF-bit (bit #22) in the Transmit Control register?