Investigating NIC Capabilities for Packet Checksum

What is a packet checksum? Here we investigate the NIC’s capabilities for computing and detecting errors using checksums

Gigabit Ethernet frame-format Carrier Extension Data Preamble Destination Address Source Address Type/Lenth Frame Check Sequence Start-Of-Frame Delimiter Frame is extended to occupy a slot-time of 512 bytes

Some lowest-level details • The frame’s Preamble consists of 7 bytes of an alternating bit-pattern of 1’s and 0’s • The Start-of-Frame Delimiter is a one-byte bit-pattern which continues the alternation of 1’s and 0’s until the final bit is reached (which ‘breaks’ the pattern-of-alternation) 10101010 10101010 10101010 10101010 10101010 10101010 10101010 10101011 The 64-bit Preamble and SFD In hexadecimal notation: 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAA 0xAB The Start-of-Frame Delimiter

The FCS field • The Frame Check Sequence is a four-byte integer, usually computed by the hardware according to a sophisticated mathematical error-detection scheme known as Cyclic-Redundancy Check (CRC): The CRC is calculated by performing a modulo 2 division of the data by a generator polynomial and recording the remainder after division CRC-32 = x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1

The CRCERRS register • Our Intel Pro1000 ethernet controller has a statistical counter register (the first one, at offset 0x4000 in the memory-mapped I/O- space) which counts any received frames which arrived with a CRC-error indicated • Our ‘nic2.c’ was programmed to ‘strip’ the 4-byte FCS field from all received frames, by setting the SECRC-bit in register RCTL

Another (simpler) checksum The Pro1000’s ‘legacy-format’ Receive-Descriptor 16 bytes Base-address (64-bits) Packet- length Packet- checksum status errors VLAN tag The device-driver initializes this ‘base-address’ field with the physical address of a packet-buffer The network controller will ‘write-back’ the values for these fields when it has transferred a received packet’s data into the packet-buffer

What is this ‘packet checksum’? • According to the Intel documentation, the packet checksum is “the unadjusted 16-bit ones complement of the packet” (page 24) • Now what exactly does that mean? • And why did Intel’s hardware designers believe that device-driver software would need this value to be available for every ethernet packet that the NIC receives?

The idea of ‘complements’ • Whenever a whole object gets divided into two parts, those pieces often referred to as complements • Example: complementary angles in a right-triangle B ABC + BAC = 90° A C

Set Theory Venn Diagram B A Within the ‘universe’ represented here by the box, the orange set B is the complement of the green set A.

Arithmetic • Among the digits in our ‘base ten’ system: • the values 1 and 9 are complements • the values 2 and 8 are complements • the values 3 and 7 are complements • the values 4 and 6 are complements • Complements are useful when performing subtractions in modular arithmetic: 4 - 6 (mod 10) = 4 + 4 (mod 10)

Two’s complement • Digital computers use modular arithmetic in the ‘base two’ number system • Two 8-bit numbers are ‘complements’ if their sum equals 28 (= 256): 00000001 is the twos-complement of 11111111 00000010 is the twos-complement of 11111110 00000011 is the twos-complement of 11111101 • As a consequence, subtractions can be done with the same circuits as additions

One’s complement • But for some purposes there is a different kind of ‘complement’ that results in better arithmetical properties – it’s known as the ‘diminished radix’ complement • For the case of radix ten, it’s called ‘nine’s complement’, and for the case of radix two it’s called ‘one’s complement’

Nine’s complements • A pair of 3-digit numbers x and y in the radix ten number system would be called nine’s complements if x+y = 103-1 = 999 • Thus 321 and 678 are nines-complements • A pair of 3-digit numbers x and y in the radix two number system would be called one’s complements if x+y = 23-1 = 111 • Thus 110 and 001 are ones-complements

‘end-around-carry’ • When you want to add two binary numbers using the one’s complement system, you can do it by first performing ordinary binary addition, and then adding in the carry-bit: 10101010 (8-bit number) + 11110000 (8-bit number) --------------------- 1 10011010 (9-bits in the normal total) + 1 (apply ‘end-around carry’) --------------------- 10011011 (8-bits in the ones-complement total)

NIC uses ‘one’s complement’ • For network programming nowadays it is common practice for ‘one’s complement’ to be used when computing checksums • It is also common practice for multi-byte integers to be ‘sent out over the wire’ in so called ‘big-endian’ byte-order (i.e., the most significant byte goes out ahead of the bytes having lesser significance)

Intel’s cpu uses ‘little endian’ • Whenever our x86 processor ‘stores’ a multi-byte value into a series of cells in memory, it puts the least significant byte first (i.e., at the earliest memory address), followed by the remaining bytes having greater significance) 0x3456 AX = mov %ax, buf buf: 0x56 0x34

Checksum using C { unsigned char *cp = phys_to_virt( rxring[ rxhead ].base_address ); unsigned int nbytes = rxring[ rxhead ].packet_length; unsigned int nwords = ( nbytes / 2 ); unsigned short *wp = (unsigned short*)cp; unsigned int i, checksum = 0; if ( nbytes & 1 ) { cp[ nbytes ] = 0; ++nwords; } // pad odd length packet for ( i = 0; i < nwords; i++ ) checksum += wp[ i ]; // two’s complement sum checksum += (checksum >> 16); // do ‘end-around carries’ checksum = htons( checksum ) // -- adjustment #1: swap the bytes checksum = ~checksum; // -- adjustment #2: and flip the bits checksum &= 0xFFFF; // mask to lowest 16-bits // Let’s compare our checksum-calculation with the one done by the PRO1000 printk( “ cpu-computed checksum=%04X “, checksum ); printk( “ nic’s rx packet-checksum=%04X “, rxring[ rxhead ].packet_chksum ); printk( “\n” ); }

In-class demonstration #1 TIMEOUT We will insert into our ‘nic2.c’ device-driver’s ‘read()’ function our C code that computes and displays the “unadjusted 16-bit ones complement sum” for each received packet and compare our calculation with the NIC’s ‘packet_checksum’

Checksum ‘offloading’ • Our Intel 82573L network controller has the capability of performing several useful checksum calculations on normal network packets – if this desired by a device-driver Receive CheckSum control register 31 9 8 7 0 reserved (=0) T U O F L D I P O F L D PCSS RXCSUM (0x5000) Legend: PCSS = Packet Checksum Start (default=0x00) IPOFLD = IP-checksum Off-load Enable (1=yes, 0=no) TUOFLD = TCP/UDP checksum Off-load Enable (1=yes, 0=no)

Using ‘nicecho.c’ • To compensate for the modifications made to the DA and SA fields by our ‘echo.c’, we can omit the first six words (12 bytes) from the checksum-calculations done both by our read() code and by the nic hardware // we start our addition-loop at i=6 instead of i=0 for ( i = 6; i < nwords; i++ ) checksum += wp[ i ]; AND // we initialize the CRXCSUM register with PCSS=12 iowrite( 0x0000000C, io + E1000_RXCSUM );

In-class demonstration #2 TIMEOUT We will modify the nic’s RXCSUM register (as well as our own previous checksum computation) and observe the resulting effects

The ‘Legacy’ Transmit-Descriptor 16 bytes Base-address (64-bits) Packet- length CSO cmd status CSS special CSO = CheckSum Offset CSS = CheckSum Start Command-Byte Format I D E V L E 0 0 W B I C I F C S E O P

Our driver’s packet-layout destn-address source-address TYPE/ LENGTH count -- data -- -- data -- -- data – -- data -- -- data -- -- data -- Let’s make room for a new 16-bit field at offset 0x0010, by starting our packet’s data-payload at offset 0x0012 instead of offset 0x0010

Further driver modifications… • We can demonstrate ‘Checksum Insertion’ performed by the NIC with these changes: #define HDR_LEN (14+4) // two more bytes precede packet’s data enum { E1000_RXCSUM = 0x5000, }; // define symbolic register-offset ssize_t my_write( struct file *file, const char *buf, size_t len, loff_t *pos ) { // add these assignments in our driver’s ‘write()’ function txring[ txtail ].cksum_offset = 16; // where to insert the checksum txring[ txtail ].cksum_origin = 12; // where to start its calculation txring[ txtail ].desc_command |= (1<<2); // IC-bit (Insert Checksum) }

In-class demonstration #3 TIMEOUT We will modify the packet-layout used in our device-driver’s ‘write()’ and ‘read()’ functions, and then program our TX descriptors to utilize the IC command-option and the CSO and CSS descriptor fields and then observe the resulting effects

Investigating NIC Capabilities for Packet Checksum

Investigating NIC Capabilities for Packet Checksum

Presentation Transcript