200 likes | 341 Views
More 82573L details. Getting ready to write and test a character-mode device-driver for our anchor-LAN’s ethernet controllers. A ‘nic.c’ character driver?. my_isr(). my_fops. ioctl. my_ioctl(). open. my_open(). read. my_read(). write. my_write(). release. my_release().
E N D
More 82573L details Getting ready to write and test a character-mode device-driver for our anchor-LAN’s ethernet controllers
A ‘nic.c’ character driver? my_isr() my_fops ioctl my_ioctl() open my_open() read my_read() write my_write() release my_release() module_init() module_exit()
Statistics registers • The 82573L has several dozen statistical counters which automatically operate to keep track of significant events affecting the ethernet controller’s performance • Most are 32-bit ‘read-only’ registers, and they are automatically cleared when read • Your module’s initialization routine could read them all (to start counting from zero)
Initializing the nic’s counters • The statistical counters all have address- offsets in the range 0x04000 – 0x04FFF • You can use a very simple program-loop to ‘clear’ each of these read-only registers // Here ‘io’ is the virtual base-address of the nic’s i/o-memory region { int r; // clear all of the Pro/1000 controller’s statistical counters for (r = 0x4000; r < 0x4FFF; r += 4) ioread32( io + r ); }
A few ‘counter’ examples 0x4000 CRCERRS CRC Errors Count 0x400C RXERRC Receive Error Count 0x4014 SCC Single Collision Count 0x4018 ECOL Excessive Collision Count 0x4074 GPRC Good Packets Received 0x4078 BPRC Broadcast Packets Received 0x407C MPRC Multicast Packets Received 0x40D0 TPR Total Packets Received 0x40D4 TPT Total Packets Transmitted 0x40F0 MPTC Multicast Packets Transmitted 0x40F4 BPTC Broadcast Packets Transmitted
Ethernet packet layout • Total size normally can vary from 64 bytes up to 1536 bytes (unless ‘jumbo’ packets and/or ‘undersized’ packets are enabled) • The NIC expects a 14-byte packet ‘header’ and it appends a 4-byte CRC check-sum 0 6 12 14 the packet’s data ‘payload’ goes here (usually varies from 56 to 1500 bytes) destination MAC address (6-bytes) source MAC address (6-bytes) Type/length (2-bytes) Cyclic Redundancy Checksum (4-bytes)
Filter registers • All the modern ethernet controllers have a built-in ‘filtering’ capability which allows the NIC to automatically discard any packets having a destination-address different from the controller’s own unique MAC address • But the 82573L offers a more elaborate filtering mechanism (and can also ‘reject’ packets based on the ‘source’ addresses)
How ‘receive’ works Buffer0 List of Buffer-Descriptors descriptor0 descriptor1 Buffer1 descriptor2 descriptor3 0 0 0 Buffer2 0 We setup memory-buffers where we want received packets to be placed by the NIC We also create a list of buffer-descriptors and inform the NIC of its location and size Then, when ready, we tell the NIC to ‘Go!’ (i.e., start receiving), but to let us know when these receptions have occurred Buffer3 Random Access Memory
Receive Control (0x0100) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 R =0 0 FLXBUF 0 SE CRC BSEX R =0 PMCF DPF R =0 CFI CFI EN VFE BSIZE 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 B A M R =0 MO DTYP RDMTS I L O S LBM S L U LPE MPE UPE 0 0 SBP E N R =0 EN = Receive Enable DTYP = Descriptor Type DPF = Discard Pause Frames SBP = Store Bad Packets MO = Multicast Offset PMCF = Pass MAC Control Frames UPE = Unicast Promiscuous Enable BAM = Broadcast Accept Mode BSEX = Buffer Size Extension MPE = Multicast Promiscuous Enable BSIZE = Receive Buffer Size SECRC = Strip Ethernet CRC LPE = Long Packet reception Enable VFE = VLAN Filter Enable FLXBUF = Flexible Buffer size LBM = Loopback Mode CFIEN = Canonical Form Indicator Enable RDMTS = Rx-Descriptor Minimum Threshold Size CFI = Cannonical Form Indicator bit-value
Registers’ Names Memory-information registers RDBA(L/H) = Receive-Descriptor Base-Address Low/High (64-bits) RDLEN = Receive-Descriptor array Length RDH = Receive-Descriptor Head RDT = Receive-Descriptor Tail Receive-engine control registers RXDCTL = Receive-Descriptor Control Register RCTL = Receive Control Register Notification timing registers RDTR = Receive-interrupt packet Delay Timer RADV = Receive-interrupt Absolute Delay Value
Rx-Desc Ring-Buffer RDBA base-address 0x00 0x10 0x20 0x30 0x40 0x50 0x60 0x70 0x80 RDH (head) RDLEN (in bytes) RDT (tail) = owned by hardware (nic) = owned by software (cpu) Circular buffer (128-bytes minimum)
Rx-Descriptor Control (0x2828) 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 R =0 R =0 R =0 R =0 R =0 R =0 R =0 G R A N R =0 R =0 1 --------- 0 WTHRESH (Writeback Threshold) ADV D3 WUC SDP1 DATA --------- 0 SDP0 DATA --------- D/UD status 0 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 R =0 R =0 HTHRESH (Host Threshold) 0 FRC DPLX FRC SPD 0 R =0 R =0 A S D E PTHRESH (Prefetch Threshold) 0 L R S T 0 0 0 0 Prefetch Threshold – A prefetch operation is considered when the number of valid, but unprocessed, receive descriptors that the ethernet controller has in its on-chip buffer drops below this threshold. Host Threshold - A prefetch occurs if at least this many valid descriptors are available in host memory Writeback Threshold - This field controls the writing back to host memory of already processed receive descriptors in the ethernet controller’s on-chip buffer which are ready to be written back to host memory GRAN (Granularity): 1=descriptor-size, 0=cacheline-size
Legacy Rx-Descriptor Layout 31 0 Buffer-Address low (bits 31..0) 0x0 0x4 0x8 0xC Buffer-Address high (bits 63..32) Packet Checksum Packet Length (in bytes) VLAN tag Errors Status Buffer-Address = the packet-buffer’s 64-bit address in physical memory Packet Length = number of bytes in the data-packet that has was received Packet Checksum = the16-bit one’s-complement of the entire logical packet Status = shows if descriptor has been used and if it’s last in a logical packet Errors = valid only when DD and EOP are set in the descriptor’s Status field
Suggested C syntax typedef struct { unsigned long long base_addr; unsigned short pkt_length; unsigned short checksum; unsigned char desc_stat; unsigned char desc_errs; unsigned short vlan_tag; } rx_descriptor;
RxDesc Status-field 7 6 5 4 3 2 1 0 PIF IPCS TCPCS UDPCS VP IXSM EOP DD DD = Descriptor Done (1=yes, 0=no) shows if nic is finished with descriptor EOP = End Of Packet (1=yes, 0=no) shows if this packet is logically last IXSM = Ignore Checksum Indications (1=yes, 0=no) VP = VLAN Packet match (1=yes, 0=no) USPCS = UDP Checksum calculated in packet (1=yes, 0=no) TCPCS = TCP Checksum calculated in packet (1=yes, 0=no) IPCS = IPv4 Checksum calculated on packet (1=yes, 0=no) PIF = Passed In-exact Filter (1=yes, 0=no) shows if software must check
RxDesc Error-field 7 6 5 4 3 2 1 0 RXE IPE TCPE reserved =0 reserved =0 SEQ SE CE RXE = Received-data Error (1=yes, 0=no) IPE = IPv4-checksum error TCPE = TCP/UDP checksum error (1=yes, 0=no) SEQ = Sequence error (1=yes, 0=no) SE = Symbol Error (1=yes, 0=no) CE = CRC Error or alignment error (1=yes, 0=no)
Network Administration • Some higher-level networking protocols require the Operating System to setup a translation between the ‘hostname’ for a workstation and the hardware-address of its Network Interface Controller • One mechanism for doing this is creation of a specially-named textfile (‘/etc/ethers’) that provides database for translations
In-class exercise #1 • We put a file named ‘ethers’ on our course website that offers a template for defining the translation database that software can consult on our ‘anchor’ cluster’s LAN • One of the eight workstations’ entries has been filled in already: • Can you complete this database by adding the MAC addresses for the other 7 machines? 00:30:48:8A:30:03 anchor00.cs.usfca.edu
Our ‘seereset.c’ demo • We created this LKM to demonstrate the sequence of ‘state-changes’ that three of our network controller’s registers undergo in response to initiating a ‘reset’ operation • The programming technique used here is one which we think could be useful in lots of other hardware programming situations where a vendor’s manual may not answer all our questions about how devices work
In-class exercise #2 • Try redirecting the output from this ‘cat’ command to a file, like this: $ cat /proc/seereset > seereset.out • Then edit this textfile, adding a comment to each line which indicates the bit(s) that experienced a ‘change-of-state’ from the line that came before it (thereby providing yourself with a running commentary as to how the NIC proceeds through a ‘reset’)