180 likes | 202 Views
Discover how to fix several design flaws in a software driver, including clear-on-read registers, buffer overflow, lost data, overwritten packets, and filled ring buffers. Solutions are provided to address each problem.
E N D
Fixing some driver problems Most software is discovered to have some ‘design-flaws’ after it has been put into use for awhile
Problem 1 • Statistics registers are ‘clear-on-read’, but ‘get_info()’ function can be entered twice, so if a statistics register is re-read upon a second entry to this function, a zero-value ‘overwrites’ the original statistic-value int my_get_info( char *buf, char **start, off_t off, int count ) { int n_xmit_packets = ioread32( io + E1000_TPT ); int len = 0; len = sprintf( buf+len, “ packets sent = %d \n”, n_xmit_packets );
Fix for problem 1 • We can declare any variables that will be used to store statistics to be ‘static’, then add any new inputs to their prior values int my_get_info( char *buf, char **start, off_t off, int count ) { static int n_xmit_packets = 0; int len = 0; n_xmit_packets += ioread32( io + E1000_TPT ); len = sprintf( buf+len, “ packets sent = %d \n”, n_xmit_packets );
Problem 2 • Our ‘nic.c’ device-driver programmed the hardware to use 2KB buffers for received packets, but our software only allocated 1536-bytes for each of our receive buffers, so if an oversized packet gets transmitted to our NIC, it can cause ‘buffer overflow’ (i.e., possible data-corruption – or even a system-crash!) data-corruption A 2048-byte packet can “overflow” a 1536-byte buffer system-crash
Fix for problem 2 • Insure our driver programs the software in a manner consistent with its programming of the Network Interface hardware! 30 27 25 17 16 RCTL: BSIZE (Buffer-Size) BSEX (Buffer-Size Extension bit) FLEXBUF (size of all receive-buffers in KB, if nonzero) Possible sizes for hardware receive-buffers: If FLEXBUF=0 and BSEX=0: 2048-bytes, 1024-bytes, 512-bytes, 256-bytes If FLEXBUF=0 and BSEX=1: 32768-bytes, 16384-bytes. 8192-bytes, 4096-bytes Othewise: 15K, 14K, 13K, 12K, 11K, 10K, 9K, 8K, 7K, 6K, 5K, 4K, 3K, 2K, 1K
Problem 3 • If an application tries to ‘read’ fewer bytes than are contained in a received packet, the extra bytes get discarded (i.e., ‘lost’), which is NOT how a character-device is supposed to work! excess bytes never do get returned to the application User-space: application-program’s buffer copy_to_user() Kernel-space: device-driver’s packet-buffer
Fix for problem 3 • We have introduced a new static variable (called ‘pickup’) that keeps track of where the ‘extra’ data begins -- so next time the application tries to ‘read()’, our driver can ‘pick up’ from where it had left off before ssize_t my_read( struct file *file, char *buf, size_t len, loff_t *pos ) { static int rxhead = 0; // the current rxring[] array-index static int pickup = 0; // count of bytes returned so far copy_to_user( buf, cp+pickup, len ); pickup += len; if ( pickup >= rxring[ rxhead ].packet_length ) { rxhead = (1 + rxhead) % N_RX_DESC; pickup = 0; }
Problem 4 • A program might call our driver’s ‘write()’ procedure more rapidly than the hardware can transmit previously written packets, so ‘old’ packets not yet sent get ‘overwritten’ by ‘new’ packets – thus data gets ‘lost’! TDH txring: TDT NOTE: the NIC ‘stalls’ when TDT == TDH User’s next packet goes here, but the hardware still isn’t ‘done’ with transmitting previous data put there
Fix for problem 4 • We created an additional ‘wait-queue’ so our driver’s ‘write()’ routine can put a task to sleep until the hardware is ‘done’ with the descriptor indexed by the TDT value wait_quete_head_t wq_xmit; ssize_t my_write( struct file *file, const char *buf, size_t len, loff_t *pos ) { int txtail = ioread32( io + E1000_TDT ); if ( txring[ txtail ].desc_status == 0 ) wait_event_interruptible( wq_xmit, txring[ txtail ].desc_status );
Our ISR’s modification • Our driver’s Interrupt Service Routine has the duty of ‘awakening’ a sleeping writer when the NIC generates a TXDW interrupt (i.e., for Transmit-Descriptor Writeback) irqreturn_t my_isr( int irq, void *dev_id ) { int intr_cause = ioread32( io + E1000_ICR ); if ( intr_cause & (1<<0) // TXDW has occurred wake_up_interruptible( &wq_xmit );
Problem 5 • The hardware might receive packets at a faster rate than the application program desires to read them – causing our ring-buffer to ‘fill up’ with newer data before being adequately drained of its older data RDH rxring: RDT Because we allowed all of our receive-buffers to be ‘owned’ by the network controller, it continues round-and-round receiving everything!
Fix for problem 5 • We only grant ‘ownership’ to some of the receive-buffers at any given time – but we arrange for our interrupt-handler to adjust the RDT value dynamically whenever the the number owned by the NIC falls below a certain threshold (signaled by RXDMT0) 9 8 RCTL: RDMTS RDMTS (Receive Descriptors Minimum Threshold Size): 00=one-half, 01=one-fourth, 10=one-eighth. 11=one-sixteenth
Our ISR’s modification • Our driver’s Interrupt Service Routine has the duty of advancing the RDT register’s value whenever the ‘Minimum Threshold Reached’ event is signaled irqreturn_t my_isr( int irq, void *dev_id ) { int intr_cause = ioread32( io + E1000_ICR ); if ( intr_cause & (1<<4) // RXDMT0 has occurred { int rxtail = ioread32( io + E1000_RDT ); rxtail = (8 + rxtail) % N_RX_DESC; iowrite32( rxtail, io + E1000_RDT ); }
Discussion question • Should our ‘fix’ for problem 5 be modified to employ the controller’s ‘flow control’? • What will happen if an application program stops reading, but the NIC’s link-partner keep on sending out more data? • Does this suggest a use drivers can make of the SWXOFF-bit in register TCTL? 22 TCTL: SW XOFF SWXOFF (Software XOFF) writing ‘1’ causes NIC to send a PAUSE frame
Problem 6 • When we all are doing development of device-drivers for the 82573L controller using our ‘anchor-cluster’ network, any broadcast-packets sent by one driver cause interference with others’ work switched hub
Fix for problem 6 • We implemented VLAN capabilities in our ‘nic2.c’ revised character-mode driver, so students can employ VLAN identification-numbers in their outgoing packets that will cause those packets to be ‘filtered out’ by receiving drivers with different VLAN tags switched hub VLAN 1 VLAN 1 VLAN 2 VLAN 3 VLAN 4
Modification to IOCTL • We needed a convenient way to let user-programs change their driver’s VLAN tag int my_ioctl( struct inode *inode, struct file *file, unsigned int request, unsigned long address ) { switch ( request ) { case 0: // SET destination MAC-address case 1: // GET destination MAC-address case 2: // SET a revised VLAN identifier case 3: // GET the current VLAN identifier }
In-class exercise • Try using our ‘joinvlan.cpp’ demo-program which lets a user change the current VLAN identification in the running ‘nic2.c’ driver • How shall we arrange a scheme for every student to have their own unique VLAN id?