470 likes | 631 Views
NIC, Kernel Timer and Block Devices. Dr A Sahu Dept of Comp Sc & Engg . IIT Guwahati. Outline. NIC Cards Registers TX/RX, Statistics Counter Network Device Driver (Skeleton) Kernel Counter Jiffies, RTC, kernel timer File System, Block Devices An introduction. Announcement.
E N D
NIC, Kernel Timer and Block Devices Dr A Sahu Dept of Comp Sc & Engg. IIT Guwahati
Outline • NIC Cards • Registers TX/RX, Statistics Counter • Network Device Driver (Skeleton) • Kernel Counter • Jiffies, RTC, kernel timer • File System, Block Devices • An introduction
Announcement • There will not be any class in DIWALI week • Wishing you a happy and safe Diwali • Assignment will be uploaded to course website before this week end. • Deadline of assignment 13 Nov 2010 • Assignment will carry 5 marks • You have to show the demo on your lab machine • No Demo, No marks • You will not get any marks by simply submitting assignment
Registers’ Names • Memory-information registers • TDBA(L/H) = Transmit-Descriptor Base-Address Low/High (64-bits) • TDLEN = Transmit-Descriptor array Length • TDH = Transmit-Descriptor Head • TDT = Transmit-Descriptor Tail • Transmit-engine control registers • TXDCTL = Transmit-Descriptor Control Register • TCTL = Transmit Control Register • Notification timing registers • TIDV = Transmit Interrupt Delay Value • TADV = Transmit-interrupt Absolute Delay Value
Registers’ Names • Memory-information registers • RDBA(L/H) = Receive-Descriptor Base-Address Low/High (64-bits) • RDLEN = Receive-Descriptor array Length • RDH = Receive-Descriptor Head • RDT = Receive-Descriptor Tail • Receive-engine control registers • RXDCTL = Receive-Descriptor Control Register • RCTL = Receive Control Register • Notification timing registers • RDTR = Receive-interrupt packet Delay Timer • RADV = Receive-interrupt Absolute Delay Value
Statistics registers • The 82573L has several dozen statistical counters which automatically operate to keep track of significant events affecting the ethernet controller’s performance • Most are 32-bit ‘read-only’ registers, and they are automatically cleared when read • Your module’s initialization routine could read them all (to start counting from zero)
Initializing the nic’s counters • The statistical counters all have address- offsets in the range 0x04000 – 0x04FFF • You can use a very simple program-loop to ‘clear’ each of these read-only registers // Here ‘io’ is the virtual base-address // of the nic’s i/o-memory region { int r; // clear all of the Pro/1000 controller’s statistical counters for (r = 0x4000; r < 0x4FFF; r += 4) ioread32( io + r ); }
A few ‘counter’ examples 0x4000 CRCERRS CRC Errors Count 0x400C RXERRC Receive Error Count 0x4014 SCC Single Collision Count 0x4018 ECOL Excessive Collision Count 0x4074 GPRC Good Packets Received 0x4078 BPRC Broadcast Packets Received 0x407C MPRC Multicast Packets Received 0x40D0 TPR Total Packets Received 0x40D4 TPT Total Packets Transmitted 0x40F0 MPTC Multicast Packets Transmitted 0x40F4 BPTC Broadcast Packets Transmitted
Connecting to kernel: Device Registration • Loopback.c, plip.c, e100.c are examples of network drivers : /drivers/net/ • Device registration: • Alloc net devices (Request for resources and offer facilities) • Structnet_devices *snull_dev[2] ; //linux/netdevice.h • snull_dev[0]=alloc_netdev(sizeof(structsnull_priv), “sn%d”,snull_init); • Alloac_etherdev(intsizeof_priv); /wrapper to alloc_netdev • After initialization complete register the devices • register_netdev(snull_dev[i]); // return 1 if fails
Private data • Strcutsnull_priv *priv=nedev_priv(dev); Strcusnull_priv { structnet_devices_stats stats; int status; strcutsnull_packet *ppool; structsnul_packet *rx_queue; intrx_enabled, tc_packele; u8 *tx_packetdata; structsk_bff *skb; spinlock_t lock; }; • Initialization priv=netdriv_priv(dev); memset(priv,0,sizeof(strcutnnull_priv)); spin_lock_init(&priv->lock); snull_rx_inits(dev,1); //enable revice interrupts
Net_deviceStrcutures • Global Information • name: name of device • State: state of device • net_device *next; // ptr to next dev in global list • init_funtion: An init fun called by reg_netdev(); • Hardware Information • Interface Information • Device methods
Net_deviceStrcutures: Hardware info • Low level hardware information • Base_address: io_base address of network interface • Char irq: dev->irq, the assigned interrupt number..ifconfig • Char if_port: the port is in use on multiport device..10base • Char dma; // dmaallcoated by the device for ISA bus • Device memory information: address of shared memory used by the devices • Rmem (rxmem) , mem (tx_mem) • rmem_start, rmem_end, mem-start, mem_end;
Net_device: Interface information • Init setup most of the information But device specific setup information need to setup later on • Non ethernet interface can use helper functions • fc_setup, ltalk_setup, fddi_setup • Fiber channel, local talk, fiber dis data ineterface, token ring, hihhperfparllel interface (hppi_setup) • Non default interface filed • Hard_headerlen,MTU (max tx unit=1500 oct ), tx_queue_len (ether=1000, pipl=10), short type, char adresslen; char dev_addeess[Max_add_len], breadcast[max_ad_len] • Flags bt sets: Mask bits, loopback, debug, noarp, multicast • Special hardware capability the device has: DMA
Net_device: Device methods • Fundamental method • Open, Stop, Hard_start_xmit • Hard_header, Rebuild_header • Tx_timeout, Net_device_stats, Set_config • Optional methods • Poll, poll_controller, do_ioctl, set_multicastlist • Set_mac_address,change_mtu, header_cache, header_cache_update, hard_header_parse • Utilities fileds (not methods) • Trans_start, last_rx, watchdog_timeo, *priv, mc_list, mc_count, xmit_lock, xmit_lock_owner
Kernel Timer • PIT • Jiffies : A global timing counter variable • User space timing • Timer interrupt ISR • Do_timer()
Timing and Timers • Accurate timing crucial for many aspects of OS • Device-related timeouts • File timestamps (created, accessed, written) • Time-of-day (gettimeofday() • High-precision timers (code profiling, etc.) • Scheduling, cpu usage, etc. • Intel timer hardware • RTC: Real Time Clock • PIT: Programmable Interrupt Timer • TSC: TimeStamp Counter (cycle counter) • Local APIC Timer: per-cpu alarms • Timer implementations • Kernel timers (dynamic timers) • User “interval” timers (alarm(), setitimer())
“What time is it?” • Need timing measurements to: • Keep track of current time and date for use by e.g. gettimeofday(). • Maintain timers that notify the kernel or a user program that an interval of time has elapsed. • Timing measurements are performed by several hardware circuits, based on fixed frequency oscillators and counters.
How the Kernel keeps time • Kernel keeps time by reading a clock device (oscillator) and maintaining a kernel variable with the current time • Current time accessible to user-mode programs via system calls • gettimeofday() is the usual interface to the current time maintained by system. • Same is also used to determine when the currently running process should be removed from CPU to let others run • Also used to keep track of the amount of time a process runs in user or supervisor mode!
How a user-space program can read the system time #include <sys/time.h> structtimevaltheTime; gettimeofday(&theTime, NULL); //Definition of structtimeval: structtimeval { long tv_sec; long tv_usec; }; The date command: this command gives the time according to the Gregorian (modern Christian) calendar.
Mechanics of keeping time • The clock ISR • timer_interrupt() in file arch/i386/kernel/time.c calls • do_timer( ) function in file kernel/sched.c • Increments a counter in the kernel variable called jiffieseach time the function (do_timer( )) runs. • do_timer( ) then marks TIMER_BH (bottom-half) for execution in the ret_from_sys_call • For the system time, the timer bottom half uses the current value of kernel variable jiffies to compute the current time. It stores the value in structtimevalxtime, can be read by kernel functions • sys_gettimeoday( )
Hardware Clocks • Real-Time Clock (RTC): • Often integrated with CMOS RAM on separate chip from CPU: e.g., Motorola 146818. • Issues periodic interrupts on IRQ line (IRQ 8) at programmed frequency (e.g., 2-8192 Hz). • In Linux, used to derive time and date. • Kernel accesses RTC through 0x70 and 0x71 I/O ports.
Timestamp Counter (TSC) • Intel Pentium (and up), AMD K6 etc incorporate a TSC. • Processor’s CLK pin receives a signal from an external oscillator e.g., 400 MHz crystal. • TSC register is incremented at each clock signal. • Using rdtsc assembly instruction can obtain 64-bit timing value. • Most accurate timing method on above platforms.
The “PIT”s • Programmable Interrupt Timers (PITs): • e.g., 8254 chip. Already discussed • PIT issues timer interrupts at programmed frequency. • In Linux, PC-based 8254 is programmed to interrupt Hz (=100) times per second on IRQ 0. • Hz defined in <linux/param.h> • PIT is accessed on ports 0x40-0x43. • Provides the system “heartbeat” or “clock tick”.
jiffies • unsigned long volatile jiffies; • global kernel variable (used by scheduler) • initialized to zero when system reboots • gets incremented during a timer interrupt • so it counts ‘clock-ticks’ since cpu restart • ‘tick-frequency’ is a ‘configuration’ option
jiffies overflow • Won’t overflow for at least 16 months • Linux kernel got modified to ‘fix’ overflow • Now the declaration is in ‘linux/jiffies.h’: unsigned long long jiffies_64; and a new instruction in ‘do_timer()’ (*(u64*)&jiffies_64)++;
“This’ll only take a jiffy” • jiffies is incremented every timer interrupt. • Number of clock ticks since OS was booted. • Scheduling and preemption done at granularities of time-slices calculated in units of jiffies.
Timer Interrupt Handler • Every timer interrupt: • Update jiffies. • Determine how long a process has been executing and preempt it, if it finishes its allocated timeslice. • Update resource usage statistics. • Invoke functions for elapsed interval timers.
PIT Interrupt Service Routine • Signal on IRQ 0 is generated: • timer_interrupt() is invoked w/ interrupts disabled (SA_INTERRUPT flag is set to denote this). • do_timer() is ultimately executed: • Simply increments jiffies & allocates other tasks to “bottom half handlers”. • Bottom half (bh) handlers update time and date, statistics, execute fns after specific elapsed intervals and invoke schedule() if necessary, for rescheduling processes.
Updating Time and Date • lost_ticks (lost_ticks_system) store total (system) “ticks” since update to xtime, which stores approximate current time. This is needed since bh handlers run at convenient time and we need to keep track of when exactly they run to accurately update date & time. • TIMER_BH refers to the queue of bottom halves invoked as a consequence of do_timer().
Kernel timer syntax • Declare a timer: struct timer_list mytimer; • Initialize this timer: init_timer( &mytimer ); mytimer.func = mytimeraction; mytimer.data = (unsigned long)mydata; mytimer.expires = <number-of-jiffies> • Install this timer: add_timer( &mytimer ); • Modify this timer: mod_timer( &mytimer, <jifs> ); • Delete this timer: del_timer( &mytimer ); • Delete it safely: del_timer_sync( &mytimer);
Hardware Clocks (Intel) • RTC • battery backed (packaged with CMOS RAM) • registers to access current date/time (ports 0x70, 0x71) • includes programmable timer (2-8192Hz) • accessible as /dev/rtc • sampled by kernel (only) on startup • set by “clock” command (synched at shutdown) • TSC time stamp ( MSR: microprocspecific register) • 64 bit counter increments at CPU cycle speed • accessible via user space assembly instruction rdtsc • provides high-resolution timing capability • kernel determines frequency at boot (calibrate_tsc())
Hardware Clocks (Intel) • PIT • heartbeat timer; drives timer interrupt (tick) • 100 Hz on PC; 1024 Hz on fast chips (alpha, itanium) • patches to change clock speed via /proc! • jiffies: # of ticks since boot • xtime: struct with secs, usecs since Jan 1, 1970 (“epoch”) • CPU Local (APIC) Timers • when available does per-cpu timing (e.g. quantum) • if not available, driven by PIT • 32 bit (instead of PIT 16 bit) so lower frequency possible • decrements in multiples of bus cycles (1, 2, 4, 8, .. 128)
Updating Date/Time • xtime.tv_sec, xtime.tv_usec • seconds since Jan 1, 1970 • update_times() • wall_jiffies: time of last xtime update • update_wall_time(ticks) // handles usec wrap • calc_load(ticks) // load average void update_times(void) { unsigned long ticks;write_lock_irq(&xtime_lock); ticks = jiffies – wall_jiffies; if (ticks) { jiffies += wall_jiffies;update_wall_time(ticks); }write_unlock_irq(&xtime_lock);calc_load(ticks);}
Updating System Statistics • checking cpu resource limits • update user and kernel mode ticks for times() • per_cpu_utime, per_cpu_stime • over cpu limit? send SIGXCPU, SIGKILL • updating system load averages <1.0 is good • average tasks in run queue last 1, 5, 15 minutes • includes UNINTERRUPTIBLE (but not pid 0) • kernel profiling • samples eip on each interrupt • activated by kernel option profile= • results exported via /proc/profile (readprofile command) • NMI watchdogs (detecting system freeze) • clever use of APIC to detect freezes (failure to re-enable interrupts) • broadcast NMI periodically, check for increasing interrupt count!
System Calls • gettimeofday(): sec, usec • delay since last bottom half (xtime update) • delay since last interrupt (jiffies update) • samples TSC if available for high-precision • settimeofday(): update xtime (not RTC!) requires root • adjtimex(): gradual clock time change • alarm(), setitimer() • user mode interval timers • three different timers
File System & Block Devices • Block Devices (Disk) • Sector, inode • File systems (Operations) • Read/write, open,close, lseek, type
What is the VFS ? Component in the kernel that handles file-systems, directory and file access. Abstracts common tasks of many file-systems. Presents the user with a unified interface, via the file-related system calls (open, stat, chmod etc.). Filesystem-specific operations:- vector them to the filesystem in charge of the file.
Mounting a device $ mount -t iso9660 -o ro /dev/cdrom /mnt/cdrom Steps involved: Find the file system.(file_systems list) Find the VFS inode of the directory that is to be the new file system's mount point. Allocate a VFS superblock and call the file system specific read_super function.
Block Device Specific Operations • Operations for block devices • In include/linux/fs.h : structblock_device_operations { int (*open) (structinode *, struct file *); int (*release) (structinode *, struct file *); int (*ioctl) (structinode *, struct file *, unsigned, unsigned long); int (*check_media_change) (kdev_t); int (*revalidate) (kdev_t); }; • In include/linux/blkdev.h : typedef void (request_fn_proc) (request_queue_t *q);
Generic Block Device Layer • Provides common functionality for all block devices in Linux • Uniform interface (to file system) e.g. bread( ) block_prepare_write( ) block_read_full_page( ), ll_rw_block( ) // low level • buffer management and disk caching • Block I/O requests scheduling • Generates and queues actual I/O requests in a request queue (per device) • Individual device driver services this queue (likely interrupt driven)
Invoking the Lower Layer • Generic block device layer • Generates and queues I/O request • If the request queue is initially empty, schedule a plug_tqtasklet into tq_disk task queue • Asynchronous run of task queue tq_disk • Run in a few places (e.g., in kswapd) • Take a request from the queue and call the request_fn function: • q->request_fn(q);
Request Service Routine • To service all I/O requests in the queue • Typical interrupt-driven procedure • Service the first request in the queue • Set up hardware so it raises interrupt when it is done • Return • Interrupt handler tasklet • Remove the just-finished request from the queue • Re-enter the request service routine (to service the next)
Skeleton Block Device • Device operation structure: • static struct block_device_operations xxx_fops = { open: xxx_open, release: xxx_release, ioctl: xxx_ioctl, check_media_change, xxx_check_change, revalidate, xxx_revalidate, owner: THIS_MODULE, };
After Diwali • Block device driver • 1 class (Lect 36) • Creative Sound blaster • 1 class (Lect 37) • USB2.0 • 2 class (Lect 38-39) • Summery After Mid Semester & Question patterns • Last class (Lect40)
ThanksRef: Chap 16, LDD 3e Rubini- CorbetWishing u happy diwali