CS 6560 Operating System Design

CS 6560 Operating System Design Lecture 6: Interrupts: Bottom Halves Kernel Synchronization

References • Our textbook: Robert Love, Linux Kernel Development, 2nd edition, Novell Press, 2005. • Understanding the LINUX Kernel, 3rd. edition, O’Reilly, 2005. (covers 2.6) • The kernel code and its own documentation.

Plan • Chap 7: Bottom halves • Chap 8: Concurrency Issues and Principles • Chap 9: Concurrency methods in Linux 2.6

Recall: Top and Bottom Halves • Conflicting needs for interrupts: • Return quickly • Get lots of work done • Solution: Break up the interrupt processing into two parts • Top half: returns quickly, after setting up work to be done • Bottom half: scheduled by the kernel to get work done at a later time

Bottom Halves • Bottom Halves = General term for mechanisms to defer work generated by interrupt handlers (also an old name for an obsolete mechanism for doing this)

When to run Bottom Half? • When? • In interrupt context - no sleeping allowed • In process context - sleeping allowed • Advantages to each, thus Linux 2.6 offers both types of bottom halves.

Process and Interrupt Contexts • Process Context: Execution of code in the kernel that is scheduled by the process scheduler. • Interrupt Context: Execution of code in the kernel that is responding to an interrupt and is not scheduled by the process scheduler.

Bottom Half Mechanisms In Linux 2.6 • The following are current mechanisms • Softirq • Tasklets • Work queues • Previous versions of the kernel had others that are now obsolete: • BH - removed in 2.5 kernel • Task queues - removed in 2.5 kernel

Softirq • Runs in interrupt context - no blocking allowed • Statically defined at compile time - limited to 32 • Available since 2.3 version of kernel • Handles a few special cases such as network and scsi disk processing. Also handles tasklets • Code is mainly in kernel/softirq.c (and other places such as include/linux/irq_cpustat.h)

Properties • A softirq is never preempted by another softirq - only by an interrupt • A softirq can run on more than one processor at a time.

Implementation of Softirqs • Data structure: softirq_action • Has function reference and data reference • Static array of these: softirq_vec[32] • Handler function has simple prototype: void softirq_handler(struct softirq_action *)

Registering Softirqs • Softirqs are enumerated at compile time • They are registered via the open_softirq function. • This just places them in the softirq_vec.

When do softirqs run? • When needed, they are marked for running with the raise_softirq function. This turns on a bit in the softirq_pending word • They are later run by the kernel with the do_softirq function. This is called • in the return from hardware interrupt code • in the ksoftirq kernel thread • elsewhere such as in the networking subsystem

Tasklets • Tasklets is a system that is implemented on top of soft_irq. • While soft_irqs are reserved for certain purposes, tasklets are the recommended way to do low overhead bottom half processing. • They are dynamically generated and managed.

Tasklet Operations • Declaring: DECLARE_TASKLET and DECLARE_TASKLET_DISABLED • Scheduling (mark as pending): tasklet_schedule • Disabling: tasklet_disable and tasklet_disable_nosync • Enabling: tasklet_enable • Removing: tasklet_kill

Workqueues • Workqueues run in process context and contain a list of work to be worked on. • Each unit of work has a function and a pointer to some data. • Each workqueue has a name, a list of work, and a set of worker threads, one for each CPU. The threads are named after the name of the workqueue. • Initially, there is one workqueue whose worker threads are called “events”. Individual threads are called “events/0”, “events/1”, and so on.

Workqueue Operations • Creating Workqueues: create_workqueue makes new workqueue with given name. • Creating work: (give name, function, data) • Statically: DECLARE_WORK • Dynamically: INIT_WORK • Schedule work: • Generically: queue_work and queue_delayed_work • Default workqueue: schedule_work and schedule_work_delayed • Flush work: • Generically: flush_workqueue • Default workqueue: flush_scheduled_work • Cancel work: cancel_delayed_work

Kernel Synchronization • Chapters 8 and 9 • Issue: Sharing data in a multiprocessor, preemptive environment • Solutions: • Limit sharing • Use locks

Concurrency Issues • True Concurrency: Multiple processors be running code at the same time. They may contend for the same data. • Preemption: Running code may be interrupted by the running of other code. Running the new code may contend for the same data as running the old.

Blocking and Sleeping • Blocking: Code may stop running and be waiting for an event to occur. • Sleeping: This can only happen if there is a process that is executing it.

Scalability • Adding more processors adds more overhead. So a quad core system does not run four times faster.

Granularity • Graduality: How much data should be protected? • Bits • Words • Structures

Danger of Data Contention • Danger of data contention is corruption. • Example: updates - read, modify, write

Solutions • Don’t share: • Don’t use global variables. • Have separate copies of variables for each CPU • Carefully design how data is shared and use a variety of locking and synchronization methods.

Basic Principles • Protect data not code. • Critical region: Code that accesses shared data • Typical protection: bracket critical regions

Linux 2.6 uses • Per CPU copies of variables • Preemption disabling • Interrupt disabling • atomic operations • spinlocks • Semaphores • Completion variables • Seq locks • Barriers

Per CPU Variables • Covered under memory management in LKD, Chapter 11. • Linux 2.4 used an explicit array, indexed by CPU. The function get_cpu gives the index number for the current CPU and disables preemption. The function put_cpu enables preemption. • Linux 2.6 has a percpu interface.

Atomic Operations • Linux 2.6 has atomic integer variables and atomic bitwise operations

Atomic Integer Operations • ATOMIC_INIT(int i)// assign the type atomic_t to i • int atomic_read(atomic_t *v)// returns value of v atomically • void atomic_set(atomic_t *v, int i)// atomically sets v equal to the value of I • void atomic_add(int i, atomic_t *v)// atomically add i to v • void atomic_sub(int i, atomic_t *v)// atomically subtract i from v • void atomic_inc(atomic_t *v)// atomically increment v • void atomic_dec(atomic_t *v)// atomically decrement v • And more - see page 133 for details on atomic_sub_and_test, atomic_add_negative, atomic_dec_and_test, atomic_inc_and_test

Atomic Bitwise Operations • Operations • Set, clear and flip individual bits • Test and set, clear, and flip individual bits • Return a bit value • Find first set bit and first cleared bit • Atomic and non-atomic forms (prefix by double underscore) • Danger - bit position can overflow actual data

Spin Locks • Why? • Way of waiting without sleep • Used in SMP systems • Notes: • sleep has high overhead and is not available in interrupt context • Spin locks must be only be held for a short time - maybe five lines of code

Spin Lock Use • Bracket critical region with spin_lock and spin_unlock • Example: spin_lock(&my_lock) /* critical section */ spin_unlock(&my_lock)

Spin Lock Implementation • See include/asm-i386/spinlock.h static inline void __raw_spin_lock(raw_spinlock_t *lock) { asm volatile("\n1:\t" LOCK_PREFIX " ; decb %0\n\t" "jns 3f\n" "2:\t" "rep;nop\n\t" "cmpb $0,%0\n\t" "jle 2b\n\t" "jmp 1b\n" "3:\n\t" : "+m" (lock->slock) : : "memory"); }

Spin Lock Methods • spin_lock()// aquires spin lock • spin_unlock()// releases spin lock • spin_lock_irq() // disables local interrupts and acquires spin lock • spin_unlock_irq() // releases spin lock and enables interrupts • spin_lock_irqsave() // saves and disables local interrupts, acquires spin lock • spin_unlock_irqsave() // restores and enables local interrupts and releases spin lock • spin_lock_init() • spin_trylock() • spin_is_locked()

More Coming

CS 6560 Operating System Design