380 likes | 1.13k Views
CS 498 Lecture 4 An Overview of Linux Kernel Structure. Jennifer Hou Department of Computer Science University of Illinois at Urbana-Champaign Reading: Chapter 1-2, The Linux Networking Architecture: Design and Implementation of Network Protocols in the Linux Kernel. Outline.
E N D
CS 498 Lecture 4An Overview of Linux Kernel Structure Jennifer Hou Department of Computer Science University of Illinois at Urbana-Champaign Reading: Chapter 1-2, The Linux Networking Architecture: Design and Implementation of Network Protocols in the Linux Kernel
Outline • Overview of the Kernel Structure • Activities in the Linux Kernel • Locking • Kernel Modules • /proc File System • Memory Management • Timing
Structure of Linux Kernel User space Applications and tools System calls component Process management Device drivers File systems Memory management Network Network functionality Virtual memory Multitasking Functionality Files, directories Device access Network protocols File system types Software support Scheduler Architecture specific code Memory manager Character devices Block devices Network drivers Hardware support Hard disk CD, floppy Network adapter Hardware CPU RAM Terminals
Overview of the Kernel Structure • Process management • The scheduler handles all the active, waiting, and blocked processes. • Memory management • Is responsible for allocating memory to each process and for protecting allocated memory against access by other processes. • File system • In UNIX, almost everything is handled over the file system interface. • Device drivers can be addressed as files • /proc file system allows us to access data and parameters in the kernel
Overview of the Kernel Structure • Device drivers • Abstract from the underlying hardware and allow us to access the hardware with well-defined APIs • The use of kernel modules allow device drivers to be dynamically loaded/unloaded • Networks • Incoming packets are asynchronous events and have to be collected and identified, before a process can handle them. • Most network operations cannot be allocated to a specific process. Instead, interrupts and timers are used extensively.
Features of Linux Kernel • Is a Monolithic kernel • The entire functionality is contained in one kernel. • In contrast, in microkernels (e.g., Mach kernel and Windows NT), only memory management, IPC, and other hardware-related functions are contained in the kernel. The remaining functionality is moved to independent processes/threads running outside the OS. • + accessing resources directly from within the kernel, avoiding expensive system calls and context switches. • - OS becomes quite complex. • - The development of new drivers is difficult because of the lack of appropriate interface definitions.
Feature of Linux Kernel • A cure is the use of kernel modules • Linux allows kernel modules to be dynamically loaded into (removed from) the kernel at run time. • This is achieved with the use of well-defined interfaces, e.g., register_netdev(), register_chrdev(), register_blkdev(). • The components shown on dark backgrounds provide interfaces for dynamically registering new functionality. • The run-time performance is guaranteed by having modules run in protected kernel mode.
Activities – Processes and System Calls • Processes operate exclusively in the user address space, and can only access the memory allocated to them. • Violation leads to exceptions. • When a process wants to access devices or use a functionality in the kernel system call. • The control is transferred to the kernel, which executes the system call on behalf of the user process. • Processes can be interrupted voluntarily (wait on semaphore or sleep) or involuntarily (interrupt).
Other Forms of Activities • Hardware interrupts (hardware IRQs) • Software interrupts (software IRQs) • Tasklets Can a single activity or multiple instances of an activity be Execuated on multiple processors?
Interrupts – Hardware IRQs • Peripherals use hardware interrupts to inform OS of events (e.g., a packet has arrived at the network adapter) an interrupt handling routine is called. • The handling routine for a specific interrupt can be registered (de-registered) by register_irq() (free_irq()). • Fast interrupts • have a very short handling routine (that cannot be interrupted). • Are specified by the flag SA_INTERRUPT in request_irq(). • Slow interrupts • Have a longer handling routine and can be interrupted by other interrupts during their execution. • in_irq() (include/asm/hardirq.h) can be used to check whether or not the current activity is an interrupt-handling routine.
Software Interrupts • Not every operation that needs to be executed in an interrupt can be completed in a few instructions (e.g., a packet that arrives at a network adapter). • To keep interrupt handling short, the routine is usually divided into two parts: • Top-half: handles the most important tasks (e.g., copying the arrived packet to a kernel buffer queue waiting for detailed handling later) • Bottom-half: handles non-time critical operations. It is being scheduled for execution right after the top half is executed (e.g., when a packet arrives, the bottom half is run as a software interrupt NET_RX_SOFTIRQ).
Software Interrupts • When a system call or a hardware interrupt terminates, the scheduler calls do_softirq(). • do_softirq() schedules software interrupts for execution. • A maximum of 32 software interrupts can be defined in Linux. • NET_RX_SOFTIRQ and NET_TX_SOFTIRQ are two software interrupts. • Multiple software interrupts can run concurrently, and hence need to be reentrant If critical sections exist in a software interrupt, they have to be portected by locks. • in_softirq() (include/asm/softirq.h) can be used to check whether or not the current activity is a software interrupt.
Tasklets • A more formal mechanism of scheduling software interrupts (and other tasks). • The macro DECLARE_TASKLET(name, func,data) • name: a name for the tasklet_struct data structure • func: the tasklet’s handling routine. • data: a pointer to private data to be passed to func(). • tasklet_schedule(&tasklet_struct) schedules a tasklet for execution. • tasklet_disable() stops a tasklet from running, even if it has been scheduled for execution. • tasklet_enable() reactivates a deactivated tasklet.
Tasklet Example #include <linux/interrupt.h> /* Handling routine of new tasklet */ void test_func(unsigned long); /* Data of new tasklet */ char test_data[] = “Hello, I am a test tasklet”; DECLARE_TASKLET(test_tasklet, test_func, (unsigned long) &test_data); void test_func(unsigned long data) { printk(KERN_DEBUG, “%s\n”, (char *) data); } …. tasklet_schedule(&test_tasklet);
Bit Operations • test_and_set_bit(nr, void *addr) sets the bit in position nr in the unsigned long variable pointed to by addr. The previous value of the bit is returned. • test_and_clear_bit(nr, void *addr) clears the bit in position nr in the variable pointed to by addr. • test_and_change_bit(nr, void *addr) • set_bit(nr, void *addr) • clear_bit(nr, void *addr) • change_bit(nr, void *addr) • test_bit(nr, void *addr)
Locking -- spinlock • A mechanism for busy wait locks. • spin_lock_init(&my_spinlock) • spin_lock (spinlock_t *my_spinlock) • Tries to set the spinlock my_spinlock. If it is not free, then wait or test until the lock is released. • spin_unlock(spinlock_t *my_spinlock) • Releases a lock. • spin_is_lock(spinlock_t *my_lock) returns the current value of the lock (non-zero value lock is set) • spin_trylock(spinlock_t *my_lock) sets the spinlock, if it is currently unlocked; otherwise, the function returns a non-zero value.
Spinlock Example #include <linux/spinlock.h> spin_lock_init(&my_spinlock); // One thread spin_lock(&my_spinlock); // Critical section spin_unlock(&my_spinlock); …. // Another thread spin_lock(&my_spinlock); // Critical section spin_unlock(&my_spinlock);
Read-Write Spinlocks • Some data structure, such as the list of registered network devices (dev_base), does not change frequently, but is subject to many read accesses use of read-write spinlock to improve run-time performance. • read_lock(): if there is no lock or only read lock, then the critical section can be immediately accessed. If there is a write lock, then we have to wait. • read_unlock(): A read activity leaves the critical section. If a write activity is waiting and there exists no other read activity, it gains access. • write_lock(): if there is a (read/write) lock, we have to wait; otherwise, we put an exclusive lock. • write_unlock()
Kernel Modules • Each kernel module implements init_module() and cleanup_module(). • To load a kernel module into the kernel space manually, use insmod modulename.o [argument]. In turns the following system calls are called: • sys_create_module() allocates memory to accommodate the module in the kernel space. • sys_get_kernel_syms() returns the kernel’s symbol table to resolve the missing references within the module to kernel symbols. • sys_init_module() copies the module’s object code into the kernel address space and calls the module’s init_module(). • Insmod wvlan_cs eth=1 network_name=“mywavelan”
Kernel Modules • rmmod modulename • Removes the specified module from the kernel address space. In turn, the system call sys_delete_module() is called, which in turn calls cleanup_module(). • lsmod lists all currently loaded modules and their dependencies and reference counts. • modinfo gives the information about a module. The information is set by the macros MODULE_DESCRIPTION, MODULE_AUTHOR in the module’s source.
#include #include <linux/module.h> // Needed by all modules #include <linux/kernel.h> // Needed for KERN_ALERT #include <linux/init.h> // Needed for the macros
init_module and clear_module • init_module(): runs all initialization tasks such as reserving memory, creating entries in the /proc directory, initializing data structures, registering and unregistering the functionality. • cleanup_module() cleans up the work environment of the module (unregister the module’s functionality, free the memory it allocated, and remove the dependencies between the module and other parts of the kernel. • Need to ensure the reference count for the module is zero.
Module Usage Count • Linux keeps a usage count for every module in order to determine whether the module can be safely removed. • Three macros are defined in <linux/module.h> • MOD_INC_USE_COUNT: increment the count for the current module • MOD_DEC_USE_COUNT: decrement the count • MOD_IN_USE: evaluates to be true if the count is not zero.
Passing Module Parameters • MODULE_PARM(var, type) designates the variable var as a parameter of the module, and a value can be assigned to this parameter during loading. Possible types are: • b: byte; h: short (two bytes); i: integer; l: long; s: string. • MODULE_PARM_DESC(var, desc) adds a description (desc) for the parameter var. • MODULE_DESCRIPTION(desc) contains a description of the module. • EXPORT_SYMBOL(name) exports and adds a function or variable of the kernel to the symbol table.
Makefile for Kernel Modules TARGET := mymodule WARN := -W -Wall INCLUDE := -isystem /lib/modules/`uname -r`/build/include CFLAGS := -O -DMODULE -D__KERNEL__ ${WARN} ${INCLUDE} CC := gcc-3.0 ${TARGET}.o: ${TARGET}.c .PHONY: clean clean: rm -rf ${TARGET}.o Check out the Linux Kernel Module Programming Guide for details http://www.tldp.org/LDP/lkmpg/2.4/html/index.html
printk • printk(KERN_INFO,”I am in trobule, guping up on %p\n”, ptr); • There are 8 possible log level: • KERN_EMERG: Used for emergency messages. • KERN_ALERT: A situation requiring immediate action. • KERN_CRIT: Critical conditions related to serious hardware/software failure. • KERN_ERR: Used to report error condition • KERN_WARNING: Warnings about problematic situations that do not create serious problems. • KERN_NOTICE: Situations that are normal, but still worthy of note. • KERN_INFO: Informational message. • KERN_DEBUG: Used for debugging message. • If the priority is less than the integer variable, console_loglevel, the message is displayed on the console. • If both klogd and syslogd are running, kernel messages are appended to /var/log/messages, independent of console_loglevel.
Reserving/Releasing Memory In the Kernel • kmalloc(size,priority): attempts to reserve consecutive memory space with a size of size bytes in the kernel memory. • GFS_KERNEL: is used when the requesting activity can be interrupted during the reservation. • GFS_ATOMIC: is used when the memory request should be atomic. • kfree(objp): releases the memory space reserved at address objp
Reserving/Releasing Memory In the Kernel • copy_from_user(to, from, count) copies count bytes from the address from in the user address space to the address to in the kernel address space. • copy_to_user(to,from,count) copies count bytes from the address from in the kernel address space to the address to in the user address space. • access_ok() confirms the corresponding virtual memory page is actually residing in the physical memory.
Memory Caches • Linux allows us to create a cache with memory spaces of specific sizes slab caches. • kmem_cache_create(name, size, offset, flags, ctor, dtor) creates a slab cache of memory spaces with sizes in size bytes. • name points to a string containing the name of the slab cache; offset is usually set to null. • flags specifies additional options, e.g., SLAB_HWCACHE_ALIGN (aligns to the size of the first level cache in the CPU) • ctor, dtor: specifies a constructor and a destructor for the memory spaces used to initialize or clean up the reserved memory spaces. • Example: skbuff_head_cache = kmem_cache_create (“skbuffer_head_cache”, sizeof(struct sk_buff), 0, SLAB_HWCACHE_ALIGN, skb_headerinit, NULL).
Memory Caches • kmem_cache_destroy(cachep): releases the slab cache cachep. • kmem_cache_shrink(cachep): is called by the kernel when the kernel itself requires memory space and has to reduce the cache. • kmem_cache_alloc(cachep,flags): is used to request a memory space from the slab cache, cachep. If the slab cache is empty, then kmalloc() is used to reserve new memory space. • kmem_cache_free(cachep, ptr): frees the meory space that begins at ptr, and gives it back to the cache, cachep.