310 likes | 562 Views
‘Dynamic’ kernel patching. How you could add your own system-calls to Linux without editing and recompiling the kernel. System calls.
E N D
‘Dynamic’ kernel patching How you could add your own system-calls to Linux without editing and recompiling the kernel
System calls • System Calls are the basic OS mechanism for providing privileged kernel services to application programs (e.g., fork(), clone(), execve(), read(), write(), signal(), getpid(), waitpid(), gettimeofday(), setitimer(), etc.) • Linux implements over 300 system calls • To understand how system calls work, we can try creating one of our own design
‘Open Source’ philosophy • Linux source-code is publicly available • In principle, anyone could edit the sources to add their own new functions into Linux • In practice, it is inconvenient to do this • The steps needed involve reconfiguring, recompiling, and reinstalling your kernel • For novices these steps are treacherous! • Any error risks data-loss and down-time
Alternative to edit/recompile • Linux modules offer an alternative method for modifying the OS kernel’s functionality • It’s safer -- and vastly more convenient – since error-recovery only needs a reboot, and minimal system knowledge suffices • The main hurdle to be overcome concerns the issue of ‘linking’ module code to some non-exported Linux kernel data-structures
Invoking kernel services user-mode (restricted privileges) kernel-mode (unrestricted privileges) application program installable module ret call call ret Linux kernel standard runtime libraries int 0x80 iret
The system-call jump-table • There are approximately 300 system-calls • Any specific system-call is selected by its ID-number (it’s placed into register %eax) • It would be inefficient to use if-else tests or even a switch-statement to transfer to the service-routine’s entry-point • Instead an array of function-pointers is directly accessed (using the ID-number) • This array is named ‘sys_call_table[]’
Assembly language (.data) .section .data sys_call_table: .long sys_restart_syscall .long sys_exit .long sys_fork .long sys_read .long sys_write // …etc (from ‘arch/i386/kernel/entry.S’)
The ‘jump-table’ idea sys_call_table sys_restart_syscall 0 1 2 3 4 5 6 7 8 .section .text sys_exit sys_fork sys_read sys_write sys_open sys_close …etc…
Assembly language (.text) .section .text system_call: // copy parameters from registers onto stack… call sys_call_table(, %eax, 4) jmp ret_from_sys_call ret_from_sys_call: // perform rescheduling and signal-handling… iret // return to caller (in user-mode)
Changing the jump-table • To install our own system-call function, we just need to change an entry in the Linux ‘sys_call_table[]’ array, so it points to our own module function, but save the former entry somewhere (so we can restore it if we remove our module from the kernel) • But we first need to find ‘sys_call_table[]’ -- and there are two easy ways to do that
Finding the jump-table • Older versions of Linux (prior to 2.4.18) used to ‘export’ the ‘sys_call_table[]’ as a global symbol, but current versions keep this table’s address private (for security) • But often during kernel-installation there is a ‘System.map’ file that gets put into the ‘/boot’ directory and – assuming it matches your compiled kernel – it holds the kernel address for the ‘sys_call_table[]’ array
Using ‘uname’ and ‘grep’ • You can use the ‘uname’ command to find out which kernel-version is running: $ uname -r • Then you can use the ‘grep’ command to find ‘sys_call_table’ in your System.map file, like this: $ grep sys_call_table /boot/System.map-2.6.22.5cslabs
The ‘vmlinux’ file • Your compiled kernel (uncompressed) is left in the ‘/usr/src/linux’ directory • It is an ELF-format (executable) file • It contains .text and .data sections • You can examine your ‘vmlinux’ kernel with the ‘objdump’ system-utility • You can pipe the output through the ‘grep’ utility to locate the ‘sys_call_table’ symbol
Executable versus Linkable ELF Header ELF Header Program-Header Table (optional) Program-Header Table Section 1 Data Segment 1 Data Section 2 Data Segment 2 Data Section 3 Data Segment 3 Data … Section n Data … Segment n Data Section-Header Table Section-Header Table (optional) Linkable File Executable File
Where is ‘sys_call_table[ ]’? • This is how you use ‘objdump’ and ‘grep’ to find the ‘sys_call_table[]’ address: $ cd /usr/src/linux $ objdump –t vmlinux | grep sys_call_table
Exporting ‘sys_call_table’ • Once you know the address of your kernel’s ‘sys_call_table[]’, you can write a module to export that address to other modules, e.g.: // declare global variable unsigned long *sys_call_table; EXPORT_SYMBOL(sys_call_table); int init_module( void) { sys_call_table = (unsigned long *)0xC0251500; return 0; }
Avoid hard-coded constant • You probably don’t want to ‘hard code’ the sys_call_table’s value in your module – if you ever recompile your kernel, or use a differently configured kernel, you’d have to remember to edit your module and then recompile it – or risk a corrupted system! • There’s a way to suply the required value as a module-parameter during ‘insmod’
Module paramerers char *svctable; // declare global variable module_param( svctable, charp, 0444 ); // Then you install your module like this: $ /sbin/insmod myexport.ko svctable=c0251500 // Linux will assign the address of your input string “c0251500” to the ‘svctable’ pointer:
simple_strtoul() • There is a kernel function you can use, in your ‘init_module()’ function, that will convert a string of hexadecimal digits into an ‘unsigned long’’: int init_module( void ) { unsigned long myval; myval = simple_strtoul( svctable, NULL, 16 ); sys_call_table = (unsigned long *)myval; return 0; }
Shell scripts • It’s inconvenient – and risks typing errors – if you must manually search ‘vmlinux’ and then type in the sys_call_table[]’s address every time you want to install your module • Fortunately this sequence of steps can be readily automated – by using a shell-script • We have created an example: ‘myscript’
shell-script format • First line: #!/bin/sh • Some assignment-statements: version=$(uname –r) mapfile=/boot/System.map-$version • Some commands (useful while debugging) echo $version echo $mapfile
The ‘cut’ command • You can use the ‘cut’ operation on a line of text to remove the parts you don’t want • An output-line from the ‘grep’ program can be piped in as a input-line to ‘cut’ • You supply a command-line argument to the ‘cut’ program, to tell it which parts of the character-array you wish to retain: • For example: cut –c0-8 • Only characters 0 through 8 will be retained
Finishing up • Our ‘myscript’ concludes by executing the command which installs our ‘myexport.o’ module into the kernel, and automatically supplies the required module-parameter • If your ‘/boot’ directory doesn’t happen to have the ‘System.map’ file in it, you can extract the ‘sys_call_table[]’ address from the uncompressed ‘vmlinux’ kernel-binary
The ‘objdump’ program • The ‘vmlinux’ file contains a Symbol-Table section that includes ‘sys_call_table’ • You can display that Symbol-Table using the ‘objdump’ command with the –t flag: $ objdump –t /usr/src/linux/vmlinux • You can pipe the output into ‘grep’ to find the ‘sys_call_table’ symbol-value • You can use ‘cut’ to isolate the address
Which entry can we change? • We would not want to risk disrupting the normal Linux behavior through unintended alterations of some vital system-service • But a few entries in ‘sys_call_table[]’ are no longer being used by the newer kernels • If documented as being ‘obsolete’ it would be reasonably safe for us to ‘reuse’ an array-entry for our own purposes • For example: system-call 17 is ‘obsolete’
‘newcall.c’ • We created this module to demonstrate the ‘dynamic kernel patching’ technique • It installs a function for system-call 17 • This function increments the value stored in a variable of type ‘int’ whose address is supplied as a function-argument • We wrote the ‘try17.cpp’ demo to test it!
Recently… an extra obstacle • Some recent versions of the Linux kernel (including ours in the classroom and labs) have placed the ‘sys_call_table[]’ (as the default configuration-option) in ‘read-only’ memory within kernel-space, despite the already existing protections of ‘ring 0’ • What this achieves is creation of an added obstacle to alterations by privileged-code
page-frame attributes virtual memory address of our ‘sys_call_table[]’ array 0xC0251500 = 1100 0000 00 10 0101 0001 0101 0000 0000 sys_ call_ table 2 1 0 S / U R / W P frame attributes CR3 Page-Frame Page-Directory Page-Tables We cannot modify entries in ‘sys_call_table[]’ unless its page-frame is ‘writable’
Tweak page-frame attributes • Our ‘newcall.c’ module needs to be sure it can modify entry 17 in ‘sys_call_table[]’ • So it locates the page-table entry for the page-frame containing ‘sys_call_table[]’ and sets its ‘writable’ bit to be ‘TRUE’ • But it preserves the previous value of that entry, so it can be restored if we remove our ‘newcall.ko’ object from the kernel
In-class exercise #1 • Write a kernel module (named ‘unused.c’) which will create a pseudo-file that reports how many ‘unimplemented’ system-calls are still available. The total number of locations in the ‘sys_call_table[]’ array is given by a defined constant: NR_syscalls so you can just search the array to count how many entries match ‘sys_ni_syscall’ (it’s the value found initially in location 17)