450 likes | 578 Views
Digging into ext2. Nezer J. Zaidenberg. Agenda. Linux kernel module programming (summary from TLDP) Abstract (Virtual) in C Initial code review and work on ext2 How to start working on ex 3 + more digging methods Some clues on ex3. References.
E N D
Digging into ext2 Nezer J. Zaidenberg
Agenda • Linux kernel module programming (summary from TLDP) • Abstract (Virtual) in C • Initial code review and work on ext2 • How to start working on ex 3 + more digging methods • Some clues on ex3
References • The linux kernel - www.kernel.org (or local mirror) • The linux documentation project – Kernel module programming – 2.6 www.tldp.org (http://www.tldp.org/LDP/lkmpg/2.6/html/lkmpg.html) • Under KERNEL tree • Documentation/Kbuild • fs/*.c • fs/ext2/*.c • UNIX filesystems • Understanding The Linux Kernel • The linuxkerenl (from tldp.org) has a fairly reqasonable description of ext2.h and file systems in general but is referring to version 2.0 of Linux (so The concepts are correct but but some files have different names!)
Hello world module (1/2) in kernel 2.6 l#include<linux/module.h> /* Needed by all modules */ #include <linux/kernel.h> /* Needed for KERN_INFO */ #include <linux/init.h> /* Needed for the macros */ static int __init hello_2_init(void) { printk(KERN_INFO "Hello, world 2\n"); return 0; }
Hello world module (2/2) static void __exit hello_2_exit(void) { printk(KERN_INFO "Goodbye, world 2\n"); } module_init(hello_2_init); module_exit(hello_2_exit);
Makefile for kernel module obj-m += hello-2.o all: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules clean: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
Makefile for multiple objects obj-m += startstop.o startstop-objs := start.ostop.o all: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) modules clean: make -C /lib/modules/$(shell uname -r)/build M=$(PWD) clean
Implementing abstract functions in C class base { virtual foo() virtual bar() Widget W; } class derived : public class base // Extends class base { foo() }
Calling base and derived functions base b; derived d; base * ptr=&b; (*ptr).foo(); (*ptr).bar(); ptr=&d; (*ptr).foo(); (*ptr).bar();
The Virtual implementation • When implementing virtual in memory, we keep for every object a “Virtual table” • The virtual table holds function names and pointers to functions • When we use (*ptr)->foo() the function call goes to the Virtual table and looks for foo. • When it finds foo it will call the right function.
Abstract functions in C • The fact that C have no virtual table built by the compiler for us does not mean we cannot built one ourselves. (If you think it through everything in C++/Java compiles to assembler (close to C) with no classes either) • We just have to build the virtual table ourselves… • For every struct widget we can add structwidget_operations * which will hold pointers to functions that operate on widget. (and can be implemented as we want) • Since the compiler in C lacks the class and inheritance mechanism we have to do lots of things ourselves such as • Deliver the * this to the functions • Initialize the struct widget operations • Call the function in a bit akward way (b.foo() -> (*(b->ops.foo)(&b))
The abstract interface structfile_operations { struct module *owner; loff_t(*llseek) (struct file *, loff_t, int); ssize_t(*read) (struct file *, char __user *, size_t, loff_t *); ssize_t(*aio_read) (structkiocb *, char __user *, size_t, loff_t); ssize_t(*write) (struct file *, const char __user *, size_t, loff_t *); ssize_t(*aio_write) (structkiocb *, const char __user *, size_t, … Complete struct (and others) found under Kernel source tree include/linux/fs.h
Filling the abstract interface (GCC) structfile_operations fops = { read: device_read, write: device_write, open: device_open, release: device_release };
Filling the abstract interface (C99) structfile_operations fops = { .read = device_read, .write = device_write, .open = device_open, .release = device_release };
fs/ext2/super.c • 1426 static void __exit exit_ext2_fs(void) • 1427 { • 1428 unregister_filesystem(&ext2_fs_type); • 1429 destroy_inodecache(); • 1430 exit_ext2_xattr(); • 1431 } • 1432 • 1433 module_init(init_ext2_fs) • 1434 module_exit(exit_ext2_fs)
fs/ext2/super.c • 1407 static int __init init_ext2_fs(void) • 1408 { • 1409 int err = init_ext2_xattr(); • 1410 if (err) • 1411 return err; • 1412 err = init_inodecache(); • 1413 if (err) • 1414 goto out1; • 1415 err = register_filesystem(&ext2_fs_type);
fs/ext2/super.c • 1416 if (err) • 1417 goto out; • 1418 return 0; • 1419 out: • 1420 destroy_inodecache(); • 1421 out1: • 1422 exit_ext2_xattr(); • 1423 return err; • 1424 }
What have we just seen • Init the module • Exit • Register/unregister the file system • It would be perfectly reasonable if your module just call registerfs and unregisterfs in the init module and exit module (you don’t REALLY need inode cache and xattr etc.)
Struct ex2_fs_type (fs/ext2/super.c) 1399 static structfile_system_type ext2_fs_type = { 1400 .owner = THIS_MODULE, 1401 .name = "ext2", 1402 .get_sb = ext2_get_sb, 1403 .kill_sb = kill_block_super, 1404 .fs_flags = FS_REQUIRES_DEV, 1405 };
What have we just seen • We have 5 fields • .owner = THIS_MODULE (so if this file system is mounted we cannot unmount this module) • .name = “ext2” (the name that will be specified by –t to mount) • .get_sb = what to do to get super block (call ext2 method) • .kill_sb = how to free the super block (call linux method) • FS_REQUIRES_DEV This file system is based on device
ext2_get_sb (from fs/ext2/super.c) 1288 static int ext2_get_sb(struct file_system_type *fs_type, 1289 int flags, const char *dev_name, void *data, structvfsmount *mnt) 1290 { 1291 return get_sb_bdev(fs_type, flags, dev_name, data, ext2_fill_super, mnt); 1292 }
What have we just seen • We call standard Linux kernel method to get super block for file system based on BLOCK device This function initialize the block device • We give this function as argument – a function to fill the private file super block (read it from disk) called ext2_sb_fill
Continuing the dig… fs/super.c • You’ll find get_sb_bdev at lines 751-813 (note it’s a different file) • This function does a lot we don’t really care about (initialize the block device) etc. But in lines 795-800 you shall find : 795 error = fill_super(s, data, flags & MS_SILENT ? 1 : 0); 796 if (error) { 797 up_write(&s->s_umount); 798 deactivate_super(s); 799 goto error; 800 } • Lets take it from there
Back to fs/ext2/super.c 738 static int ext2_fill_super(struct super_block *sb, void *data, int silent) 739 { 740 structbuffer_head* bh; … 756 sbi = kzalloc(sizeof(*sbi), GFP_KERNEL); 757 if (!sbi) 758 return -ENOMEM; 759 sb->s_fs_info = sbi; 760 sbi->s_sb_block = sb_block; …
fs/ext2/super.c 779 if (blocksize != BLOCK_SIZE) { 780 logic_sb_block = (sb_block*BLOCK_SIZE) / blocksize; 781 offset = (sb_block*BLOCK_SIZE) % blocksize; 782 } else { 783 logic_sb_block = sb_block; 784 } 785
fs/ext2/super.c 786 if (!(bh = sb_bread(sb, logic_sb_block))) { 787 printk ("EXT2-fs: unable to read superblock\n"); 788 gotofailed_sbi; 789 } 790 /* 791 * Note: s_es must be initialized as soon as possible because 792 * some ext2 macro-instructions depend on its value 793 */ 794 es = (struct ext2_super_block *) (((char *)bh->b_data) + offset); 795 sbi->s_es = es;
fs/ext2/super.c … 798 if (sb->s_magic != EXT2_SUPER_MAGIC) 799 goto cantfind_ext2; … 1042 sb->s_op = &ext2_sops; … 1045 root = ext2_iget(sb, EXT2_ROOT_INO);
fs/ext2/super.c 299 static const structsuper_operations ext2_sops = { 300 .alloc_inode = ext2_alloc_inode, 301 .destroy_inode = ext2_destroy_inode, 302 .write_inode = ext2_write_inode, 303 .delete_inode = ext2_delete_inode, 304 .put_super = ext2_put_super, 314 };
fs/ext2/super.c 305 .write_super = ext2_write_super, 306 .statfs = ext2_statfs, 307 .remount_fs = ext2_remount, 308 .clear_inode = ext2_clear_inode, 309 .show_options = ext2_show_options, 310 #ifdef CONFIG_QUOTA 311 .quota_read = ext2_quota_read, 312 .quota_write = ext2_quota_write, 313 #endif
What have we just seen • The function find the exact block to read based on block device block size and super block block size • The function reads the super block via the bread API for block devices (the data is read into buffer header struct) • The function stores private data • The function checks magic number (you should also do) • The function set super block ops • The function reads the root directory
Saving private data • Linux structsuper_block has a void * called s_fs_info • Linux structinode has a void * called i_private • Those pointers can be used to save private data (your file system structure) • You can use those pointers data later
Include files and structs • Include/linux/fs.h has most struct definitions including • structsuper_block (lines 1106-1176) • structinode (lines 623-688) • structblock_device (lines 550-580) • Structbuffer_head is found at line 60 of buffer_head.h • Operations • super_operationsfs.h : 1358 • Inode_operationsfs.h: 1311 • File_ioerations: fs.h 1281 • Address_space_operationsfs.h : 486 (for mmap)
Lots of time we will go from our file system code to Linux code and back • Our uxfs_get_sb -> calls Linux bd_get_sb -> calls our uxfs_fill_sb • Our Inode ops -> fills Linux do_sync_real -> which calls our get block • This makes sense so don’t be surprised when you see it
How to start working (from scratch) on ex3 • Learn : • Build hello world module • Build a blank (do nothing) file system module • Think : • What are you going to hold in your file system? Think about super block, Inodes • Implement : • Write mkfs.uxfs (a user program) • Write fsdebug (a user program that prints/manipulates your file system)
Ex 3 • Move to kernel • Make your file system module mountable (mount your floppy) • (make sure you read SB correctly) • Implement directories (so we can ls the root of the module) • Implement files (via read/write/lseek) • Implement mmap • You’re done you can work on bonus now
More digging methods + some advice • Install kernel-dev-2.6.27.5 RPM from ftp://mirror.isoc.org.il/fedora/releases/10/Fedora/i386/os/Packages/kernel-devel-2.6.27.5-117.fc10.i686.rpm • rpm –hiv –-force kernel-dev-2.6.27.5-117.fc10.i686 • Download complete kernel source from www.kernel.org • You can printk each function that ext2 calls in each operation as you traverse the file
Some clues on EX3 • You can use the unixfilesystems solutions (but some porting is required) • You can use the ext2 solution (which is complete solution +all bonuses) but “encumbered” with many performance optimizations you don’t need • If ext2 has implementation to a function you need -> you should probably implement it as well. • If ext2 uses Linux default implementation -> It should be OK for you too. • All ext2 functions are prefixed with ext2 • Use something to help you navigate in the kernel (I use vi ctags, grep and cscope)
minix • Minix is a “mini-Unix” (OS built by Andrew Tenenbaum for Educational purposes which later inspired LinusTurvalds to write Linux) has a unix-like file system MUCH simpler then ext2. • I am NOT familiar with minix enough to recommend it but from what I browsed it has all the features you need for the ex. (+some extra) and is much simpler then ext2 • The implementation for minixfs under Linux was authored by LinusTurvalds at 1991-1992
For those of you that want to base you work on MINIX FS • Ignore everything that has to do with minix versions (for simplicity assume minix version 1) • Ignore everything that has to do with aio (your file system is not required to support aio) • Ignore everything that has to do with inode_cache (performance optimization that was not required) • The rest is pretty much your homework… (with some bonuses such as bitmaps)
More minix • Zones = blocks • Zmap = blocks map • Minix is documented (including file system) in Andrew Tanenbaum book Operating systems (Latest edition with Albert Woodhull) This documentation describes MINIX code for the minixfs (which is different then Linux code) still it can help if you don’t understand the structure
Some VMWARE and Linux admin clues • install rpm -hiv (package you downloaded) (may require –force if you try to force install of older package) • yum install emacs (or eclipse or anything)