1 / 15

LKCD Linux Kernel Crash Dumps

LKCD Linux Kernel Crash Dumps. Matt D. Robinson matt@aparity.com. LKCD Overview. Description Kernel Implementation Configuration Invocation/Kernel State User-Level Analysis (lcrash) lcrash Example Output Future Development/Evolution. Description.

Download Presentation

LKCD Linux Kernel Crash Dumps

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. LKCDLinux Kernel Crash Dumps Matt D. Robinson matt@aparity.com

  2. LKCD Overview • Description • Kernel Implementation • Configuration • Invocation/Kernel State • User-Level Analysis (lcrash) • lcrash Example Output • Future Development/Evolution Version 1.0

  3. Description LKCD is a set of kernel and application code to configure, implement, and analyze system crash dumps. These slides will cover a high-level view of the kernel side of LKCD, with a brief introduction to the user-level analysis tools. Version 1.0

  4. Kernel Implementation • dump.o is the primary kernel driver, and can be either a module or built by default into the kernel • Dump driver is dormant until either invoked for configuration or for dumping • Configuration of dump device determines what occurs on invocation • Disruptive and non-disruptive dumping available Version 1.0

  5. Kernel Implementation • Dump compression available through modules (or standalone) – GZIP or RLE • Access to dump driver through /dev/dump (device pair 227,0) • panic() or die_if_kernel() will invoke the dumping process – dumping only occurs if dumps are configured Version 1.0

  6. Current dump path uses existing I/O subsystem for dumping Disks (primarily swap) are used for now – future direction will be MUCH different Kernel Implementation panic() die_if_kernel() dump() dump_execute() dump_add_page() dump_write_pages() dump_compress_page() I/O Subsystem (Disk, Network, Etc.) Version 1.0

  7. Configuration • Dump configuration takes place via ioctl() to the kernel driver: • DIOSDUMPLEVEL • DUMP_LEVEL_NONE – Don’t dump any pages • DUMP_LEVEL_ALL – Dump all memory pages • DUMP_LEVEL_KERN – Dump just kernel level pages • DIOSDUMPFLAGS • DUMP_FLAGS_NONE – No flags set • DUMP_FLAGS_NONDISRUPT – Try and continue standard system operation after a dump takes place Version 1.0

  8. Configuration • DIOSDUMPCOMPRESS • DUMP_COMPRESS_NONE – Raw dump format • DUMP_COMPRESS_RLE – Use RLE compression • DUMP_COMPRESS_GZIP – Use GZIP compression • DIOSDUMPDEV • This is the device to dump to (for example, /dev/sda4) Each configuration parameter is dependent on the system state, whether dump compression is loaded into the kernel, etc. Version 1.0

  9. User-Level Analysis (lcrash) Linux Crash (lcrash) is used for analyzing system crash dumps. It is extremely powerful for support and engineering personnel for finding solutions to kernel crashes: • Evaluates CPU state • Mode, register settings, etc. • Displays all tasks • Includes which task is running on a given CPU • Stack trace for each running task • This is accomplished WITHOUT frame pointers built into the kernel (-fomit-frame-pointer) • Allows for memory dumping, struct analysis, finding symbols, etc. • lcrash is amazingly versatile for problem analysis • Crash dump reports can be created automatically on boot-up after a system crash Version 1.0

  10. lcrash Example Output >> stat | head sysname : Linux nodename : crashme.atmyhouse.com release : 2.4.8 version : #9 SMP Mon Dec 10 00:05:19 PST 2001 machine : i686 domainname : (none) LOG_BUF: >> dump log_buf 10 0xc0332c60: 4c3e343c 78756e69 72657620 6e6f6973 : <4>Linux version 0xc0332c70: 342e3220 2820382e 746f6f72 74617740 : 2.4.8 (root@cra 0xc0332c80: 79657265 70612e65 : shme.atm Version 1.0

  11. lcrash Example Output >> task ADDR UID PID PPID STATE FLAGS CPU NAME ====================================================================== 0xc02e4000 0 0 0 0 0 - swapper 0xdfffc000 0 1 0 0 0x100 - init 0xdfff2000 0 2 1 1 0x40 - keventd 0xdffee000 0 3 0 0 0x40 - ksoftirqd_CPU0 [ . . . ] 0xde47a000 0 867 1 1 0x100 - mingetty 0xda0fe000 0 1017 660 0 0x140 - sshd 0xd9c06000 0 1018 1017 1 0x100 - bash 0xde4b4000 0 1101 1018 0 0x100 0 insmod ====================================================================== 31 active task structs found Version 1.0

  12. lcrash Example Output >> t 0xda0fe000 ========================================================= STACK TRACE FOR TASK: 0xda0fe000(sshd) 0 schedule+1040 [0xc0111250] 1 schedule_timeout+121 [0xc0110d89] 2 do_select+506 [0xc014251a] 3 sys_select+820 [0xc01428c4] 4 system_call+44 [0xc0106ed4] ========================================================= >> fsym panic_timeout ADDR OFFSET TYPE NAME ============================================================ 0xc0332804 0 GLOBAL_DATA panic_timeout ============================================================ 1 symbol found >> od panic_timeout 0xc0332804: 00000005 : .... Version 1.0

  13. lcrash Example Output >> px ((struct task_struct *)0xd8abf000).thread.esp0 0x15a159 >> px ((struct task_struct *)0xd8abf000).thread.debugreg[0] 0x0 >> whatis user_struct struct user_struct { atomic_t __count; atomic_t processes; atomic_t files; struct user_struct *next; struct user_struct **pprev; uid_t uid; }; >> px (struct user_struct *)(((struct task_struct *)0xd8abf000).user).uid 0xfffff000 Version 1.0

  14. Future Development/Evolution • The 2.5 implementation of LKCD will use dump methods to allow multiple dumping paths through the kernel (multiple devices!) • Low-level device drivers will register their own set of dump functions so that each driver does what it thinks is correct • Additions to lcrash and other LKCD utilities will be extended to allow for this functionality • LKCD will be extended to work on multiple OS architectures (such as FreeBSD) Version 1.0

  15. Questions/Comments? Version 1.0

More Related