Thursday, April 16, 2009

Linux kernel crash dumps with kdump

Kdump is official GNU/Linux kernel crash dumping mechanism. It is part of vanilla kernel. Before it, there exists some projects like LKCD for performing such things. But they weren't part of mainline kernel so you needed to patch the kernel or to rely on Linux distribution to include it. In the event of LKCD, it was difficult to configure it, especially which device to use for dumping.

The first notice about kexec (read what it is useful for and how to use it) in GNU/Linux kernel was in changelog of version 2.6.7. Kexec tool is prerequisite for kdump mechanism. Kdump was firstly mentioned in changelog of version 2.6.13.

How is it working? When the kernel crashed the new so called capture kernel is booted via kexec tool. The memory of previous crashed kernel is leaved intact and the capture kernel is able to capture it. In detail, first kernel needs to reserve some memory for capture kernel. It is used by capture kernel for booting. The consequence is the total system memory is lowered by reserverd memory size.

When the capture kernel is booted, the old memory is captured from the following virtual /proc files:
  • /proc/vmcore - memory content in ELF format
  • /proc/oldmem - really raw memory image!

Next, we will check how to initialize kdump mechanism, how to configure it and how to invoke it for testing purposes.

1 comment:

Robin Garner said...

On the face of it, since (according to redhat) you need ~128MB memory to boot kdump, you wouldn't enable this by default on a small-memory machine. But on a virtual machine, I'm thinking that the hypervisor wouldn't actually allocate physical memory to the kdump region until the virtual machine crashed. This would make it viable to configure on even a 512MB memory virtual machine like some of our more stripped-down virtual servers.

Have you looked into this at all ?