LLDB FreeBSD live kernel debugging support

By Michał Górny

January 7, 2022 - 12 minutes read - 2378 words

BSD contract debugger FreeBSD GDB LLDB LLVM

Moritz Systems have been contracted by the FreeBSD Foundation to continue our work on modernizing the LLDB debugger’s support for FreeBSD.

The primary goal of our contract is to bring kernel debugging into LLDB. The complete Project Schedule is divided into six milestones, each taking approximately one month:

Improve LLDB compatibility with the GDB protocol: fix LLDB implementation errors, implement missing packets, except registers.
Improve LLDB compatibility with the GDB protocol: support gdb-style flexible register API.
Support for debugging via serial port.
libkvm-portable and support for debugging kernel core files in LLDB, on amd64 + arm64 platform. Support for other platforms as time permits.
Support for debugging the running kernel on amd64 + arm64 platform. Support for other platforms as time permits.
Extra month for kgdb work, processing patches on LLDB reviews or miscellaneous tasks – as time permits. Examples of misc tasks: access to extended system and process information, starting processes via shell, $_siginfo support.

The previous part of our work was focused on inspecting a snapshot of the kernel state from a vmcore. This could either be a crash dump or the memory captured from a running system. However, it is also possible to inspect (and modify) the kernel memory of a running FreeBSD system.

For this milestone, we have focused on extending the newly-added FreeBSDKernel LLDB plugin to support not only vmcores but also inspecting the live system state. We have also extended the generic plugin features to make it more useful as a kernel state debugging tool.

Live kernel debugging via VM

libkvm and libdbsdvmcore relation

Following our earlier development, there are two kernel debugging backend libraries available:

libkvm that is the part of FreeBSD’s base system and can be used to debug both the live kernel and vmcores. However, it is not portable and has very limited support for cross-architecture debugging.
libfbsdvmcore that is a stand-alone library for vmcore debugging. It is portable and supports cross-architecture vmcores but needs to be explicitly installed on the system.

So far LLDB supported libfbsdvmcore only. Adding support for live kernel debugging required extending it to support libkvm.

While adding support for libkvm, it also made sense to use it as a fallback library for vmcores. While it does not support cross-architecture vmcores like libfbsdvmcore does, it is capable of covering the most common use case on FreeBSD systems that do not feature libfbsdvmcore yet.

Furthermore, since libkvm uses the same API for the live kernel and vmcores we get vmcore fallback support with no additional code.

In the previous post, a program for reading a variable from a kernel core dump was presented. The following snippet modifies that program to support processing either a vmcore (given as argv[1]) or live kernel memory (when no argument is supplied). Again, it is very simple and relies on hardcoded memory address that will only work correctly for a specific kernel build.

#include <fcntl.h> /* for O_RDONLY */
#include <kvm.h>
#include <limits.h> /* for _POSIX2_LINE_MAX */
#include <stdio.h>

/* obtained via readelf(1) */
static uintptr_t hz_addr = 0xffffffff81ed838c;

int main(int argc, char *argv[]) {
  kvm_t *kvm;
  char errbuf[_POSIX2_LINE_MAX];
  int hz;
  ssize_t rd;

  kvm = kvm_open2(/*execfile=*/ "/boot/kernel/kernel",
                  /*corefile=*/ argv[1] ? argv[1] : "/dev/mem",
                  /*flags=*/ O_RDONLY,
                  /*errbuf=*/ errbuf,
                  /*resolver=*/ NULL);
  if (kvm == NULL) {
    printf("Failed to open kernel / core dump: %s\n", errbuf);
    return 1;
  }

  rd = kvm_read2(kvm, hz_addr, &hz, sizeof(hz));
  if (rd != sizeof(hz)) {
    printf("Failed to read hz: %s\n", kvm_geterr(kvm));
    kvm_close(kvm);
    return 1;
  }

  printf("hz = %d\n", hz);

  kvm_close(kvm);
  return 0;
}

The enhanced FreeBSDKernel plugin links to libfbsdvmcore if available, and to libkvm when on FreeBSD. Then it attempts to use them in order to open the specified core file. In order to debug the live kernel, /dev/mem needs to be used in place of the core file, e.g.:

lldb --core /dev/mem /boot/kernel/kernel

Grabbing the process list from kernel

The initial implementation of the FreeBSDKernel plugin has only supported looking up the crashing thread. In order to make inspecting the system state easier and improve feature parity with KGDB, we have enhanced the plugin to load the complete process and thread list.

Structures of process/thread lists

The FreeBSD kernel uses a linked list of struct proc instances for the process list. The initial list member is pointed by allproc variable, the subsequent elements are linked via the p_list fields (with a null value indicating the end of the list). Other interesting members of struct proc are:

p_pid containing the process identifier
p_comm containing the process name
p_threads linking to the process' thread list

The threads belonging to a process are stored in a linked list of struct thread instances. The initial thread is pointed by p_threads member of the respective struct proc. Subsequent threads are linked via the td_plist members (with a null value terminating the list). Other interesting struct thread fields are:

td_tid containing the thread identifier
td_name containing the thread name
td_oncpu contaning the CPU number if the thread is on CPU, -1 otherwise
td_pcb linking to the thread’s PCB structure

However, td_pcb is not always the correct PCB to inspect. The following algorithm needs to be used to grab the PCB:

If the thread is the crashing thread (i.e. td_tid is equal to the dumptid variable), then dumppcb variable should be used instead.
If the thread is on CPU (i.e. td_oncpu != -1), then the td_oncpu-th element of the stoppcbs array should be used instead.
Otherwise, td_pcb should be used.

An interesting feature of the FreeBSD kernel is that it defines a number of helper symbols that can be used to inspect these structures and arrays without the necessity of having the DWARF debug info for the kernel. These are:

proc_off_p_* symbols containing offsets for the struct proc fields listed above
thread_off_td_* symbols containing offsets for the struct thread fields listed above
pcb_size symbols defining the size of the PCB structure, and therefore of a single stoppcbs array element

With all this information, LLDB can now display the list of all threads belonging to all running processes. Similarly to GDB, it does not support grouping them by process but instead displays them as a flat list of threads.

(lldb) thread list
Process 0 stopped
* thread #1: tid = 100670, 0xffffffff80c09ade kernel`doadump + 46, name = '(pid 800) sysctl (crashed)'
  thread #2: tid = 100679, 0xffffffff80c3c8c8 kernel`sched_switch + 1768, name = '(pid 795) csh'
  thread #3: tid = 100647, 0xffffffff80c3c8c8 kernel`sched_switch + 1768, name = '(pid 794) getty'
  thread #4: tid = 100676, 0xffffffff80c3c8c8 kernel`sched_switch + 1768, name = '(pid 793) getty'
  thread #5: tid = 100700, 0xffffffff80c3c8c8 kernel`sched_switch + 1768, name = '(pid 792) getty'
  [...]
  thread #28: tid = 100211, 0xffffffff80c3c8c8 kernel`sched_switch + 1768, name = '(pid 29) vmdaemon'
  thread #29: tid = 100210, 0xffffffff81057988 kernel`cpustop_handler + 40, name = '(pid 28) pagedaemon/dom0 (on CPU 4)'
  thread #30: tid = 100214, 0xffffffff80c3c8c8 kernel`sched_switch + 1768, name = '(pid 28) pagedaemon/laundry: dom0'
  thread #31: tid = 100215, 0xffffffff80c3c8c8 kernel`sched_switch + 1768, name = '(pid 28) pagedaemon/uma'
  [...]
  thread #195: tid = 100136, 0xffffffff80c3c8c8 kernel`sched_switch + 1768, name = '(pid 12) intr/irq1: atkbd0'
  thread #196: tid = 100137, 0xffffffff81062820 kernel`fork_trampoline, name = '(pid 12) intr/irq12: psm0'
  thread #197: tid = 100138, 0xffffffff81062820 kernel`fork_trampoline, name = '(pid 12) intr/irq7: ppc0'
  thread #198: tid = 100139, 0xffffffff81062820 kernel`fork_trampoline, name = '(pid 12) intr/swi0: uart'
  thread #199: tid = 100675, 0xffffffff81062820 kernel`fork_trampoline, name = '(pid 12) intr/irq9: acpi0 intsmb0'
  thread #200: tid = 100003, 0xffffffff81057988 kernel`cpustop_handler + 40, name = '(pid 11) idle/idle: cpu0 (on CPU 0)'
  thread #201: tid = 100004, 0xffffffff81057988 kernel`cpustop_handler + 40, name = '(pid 11) idle/idle: cpu1 (on CPU 1)'
  thread #202: tid = 100005, 0xffffffff81057988 kernel`cpustop_handler + 40, name = '(pid 11) idle/idle: cpu2 (on CPU 2)'
  [...]

Building inputs for the test suite

The easiest way to test the FreeBSDKernel plugin is to supply it with some example kernel executables and vmcores. Unfortunately, the kernels and vmcores from regular FreeBSD setups are pretty large. For example, the FreeBSD 13.0 amd64 kernel is 28 MiB, and a minidump from a VM limited to 128 MiB of RAM is 103 MiB. Multiply that by a few architectures (and at least amd64 having both minidump and full size vmcore) and LLDB suddenly becomes huge.

There are three main approaches to achieving smaller test inputs:

Configure and build a smaller kernel, then boot a smaller system and obtain vmcores from that setup.
Post-process the real kernel image and/or vmcore to achieve smaller size while preserving the essential data.
Construct an artificial kernel image and/or vmcore.

It should be noted that in order to test the plugin properly, the kernel executable needed to be an ELF file that was correctly recognized by libfbsdvmcore and libkvm, and that contained the symbols needed by the libraries and the plugin.

The vmcore needed to contain memory segments corresponding to the symbols, in particular the process and thread lists. The complexity of synthesizing the data was increased by the fact that we needed to account not only for the actual in-memory structures but also the container format, including the sparse layout of minidumps.

Generating LLDB FreeBSDKernel tests

For the kernel image, we chose the third approach, i.e. synthesizing the ELF executable using the existing yaml2obj tool from LLVM. First, we used the obj2yaml tool to create a human-readable YAML description of the real kernel executables. Due to the large size of the resulting YAML, rather than attempting to remove unnecessary data from it, we used it to create a new simpler YAML from scratch.

We have limited the resulting file to the absolute necessities: a proper ELF header, minimal .bss and .rodata sections and all the necessary symbols declared within them. For an example, please see the kernel-amd64.yaml file.

For the vmcore, we used a combination of the first two options. For a start, we attempted to boot a VM with as little memory as possible, in order to naturally limit the size of the generated vmcore. Afterwards, we post-processed the resulting file using tools we created for this purpose. In order to avoid reimplementing the respective file formats from scratch, we decided to hack on top of libfbsdvmcore and LLDB instead. The respective tools and patches can be found in the tools subdirectory of the FreeBSDKernel test suite.

The first processing step is to prune to the process list out of “uninteresting” processes. For our purposes, we consider it sufficient to have at least one thread of each of the three PCB categories mentioned earlier:

the crashing thread
a thread on a CPU
a thread matching neither of the above

While technically we could prune the thread lists as well, a few extra threads are a not a problem. Therefore, we iterate through all processes and determine whether they have at least one interesting thread. These that do not are unlinked from the list.

To achieve this, we patch libfbsdvmcore to support overwriting memory in vmcores and LLDB to perform the process list search and modification. While the resulting vmcore still contains all instances of struct proc, the uninteresting processes are no longer linked into allproc. For example, our original vmcore contained 652 threads in total, while the pruned version is limited to 16 threads. This allowed us to shrink the compressed sparse vmcore from 132 KiB to 18 KiB.

The second processing step is making the vmcore sparse. We zero out most of the file, leaving only the essential data. To achieve this, we patch libfbsdvmcore to output the file areas actually read in a machine-readable form. Then we feed the resulting list to a copy-sparse tool that copies only the file header and these areas to a new file, effectively leaving the remainder sparse (unallocated if supported, zero-filled).

Finally, since we cannot rely on being able to store, distribute and recreate sparse files reliably, the final step is to compress the file using bzip2. The compression provides an easy and portable way of reducing the file to the very small size. Most importantly, the test suite can easily uncompress the file using the Python standard library.

This processing makes it possible to reduce the size of the amd64 “full memory” vmcore from 128 MiB to only 13 KiB, and the minidump from 103 MiB to 18 KiB.

For completeness, we have published the original vmcores and the kernels used to create them in the freebsd-test-vmcores repository.

Changes pushed to LLDB

Summary

Previously, we added a new FreeBSDKernel LLDB plugin that made it possible to inspect FreeBSD vmcores. It used a new library based on FreeBSD’s libkvm, libfbsdvmcore in order to portably read vmcores in “full memory” ELF and minidump formats. This made it possible to use it on operating systems other than FreeBSD, and to read vmcores cross-architecture.

Now we extended this plugin to supporting native FreeBSD libkvm. Firstly, this makes it possible to open /dev/mem and inspect the state of currently running kernel without the necessity to dump core. Secondly, libkvm can be used to open native vmcores when libfbsdvmcore is not available on the system.

Afterwards, we enabled the plugin to inspect running processes and threads. It reads the kernel allproc list to find all the processes, and the nested thread lists and uses them to construct a complete list of all threads in the system. The debugger can then inspect the backtraces and register states corresponding to each thread.

Finally, we worked on kernel image and vmcore inputs to the test suite. We were able to replace the real kernel executable with an ELF file synthesized from relatively simple YAML. We created tooling to process vmcores and convert them into sparse files containing only the essential content, suitable for efficient compression. We were able to achieve kernel YAML sizes around 4 KiB each, and vmcore sizes up to 18 KiB.

Future plans

This milestone concludes the baseline part of our contract. The sixth milestone has been primarily intended to provide time for additional work coming up throughout the previous milestones.

We are planning to use the remaining time to perform more testing of FreeBSD kernel debugging and fix bugs that might surface during this time. As time permits, we can also focus on miscellaneous tasks such as:

access to extended system and process information
starting processes via shell
$_siginfo support