LLDB FreeBSD kernel debugging support summary

By Michał Górny

January 20, 2022 - 13 minutes read - 2677 words

BSD contract debugger FreeBSD GDB LLDB LLVM

Moritz Systems have been contracted by the FreeBSD Foundation to continue our work on modernizing the LLDB debugger’s support for FreeBSD.

The primary goal of our contract is to bring kernel debugging into LLDB. The complete Project Schedule is divided into six milestones, each taking approximately one month:

Improve LLDB compatibility with the GDB protocol: fix LLDB implementation errors, implement missing packets, except registers.
Improve LLDB compatibility with the GDB protocol: support gdb-style flexible register API.
Support for debugging via serial port.
libkvm-portable and support for debugging kernel core files in LLDB, on amd64 + arm64 platform. Support for other platforms as time permits.
Support for debugging the running kernel on amd64 + arm64 platform. Support for other platforms as time permits.
Extra month for kgdb work, processing patches on LLDB reviews or miscellaneous tasks – as time permits. Examples of misc tasks: access to extended system and process information, starting processes via shell, $_siginfo support.

The remaining part of our contract consisted of finishing the support for connecting to the GDB server in FreeBSD kernel. The additional time was used to implement file(1) support for recognizing minidump files and to work on implementing siginfo support. In this report, we’d like to explain all of these tasks, as well as summarize our prior work.

FreeBSD Kernel debugging: recap

Kernel debugging methods

FreeBSD provides a number of tools dedicated to debugging the kernel. We have focused on the solutions utilizing an external debugger — originally a GDB fork called KGDB. Our goal was to provide a feature parity between mainstream LLDB and KGDB, and therefore to make it possible to use a permissively licensed debugger to work with the FreeBSD kernel.

FreeBSD kernel debugging types

The kernel debugging methods we were working on can be divided into two main categories: post-mortem debugging and live kernel debugging. Post-mortem debugging means analyzing a past system state recorded in the form of a core dump, usually at the time of a crash. Live kernel debugging involves inspecting the immediate state of a running FreeBSD system.

FreeBSD supports creating a vmcore at a time of kernel panic or upon administrator’s request. At the time of writing, three types of vmcores can be created:

full memory dumps that contain the complete contents of system’s physical memory
minidumps that contain only the memory pages in use by the kernel
textdumps that contain captured debugger output

For a userspace debugger such as LLDB, only the first two types are meaningful. The minidump format is used by default and it is supported by more FreeBSD architectures than the legacy format.

The traditional method of recording core dumps involves storing them in the swap space (provided it is large enough) and then copying into a file on the next boot. Modern FreeBSD kernels also support compression and encryption of core dumps, as well as a netdump mechanism that can be used to send the core dump to a remote network server.

As for live kernel debugging, FreeBSD supports both inspecting a live system from userspace, as well as running a GDB Remote Serial Protocol stub for the purpose of online debugging. In the former case, a special device /dev/mem can be used to inspect the system memory. In the latter, the GDB stub needs to be enabled and the online kernel debugger needs to be started to expose the ability to debug the kernel remotely.

The GDB stub primarily supports access via the serial port. However, modern kernel versions also support debugging over the network using netgdb.

Kernel debugging in LLDB

Depending on the exact kernel debugging method, LLDB provides two main debugging facilities: the FreeBSDKernel plugin and GDB Remote Serial Protocol interface.

The FreeBSDKernel plugin has been written to specifically support FreeBSD vmcores and live kernel debugging from userspace. The plugin can utilize two libraries: a standalone portable libfbsdvmcore dedicated to processing FreeBSD vmcores and/or FreeBSD’s libkvm providing support for native vmcores and live kernel debugging.

The GDB Remote Serial Protocol is supported generically by LLDB, and used natively for communication between lldb-server and LLDB client. Throughout our contract, the GDB protocol support has received a number of enhancements to improve its compatibility with the original gdbserver, as well as other protocol implementations such as the ones provided by QEMU and FreeBSD kernel. Furthermore, proper support for communicating over the serial port has been added.

Debugging the FreeBSD Kernel over the GDB protocol

The FreeBSD kernel includes an embedded GDB stub that can be used to drive a remote GDB Remote Serial Protocol client such as GDB or LLDB. The stub can be used in place of the online kernel debugger DDB. It can either be activated explicitly from the running system, or implicitly when the kernel crashes.

The prerequisite for remote GDB is a kernel with GDB support compiled in. The GDB support is disabled by default in the GENERIC kernels on the stable and release branches. If you are using a release version of FreeBSD, you are going to have to build your own kernel. A simple approach is to search git log for the commit removing the debug options (e.g. bfd15705156b for 13.x branch) and readd them. Then build and install the kernel.

The kernel (along with its debug symbols) need to be copied to the machine that will run the client.

The serial port that will be used for kernel debugging needs to have its flags set to 0x80. If this is not done, the kernel will not expose support for the gdb debugger. On amd64, this is done via /boot/device.hints file:

hint.uart.0.flags="0x80"

After booting a properly configured system, sysctl should indicate that the gdb stub is available:

# sysctl debug.kdb.available
debug.kdb.available: ddb gdb

The GDB backend can be enabled via sysctl:

# sysctl debug.kdb.current=gdb
debug.kdb.current: ddb -> gdb

Now, GDB stub will start waiting for remote connection if the kernel crashes or the debugger is entered explicitly via sysctl:

# sysctl debug.kdb.enter=1

Note that the system will seem hanged for all practical purposes until a debugger actually connects. The connection can be established using LLDB, e.g. from Linux client:

(lldb) process connect serial:///dev/ttyS0

Note that in order for register lookups to work, LLDB needs to be provided with the kernel executable.

file(1) support for minidumps

file(1) is a tool commonly used to recognize file formats based on their contents. Prior to our work, the magic database used by file(1) lacked rules for FreeBSD minidumps.

The minidump header is defined for each architecture separately. However, there are some common features that are useful for defining the recognition rules.

On all architectures but powerpc, the minidump header starts with:

struct minidumphdr {
        char magic[24];
        uint32_t version;
        // ...

On powerpc, it starts with:

struct minidumphdr {
        char magic[32];
        char mmu_name[32];
        uint32_t version;
        // ...

In all cases, the file starts with a magic string consisting of literal minidump FreeBSD/, followed by architecture name, followed by null terminator.

The version field is written in native endianness. The highest value at the time of writing is 3, so it is reasonable to look at the most significant byte to guess endianness.

The resulting magic(4) rule is:

0       string  minidump\040FreeBSD/    FreeBSD kernel minidump
>17     string  powerpc
>>17    string  >\0                     for %s,
>>32    string  >\0                     %s,
>>>64   byte    0                       big endian,
>>>>64  belong  x                       version %d
>>>64   default x                       little endian,
>>>>64  lelong  x                       version %d
>17     default x
>>17    string  >\0                     for %s,
>>>24   byte    0                       big endian,
>>>>24  belong  x                       version %d
>>>24   default x                       little endian,
>>>>24  lelong  x                       version %d

The syntax is pretty straightforward. The first column specifies file offset, plus > used to nest rules. The second column specifies matched type, while the third matched value. The fourth column specifies printed string, if any.

The rule set starts with literal match of minidump FreeBSD/ at offset 0. If the value matches, FreeBSD kernel minidump is output and the successive rules (that have at least one >) are processed.

The next rule matches powerpc at offset 17, to special-case PPC coredumps. If it matches, the following rules with at least two >s are processed. They print the architecture name and MMU name from their respective offsets.

The next nested rules check the first byte of version field in order to determine the endianness. The default type applies if no other rule at the same level matched, making the whole rule set resemble a switch-case construct. The most deeply nested rule prints the minidump version, decoded using the appropriate endianness.

The other half of the rule set applies the same logic for non-PPC coredumps. There is no MMU name there, and the version is at a different offset.

As a result, file(1) now correctly recognizes minidumps, reports their respective architecture, endianness and version:

minidumps/amd64:   FreeBSD kernel minidump for amd64, little endian, version 3
minidumps/arm64:   FreeBSD kernel minidump for arm64, little endian, version 2
minidumps/i386:    FreeBSD kernel minidump for i386, little endian, version 2
minidumps/ppc64be: FreeBSD kernel minidump for powerpc64, mmu_radix, big endian, version 2
minidumps/ppc64le: FreeBSD kernel minidump for powerpc64, mmu_radix, little endian, version 2

Preliminary siginfo support

siginfo support in the operating system

Signals provide a way to asynchronously notify the program about specific events. The baseline support for signals as defined by the C standard is rather simple. The program’s signal handler only receives a signal number, i.e. a general category of the event reported. In some cases, additional signal information can be retrieved using specific API — e.g. for SIGCHLD, the wait() family of functions can be used to determine the child reported.

POSIX extends the signal interface with a siginfo_t structure providing additional signal information. The program itself can obtain this information through sigaction(2) sa_sigaction extended handler prototype. The common ptrace(2) implementations also provide methods of obtaining this information for the debugged program, as well as overwriting it in some instances:

PTRACE_GETSIGINFO and PTRACE_SETSIGINFO on Linux
PT_LWPINFO on FreeBSD
PT_GET_SIGINFO and PT_SET_SIGINFO on NetBSD

The exact contents of siginfo_t vary between operating systems, and in some cases between architectures of the same system. Commonly, the structure contains:

si_signo — the signal number
si_code — additional code identifing signal sub-category
si_errno — errno value associated with the signal (if any)

Plus further members depending on the signal delivered. Usually a union is used to share storage between mutually exclusive members. For example, a SIGSEGV includes the faulting memory address as si_addr, while SIGCHLD includes the child process' PID as si_pid, UID as si_uid and exit status (if exited) as si_status.

siginfo in the GDB Remote Serial Protocol

The GDB Remote Serial Protocol exposes the ability to retrieve and write siginfo via the qXfer:siginfo:read::... and qXfer:siginfo:write::... packets respectively. The packets transmit the raw siginfo_t contents without any processing. Therefore, the server needs not to be aware of platform’s siginfo_t but the client needs to be able to process the structure’s contents.

The support for these packets needs to be indicated by server’s qSupported. The respective feature flags are qXfer:siginfo:read and qXfer:siginfo:write.

siginfo support in LLDB

signal information workflow

According to the plan, getting siginfo is going to consist of four steps:

The client verifies that the server supports qXfer:siginfo:read via its qSupported packet.
The client calls an appropriate platform plugin to synthesize siginfo_t for the respective system-architecture combination. This makes it possible to inspect siginfo even if debug information is not available.
The client requests siginfo from server using qXfer:siginfo:read packet.
The server calls the respective process plugin to obtain siginfo (via ptrace(2) on Unix derivatives), then sends it to the client.

The advantage of this approach is that the individual parts of the pipeline can be implemented and tested separately. More specifically, the implementation work consists of the following steps:

Implementing server support for qXfer:siginfo:read. This involves adding an extension flag to process plugins, a method to obtain siginfo and minimal integration necessary for lldb-server to support this qXfer variant. The new feature is tested via delivering a signal to a test program, then issuing a qXfer:siginfo:read packet and manually verifying the return value.
Implementing siginfo_t type synthesis. This involves adding a Platform method to obtain the type. The new feature is tested via comparing offsets and sizes within the synthesized siginfo_t with the values obtained from a real system.
Implementing siginfo getter API. This involves adding a new API method that issues a client qXfer:siginfo:read call and combines the raw result with the synthesized type. This can be tested through mocking the server and verifying that the API call returns the expected value.
Implementing a dedicated LLDB command to get siginfo. This can also be tested using mocked server, and verifying the command output.

Changes pushed to LLDB

Summary

When we started this contract, LLDB had limited usefulness as a FreeBSD kernel debugger. While it used the GDB Remote Serial protocol internally, it was not fully compatible with the implementation in GDB itself and other stubs. The support for communicating over the serial port was limited and did not include the ability to set communication parameters. LLDB did not support minidumps, and the “full memory” dumps were incorrectly interpreted as userspace cores.

Throughout our work, the protocol used by LLDB was aligned to match GDB better. Whenever backwards compatibility with the prior lldb-server versions wasn’t important, the protocol was changed in place. In other cases, new features were implemented to improve compatibility. This was particularly important regarding signal number transmission where it was necessary to implement two modes: one maintaining compatibility with GDB, and the other using prior LLDB semantics to preserve its support for more signal numbers.

Furthermore, LLDB was improved to handle register definitions from foreign servers better. Major changes to the internal design were done in order to make the register API more flexible. LLDB gained the ability to fill the missing portions of register information, including adding partial registers (e.g. eax for rax on amd64) and reconstructing composite registers (e.g. ymm0 from xmm0 and ymm0h). Finally, basic register definitions were embedded in LLDB to handle servers that do not supply target.xml at all.

Afterwards, the support for serial port was greatly enhanced. The original mode relying on hardcoded communication parameters was replaced by a flexible serial:// URI, providing the ability to easily and consistently configure serial port parameters both on client and server end. This also involved writing proper serial port abstraction for a variety of platforms, including FreeBSD, Linux and NetBSD.

LLDB became more compatible with a variety of GDB stubs. This includes both the gdbserver supplied as part of the GDB distribution but also stubs used e.g. by QEMU or the FreeBSD kernel. The register set fallbacks implemented made it possible to read the backtraces and register values from the kernel GDB stub.

A new plugin was created to handle FreeBSD vmcores and live kernel debugging. The plugin can utilize either the standard FreeBSD libkvm library to read native core dumps and live kernel memory, or the newly introduced libfbsdvmcore to read core dumps from any FreeBSD architecture. This library has been created as a cross-platform, cross-architecture alternative to libkvm with a clean API. This effort also involved creating a framework for creating small test inputs from kernel images and vmcores.

Finally, a number of miscellaneous changes and bugfixes were done to a variety of projects. An improvement to the RLE (Run Length Support) support in the GDB stub of the FreeBSD kernel was submitted. file(1) has been enhanced to recognize FreeBSD minidumps. The work on siginfo support in LLDB was started.

All of this work was merged into LLDB mainstream and is going to be included in the LLVM 14 release. The FreeBSD users will gain the ability to use LLDB as a permissively licensed alternative to KGDB. It is also worth noting that unlike KGDB that is a customized version of GDB, the LLDB support for kernel debugging is now an integral part of the LLVM suite, effectively reducing the maintenance burden on FreeBSD.