LLDB core dump support improvements

By Michał Górny

May 6, 2021 - 10 minutes read - 2048 words

BSD contract debugger FreeBSD LLDB LLVM

Moritz Systems have been contracted by the FreeBSD Foundation to continue our work on modernizing the LLDB debugger’s support for FreeBSD.

The complete Project Schedule is divided into four milestones, each taking approximately one month:

M1 Switch all the non-x86 CPUs to the LLDB FreeBSD Remote-Process-Plugin.
M2 Iteration over regression tests on ARM64 and fixing known bugs, marking the non-trivial ones for future work. Remove the old local-only Process-Plugin.
M3 Implement follow-fork and follow-vfork operations on par with the GNU GDB support. Cover the functionality with LLDB regression tests.
M4 Implement SaveCore functionality for FreeBSD and enhance the regression testing of core files in LLDB. Update the FreeBSD manual.

The final part of our project involved improvements to core dump support in LLDB. This includes extending regression tests to verify core dump support better, fixing any bugs we might find and eventually implementing the support for creating core dumps of running programs on FreeBSD.

Core dumps

Introduction to core dumps

A core dump is a record of a program’s state at a particular point in time. Most commonly, a core dump is created when a program terminates abnormally, e.g. due to memory access violation (a segfault) or a call to abort(3). It is used for post-mortem debugging, i.e. trying to assess why the program has terminated.

On modern Unix derivatives, including FreeBSD, Linux and NetBSD, core dumps are written in the ELF format, the same format as used for executable programs. On these systems, core dumps have no sections and two kinds of segments: PT_LOAD segments with the contents of the program’s memory, and a PT_NOTE segment containing the remaining metadata. The notes contained within include a map of loaded libraries, information on open file descriptors, threads, contents of registers.

A few operating systems can additionally create core dumps of a stopped program without requiring it to crash. The NetBSD kernel provides a PT_DUMPCORE ptrace(2) request to do that, and the FreeBSD kernel has recently gained a PT_COREDUMP request providing a similar function. Besides that, a few debuggers can assemble an artificial core dump through inspecting the program’s data.

ELF file format

ELF core

Typically, the ELF file consists of the following elements:

an ELF header located at the beginning of the file
a program header containing the list of segments
the program data
a section header containing the list of sections
optionally, any trailer data

The ELF header is the only element with a fixed location. It is used to identify the file format, as well as the executable type, operating system, architecture, ABI. It also lists offsets to the program and section headers, permitting them to be placed at arbitrary offsets.

The program and section headers contain the lists of program segments and sections, respectively. Segments and sections provide two overlapping views of the program data. Usually, the program header is placed before the program data, while the section header is placed after it.

Segments describe how the program data is to be loaded into memory, and they are used by the dynamic loader to load the program. Sections describe information needed for linking and relocation, and therefore are used by the link editor. Core dumps use segments only.

The program data is referenced only through program and section headers. Its organization is entirely arbitrary. Furthermore, additional data (trailer) can be freely appended to ELF executables without changing its behavior. This is used e.g. to create self-extracting archives and installers.

Segments can have different types. The two types used in core dumps are PT_LOAD and PT_NOTE. PT_LOAD segments are used to describe the program’s memory, i.e. map data from the core file into the program’s memory addresses. The PT_NOTE segment is used to describe notes describing additional information.

ptrace(2) API for creating core dumps

NetBSD provides a PT_DUMPCORE request to create a core dump from a stopped tracee. The request takes two arguments: a pointer to the path to create core dump at, and the length of the passed path. For example, it can be invoked as:

const char *path = "/tmp/mycore";
ptrace(PT_DUMPCORE, pid, path, strlen(path));

FreeBSD provides a PT_COREDUMP request for the same purpose. The actual arguments are passed via struct ptrace_coredump. The two arguments to the request pass respectively pointer to the structure, and its size. Through passing the size, it is possible to extend the structure in the future.

struct ptrace_coredump is defined as:

/* Argument structure for PT_COREDUMP */
struct ptrace_coredump {
  int      pc_fd;    /* File descriptor to write dump to. */
  uint32_t pc_flags; /* Flags PC_* */
  off_t    pc_limit; /* Maximum size of the coredump,
                        0 for no limit. */
};

/* Flags for PT_COREDUMP pc_flags */
#define PC_COMPRESS 0x00000001 /* Allow compression */
#define PC_ALL      0x00000002 /* Include non-dumpable entries */

The request provides a greater control over the core dump. The data is written to a specified fd (open for writing). It can optionally be compressed. The user can also specify a size limit; the request will fail if the core dump exceeded maximum size.

An example invocation follows:

struct ptrace_coredump cd;
cd.pc_fd = open("/tmp/mycore", O_WRONLY|O_CREAT, 0666);
cd.pc_flags = 0;
cd.pc_limit = 0;
ptrace(PT_COREDUMP, pid, (void*)&cd, sizeof(cd));
close(cd.pc_fd);

gcore tool

gcore(1) is a commonly present tool that can be used to create a core dump of a running program. On FreeBSD and NetBSD, gcore(1) is provided by the base system. On Linux, it is usually installed as part of GDB.

Normally, the tool uses ptrace(2) to attach to the running process, then dump its state. Effectively, it reimplements the core dumping algorithm in userspace. The newer versions of gcore(1) on FreeBSD can optionally use PT_COREDUMP instead, using the -k option.

Core dump notes on various systems

Notes are the primary means of storing program state information in core dumps. Each note is keyed using a combination of textual name (‘owner’) and enumerated type. Notes are either program-specific or thread-specific. The method of distinguishing threads differs per operating system.

We are going to consider amd64 core dumps on FreeBSD, Linux and NetBSD for comparison.

ELF core

On FreeBSD, all notes are named FreeBSD. The core dump starts with a NT_PRPSINFO structure that holds generic program information, such as its name, arguments and PID. It is followed by a series of notes for every program thread. Every series starts with a NT_PRSTATUS note holding the thread identifier, current signal and a dump of general-purpose registers. It is followed by additional notes for FPU registers (NT_FPREGSET), thread name (NT_THRMISC), additional signal information (NT_PTLWPINFO) and XSAVE dump (NT_X86_XSTATE). The subsequent NT_PRSTATUS notes indicate the beginning of information for the next thread. The last thread is followed by additional notes with process information (NT_PROCSTAT_PROC), open files (NT_PROCSTAT_FILES), memory map (NT_PROCSTAT_VMMAP), auxiliary vector (NT_PROCSTAT_AUXV) and other metadata.

On Linux, all notes are named CORE, with the exception of XSAVE dump that is named LINUX. The core dump starts with status note for the main process thread (NT_PRSTATUS), including general program and signal information, and a dump of GPRs. It is followed by additional process (NT_PRPSINFO) and signal data (NT_SIGINFO), auxiliary vector (NT_AUXV), mapped files (NT_FILE), then FPU (NT_FPREGSET) and XSAVE dumps (NT_X86_XSTATE). The subsequent threads are described as series of NT_PRSTATUS, followed by their respective FPU and XSAVE dumps.

NetBSD explicitly splits notes between process- and thread-specific. The process notes are named NetBSD-CORE, while thread specific notes carry an additional @<lwp-id> suffix. The core dump starts with a procinfo structure carrying all signal and process information, including LWP ID of the thread receiving the signal. It is followed by the auxiliary vector. Afterwards, register dumps for every thread are included. Note that NetBSD stores the GPR dump in a separate PT_GETREGS note rather than combined with process info.

To summarize: notes are split into process-specific and thread-specific. On FreeBSD and Linux, each thread is described by a series of successive notes starting with a NT_PRSTATUS containing thread and signal information and a GPR dump. Each thread can describe a separate signal. Linux’s NT_PRSTATUS information also repeats redundant process information, most likely for historical reasons.

On NetBSD, each thread is identified by a unique note name. Thread-specific information carries explicit register dumps. The signal information is carried in the procinfo structure that also identifies to which thread the signal was sent. Only one signal can be described at a time.

Core dump support in LLDB

The elf-core plugin

LLDB uses a special process plugin called elf-core to process core dumps. This way, the API used to access core dumps is consistent with the one used to debug local processes, and the amount of special casing for core dump is limited to the minimum.

The elf-core plugin provides integrated support for core dumps on all supported platforms using the ELF file format. At the moment of writing, this includes FreeBSD, Linux, NetBSD and OpenBSD. The core parser reuses as much code as possible, with dedicated routines for platform-specific data.

The design of platform-specific notes is inlined inside LLDB. This makes it possible to read core dumps from any supported platform on any other platform. This also means that all core dump tests are run on every platform.

SaveCore() functionality

LLDB provides the ability to create a core dump of a running executable via the process save-core command. Originally, this functionality was implemented through object file plugins. Each of the plugins specified routines for writing a specified core file format. Unfortunately, the support for writing ELF core dumps was not present.

We extended this command to support writing core dumps via process plugins. This made it possible to expose the PT_COREDUMP and PT_DUMPCORE requests on FreeBSD and NetBSD respectively. If the process plugin indicates support for creating coredumps, LLDB prefers using its API over trying to use an object file plugin.

To implement this function, we had to extend the gdb-remote protocol. We have implemented a new qSaveCore packet. The support for the feature (including platform support for saving cores) is indicated via qSupported response containing qSaveCore+. The request packet is of the form:

qSaveCore[;path-hint:<path-hint>]

It optionally accepts a hint of the path where the core dump should be created on the server. The hint is entirely optional. If not provided or the server cannot use the specified path, a temporary file will be created instead. The server replies with:

core-path:<core-path>

That indicates the actual path used for the core dump. If the client is connecting to a remote server, it afterwards uses the vFile packets to transfer the core dump to the local system and remove the original file.

An example packet exchange looks like this:

1620306759.175889969 <  42> read packet: $qSaveCore;path-hint:2f746d702f74657374#d2
1620306759.177997112 <  32> send packet: $core-path:2f746d702f74657374#04

List of relevant commits

Changes merged upstream

Summary

This post concludes the next portion of our work sponsored by the FreeBSD Foundation. During the past four months, we have been working to improve the LLDB support for FreeBSD userland debugging and getting LLDB closer to being a full GDB replacement.

During the first two months, our primary focus was additional architecture support. The new FreeBSD plugin that was based on client-server architecture was originally written for x86 platform. We have extended it to cover all other architectures that were supported by the legacy plugin, and eventually removed the legacy plugin. We have put special focus on the ARM64 platform, reaching feature parity with AMD64.

Afterwards, we have focused on adding support for fork(2) and vfork(2) events in the debugger. This not only made it possible to debug forked processes but also avoid crashes when the debugged parent set breakpoints on code shared with its children.

Finally, we have worked on improving the support for core dumps. We have created tests to verify that core dumps from FreeBSD, Linux and NetBSD platforms are read correctly. We have fixed a few bugs related to reading FreeBSD and NetBSD core dumps. Finally, we have participated in adding a PT_COREDUMP request to FreeBSD and implemented its support along with the necessary gdb-remote protocol changes in LLDB.