FreeBSD Remote Process Plugin is now the default in LLDB
By Michał Górny, Kamil Rytarowski
- 10 minutes read - 2059 wordsMoritz Systems have been contracted by the FreeBSD Foundation to modernize the LLDB debugger’s support for FreeBSD. We are working on a new plugin utilizing the more modern client-server layout that is already used by Darwin, Linux, NetBSD and (unofficially) OpenBSD. The new plugin is going to gradually replace the legacy one.
The Project Schedule is divided into three milestones, each taking approximately one month:
- M1 Introduce new FreeBSD Remote Process Plugin for x86_64 with basic support and upstream to LLVM.
- M2 Ensure and add the mandated features in the project (process launch, process attach (pid), process attach (name), userland core files, breakpoints, watchpoints, threads, remote debugging) for FreeBSD/amd64 and FreeBSD/i386.
- M3 Iterate over the LLDB tests. Detect and as time permits fix bugs. Ensure bug reports for each non-fixed and known problem. Add missing man pages and update the FreeBSD Handbook.
In the previous report we have announced the completion of the first project’s milestone, that is upstreaming the first functional version of the plugin. We have described the differences between the legacy plugin model and the modern server-client model. We have listed the major components involved in platform support and detailed a few problems found while implementing the support for x87 FPU registers.
This time we would like to announce the completion of the second milestone. We have reached feature parity with the original FreeBSD plugin on amd64 and i386 architectures. This made it possible to enable it by default on these two targets. In this article, we would like to uncover a few facts related to the work we have been doing in the past months, in particular explain more differences between FreeBSD and NetBSD.
Thread identifiers across platforms
A single process can have one or more threads. While process identifiers are standardized by POSIX, the same is not true about threads. Their implementations vary greatly across different platforms, in particular FreeBSD, NetBSD and Linux all use different approaches. The portable threads (pthread) library uses opaque types to avoid relying on a specific implementation.
Linux uses a combined namespace for process and thread identifiers.
The first thread of a process has the same identifier as the process
itself, while the remaining threads get identifiers that are globally
unique, and do not collide with the identifiers of other processes
or threads. The ptrace(2)
requests that operate on threads accept
thread identifiers in place of the PID.
Historically, NetBSD used an approach of separate process and thread
namespaces. Thread, or Lightweight Process (in BSD terminology)
Identifiers were local to the process and were not unique between
different processes. Starting with the upcoming release of NetBSD 10.0,
thread identifiers will be globally unique, similarly to the Linux
approach. However, for compatibility with the existing code and more
flexibility in the future, the syscalls will continue requiring
explicitly passing both process and thread identifiers. For this
reason, the ptrace(2)
requests working on a specific thread
generally dedicate the numeric data
argument for passing the LWP ID.
FreeBSD splits a single namespace into two ranges, respectively
for process and thread identifiers. Identifiers up to PID_MAX
are
used to identify processes, while the identifiers above this value are
used for threads. All thread identifiers are globally unique
and disjoint from process identifiers. The requests that operate
on threads accept thread identifiers in place of the PID.
The pthread library uses an opaque pthread_t
type to represent
thread identifiers, that are normally initialized by
pthread_create()
. The identifier for the current thread can be
returned by pthread_self()
and compared to other identifiers using
pthread_equal
. However, there is no portable way of printing it
or using it outside of the current process.
GNU/Linux provides a non-portable gettid()
function (<unistd.h>
)
that can be used to obtain the numeric thread identifier that can be
passed e.g. to tgkill()
. This function has been added in glibc
2.30. For compatibility with older versions of libc, SYS_gettid
syscall can be directly invoked instead.
On NetBSD, the numeric identifier of the current thread can be gotten
using _lwp_self()
(<lwp.h>
). This identifier needs to be
combined with the process ID e.g. before passing it to ptrace(2)
calls.
On FreeBSD, the numeric identifier of the current thread can be obtained
using pthread_getthreadid_np()
(<pthread_np.h>
).
Launching and attaching to processes
There are two primary ways to hook up a debugger to a process. You can either have the debugger start the program and therefore debug it from its entry point, or you can attach a process that is already running to the debugger. The second approach is especially useful to handle unexpected issues that occurred while using a program.
Launching a program inside the debugger is very similar to the POSIX
low-level method of starting a child executable — i.e. fork(2)
,
prepare the child environment and then execute the actual program
via execvp(3)
or alike. The difference is that the debugger issues
a PT_TRACE_ME
request in order to take control over the child just
before executing the actual executable. This implies that it needs to
automatically step through this final action before giving control
to the user.
Attaching to a running process is done using the PT_ATTACH
request.
The request takes a PID of the interesting process, attaches it to
the debugger and stops as a result.
In both cases, the debugger calls waitpid(2)
or alike
on the process, in order to confirm that it has stopped. On both
FreeBSD and NetBSD, it invokes the PT_SET_EVENT_MASK
request to
enable reporting events of interest — e.g. new or terminating threads.
Finally, it obtains a list of all threads of the running process.
The method of getting the thread list differs between FreeBSD
and NetBSD. On FreeBSD, PT_GETNUMLWPS
is invoked first to get
the number of active threads, then PT_GETLWPLIST
is used to get
the list of thread identifiers. On NetBSD, PT_LWPNEXT
is repeatedly
called to get information about successive threads.
SIGTRAP on FreeBSD and NetBSD
Generally, when a non-ignored signal is about to be delivered to
a debugged process, the process stops and the waitpid(2)
call or
alike issued by the debugger to monitor the process indicates
the signal. The debugger is then responsible for deciding how to handle
the signal and resuming the process.
The same mechanism is used to inform the debugger about other events related to the debugged process, such as:
- breakpoint and watchpoint hits
- single-stepping traps
- spawning new processes
- starting and exiting threads
- replacing the process via
exec(3)
- syscall entry and exit
More precisely, whenever such an event occurs, the kernel generates
an artificial SIGTRAP
signal that causes the process to be stopped.
Some of the events are signaled unconditionally, while others need
to be explicitly requested by setting the event mask via
PT_SET_EVENT_MASK
. Through inspecting the detailed signal/LWP
information, the debugger can determine which event has occurred.
The level of detail of SIGTRAP
data differs between FreeBSD
and NetBSD. On FreeBSD, the event and signal data can be found
in the structure returned by PT_LWPINFO
. On NetBSD, the signal data
is obtained via PT_GET_SIGINFO
, while additional event information
is provided by PT_GET_PROCESS_STATE
.
The following table illustrates the events and the method of reporting them.
Event | FreeBSD | NetBSD | ||
---|---|---|---|---|
pl_siginfo.si_code |
pl_flags |
psi_siginfo.si_code |
pe_report_event |
|
Breakpoint | TRAP_BRKPT |
TRAP_BRKPT |
||
Generic trace | TRAP_TRACE |
TRAP_TRACE |
||
Hardware DR trap | TRAP_DBREG |
|||
DTrace-induced trap | TRAP_DTRACE |
(FreeBSD-specific) | ||
Capabilities protective trap | TRAP_CAP |
(FreeBSD-specific) | ||
LWP (thread) created | PL_FLAG_BORN |
TRAP_LWP |
PTRACE_LWP_CREATE |
|
LWP (thread) exited | PL_FLAG_EXITED |
PTRACE_LWP_EXIT |
||
Syscall entry | PL_FLAG_SCE |
TRAP_SCE |
||
Syscall exit | PL_FLAG_SCX |
TRAP_SCX |
||
exec(3) |
PL_FLAG_EXEC |
TRAP_EXEC |
||
fork(2) (parent) |
PL_FLAG_FORKED |
TRAP_CHLD |
PTRACE_FORK |
|
fork(2) (child) |
PL_FLAG_CHILD |
|||
vfork(2) (parent) |
PL_FLAG_VFORKED |
PTRACE_VFORK |
||
vfork(2) (child) |
PL_FLAG_CHILD |
|||
vfork(2) (parent resumed) |
PL_FLAG_VFORK_DONE |
PTRACE_VFORK_DONE |
||
posix_spawn(3) (parent and child) |
(NetBSD-specific) | PTRACE_POSIX_SPAWN |
||
Note: fork(2) , vfork(2)
and posix_spawn(3) signals are issued both from
the parent (forking) process, upon reaching the syscall,
and from the child process, before executing the first
instruction. The vfork(2) syscall blocks parent
until the child exits or execs, and the kernel issues
an additional signal to the parent when that happens and it is
about to resume execution. The clone(2) function
causes the same signal as fork(2) or
vfork(2) , depending on its arguments.
|
Achieving Milestone 2 and updating tests
The most important goal for Milestone 2 was to reach feature parity with the legacy plugin, and ensure that there are no major regressions that would prevent LLDB from being used to actually debug programs. We have finally reached that point.
In order to reach this point, we had to implement missing features and fix a few nasty bugs. Most notably, we had to implement threading support and XSTATE-based register support. The watchpoint implementation originally written for NetBSD worked fine for FreeBSD but we wanted to provide a single, reusable mixin-style class rather than copying the same code to a third plugin (NetBSD watchpoint support was a modified version of the Linux code). While at it, we tried to make the code more readable.
Figuring out why attaching to processes did not work was particularly challenging. It turned out that it was a result of two different issues in the plugin code. Firstly, setting the process state to stopped while attaching caused lldb-server to try to prematurely emit the respective state packet. Since the attach method has not returned yet, the server crashed due to not having the process handle. Secondly, we were re-listing process' threads too late in the code, after marking all threads as stopped. As a result, the threads reported to LLVM were not marked as stopped.
Attaching to process by name was broken for both FreeBSD plugins, as well as NetBSD, possibly due to earlier changes in LLDB. The plugin code responsible for searching running processes involved an optimization that delayed fetching the process name until its other properties (such as PID, UID, GID…, if requested) were tested. However, the test also attempted to match the process name before it was read, and therefore always failed. We have fixed it to skip process name during the first verification.
Finally, we wanted to discover why expression parser did not work
correctly. The causal chain involved the parser engine claiming that
it can’t allocate memory, inability to find mmap
function, missing
shared libraries in process' module list and finally the method
responsible for getting information on memory regions. We postponed
the further investigation of this bug for the next milestone. Since
the legacy plugin did not implement this method at all, we have decided
to disable it for the time being and continue looking into the problem
later.
After resolving all these issues, we have decided to swap the default
plugin. Now, the (new) remote plugin is used by default with amd64
and i386 targets. The legacy plugin can be forced there by setting
FREEBSD_LEGACY_PLUGIN
environment variable to any value. It is
also used on other architectures that the new plugin has not been ported
yet (arm, arm64, mips, ppc).
Changes merged upstream
- [lldb] Enable FreeBSDRemote plugin by default and update test status
- [lldb] [test] Update XFAILs/skips for FreeBSD
- [lldb] [test/Shell] Pass -pthread to host toolchain on FreeBSD too
- [lldb] [test] Remove xfail from tests that pass on FreeBSD
- [lldb] [Process/FreeBSDRemote] Fix “Fix attaching via lldb-server”
- [lldb] [Plugins/FreeBSDRemote] Disable GetMemoryRegionInfo()
- [lldb] [Process/FreeBSDRemote] Remove GetSharedLibraryInfoAddress override
- [lldb] [Process/FreeBSDRemote] Fix attaching via lldb-server
- [lldb] [Host/{free,net}bsd] Fix process matching by name
- [lldb] [Process/FreeBSDRemote] Implement thread GetName()
- [lldb] [Process/FreeBSD] Fix missing namespace qualifier
- [lldb] [Process/FreeBSDRemote] Enable watchpoint support
- [lldb] [Process/Linux] Reuse NativeRegisterContextWatchpoint_x86
- [lldb] [test/Register] Use initial state for write tests
- [lldb] [Process/FreeBSDRemote] Fix #include for i386 compat
- [lldb] Split out NetBSD/x86 watchpoint impl for unification
- [lldb] [Process/FreeBSDRemote] Initial multithreading support
- [lldb] [Process/FreeBSDRemote] Support YMM reg via PT_*XSTATE
- [lldb] [test/Register] Add read/write tests for multithreaded process
- [lldb] [Process/FreeBSDRemote] Fix double semicolon
- [lldb] [Process/FreeBSDRemote] Kill process via PT_KILL
- [lldb] [Process/FreeBSD] Mark methods override in RegisterContext*
Plan for the next milestone
The third milestone focuses on resolving issues and updating documentation. We have initially marked most of the failing tests as ‘expected failures’ already. We have also established that some tests are producing unstable results. In the first place, we would like to go through these tests and attempt to fix the underlying bugs.
We are going to go through open LLDB bug reports, and establish whether they are still valid. We are going to close these that have been fixed already, and file new bugs for known issues that have not been reported yet.
We are also going to ensure that the FreeBSD documentation regarding LLDB is complete and up-to-date. This primarily involves adding missing manpages for LLDB tools.