FreeBSD Remote Process Plugin: Final Milestone Achieved
By Kamil Rytarowski, Michał Górny
- 14 minutes read - 2858 wordsMoritz Systems have been contracted by the FreeBSD Foundation to modernize the LLDB debugger’s support for FreeBSD. We are working on a new plugin utilizing the more modern client-server layout that is already used by Darwin, Linux, NetBSD and (unofficially) OpenBSD. The new plugin is going to gradually replace the legacy one.
The Project Schedule was divided into three milestones, each taking approximately one month:
- M1 Introduce new FreeBSD Remote Process Plugin for x86_64 with basic support and upstream to LLVM.
- M2 Ensure and add the mandated features in the project (process launch, process attach (pid), process attach (name), userland core files, breakpoints, watchpoints, threads, remote debugging) for FreeBSD/amd64 and FreeBSD/i386.
- M3 Iterate over the LLDB tests. Detect and as time permits fix bugs. Ensure bug reports for each non-fixed and known problem. Add missing man pages and update the FreeBSD Handbook.
In the previous report
we have announced the completion of the second project’s milestone,
that is achieving the feature parity with the legacy plugin and enabling
the new plugin by default on 32 and 64-bit x86. We have explained how different
platforms express process and thread identifiers and how SIGTRAP
is used
to deliver event notifications to the debugger. We have also described
the two alternative approaches on hooking the debugger up to the process -
either via launching it, or attaching to a running process.
The third milestone was focused on fixing bugs, updating the test suite state and documentation. We are proud to announce that this stage is finished as well, and therefore the whole contract is accomplished timely and successfully. In this article, we would like to shortly summarize our work and describe some of the more interesting areas of focus in detail.
A race condition while copying watchpoints to new threads
The primary goal in the third milestone was to go through failing tests and either fix them, or at least document the failures and mark the respective tests as expected to fail. The first really interesting problem we’ve found while investigating the commands/watchpoints/multiple_threads test. The purpose of the test is to verify that watchpoints work when the respective variables are altered by a non-main thread.
Originally, the test was done in two variants: with the watchpoint being
set before starting the new thread, and after starting it. The first
variant was supposed to verify whether LLDB correctly copies existing
watchpoints to new threads as they are being started. The second
variant verified whether the watchpoint
command correctly adds
the new watchpoint to all running threads.
What’s important here is that hardware-assisted watchpoints on x86 are configured via altering the state of Debug Registers. Like other register sets, the values of DRs are thread-local, and therefore the debugger needs to set them separately for every thread. Furthermore, new threads inherit the DR state from parent threads on FreeBSD, and our original watchpoint code relied on new threads having the correct DR at start.
However, there is a catch. The new thread is not reported to the debugger until it is actually ready to start. During this time, the DRs are copied from the parent thread and it continues execution. In fact, it is entirely feasible that the process is stopped due to breakpoint in the parent thread before the new thread is actually reported ready. This creates an ample opportunity for the user to set a new watchpoint, and this is precisely what happened to us during the test.
At this point, the debugger is not yet aware that another thread is being created. However, the kernel has already copied the Debug Register values from the parent thread. As a result, the new thread is created with the old DR values, while the debugger assumed that it had the new values instead.
We have reported this confusing behavior to the FreeBSD Bugzilla. For the time being, we’ve changed the plugin to explicitly copy DRs when a new thread is reported, therefore guaranteeing that any changes during the problematic period are propagated. We have also extended the original test to cover three scenarios: watchpoint set before requesting the new thread, watchpoint set immediately after requesting it (i.e. falling into our race condition) and watchpoint set after waiting for the new thread to actually start running (i.e. covering the original intent).
Simplifying the register reading and writing logic
The original register reading and writing logic in the new plugin has been inspired by the code present in the NetBSD plugin. It roughly consisted of a large switch-case construct that mapped enumeration values into appropriate operations on system structures. There were three large switches in total: one for reading register values, one for writing register values and one for mapping enumeration values from i386 to amd64 platform. Furthermore, the first two needed large separate variants for i386 and amd64.
At the same time, LLDB already carried another set of register information that was created via macros by inspecting struct field offsets and sizes. Unlike the plugin logic, it did not use system structures but instead inlined them. This is because the same structures are used to access core dumps, and avoiding system headers makes it possible to compile the code and inspect FreeBSD core dumps on other systems.
Unlike NetBSD, the Linux plugin actually reused the offsets and sizes from this data to access register sets. We have decided to follow suit, and replace the aforementioned custom logic with accesses based on offset and size values, and this allowed us to reduce code duplication significantly. We have also added platform-specific tests that verify that the offsets and sizes are correct, compared to system structures.
What’s even more important is that this change improved maintainability a lot. We have had hit cryptic bugs that turned out to be caused by wrong integer type being used inside the switch-case. Storing the sizes inside a list makes it possible to easily verify their correctness and avoid future bugs due to size mismatches.
Fixing cases of the legacy plugin being wrongly used
The process plugins in LLDB are split into two kinds: client plugins
and server plugins. Client plugins are used by the LLDB client, while
server plugins are used by lldb-server
to implement the remote
protocol. The legacy FreeBSD plugin is a client plugin - it is loaded
by LLDB and used to debug a program. The modern FreeBSD plugin is
a server plugin - it is loaded by the LLDB server and used to implement
the GDB remote protocol. Another plugin called gdb-remote
provides
a glue between the client and server. It is loaded by the client,
it spawns lldb-server and fulfills client’s requests by communicating
with the server.
Therefore, by switching between the legacy and remote FreeBSD plugins,
we are actually switching between using the legacy client plugin
and the gdb-remote
plugin that spawns lldb-server with the remote
FreeBSD plugin. Our original switching logic (based on the prior art
from the Windows plugin) consisted of two pieces: a boolean switch in
PlatformFreeBSD
and a code blocking the legacy plugin from being loaded when the new
plugin should be used. However, we have established that the latter
is not really necessary, and we have removed the latter part as we
changed the preferred plugin.
During the final testing period, we’ve found and fixed two cases where
this was not correct: when choosing plugin for process connect
,
and when attaching to a running process.
The process connect
command is supposed to iterate through all
available process plugins, find one that initializes successfully
and use it to establish a connection to the server. However, it lacked
any means of actually determining whether the plugin in consideration
supported remote connections at all. This was acceptable for
non-transitional platforms that had only one candidate client plugin.
However, on FreeBSD it could randomly choose either the legacy plugin,
or the gdb-remote
plugin. To resolve this, we have added explicit
filtering for remote connection support,
using similar approach as for determining core file support.
The plugin used for launching and attaching processes was supposed to be controlled by the aforementioned boolean switch. If the new plugin was to be used, the method returned true and the launch/attach implementation from PlatformPOSIX was being used. Otherwise, it returned false and the legacy plugin kicked in.
The PlatformPOSIX::DebugProcess()
method used to launch programs
explicitly forced the gdb-remote
plugin. However,
the PlatformPOSIX::Attach()
method did not specify the plugin name
and could therefore use either. To fix this, we’ve updated it to force
gdb-remote
consistently within the class.
The interaction between dynamic loader and the debugger
The dynamic loader is the system component responsible for loading
shared libraries that are used by the program. This includes both
loading the linked libraries as specified by DT_NEEDED
ELF header,
and loading additional modules at runtime via dlopen(3)
.
The dynamic linker provides a r_debug
structure that can be used
by the debugger to inspect its state, as well as monitor events - that
is, loading and unloading shared libraries. The r_debug
structure
is consistent across most of the Unix systems (with Solaris being
an exception). On FreeBSD, it is declared in <sys/link_elf.h>
as:
struct r_debug {
int r_version; /* Currently '1' */
struct link_map *r_map; /* list of loaded images */
void (*r_brk)(struct r_debug *, struct link_map *);
/* pointer to break point */
enum {
RT_CONSISTENT, /* things are stable */
RT_ADD, /* adding a shared library */
RT_DELETE /* removing a shared library */
} r_state;
void *r_ldbase; /* Base address of rtld */
};
The r_version
field specifies the structure version. The newest
releases of FreeBSD and NetBSD both use version 0 of the SVR4 rendezvous protocol.
Linux uses version 1, and the future releases of FreeBSD and NetBSD will use it too.
The only difference between the two versions is the presence
of r_ldbase
field. It is worth noting that using version 1 has
the additional advantage of clearly indicating that the structure
has been initialized.
The r_map
field is a pointer to an array of link_map
structures
providing information about the currently loaded shared libraries.
The r_brk
provides an address to a function that is called by dynamic loader
on state changes. The debugger is expected to set a breakpoint on this
function in order to act on these events.
The r_state
field indicates the current dynamic loader state. There are three
states defined: consistent indicating that a new stable state has been
achieved, add indicating that the loader is about to load new
libraries and delete indicating that it is about to unload libraries.
Finally, r_ldbase
specifies the memory address at which the dynamic
loader itself is loaded.
When the dynamic linker is about to load a new module, it triggers
the r_brk
breakpoint (called the rendezvous breakpoint in LLDB)
with an r_state
of add. When it is about to unload a module,
it calls it with an r_state
of delete. In both cases, r_map
does not include the new modules yet. The debugger can use this
to save the current list of modules for comparison.
After the modules are loaded or unloaded, the breakpoint is hit again,
with the consistent r_state
. At this point, LLDB updates its
loaded module list.
One curious difference between Linux and FreeBSD is how the initial set
of shared libraries (DT_NEEDED
) is reported. On Linux, it is
reported at the very beginning of the program via a regular
added-consistent series of hits. On the first (added state)
breakpoint hit, the module list contains only the dynamic loader itself.
On the second (post-add) hit, it contains all the shared libraries.
On FreeBSD, there is only one (consistent) breakpoint hit during which
all the shared libraries are already present in r_map
.
LLDB’s POSIX Dynamic Loader plugin has been originally written with
the Linux behavior in mind, particularly expecting an explicit add
event for the dynamically loaded shared libraries. As a result, it has
failed to include the DT_NEEDED
libraries in the loaded module list.
A side effect of this is that it also did not skip the dynamic loader
itself on Linux.
We have prepared a patch adding all libraries from the initial
breakpoint hit that resolved
the FreeBSD problem and therefore unblocked enabling memory map support.
However, we had to revert it since it caused the dynamic loader module
to be loaded twice on Linux. We have established that this is caused
by the module being loaded using two different paths
(the ld-linux-x86-64.so.2
symlink and actual ld-2.32.so
file),
and LLDB relying on exact path match for deduplication.
Other significant changes and fixes
Besides the problems we’ve described in detail above, the final milestone work included a few more important fixes, notably:
-
Removing thread name caching that caused LLDB not to reflect thread name changes during process' runtime.
-
Adding support for
exec()
events. -
Fixing handling of user-raised
SIGTRAP
. -
Adding
fip
andfdp
registers on amd64 that provide convenient access to the full 64-bit values of these FPU registers (this is a followup on FIP/FDP register problems from our first report). -
Translating
ftag
to its full value, consistently with GDB behavior (this is a followup on ftag register problems from our first report).
Digesting of the changes
The final results of the execution of the LLDB regression on FreeBSD 13.0-CURRENT amd64:
Unsupported : 453
Passed : 1766
Expectedly Failed: 4
This test results reflect the pristine LLVM development branch (revision 25c40a45999e59e3b2902cd91373cd47e7a93488) with the dynamic loader patch patch applied.
For comparison, the results on Linux 5.9.13 x86_64:
Unsupported : 326
Passed : 1904
Expectedly Failed: 1
We have ensured that all non-fixed and known problems have documented Problem Reports in LLVM’s Bugzilla.
To find annotated failing or skipped tests, try:
find lldb/test/API -type f \
-exec grep -i '\(expectedFail\|skipIf\).*freebsd' {} +
The lldb-server program has been documented in a form of a manual page. Originally, the lldb.1
contributed by The FreeBSD Foundation file was written in a raw troff format, but it was recently rewritten by upstream in a Sphinx format and it is currently generated on the fly, during the build.
The FreeBSD Handbook was patched accordingly to mention the LLDB remote debugging capabilities. We expect to see this change merged once LLDB 12.0 is released.
Changes merged upstream
- [lldb] [docs] Add a manpage for lldb-server
- [lldb] [test] Remove duplicate xfail for Testtypedef
- [lldb] [test] Fix continue_to_breakpoint() args in TestThreadStepOut
- [lldb] [Process/FreeBSDRemote] Implement GetLoadedModuleFileSpec() and GetFileLoadAddress()
- [lldb] [Platform/POSIX] Use gdb-remote plugin when attaching
- [lldb] [test] Link FreeBSD test failures to bugs
- [lldb] [test] Reenable two passing tests on FreeBSD
- [lldb] [test] Restore Windows-skip on ‘process connect’ tests
- [lldb] Prevent ‘process connect’ from using local-only plugins
- [lldb] [Process/FreeBSDRemote] Fix regset names and related tests
- [lldb] [test] Fix qRegisterInfo lldb-server tests to handle missing registers
- [lldb] [Process/Utility] Declare register overlaps between ST and MM
- [lldb] [Process/FreeBSD] Add missing ‘override’ kws to POSIXStopInfo
- [lldb] Reland “Use translated full ftag values”
- [lldb] [test/Register] XFAIL x86-fp-write on Darwin
- Revert “[LLDB] Fixing lldb/test/Shell/Register/x86-fp-write.test”
- [lldb] Use translated full ftag values
- [lldb] Add explicit 64-bit fip/fdp registers on x86_64
- [lldb] [test] Un-XFAIL tests on freebsd/i386
- [lldb] [test] Un-XFAIL TestMultipleDebuggers.py
- [lldb] [test] Mark command-process-connect.test XFAIL
- [lldb] [test] Pass -mmmx to x86-gp-write test explicitly
- [lldb] [Process/FreeBSDRemote] Optimize regset pointer logic
- [lldb] [Process/FreeBSDRemote] Modernize and simplify YMM logic
- [lldb] [Process/FreeBSDRemote] Access debug registers via offsets
- [lldb] [Process/FreeBSDRemote] Access FPR via RegisterInfo offsets
- [lldb] [Process/FreeBSDRemote] Access GPR via reginfo offsets
- [lldb] [test] Add a minimal test for x86 dbreg reading
- [lldb] [Process/Utility] Fix DR offsets for FreeBSD
- [lldb] [Process/NetBSD] Copy the recent improvements from FreeBSD
- [lldb] [Process/FreeBSDRemote] Explicitly copy dbregs to new threads
- [lldb] [Process/FreeBSDRemote] Correct DS/ES/FS/GS register sizes
- [lldb] [Process/FreeBSDRemote] Fix handling user-generated SIGTRAP
- [lldb] [test] Rename ‘.categories’ to ‘categories’
- [lldb] [test] Skip ObjC-based tests via ‘objc’ category
- [lldb] [Process/NetBSD] Correct DS/ES/FS/GS register sizes
- [lldb] [Host/freebsd] Set Arg0 for ‘platform process list -v’
- [llvm] [Support] Fix segv if argv0 is null in getMainExecutable()
- [lldb] [test] Extend watchpoint test to wait for thread to start
- [lldb] [Process/FreeBSDRemote] Handle exec() from inferior
- [lldb] [test] Use skipUnlessDarwin for tests specific to Darwin
- [lldb] [test] Un-skip one of TestRaise signals on fbsd
- [lldb] [test] Avoid double negation in llgs/debugserver logic
- [lldb] [Process/FreeBSDRemote] Remove thread name caching
- [lldb] [test] Fix TestGdbRemoteThreadName code on FreeBSD
Summary of the third and the last milestone
The third milestone finalizes our current contract with the FreeBSD Foundation. The introduced changes are expected to be shipped with LLDB 12.0, and where applicable in FreeBSD 13.0.
During our work, the FreeBSD Project gained numerous important improvements: in the kernel, userland base libraries (the dynamic loader) and the LLVM toolchain FreeBSD support. The overall experience of FreeBSD/LLDB developers and advanced users on this rock solid Operating System reached the state known from other environments. Furthermore, the FreeBSD specific work resulted in generic improvements, enhancing the LLDB support for Linux and NetBSD.
Now, after concluding the FreeBSD work, we are also planning to use our new experience to merge improvements back to the NetBSD plugin, which was used as a starting point for the whole FreeBSD work.
This work was sponsored by The FreeBSD Foundation and we are grateful for this great development challenge from the FreeBSD Project.