Improving GDB protocol compatibility in LLDB
By Michał Górny
- 15 minutes read - 3068 wordsMoritz Systems have been contracted by the FreeBSD Foundation to continue our work on modernizing the LLDB debugger’s support for FreeBSD.
The primary goal of our contract is to bring kernel debugging into LLDB. The complete Project Schedule is divided into six milestones, each taking approximately one month:
-
Improve LLDB compatibility with the GDB protocol: fix LLDB implementation errors, implement missing packets, except registers.
-
Improve LLDB compatibility with the GDB protocol: support gdb-style flexible register API.
-
Support for debugging via serial port.
-
libkvm-portable and support for debugging kernel core files in LLDB, on amd64 + arm64 platform. Support for other platforms as time permits.
-
Support for debugging the running kernel on amd64 + arm64 platform. Support for other platforms as time permits.
-
Extra month for kgdb work, processing patches on LLDB reviews or miscellaneous tasks – as time permits. Examples of misc tasks: access to extended system and process information, starting processes via shell, $_siginfo support.
We have completed the first milestone, that is the work on improving protocol-level compatibility of LLDB with GDB, with the exception of register packets that are planned for the next milestone.
The purpose of this effort is to make it possible for LLDB to be used as a UI to other implementations of the GDB remote protocol, including the original gdbserver implementation, as well as other implementations provided e.g. by QEMU.
Resolving protocol incompatibilities
The protocol used for communication between GDB and gdbserver is documented in the GDB Remote Serial Protocol appendix to the GDB manual. It is a line-based plaintext protocol that permits both partial implementation and custom extensions. An extended and modified flavor of this protocol is used for the communication between LLDB and lldb-server.
The incompatibilities between the flavors used by GDB and LLDB could be classified into two main groups:
-
Incompatible packet implementations, i.e. both GDB and LLDB use the same packets but have slight differences in the actual implementation. In some cases, these differences affect only specific corner cases; e.g. LLDB used to interpret file descriptors as decimal, while GDB used hexadecimal notation — this does not cause problems while FDs stay below 10. Generally solved by changing LLDB to conform to GDB protocol.
-
Use of unsupported packets, i.e. LLDB uses custom (or obsolete) packets that are not uniformly supported by other implementations. This is generally solved by providing a fallback to GDB packets, or eliminating LLDB extensions entirely if they are redundant.
Many of our fixes effectively broke compatibility between different versions of LLDB and lldb-server. However, upstream developers have agreed that at this point mixing different tool versions is not really supported, and the gain from preserving backwards compatibility does not justify the effort. If need arises, we are ready to work on improving backwards compatibility in the future.
Summary of incompatibilities found
The first step of our work was to perform an audit of LLDB gdb-remote plugin source code for incompatibilities. We were comparing the LLDB implementation with either GDB documentation, or its source code whenever the documentation was insufficient.
The following table summarizes the incompatibilities we’ve found:
Packet | Available in | Incompatibility | Fix | |
---|---|---|---|---|
vFile: | open | both | GDB uses hexadecimals, LLDB decimals for parameters and return values; incompatible mode constants | sync to GDB |
pread | GDB uses hexadecimals, LLDB decimals for parameters and return values | |||
pwrite | ||||
close | ||||
size | LLDB | LLDB extensions | fallback via fstat | |
mode | ||||
exists | fallback via open | |||
MD5 | none (Darwin only) | |||
fstat | GDB | not supported by LLDB | implemented | |
A | LLDB | removed from gdbserver; LLDB uses decimals instead of hexadecimals | prefer vRun | |
vRun | GDB | new GDB packet for launching processes | implemented | |
qLaunchSuccess | LLDB | LLDB extension | made unnecessary with vRun | |
QEnvironment | LLDB | LLDB extension | fallback to QEnvironment |
Incompatibility details
vFile packets: incompatible parameters and responses
One of the incompatibilities between LLDB and GDB that was already known at the time of discussing the project was the implementation of vFile packets in LLDB. The incompatibilities were:
-
GDB used hexadecimal notation for file descriptors, offsets, sizes, return values and error numbers, while LLDB used decimal notation.
-
GDB and LLDB used different constants (bitmaps) for file opening modes. GDB used constants based on
open(2)
flags, while LLDB used a custom bitmap.
Solving the first problem was rather straightforward. We have decided to update not only packets common to GDB and LLDB but also LLDB-specific protocol extensions, to avoid confusion resulting from inconsistent use of decimal and hexadecimal notation. LLDB now uses hexadecimal notation completely, in conformance with the GDB protocol.
The file open mode problem was harder to solve, as the constants used by LLDB are used throughout the project and not just in the gdb-remote packets. The constants differed not only in the exact values but also in semantics:
-
in LLDB, ‘read’ and ‘write’ modes were represented by individual (non-zero) bits, and ‘read/write’ mode was enabled through their union
-
in GDB, there were three individual constants: ‘read only’, ‘write only’ and ‘read/write’, with the first one being represented by zero
In order to align the constants, we have changed LLDB’s internal semantics to follow that of GDB, that is to explicitly recognize and use the ‘read/write’ constant and not to rely on either of the three states being identifiable by simple binary intersection.
While working on this, we have also introduced a series of tests, both verifying that the LLDB client generates correct protocol packets, and that the server handles client packets correctly. We have covered all of the reasonably possible file opening modes, as well as the most common error conditions.
To use the platform commands, it is necessary to connect via platform connect
API rather than the gdb-remote
command. This can be done
e.g. via the following commands:
(lldb) platform select remote-gdb-server
Platform: remote-gdb-server
Connected: no
(lldb) platform connect connect://127.0.0.1:1234
Platform: remote-gdb-server
Hostname: (null)
Connected: yes
(lldb) platform file open /tmp/test.txt
File Descriptor = 5
(lldb) platform file read -c 64 5
Return = 36
Data = "some test data longer than 10 bytes
"
(lldb) platform file close 5
file 5 closed.
This results in the following GDB protocol exchange (from gdbserver debug output):
$ gdbserver --remote-debug --multi 127.0.0.1:1234
Listening on port 1234
Remote debugging from host 127.0.0.1, port 37792
[getpkt: discarding char '+']
getpkt ("QStartNoAckMode"); [sending ack]
[sent ack]
[noack mode enabled]
putpkt ("$OK#9a"); [noack mode]
[getpkt: discarding char '+']
getpkt ("qHostInfo"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("qGetWorkingDir"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("qQueryGDBServer"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("vFile:open:2f746d702f746573742e747874,00000202,00000000"); [no ack sent]
putpkt ("$F5#7b"); [noack mode]
getpkt ("vFile:pread:5,40,0"); [no ack sent]
putpkt ("$F24;some test data longer than 10 bytes
#79"); [noack mode]
getpkt ("vFile:close:5"); [no ack sent]
putpkt ("$F0#76"); [noack mode]
vFile packets: LLDB extensions
LLDB implements four custom vFile packets that are not implemented by GDB:
-
vFile:size that can be used to directly obtain the size of a remote file.
-
vFile:mode that can be used to directly obtain the mode bits (permissions) of a remote file.
-
vFile:exists that can be used to directly query whether a remote file exists.
-
vFile:MD5 that can be used to directly obtain a md5sum of a remote file without transferring its contents.
GDB implements a vFile:fstat packet that can be used to obtain
metadata of an open file, and that was not supported by LLDB.
The packet returns a binary-encoded copy of normalized struct stat.
It should be noted that, with the exception of size fields, all the data
is encoded as 32-bit integers, therefore it is unsuitable for reliably
transmitting all the data from a 64-bit fstat(2)
.
For completeness, it is worth noting that lldb-server includes a stub vFile:stat packet. It is unclear whether this was meant as a possible LLDB extension, or a typo for vFile:fstat.
To improve compatibility between LLDB and GDB, we have implemented vFile:fstat packet support both in LLDB client and server. Furthermore, we have extended the client support for vFile:size, vFile:mode and vFile:exists requests with fallbacks to vFile:fstat and vFile:open, therefore enabling the relevant client methods to work with remote gdbserver.
We have not implemented a fallback for vFile:MD5 since it is used only on Darwin.
We have covered all the added functions with tests, both for the client (including fallback cases) and the server. We also added a few convenience commands to utilize the new packets. Example session:
(lldb) platform get-size /tmp/test.txt
File size of /tmp/test.txt (remote): 35
(lldb) platform get-permissions /tmp/test.txt
File permissions of /tmp/test.txt (remote): 0o0644
(lldb) platform file-exists /tmp/test.txt
File /tmp/test.txt (remote) exists
The corresponding gdbserver debug output (including explicit unsupported responses followed by fallback logic):
getpkt ("vFile:size:2f746d702f746573742e747874"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("vFile:open:2f746d702f746573742e747874,00000000,00000000"); [no ack sent]
putpkt ("$F5#7b"); [noack mode]
getpkt ("vFile:fstat:5"); [no ack sent]
putpkt ("$F40;"); [noack mode]
getpkt ("vFile:close:5"); [no ack sent]
putpkt ("$F0#76"); [noack mode]
getpkt ("vFile:mode:2f746d702f746573742e747874"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("vFile:open:2f746d702f746573742e747874,00000000,00000000"); [no ack sent]
putpkt ("$F5#7b"); [noack mode]
getpkt ("vFile:fstat:5"); [no ack sent]
putpkt ("$F40;"); [noack mode]
getpkt ("vFile:close:5"); [no ack sent]
putpkt ("$F0#76"); [noack mode]
getpkt ("vFile:exists:2f746d702f746573742e747874"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("vFile:open:2f746d702f746573742e747874,00000000,00000000"); [no ack sent]
putpkt ("$F5#7b"); [noack mode]
getpkt ("vFile:close:5"); [no ack sent]
putpkt ("$F0#76"); [noack mode]
A and vRun packets
Both gdbserver and lldb-server can be started without a debugged process attached (in fact, this is how lldb-server is started by LLDB itself). In this case, the client needs to specify the executable to run and its command-line arguments (argv). This can be done using A and vRun packets.
Originally, GDB used the A packet but the modern versions have switched to vRun instead (and gdbserver does not support A anymore). Before our changes, LLDB used a slightly incompatible version of the A packet, and did not support the vRun packet at all.
Both packets are roughly equivalent. The A packet uses quite baroque syntax of:
A arglen,argnum,arg[,arglen,argnum,arg...]
For each successive command-line argument, the packet specifies its length and index. This information is entirely redundant since packets are transmitted in order, and arguments themselves are hex-encoded, therefore their length can be clearly determined.
The newer vRun packet avoids all this redundancy. It has the form of:
vRun;arg[;arg...]
In both packets, the first arguments serve simultaneously as path
to the executable and argv[0]
.
To resolve the incompatibility between GDB and LLDB, and at the same time optimize LLDB, we have implemented the vRun packet according to the specification and made LLDB use it by default, with fallback to the A packet.
Given that GDB does not support the latter packet and its description is incomplete in the documentation, we’ve decided that changing LLDB’s implementation of the A packet does not bring any tangible gain and does not outweigh the time spent on researching the historical GDB implementation and changing LLDB to match.
The LLDB implementation of the A packet expects an OK response. However, the vRun implementation follows GDB and returns a stop packet immediately.
LLDB issues an additional qLaunchSuccess
packet to query whether
the process was started successfully. This packet is not supported
by gdbserver — instead we assume success if the vRun
command
succeeded and returned a stop reason.
To launch a process via remote server, it is important to select the target before connecting to the remote:
(lldb) file a.out
Current executable set to '/home/mgorny/git/llvm-project/build/a.out' (x86_64).
(lldb) gdb-remote 1234
(lldb) process launch
Process 10875 launched: '/home/mgorny/git/llvm-project/build/a.out' (x86_64)
Process 10875 stopped
* thread #1, stop reason = signal SIGTRAP
frame #0: 0x0000555555555051 a.out`_start + 1
a.out`_start:
-> 0x555555555051 <+1>: inl %dx, %eax
0x555555555052 <+2>: movq %rdx, %r9
0x555555555055 <+5>: popq %rsi
0x555555555056 <+6>: movq %rsp, %rdx
The relevant gdbserver debug output:
$ gdbserver --remote-debug --multi 127.0.0.1:1234
Listening on port 1234
Remote debugging from host 127.0.0.1, port 37796
[getpkt: discarding char '+']
getpkt ("QStartNoAckMode"); [sending ack]
[sent ack]
[noack mode enabled]
putpkt ("$OK#9a"); [noack mode]
[getpkt: discarding char '+']
getpkt ("qSupported:xmlRegisters=i386,arm,mips,arc;multiprocess+"); [no ack sent]
putpkt ("$PacketSize=47ff;QPassSignals+;QProgramSignals+;QStartupWithShell+;QEnvironmentHexEncoded+;QEnvironmentReset+;QEnvironmentUnset+;QSetWorkingDir+;QCatchSyscalls+;qXfer:libraries-svr4:read+;augmented-libraries-svr4-read+;qXfer:auxv:read+;qXfer:siginfo:read+;qXfer:siginfo:write+;qXfer:features:read+;QStartNoAckMode+;qXfer:osdata:read+;multiprocess+;fork-events+;vfork-events+;exec-events+;QNonStop+;QDisableRandomization+;qXfer:threads:read+;ConditionalTracepoints+;TraceStateVariables+;TracepointSource+;DisconnectedTracing+;StaticTracepoints+;InstallInTrace+;qXfer:statictrace:read+;qXfer:traceframe-info:read+;EnableDisableTracepoints+;QTBuffer:size+;tracenz+;ConditionalBreakpoints+;BreakpointCommands+;QAgent+;Qbtrace:bts+;Qbtrace-conf:bts:size+;Qbtrace:pt+;Qbtrace-conf:pt:size+;Qbtrace:off+;qXfer:btrace:read+;qXfer:btrace-conf:read+;swbreak+;hwbreak+;qXfer:exec-file:read+;vContSupported+;QThreadEvents+;no-resumed+#c8"); [noack mode]
getpkt ("QThreadSuffixSupported"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("QListThreadsInStopReply"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("qHostInfo"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("vCont?"); [no ack sent]
putpkt ("$vCont;c;C;t;s;S;r#be"); [noack mode]
getpkt ("qVAttachOrWaitSupported"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("QEnableErrorStrings"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("qProcessInfo"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("qC"); [no ack sent]
putpkt ("$E01#a6"); [noack mode]
getpkt ("qfThreadInfo"); [no ack sent]
putpkt ("$E01#a6"); [noack mode]
getpkt ("QSetSTDIN:2f6465762f7074732f34"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("QSetSTDOUT:2f6465762f7074732f34"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("QSetSTDERR:2f6465762f7074732f34"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("QSetDisableASLR:1"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("QSetDetachOnError:1"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("QLaunchArch:x86_64"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("QEnvironment:G4SAIDXSDATA=/usr/share/Geant4-10.3.0/data/G4SAIDDATA1.1"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("QEnvironmentHexEncoded:4734534149445853444154413d2f7573722f73686172652f4765616e74342d31302e332e302f646174612f47345341494444415441312e31"); [no ack sent]
[QEnvironmentHexEncoded received '4734534149445853444154413d2f7573722f73686172652f4765616e74342d31302e332e302f646174612f47345341494444415441312e31']
[Environment variable to be set: 'G4SAIDXSDATA=/usr/share/Geant4-10.3.0/data/G4SAIDDATA1.1']
putpkt ("$OK#9a"); [noack mode]
[...]
getpkt ("vRun;2f686f6d652f6d676f726e792f6769742f6c6c766d2d70726f6a6563742f6275696c642f612e6f7574"); [no ack sent]
Process /home/mgorny/git/llvm-project/build/a.out created; pid = 10875
putpkt ("$T0506:0*,;07:e0ccf*"7f0* ;10:50d0fcf7ff7f0* ;thread:p2a7b.2a7b;core:2;#65"); [noack mode]
getpkt ("qLaunchSuccess"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("qProcessInfo"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("qC"); [no ack sent]
putpkt ("$QCp2a7b.2a7b#8a"); [noack mode]
getpkt ("qfThreadInfo"); [no ack sent]
putpkt ("$mp2a7b.2a7b#63"); [noack mode]
getpkt ("qsThreadInfo"); [no ack sent]
putpkt ("$l#6c"); [noack mode]
getpkt ("?"); [no ack sent]
putpkt ("$T0506:0*,;07:e0ccf*"7f0* ;10:50d0fcf7ff7f0* ;thread:p2a7b.2a7b;core:2;#65"); [noack mode]
[...]
QEnvironment and QEnvironmentHexEncoded packets
LLDB sends series of environment packets to set environment variables
before launching the remote process. Normally, LLDB uses two packets
for that: QEnvironment
and QEnvironmentHexEncoded
. The former
packet sends the environment variable in raw key=value
form,
the latter encodes it into a hexdump. LLDB chooses either packet
depending on whether the variable contains any characters that need
to be escaped.
However, GDB supports only QEnvironmentHexEncoded
(i.e. QEnvironment
is LLDB optimization). For this reason, we have
modified the environment sending logic to handle unsupported response
from QEnvironment
and fall back to QEnvironmentHexEncoded
in that case.
Example environment variable transmission demonstrating the fallback:
getpkt ("QEnvironment:G4SAIDXSDATA=/usr/share/Geant4-10.3.0/data/G4SAIDDATA1.1"); [no ack sent]
putpkt ("$#00"); [noack mode]
getpkt ("QEnvironmentHexEncoded:4734534149445853444154413d2f7573722f73686172652f4765616e74342d31302e332e302f646174612f47345341494444415441312e31"); [no ack sent]
[QEnvironmentHexEncoded received '4734534149445853444154413d2f7573722f73686172652f4765616e74342d31302e332e302f646174612f47345341494444415441312e31']
[Environment variable to be set: 'G4SAIDXSDATA=/usr/share/Geant4-10.3.0/data/G4SAIDDATA1.1']
putpkt ("$OK#9a"); [noack mode]
getpkt ("QEnvironmentHexEncoded:4c4f474e414d453d6d676f726e79"); [no ack sent]
[QEnvironmentHexEncoded received '4c4f474e414d453d6d676f726e79']
Signal number transmission
The signals are transmitted via the GDB Remote Protocol using their numbers. This means that special care needs to be taken when connecting to a server running on a remote platform using different signal codes. GDB and LLDB use different approaches to resolve this problem.
GDB defines a standard set of signals and their corresponding numbers. The remote gdbserver translates between platform’s signal numbers and sends the standardized codes over the protocol. The client is not required to have any knowledge of the remote platform’s signal layout. However, the protocol is unable to transmit signals not in the predefined set.
LLDB uses remote host’s signal numbers instead. The client obtains
the remote platform identification from server, and uses it to process
the signal numbers correctly. This includes a custom jSignalsInfo
packet transmitting remote signal information.
To provide the best compatibility, we have decided to use standardized
GDB signal codes when interacting with a foreign server implementation
and native signal codes when interacting with lldb-server (indicated via
native-signals+
qSupported property). Most importantly, this
preserves the ability to convey signals that are not part of
the standard set defined by GDB (e.g. SIGSTKFLT).
errno transmission
The vFile packets include the numerical errno value on errors. Similarly to signal codes, these values are not guaranteed to be consistent across systems.
GDB translates the system errno values into its own predefined constants. LLDB used to transmit raw system value and not account for cross-platform mismatches.
We have implemented GDB-style translation of errno values in LLDB.
Changes merged upstream
- [lldb] Make WatchpointList iterable
- [lldb] [gdb-remote client] Avoid zero padding PID/TID in H packet
- [lldb] [gdb-remote] Add eOpenOptionReadWrite for future gdb compat
- [lldb] [gdb-remote] Sync vFile:open mode constants with GDB
- [lldb] [gdb-remote] Use hexadecimal numbers in vFile packats for GDB compliance
- [lldb] [test] Fix TestGdbRemotePlatformFile with non-022 umask
[lldb] [test] Mark new vFile tests as XFAIL on Windows[lldb] [test] Use Windows-friendly modes in vFile O_CREAT tests- [lldb] [test] Mark vFile tests as LLGS-specific
- [lldb] [test] Skip all vFile tests on Windows
- [lldb] [Commands] Remove ‘append’ from ‘platform file open’ mode
- [lldb] [Commands] Fix reporting errors in ‘platform file read/write’
- [lldb] [gdb-server] Add tests for more vFile packets
- [lldb] [gdb-remote] Implement fallback to vFile:stat for GetFileSize()
- [lldb] Add new commands and tests for getting file perms & exists
- [lldb] [gdb-remote] Add fallbacks for vFile:mode and vFile:exists
- [lldb] [gdb-server] Implement the vFile:fstat packet
- [lldb] [gdb-remote] Implement vRun packet
- [lldb] [gdb-remote] Support QEnvironment fallback to hex-encoded
- [lldb] [gdb-remote] Use standardized GDB errno values
Changes pending review
Future plans
The next step of our work is to improve the protocol-wise compatibility between LLDB and GDB with regard to the packets used to read and write registers. At the moment, LLDB and GDB have their own register abstraction layers that are partially incompatible.
GDB’s register support could be said to be more low-level than LLDB’s. For example, the x86 XSAVE instruction exposes the contents of YMM registers in two parts — the lower half as an XMM register, and the higher half separately. GDB preserves this split in the packet layout and recombines the two parts into a dump of YMM register on the client. LLDB recombines them on the server and sends the complete YMM register instead (with some redundancy to the XMM register that is also transmitted).
At the moment, LLDB partially supports using GDB’s register layouts. However, this support has a few limitations:
-
LLDB does not provide convenient register aliases like GDB does, e.g. it is impossible to
register read ymm0
when connecting to gdbserver. Instead, the client needs to readxmm0
andymm0h
and manually recombine the value. -
There are cross-platform compatibility problems. For example, LLDB connecting to a gdbserver debugging a 32-bit executable on amd64 is unable to read registers.
Our goal is to resolve these problems, making it possible to use LLDB in place of GDB without any inconveniences. Resolving cross-platform compatibility problems is especially important, as using gdbserver to remotely debug embedded hardware is an important feature.
As the support for GDB-style register layouts is improved, it will probably make sense to adjust the LLDB layouts as well. For example, we will be able to remove the YMM register recombining logic from the server plugins, and remove the resulting redundancy from transmitted packets.