How Debuggers Work: Getting and Setting x86 Registers, Part 2: XSAVE
By Michał Górny
- 18 minutes read - 3707 wordsIn the previous part of this article, I have described the basic methods
of getting and setting the baseline registers of 32-bit and 64-bit x86
CPUs. I have covered General Purpose Registers, baseline Floating-Point
Registers and Debug Registers along with their ptrace(2)
interface.
In the second part, I would like to discuss the XSAVE
family
of instructions. I will describe the different variants of this
instruction as well as explain the differences between them and their limitations.
Afterwards, I will compare the ptrace(2)
API used to access its data
on Linux, FreeBSD and NetBSD. Other systems such as OpenBSD
or DragonFly BSD do not provide requests to retrieve or set extended
registers, so the comparison may help them design their own APIs.
As I’ve explained earlier, the discussed instructions are necessary to implement context switching — the mechanism used by the Operating System to run multiple threads and processes quasi-simultaneously on the same processor. In order to perform that, the kernel needs to be able to save the values of all registers used by the program, and restore them afterwards. This information is also exposed to debuggers in order to provide them with means to introspect and alter the state of debugged programs.
The instructions described in the first part were sufficient to describe the registers used up to the early generations of Intel Core CPUs. However, as the next generations of processors introduced new instruction sets, it eventually became necessary to introduce new registers as well. In 2011, the AVX extensions present first in Intel’s Sandy Brige and afterwards in AMD’s Bulldozer microarchitecture doubled the sizes of earlier XMM registers, creating 16 new YMM registers.
The new registers can be used to store twice as large vectors of data, and perform operations on all of their elements simultaneously. This is particularly useful for heavy computations, for example in multimedia or cryptographic applications. Examples of programs that can explicitly take advantage of AVX instructions to improve their performance include the FFmpeg media decoding and encoding library or OpenCV image manipulation library.
As applications start using the new registers, it becomes necessary
for the kernel to be able to save and restore them as part of context
switching — otherwise the programs would lose data! The XSAVE
instruction set serves exactly that purpose. It was introduced
in the newer versions of Intel Core microarchitecture (2008). It is
used both in the 64-bit and 32-bit mode (although 32-bit programs can use
only a subset of the exposed registers).
The XSAVE
instruction extends the format used by FXSAVE
to
include additional register sets. However, unlike the earlier saving
instructions, it is not strictly limited to a fixed data set. Instead,
it makes it possible to introduce support for new CPU extensions without
the necessity of adding a next XSAVE
variant or breaking
compatibility with existing software. Furthermore, it accounts
for the possibility that some processors may choose not to implement
interim instruction sets.
The State Components
XSAVE
revolves around the concept of State Components. A state
component represents a single subset of data that can be saved or
restored independently. There are two special state components
corresponding to the original FXSAVE
instruction: the x86 state
component, and the SSE state component. Further instruction sets
introduce one or more components each.
In modern processors, there are two kinds of state components: user state components and supervisor state components. The former group represent regular registers that are accessible to userspace programs, the latter involves privileged registers that should not be exposed to regular programs.
The individual state components are controlled via the State Component
Bitmap. This bitmap is used by XSAVE
to determine which
instruction sets to save, and by XRSTOR
to determine which to
restore (or reset). Enabling the respective bits causes additional
data to be saved to the memory, effectively requiring larger storage
area.
In order to make it possible to save a particular state component or to use the respective registers in a program, the kernel needs to enable its tracking in one of the control registers. These control registers are XCR0 for user components, and IA32_XSS for supervisor components. Both use the same bit numbers as the state component bitmap.
Bit | Instr. set | User SC (XCR0) | Supervisor SC (IA32_XSS) | Size (bytes) |
---|---|---|---|---|
0 | x87 | x87 state | reserved | 512 |
1 | SSE | SSE state | reserved | |
2 | AVX | YMM_Hi128 | reserved | 256 |
3 | MPX | BNDREGS | reserved | 64 |
4 | BNDCSR | reserved | 16 | |
5 | AVX-512 | opmask | reserved | 64 |
6 | ZMM_Hi256 | reserved | 512 | |
7 | Hi16_ZMM | reserved | 1024 | |
8 | PT | reserved | PT | 72 |
9 | PKRU | PKRU | reserved | 4 |
13 | HDC | reserved | HDC | 8 |
The XSAVE Area Format
The data format used by the XSAVE
instruction is called the XSAVE
Area. The XSAVE Area consists of three parts: the 512-byte legacy
region that is the same as used by FXSAVE
instruction, followed
by the 64-byte XSAVE header containing information about the data
present in the XSAVE Area, followed by the variably sized extended
region used to store additional state components.
Similarly to FXSAVE
, all XSAVE
instructions have their -64
counterparts (e.g. XSAVE64
) that differ in the way FIP and FDP
registers are saved in the legacy region. More information on this,
along with a table describing the legacy region in detail, can be found
in the previous part of the article,
FXSAVE vs FXSAVE64 section.
The XSAVE header currently contains two 64-bit fields whose values
correspond to the state-component bitmaps: XSTATE_BV and XCOMP_BV.
XSTATE_BV is written by XSAVE
to indicate that a particular state
component has been written to the extended region, and read by
XRSTOR
to determine whether the component is to be restored
from this region (bit set) or reset to the default state (bit clear).
XCOMP_BV is written by the compacting variants of XSAVE
to indicate
that the compact form of XSAVE Area is being used and which components
are present in it, and read by XRSTOR
to distinguish this format.
64 | 0 | bits |
---|---|---|
XCOMP_BV | XSTATE_BV | 0 |
reserved | 128 | |
256 | ||
384 |
The extended region can be written either in the standard or compact
format. In the standard format, each state component is placed
at a fixed offset defined by the processor (and available via
CPUID
). If some of the state components are skipped, the relevant
portion of XSAVE Area is gapped to preserve offsets of the successive
components. In the compact format, the skipped components do not take
up space, and the remaining components are shifted to minimize space
usage. Therefore, the offsets depend on the components actually being
written, and need to be calculated by software for every invocation.
Standard format | Compact format |
---|---|
Legacy area (512 bytes) |
Legacy area (512 bytes) |
XSAVE header (64 bytes) |
XSAVE header (64 bytes) |
YMM_Hi128 (256 bytes) |
YMM_Hi128 (256 bytes) |
unused (MPX +
AVX-512) (1680 bytes) |
PT (72 bytes) |
(not allocated) | |
PT (72 bytes) |
Invoking XSAVE
There are a few preliminary steps that need to be done before invoking
any of the XSAVE
family of instructions. I will shortly list them
now.
Firstly, the support for the instruction needs to be verified
via CPUID
. Strictly speaking, the same is also true for FXSAVE
.
Secondly, the state tracking needs to be enabled. This means setting appropriate state component bits in XCR0 for user state components, and in IA32_XSS for supervisor state components. The appropriate XSAVE bit also needs to be set in the Control Register CR4. All of this is done by the kernel.
Thirdly, a buffer large enough for the XSAVE Area needs to be obtained.
The program should use CPUID
instruction to obtain the needed
size. The buffer needs to be aligned to 64 bytes. Usually, it may
be convenient to zero the buffer first, to avoid having to be careful
e.g. about XSAVE
leaving unused XSTATE_BV bytes unmodified.
Finally, the requested state component bitmap needs to be put into
the register pair EDX:EAX (the higher 32 bits into EDX, lower into EAX
— this is a common i386 convention for 64-bit integers). Once this
is done, XSAVE
can be invoked.
Afterwards, another series of CPUID
calls are necessary to obtain
offsets or sizes and alignment requirements to process the contents
of the XSAVE Area.
The listing below presents a simple program that calls XSAVE
three
times with different register sets modified.
#include <assert.h>
#include <inttypes.h>
#include <stdint.h>
#include <stdlib.h>
#include <stdio.h>
struct xsave {
uint8_t legacy_area[512];
union {
struct {
uint64_t xstate_bv;
uint64_t xcomp_bv;
};
uint8_t header_area[64];
};
uint8_t extended_area[];
};
int main() {
uint32_t buf_size = 0;
uint32_t avx_offset = 0;
uint8_t avx_bytes[32];
struct xsave* buf[3];
int i;
for (i = 0; i < sizeof(avx_bytes); ++i)
avx_bytes[i] = i;
__asm__ __volatile__ (
/* check CPUID support for XSAVE and AVX */
"mov $0x01, %%eax\n\t"
"cpuid\n\t"
"mov $0x04000000, %%eax\n\t" /* bit 26 - XSAVE */
"and %%ecx, %%eax\n\t"
"jz .cpuid_end\n\t"
"mov $0x10000000, %%eax\n\t" /* bit 28 - AVX */
"and %%ecx, %%eax\n\t"
"jz .no_avx\n\t"
/* get AVX offset */
"mov $0x0d, %%eax\n\t"
"mov $0x02, %%ecx\n\t"
"cpuid\n\t"
"mov %%ebx, %1\n\t"
"\n"
".no_avx:\n\t"
/* get XSAVE area size for current XCR0 */
"mov $0x0d, %%eax\n\t"
"xor %%ecx, %%ecx\n\t"
"cpuid\n\t"
"mov %%ebx, %0\n\t"
"\n"
".cpuid_end:\n\t"
: "=m"(buf_size), "=m"(avx_offset)
:
: "%eax", "%ebx", "%ecx", "%edx"
);
if (buf_size == 0) {
printf("no xsave support\n");
return 1;
}
printf("has avx: %s\n", avx_offset != 0 ? "yes" : "no");
printf("xsave area size: %d bytes\n", buf_size);
for (i = 0; i < 3; ++i) {
buf[i] = aligned_alloc(64, buf_size);
assert(buf[i]);
}
__asm__ __volatile__ (
"mov $0x07, %%eax\n\t"
"xor %%edx, %%edx\n\t"
"xsave (%0)\n\t"
"movd %%eax, %%mm0\n\t"
"xsave (%1)\n\t"
"and %3, %3\n\t"
"jz .xsave_end\n\t"
"vmovups (%3), %%ymm0\n\t"
"xsave (%2)\n\t"
"\n"
".xsave_end:\n\t"
:
: "r"(buf[0]), "r"(buf[1]), "r"(buf[2]),
"c"(avx_offset != 0 ? avx_bytes : 0)
: "%eax", "%edx", "%mm0", "%ymm0", "memory"
);
printf("XSTATE_BV (initial): %#018" PRIx64 "\n",
buf[0]->xstate_bv);
printf("XSTATE_BV (with MMX): %#018" PRIx64 "\n",
buf[1]->xstate_bv);
if (avx_offset != 0) {
printf("XSTATE_BV (with AVX): %#018" PRIx64 "\n",
buf[2]->xstate_bv);
printf("YMM0 most significant quadword: %#018" PRIx64 "\n",
*((uint64_t*)(((char*)buf[2]) + avx_offset)));
}
for (i = 0; i < 3; ++i)
free(buf[i]);
return 0;
}
On my NetBSD (9.99.74 amd64) system with Ryzen 5 3600, this program writes the following output:
has avx: yes
xsave area size: 832 bytes
XSTATE_BV (initial): 000000000000000000
XSTATE_BV (with MMX): 0x0000000000000001
XSTATE_BV (with AVX): 0x0000000000000007
YMM0 most significant quadword: 0x1716151413121110
The variants of the XSAVE instruction
XSAVE
(Intel Core, 2008) is the first register-saving instruction.
It saves the requested user state components (requests for supervisor state
components are ignored) into the XSAVE Area. All requested components
are written (if available), independently of whether they are actually
being used or not. The extended region of the XSAVE Area is written
in the standard format, and skipped components result in gaps.
XSAVEOPT
(Sandy Bridge, 2011) is a version of XSAVE
that supports two optimizations:
the init optimization, and the modified optimization. The init
optimization means that the requested state component will not
be written if it has not been changed compared to its initial state.
The modified optimization means that if XSAVEOPT
is writing
to the same memory area that was passed to XRSTOR
previously,
then the state component will not be written if it has not been modified
since it has been last restored. This assumes that the XSAVE Area
is not modified by the user between the two instructions. These two
optimizations can improve context switching performance by avoiding
unnecessary writes.
XSAVEC
(Skylake, 2015) is a version of XSAVE
that uses the compact XSAVE Area
format. Therefore, only these components that were explicitly requested
are saved into the XSAVE Area, in a packed format. It also uses
the init optimization in order to skip writing the components that
were not modified compared to their initial state. The XSAVEC
instruction can improve performance and might reduce memory usage
by skipping unnecessary components.
Finally, XSAVES
(Skylake, 2015) is a version of XSAVE
that combines the ability
to save supervisor components, compact format, and both init
and modified optimizations. The components are written only if they
were modified since their initial state, and since the previous
XRSTORS
invocation. This variant provides the best performance,
and is capable of reducing the memory footprint.
XRSTOR
is the restoring counterpart of XSAVE
, XSAVEOPT
and XSAVEC
. It automatically determines the XSAVE Area format
from the header region. XRSTORS
is the restoring counterpart
of XRSTORS
.
All of the aforementioned instructions take requested component bitmap as EDX:EAX register pair.
All of the instructions except for XSAVES
and XRSTORS
can be
executed by unprivileged processes. XSAVES
and XRSTORS
can only
be executed by the kernel.
Variant | State comp. | Area format | Optimization | |||
---|---|---|---|---|---|---|
user | sup. | standard | compact | init | mod. | |
XSAVE |
✓ | ✗ | ✓ | ✗ | ✗ | ✗ |
XSAVEOPT |
✓ | ✗ | ✓ | ✗ | ✓ | ✓ |
XSAVEC |
✓ | ✗ | ✗ | ✓ | ✓ | ✗ |
XSAVES |
✓ | ✓ | ✗ | ✓ | ✓ | ✓ |
The ptrace(2) API
The ptrace(2)
API used for other register sets is based
on the concept of filling a fixed size struct. Therefore, it does not
map cleanly into XSAVE
instruction that can return data of variable
length. While it is technically possible to simply keep adding new
ptrace(2)
requests as the kernel gains support for successive
state components, it seems better to embrace the idea and create an API
that is extensible as well. This is at least what Linux, FreeBSD
and NetBSD have done.
The Linux ptrace(2) API
Linux 2.6.34 added two new ptrace(2)
requests: PTRACE_GETREGSET
and PTRACE_SETREGSET
that provide a generic way to get any register
sets. They take a NT_*
constant identifying the interesting
register set as their third argument (addr
), and a struct iovec
that encapsulates the buffer’s address and length as their fourth
argument (data
). The getter writes the actual data length back into
the structure. An interesting advantage of this solution is that
the same constants are used for these two requests and to identify notes
in core dump files.
The available constants correspond to all regular register sets,
including Linux-specific user register structure, FSAVE, FXSAVE.
However, the most interesting to us is NT_X86_XSTATE
— this is how
the XSAVE Area is exposed.
The kernel only exposes methods to copy from and into the XSAVE Area.
The program needs to call CPUID
itself in order to determine
the buffer size and component offsets.
The FreeBSD ptrace(2) API
FreeBSD has three dedicated ptrace(2)
requests related to XSAVE:
PT_GETXSTATE_INFO
, PT_GETXSTATE
and PT_SETXSTATE
.
PT_GETXSTATE_INFO
takes a pointer to struct ptrace_xstate_info
as the third argument (addr
), and its size as the fourth argument
(data
). It fills the structure with the enabled XSAVE component
bitmap and the maximum XSAVE Area length.
PT_GETXSTATE
and PT_SETXSTATE
take a pointer to the buffer
as the third argument (addr
) and its size as the fourth argument
(data
). The buffer uses the same layout as the XSAVE Area itself.
While FreeBSD provides explicit API to get the buffer size, working
on the XSAVE Area itself still requires querying CPUID
to determine
the component offsets.
The NetBSD ptrace(2) API
NetBSD has gained a ptrace(2)
API to access the XSAVE Area last
year. It consists of two requests, PT_GETXSTATE
and PT_SETXSTATE
. Both requests take a struct iovec
that
encapsulates a pointer to struct xstate
and its (current) size,
as the third argument (addr
). Similarly to other register requests
on NetBSD, the fourth argument (data
) specified the LWP (thread)
identifier.
Unlike the other two systems, NetBSD does not use the raw XSAVE Area
but instead normalizes it into struct xstate
. The caller does not
need to worry about allocating appropriately sized buffer or determining
the layout of the XSAVE Area. The current size of struct xstate
covers all currently supported components, and since it is passed along
with the request, new fields can be added without breaking backwards
compatibility. Furthermore, the kernel can switch to using XSAVES
in the future without changing the user-visible struct xstate
.
Furthermore, the NetBSD structure provides an explicit field to control
XSAVE Area updates more precisely. This makes it possible to issue
a partial PT_SETXSTATE
without having to copy the existing values
for everything else from PT_GETXSTATE
.
Finally, NetBSD implements translation from both FSAVE
and FXSAVE
, making it possible to use PT_GETXSTATE
and PT_SETXSTATE
unconditionally on all x86 systems going as far
as NetBSD/i386 is supported. It is therefore a good replacement for
both PT_GETFPREGS
and PT_GETXMMREGS
, eliminating
the inconsistency between i386 and amd64.
You can read more about the design of NetBSD XSAVE support in the LLDB: watchpoints, XSTATE in ptrace() and core dumps report.
An Example Multiplatform Program
The listing below provides an example program that reads an YMM register
via the XSTATE ptrace(2)
API and then writes a modified value back.
The program is using conditional #if
blocks to provide compatibility
with FreeBSD, NetBSD and Linux. As such, it primarily demonstrates
the differences between the interfaces provided by these Operating
Systems.
#include <sys/types.h>
#include <sys/ptrace.h>
#include <sys/uio.h>
#include <sys/wait.h>
#if defined(__NetBSD__)
# include <x86/cpu_extended_state.h>
# include <x86/specialreg.h>
#elif defined(__FreeBSD__)
# include <x86/fpu.h>
# include <x86/specialreg.h>
#elif defined(__linux__)
# include <linux/elf.h>
#else
# error "unsupported platform"
#endif
#include <assert.h>
#include <inttypes.h>
#include <signal.h>
#include <stdint.h>
#include <stdio.h>
#include <unistd.h>
#include <cpuid.h>
void print_ymm(const char* name,
uint8_t xmm[16],
uint8_t ymm_hi[16]) {
int i;
printf("%20s: {", name);
for (i = 0; i < 16; ++i)
printf(" 0x%02x", xmm[i]);
for (i = 0; i < 16; ++i)
printf(" 0x%02x", ymm_hi[i]);
printf(" }\n");
}
int main() {
/* verify that AVX is supported */
uint32_t eax, ebx, ecx, edx;
if (!__get_cpuid(0x01, &eax, &ebx, &ecx, &edx) ||
!(ecx & bit_AVX)) {
printf("AVX not supported\n");
return 1;
}
#if !defined(__NetBSD__)
/* get the YMM offset for systems using the raw XSAVE Area */
assert (__get_cpuid_count(0x0d, 0x02, &eax, &ebx, &ecx, &edx));
uint32_t avx_offset = ebx;
#endif
#if defined(__linux__)
/* get the size of the XSAVE Area */
assert (__get_cpuid_count(0x0d, 0x00, &eax, &ebx, &ecx, &edx));
uint32_t xsave_size = ebx;
#endif
int ret;
pid_t pid = fork();
assert(pid != -1);
if (pid == 0) {
/* child -- debugged program */
uint8_t avx_bytes[32];
int i;
for (i = 0; i < sizeof(avx_bytes); ++i)
avx_bytes[i] = i;
/* request tracing */
#if !defined(__linux__)
ret = ptrace(PT_TRACE_ME, 0, NULL, 0);
#else
ret = ptrace(PTRACE_TRACEME, 0, NULL, 0);
#endif
assert(ret != -1);
print_ymm("in child, initial", avx_bytes, avx_bytes+16);
__asm__ __volatile__ (
"vmovups (%0), %%ymm0\n\t"
"int3\n\t"
"vmovups %%ymm0, (%0)\n\t"
:
: "b"(avx_bytes)
: "%ymm0", "memory"
);
print_ymm("in child, modified", avx_bytes, avx_bytes+16);
_exit(0);
}
/* parent -- the debugger */
/* wait for the child to become ready for tracing */
pid_t waited = waitpid(pid, &ret, 0);
assert(waited == pid);
assert(WIFSTOPPED(ret));
assert(WSTOPSIG(ret) == SIGTRAP);
/* get registers */
#if defined(__NetBSD__)
struct xstate xst;
struct iovec iov = { &xst, sizeof(xst) };
ret = ptrace(PT_GETXSTATE, pid, &iov, 0);
#elif defined(__FreeBSD__)
struct ptrace_xstate_info info;
ret = ptrace(PT_GETXSTATE_INFO, pid,
(caddr_t)&info, sizeof(info));
assert(ret == 0);
char buf[info.xsave_len];
ret = ptrace(PT_GETXSTATE, pid, buf, sizeof(buf));
#elif defined(__linux__)
char buf[xsave_size];
struct iovec iov = { buf, sizeof(buf) };
ret = ptrace(PTRACE_GETREGSET, pid, NT_X86_XSTATE, &iov);
#endif
assert(ret == 0);
/* SSE+AVX registers should have been requested */
#if defined(__NetBSD__)
assert(xst.xs_rfbm & XCR0_SSE);
assert(xst.xs_rfbm & XCR0_YMM_Hi128);
#elif defined(__FreeBSD__)
assert(info.xsave_mask & XFEATURE_ENABLED_SSE);
assert(info.xsave_mask & XFEATURE_ENABLED_YMM_HI128);
#endif
/* SSE+AVX registers should be in modified state */
#if defined(__NetBSD__)
assert(xst.xs_xstate_bv & XCR0_SSE);
assert(xst.xs_xstate_bv & XCR0_YMM_Hi128);
#elif defined(__FreeBSD__)
struct xstate_hdr* xst = (struct xstate_hdr*)&buf[512];
assert(xst->xstate_bv & XFEATURE_ENABLED_SSE);
assert(xst->xstate_bv & XFEATURE_ENABLED_YMM_HI128);
#elif defined(__linux__)
uint64_t xstate_bv = *((uint64_t*)&buf[512]);
assert(xstate_bv & 2); /* SSE */
assert(xstate_bv & 4); /* YMM_Hi128 */
#endif
#if defined(__NetBSD__)
uint8_t* xmm = xst.xs_fxsave.fx_xmm[0].xmm_bytes;
uint8_t* ymm_hi = xst.xs_ymm_hi128.xs_ymm[0].ymm_bytes;
#elif defined(__FreeBSD__)
uint8_t* xmm =
((struct savexmm*)buf)->sv_xmm[0].xmm_bytes;
uint8_t* ymm_hi =
((struct ymmacc*)&buf[avx_offset])[0].ymm_bytes;
#elif defined(__linux__)
uint8_t* xmm = &buf[160];
uint8_t* ymm_hi = &buf[avx_offset];
#endif
print_ymm("from PT_GETXSTATE", xmm, ymm_hi);
int i;
for (i = 0; i < 16; ++i) {
xmm[i] += 0x80;
ymm_hi[i] += 0x80;
}
print_ymm("set via PT_SETXSTATE", xmm, ymm_hi);
/* update the registers and resume the program */
#if defined(__NetBSD__)
ret = ptrace(PT_SETXSTATE, pid, &iov, 0);
#elif defined(__FreeBSD__)
ret = ptrace(PT_SETXSTATE, pid, buf, sizeof(buf));
#elif defined(__linux__)
ret = ptrace(PTRACE_SETREGSET, pid, NT_X86_XSTATE, &iov);
#endif
assert(ret == 0);
ret = ptrace(PT_CONTINUE, pid, (void*)1, 0);
assert(ret == 0);
/* wait for the child to exit */
waited = waitpid(pid, &ret, 0);
assert(waited == pid);
assert(WIFEXITED(ret));
assert(WEXITSTATUS(ret) == 0);
return 0;
}
Summary
The FXSAVE
instruction can store x86 registers up to the XMM
registers introduced with SSE. The XSAVE
and XRSTOR
family
of instructions can be used to save and restore the registers introduced
by newer instruction sets, e.g. the YMM registers introduced by AVX.
XSAVE
is specifically designed to allow introducing new register
sets without breaking backwards compatibility or requiring new variants
of the instruction.
The XSAVE
instructions are relatively harder to use than the methods
described in the first part of the article. The user needs to specify
the requested State Components. Depending on the XSAVE Area format
used by the instruction, the user also needs to obtain or compute
the appropriate buffer size and State Component offsets.
The additional variants of the XSAVE
instruction primarily provide
optimizations that aim to improve the performance of context switching.
These include skipping register sets that are in their initial state
or that have not been modified since the last XRSTOR
call, as well
as using a more compact XSAVE Area format. The privileged XSAVES
variant introduces additional State Components that are only available
to the supervisor.
The new instructions required an appropriately extensible ptrace(2)
API. Unlike the requests for earlier register sets, the API for XSTATE
varies greatly between FreeBSD, NetBSD and Linux. Both FreeBSD
and Linux expose the raw XSAVE Area, while NetBSD normalizes it into
a well-defined struct xstate
. However, all three systems share
the concept of explicitly specifying the buffer size, in order
to support future extensions.
This concludes the two-part article on working with register sets
via ptrace(2)
API. You should now have a rough idea why it is
necessary to save and restore the state of all registers on a system,
how the kernel does it on x86 and how the results are exposed to
the debugger. While the article was concerned only with x86,
and primarily on FreeBSD, NetBSD and Linux, this should give you good
foundations for further research. Many other architectures share
very similar concepts, and the large parts of ptrace(2)
API
are very similar across different architectures and Operating Systems
from the UNIX family.