Skip to content

Commit abbe628

Browse files
authored
publish: linux coredumpts part 1 (#553)
### Summary Linux coredumps -- part 1! ### Test Plan - [x] Check links - [x] Check images
1 parent 82a3948 commit abbe628

File tree

1 file changed

+107
-94
lines changed

1 file changed

+107
-94
lines changed

_drafts/linux_coredump.md renamed to _posts/2025-02-14-linux-coredumps-part-1.md

Lines changed: 107 additions & 94 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,23 @@
11
---
2-
title: Coredumps at Memfault Part 1 - Introduction to Linux Coredumps
2+
title: Linux Coredumps (Part 1) - Introduction
33
description:
44
"The basics of Linux coredumps, how they're used at Memfault, and how they're
55
captured."
66
author: blake
7+
tags: [linux, coredumps, memfault, debugging]
78
---
89

910
One of the core features of the Memfault Linux SDK is the ability to capture and
10-
analyze crashes. Since the inception of the SDK we've been slowly expanding our
11+
analyze crashes. Since the inception of the SDK, we've been slowly expanding our
1112
crash capture and analysis capabilities. Starting from the standard ELF
12-
coredump, we've added support for capturing only the stack memory, and even
13+
coredump, we've added support for capturing only the stack memory and even
1314
capturing just the stack trace with no registers and locals present. This
14-
article series will give you a high level overview of that journey, and give you
15-
a deeper understanding of how coredumps work on Linux.\*\*\*\*
15+
article series will give you a high-level overview of that journey and a deeper
16+
understanding of how coredumps work on Linux.
1617

1718
<!-- excerpt start -->
1819

19-
In this article we'll start by taking a look at how a Linux coredump is
20+
In this article, we'll start by taking a look at how a Linux coredump is
2021
formatted, how you capture them, and how we use them at Memfault.
2122

2223
<!-- excerpt end -->
@@ -27,43 +28,43 @@ formatted, how you capture them, and how we use them at Memfault.
2728

2829
## What is a Linux Coredump
2930

30-
A linux coredump represents a snapshot of the crashing process' memory. It can
31+
A Linux coredump represents a snapshot of the crashing process' memory. It can
3132
be loaded into programs like GDB to inspect the state of the process at the time
3233
of crash. It is written as an ELF[^elf_format] file. The entirety of the ELF
3334
format is outside the scope of this article, but we will touch on a few of the
3435
more important bits when looking at an ELF core file.
3536

36-
## What triggers a cordump
37+
## What Triggers a Coredump
3738

3839
Coredumps are triggered by certain signals generated by or sent to a program.
3940
The full list of signals can be found in the signal man page[^man_signal]. Here
4041
are the signals that cause a coredump:
4142

42-
- SIGABRT: Abnormal termination of the program, such as a call to abort.
43-
- SIGBUS: Bus error (bad memory access).
44-
- SIGFPE: Floating-point exception.
45-
- SIGILL: Illegal instruction.
46-
- SIGQUIT: Quit from keyboard.
47-
- SIGSEGV: Invalid memory reference.
48-
- SIGSYS: Bad system call.
49-
- SIGTRAP: Trace/breakpoint trap.
50-
51-
Of these the most common culprits you'll likely see are `SIGSEGV`, `SIGBUS`, and
52-
`SIGABRT`. These are signals that will be generated when a program tries to
53-
access memory that it doesn't have access to, tries to dereference a null
54-
pointer, or when the program calls `abort`. These typically indicate a fairly
55-
serious bug in either your program, or the libraries that it uses.
56-
57-
Coredumps are very useful in these situations, as generally you're going to want
58-
to inspect the running state of the process a the time of crash. From the
59-
coredump you can get a backtrace of the crashing thread, the values of the
43+
- `SIGABRT`: Abnormal termination of the program, such as a call to abort.
44+
- `SIGBUS`: Bus error (bad memory access).
45+
- `SIGFPE`: Floating-point exception.
46+
- `SIGILL`: Illegal instruction.
47+
- `SIGQUIT`: Quit from keyboard.
48+
- `SIGSEGV`: Invalid memory reference.
49+
- `SIGSYS`: Bad system call.
50+
- `SIGTRAP`: Trace/breakpoint trap.
51+
52+
Of these, the most common culprits you'll likely see are `SIGSEGV`, `SIGBUS`,
53+
and `SIGABRT`. These signals will be generated when a program tries to access
54+
memory that it doesn't have access to, tries to dereference a null pointer, or
55+
when the program calls `abort`. These typically indicate a fairly serious bug in
56+
either your program or the libraries that it uses.
57+
58+
Coredumps are very useful in these situations, as generally, you're going to
59+
want to inspect the running state of the process at the time of crash. From the
60+
coredump, you can get a backtrace of the crashing thread, the values of the
6061
registers at the time of crash, and the values of the local variables at each
6162
frame of the backtrace.
6263

63-
## How are coredumps enabled/collected
64+
## How are Coredumps Enabled/Collected
6465

6566
Enabling coredumps on your Linux device requires a few configuration options. To
66-
start with you'll need the following options enabled on your kernel at a
67+
start with, you'll need the following options enabled on your kernel at a
6768
minimum:
6869

6970
```c
@@ -74,30 +75,42 @@ CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
7475
These settings will enable the kernel to generate coredumps, as well as set the
7576
default mappings that are present in the coredump. `man core`[^man_core]
7677
provides a good overview of the options available to you when configuring
77-
coredumps.
78+
coredumps. It's worth noting that these options are enabled for most distros by
79+
default.
7880

79-
### core_pattern
81+
In addition the kernel configuration, you'll need to set the `ulimit` for the
82+
process that you want to capture a coredump for. The `ulimit` command is used to
83+
set the resource limits for a process. The `core` resource limit is the one
84+
we're interested in. This sets the maximum size of a coredump that can be
85+
generated by a process. To make things easy, you can set it to unlimited with
86+
the following command:
87+
88+
```bash
89+
ulimit -c unlimited
90+
```
91+
92+
### `core_pattern`
8093

8194
The kernel provides an interface for controlling where and how coredumps are
8295
written. The `/proc/sys/kernel/core_pattern`[^man_core] file provides two
8396
methods for capturing coredumps from crashed processes. A coredump can be
84-
written directly to a file by providing a path directly to it. For example if we
85-
wanted to write the core file to our `/tmp` directory with both the process name
86-
and the pid we would write the following to `/proc/sys/kernel/core_pattern`.
97+
written directly to a file by providing its path. For example, if we wanted to
98+
write the core file to our `/tmp` directory with both the process name and the
99+
pid, we would write the following to `/proc/sys/kernel/core_pattern`.
87100

88101
```bash
89102
/tmp/core.%e.%p
90103
```
91104

92-
In this example `%e` expands to the name of the crashing process, and `%p`
105+
In this example, `%e` expands to the name of the crashing process, and `%p`
93106
expands to the PID of the crashing process. More information on the available
94107
expansions can be found in the `man core`[^man_core] page.
95108

96-
We can also pipe a coredump directly to a program. This is useful when we want
109+
We can also pipe a coredump directly to a program, which is useful when we want
97110
to modify the coredump in flight. The coredump is streamed to the provided
98-
program via `stdin`. The configuration is similar to saving directly to a file
111+
program via `stdin`. The configuration is similar to saving directly to a file ,
99112
except the first character must be a `|`. This is how we capture coredumps in
100-
the Memfault SDK, and will be covered more in depth later in the article.
113+
the Memfault SDK, and will be covered more in-depth later in this article.
101114

102115
#### `procfs` Shallow Dive
103116

@@ -106,32 +119,32 @@ program that is being piped to exits, we have access to the `procfs` of the
106119
crashing process. But what is `procfs`, and how does it help us with a coredump?
107120

108121
`procfs` gives us direct, usually read-only, access to some of the kernel's data
109-
structures[^man_proc]. This can be system wide information, or information about
110-
individual processes. For our purposes we are interested mostly in the
111-
information about the process that is currently crashing. We can get direct read
112-
only access to all mapped memory by address through
113-
`/proc/<pid>/mem`[^man_proc_pid_mem], or look at the command line arguments of
114-
the process through `/proc/<pid>/cmdline`[^man_proc_pid_cmdline].
122+
structures[^man_proc]. This can be system-wide information, or information about
123+
individual processes. We are mostly interested in information about the process
124+
that is currently crashing. We can get direct, read-only access to all mapped
125+
memory by address through `/proc/<pid>/mem`[^man_proc_pid_mem], or look at the
126+
command line arguments of the process through
127+
`/proc/<pid>/cmdline`[^man_proc_pid_cmdline].
115128

116129
## Elf Core File Layout
117130

118131
Linux coredumps use a subset of the ELF format. The coredump itself is a
119132
snapshot of the crashing process' memory, as well as some metadata to help
120-
debuggers understand the state of the process at the time of crash. We will
133+
debuggers understand the state of the process at the time of the crash. We will
121134
touch on the most important aspects of the core file in this article. We will
122-
not be doing an exhaustive dive into the ELF format, however, if you are
135+
not be doing an exhaustive dive into the ELF format; however, if you are
123136
interested in learning more about the ELF format, the ELF File
124137
Format[^elf_format] is a great resource.
125138

126139
![]({% img_url linux-coredump/elf-core-layout.png %})
127140

128141
### ELF Header
129142

130-
The above image gives us a very high level view of the layout of a coredump. To
131-
start, the ELF header outlines the layout of the file and source of the file. We
132-
can see if the producing system was 32-bit or 64-bit, little or big endian, and
133-
the architecture of the system. Additionally it shows the offset to the program
134-
headers. Here is the layout of the ELF header[^elf_format]:
143+
The above image gives us a very high-level view of the layout of a coredump. To
144+
start, the ELF header outlines the layout of the file and the source of the
145+
file. We can see if the producing system was 32-bit or 64-bit, little or big
146+
endian, and the architecture of the system. Additionally it shows the offset to
147+
the program headers. Here is the layout of the ELF header[^elf_format]:
135148

136149
```c
137150
typedef struct {
@@ -158,20 +171,20 @@ discussion are broken down below:
158171
- `e_ident`: This field is an array of bytes that identify the file as an ELF
159172
file.
160173
- `e_type`: This field tells us what type of file we are looking at. For our
161-
purposes this will always be `ET_CORE`.
174+
purposes, this will always be `ET_CORE`.
162175
- `e_machine`: This field tells us the architecture of the system that produced
163176
the file. Common values here are
164177
[`EM_ARM`](https://github.com/torvalds/linux/blob/c45323b7560ec87c37c729b703c86ee65f136d75/include/uapi/linux/elf-em.h#L26)
165-
for 32 bit ARM, and
178+
for 32-bit ARM, and
166179
[`EM_AARCH64`](https://github.com/torvalds/linux/blob/c45323b7560ec87c37c729b703c86ee65f136d75/include/uapi/linux/elf-em.h#L46)
167180
for aarch64.
168181
- `e_phoff`: This field tells us the offset to the program headers.
169182
- `e_phentsize`: This field tells us the size of each program header.
170183

171184
### Program Headers and Segments
172185

173-
The meat of our coredump exists in the program headers. There are a wide variety
174-
of program header types defined in the Elf File Format[^elf_format]. From the
186+
The meat of our coredump exists in the program headers. A wide variety of
187+
program header types are defined in the Elf File Format[^elf_format]. From the
175188
perspective of the coredump, however, we are primarily interested in the
176189
`PT_NOTE` and `PT_LOAD` program headers.
177190

@@ -192,8 +205,8 @@ typedef struct {
192205

193206
Here is a brief breakdown of the fields we care about in the program header:
194207

195-
- `p_type`: This field tells us what type of segment we are looking at. For our
196-
purposes this will be either `PT_NOTE` or `PT_LOAD`.
208+
- `p_type`: This field tells us what type of segment we are looking at. This
209+
will be either `PT_NOTE` or `PT_LOAD` for our purposes.
197210
- `p_offset`: This field tells us the offset from the beginning of the file
198211
where the segment starts.
199212
- `p_vaddr`: This field tells us the virtual address where the segment is
@@ -204,22 +217,22 @@ Here is a brief breakdown of the fields we care about in the program header:
204217
- `p_memsz`: This field tells us the size of the segment in memory.
205218
- `p_align`: This field tells us the alignment of the segment.
206219

207-
We'll start by taking a look at the format of the `PT_NOTE` segments. Below is
208-
the layout of a `PT_NOTE` segment.
220+
We'll start by looking at the format of the `PT_NOTE` segments. Below is the
221+
layout of a `PT_NOTE` segment.
209222

210223
![]({% img_url linux-coredump/elf-note-layout.png %})
211224

212-
The first two fields of the segment are fairly self explanatory, they represent
213-
the size of both the name and the descriptor. The `name` field is a string that
214-
represents the type of note. The `desc` field is a structure that contains the
225+
The first two fields of the segment are fairly self-explanatory, they represent
226+
the size of both the name and the descriptor. The `name` field is a string
227+
representing the type of note. The `desc` field is a structure that contains the
215228
actual data of the note. The `type` field tells us what type of note we are
216229
looking at. It is an unsigned integer that represents the type of note. It's
217230
worth noting that the `name` field works as a kind of namespace for the type
218231
field. Two notes with the same type field can be differentiated by their name
219232
field.
220233

221234
The `PT_LOAD` segment is a bit more straightforward. This represents a segment
222-
of memory that was loaded into the process at the time of crash. These can
235+
of memory that was loaded into the process at the time of the crash. These can
223236
represent either the stack, heap, or any other segment of memory that was loaded
224237
into the process.
225238

@@ -235,20 +248,20 @@ offering on MCU and Android, we needed a few basic things:
235248

236249
Based on what we've learned about Linux core files so far, they are an obvious
237250
fit for these requirements. We can use an established system to route
238-
information about crashed processes, add metadata that helps gives us
239-
information the device in question, and do all of this without making any source
240-
modifications to anything running on the system. For this reason our first pass
241-
at coredumps leave them largely untouched from what the kernel provides. The
242-
only addition is a note that contains the metadata we use to identify devices
243-
and the version of software they're running on. This takes advantage of the fact
244-
that the `PT_NOTE` segment is a free form segment that can be used to add any
245-
metadata we want to the coredump.
251+
information about crashed processes, add metadata that helps give us information
252+
about the device in question, and do all of this without making any source
253+
modifications to anything running on the system. For this reason, our first pass
254+
at coredumps leaves them largely untouched compared to what the kernel provides.
255+
The only addition is a note that contains the metadata we use to identify
256+
devices and the version of software they're running on. This takes advantage of
257+
the fact that the `PT_NOTE` segment is a free-form segment that can be used to
258+
add any metadata we want to the coredump.
246259

247260
This allows us to gather additional information about the process that crashed,
248261
and more easily stream memory to avoid unnecessary allocations or memory usage.
249262

250-
Now that we've covered all the background information we can start to dive into
251-
the innards of the `memfault-core-handler`. First we use the pipe operation that
263+
Now that we've covered all the background information, we can dive into the
264+
innards of the `memfault-core-handler`. First, we use the pipe operation that
252265
was outlined earlier.
253266
[Here](https://github.com/memfault/memfault-linux-sdk/blob/49adfe0ce0cb6082360012b0f0092a31e8030048/meta-memfault/recipes-memfault/memfaultd/files/memfaultd/src/coredump/mod.rs#L14)
254267
is the pattern we write to `/proc/sys/kernel/core_pattern` to pipe the coredump
@@ -258,27 +271,27 @@ to our handler:
258271
|/usr/sbin/memfault-core-handler -c /path/to/config %P %e %I %s
259272
```
260273

261-
This tells the kernel to pipe the coredump to our handler, and provides the
274+
This tells the kernel to pipe the coredump to our handler and provides the
262275
handler with the PID of the crashing process (`%P`), the name of the crashing
263-
process (%e), the UID of the crashing process (`%I`), and the signal that caused
264-
the crash (`%s`).
276+
process (`%e`), the UID of the crashing process (`%I`), and the signal that
277+
caused the crash (`%s`).
265278

266-
When a crash occurs the kernel will write the coredump to the `stdin` of the
279+
When a crash occurs, the kernel will write the coredump to the `stdin` of the
267280
handler. The handler will then read all the program headers into memory. This
268-
sets us up to do two things. First we'll read all of the `PT_NOTE` segments and
281+
sets us up to do two things. First, we'll read all of the `PT_NOTE` segments and
269282
save them in memory. For the first iteration of the handler, we won't do
270283
anything further with them until we write them to a file. They'll become more
271284
important in later articles as we get into more of the special sauce of the
272285
handler.
273286

274287
The next thing the handler does is read all of the memory ranges for each
275-
`PT_LOAD` segment in the coredump. Instead of storing this in memory we'll
276-
stream it directly to the output file from `/proc/<pid>/mem`. This is done to
277-
reduce the memory footprint of the handler, and prevent any issues where we
278-
would potentially need to seek backwards in the stream. As mentioned before,
279-
`stdin` is a one way stream, and we can't seek backwards in it.
288+
`PT_LOAD` segment in the coredump. Instead of storing this in memory, we'll
289+
stream it directly to the output file from `/proc/<pid>/mem`. We do this to
290+
reduce the memory footprint of the handler and prevent any issues where we would
291+
potentially need to seek backwards in the stream. As mentioned before, `stdin`
292+
is a one way stream, and we can't seek backwards in it.
280293

281-
After we've written all of the `PT_LOAD` segments to the output file we should
294+
After we've written all of the `PT_LOAD` segments to the output file, we should
282295
have an ELF coredump that is largely the same as what the kernel would have
283296
written. The only difference is that we've added a note to the coredump, the
284297
contents of which we won't cover in this article, as it's not particularly
@@ -290,25 +303,25 @@ our previous ELF layout diagram with the changes we've made.
290303
![]({% img_url linux-coredump/elf-core-layout-annotated.png %})
291304

292305
And there we have it! We've copied our coredump over from `stdin` with a few
293-
minor changes. Now you're probably wondering, why did we go through all of this
306+
minor changes. Now, you're probably wondering: why did we go through all of this
294307
trouble to end up with a file that's largely the same as what the kernel would
295-
have produced? Well for one it allows us to add metadata to the coredump, but it
296-
also sets the stage for more advanced coredump handling in the future that we'll
297-
cover in the the next article.
308+
have produced? Well, for one, it allows us to add metadata to the coredump, but
309+
it also sets the stage for more advanced coredump handling in the future that
310+
we'll cover in the next article.
298311

299312
## Conclusion
300313

301-
We've covered the basics of coredumps on Linux, and how they're used in the
314+
We've covered the basics of coredumps on Linux and how they're used in the
302315
Memfault SDK. You should now have a pretty good idea of how things look under
303-
the hood. While the baseline coredumps are useful, and a known commodity, there
316+
the hood. While the baseline coredumps are useful and a known commodity, there
304317
are a few things that aren't great about them. The biggest issue is that they
305-
can be quite large for processes that have many threads, or do a large amount of
306-
memory allocation. This can be a large problem for embedded devices that may not
307-
have a lot of room to store large files. In the next article we'll take a look
308-
at the steps we've taken to reduce the size of coredumps.
318+
can be quite large for processes with many threads or do a large amount of
319+
memory allocation. This can be a significant problem for embedded devices that
320+
may not have a lot of room to store large files. In the next article, we'll take
321+
a look at the steps we've taken to reduce the size of coredumps.
309322

310323
In the meantime, if you'd like to poke around the source code for the coredump
311-
handler you can find it
324+
handler, you can find it
312325
[here](https://github.com/memfault/memfaultd/tree/main/memfaultd/src/cli/memfault_core_handler).
313326

314327
<!-- Interrupt Keep START -->

0 commit comments

Comments
 (0)