1
1
---
2
- title : Coredumps at Memfault Part 1 - Introduction to Linux Coredumps
2
+ title : Linux Coredumps ( Part 1) - Introduction
3
3
description :
4
4
" The basics of Linux coredumps, how they're used at Memfault, and how they're
5
5
captured."
6
6
author : blake
7
+ tags : [linux, coredumps, memfault, debugging]
7
8
---
8
9
9
10
One of the core features of the Memfault Linux SDK is the ability to capture and
10
- analyze crashes. Since the inception of the SDK we've been slowly expanding our
11
+ analyze crashes. Since the inception of the SDK, we've been slowly expanding our
11
12
crash capture and analysis capabilities. Starting from the standard ELF
12
- coredump, we've added support for capturing only the stack memory, and even
13
+ coredump, we've added support for capturing only the stack memory and even
13
14
capturing just the stack trace with no registers and locals present. This
14
- article series will give you a high level overview of that journey, and give you
15
- a deeper understanding of how coredumps work on Linux.\*\*\*\*
15
+ article series will give you a high- level overview of that journey and a deeper
16
+ understanding of how coredumps work on Linux.
16
17
17
18
<!-- excerpt start -->
18
19
19
- In this article we'll start by taking a look at how a Linux coredump is
20
+ In this article, we'll start by taking a look at how a Linux coredump is
20
21
formatted, how you capture them, and how we use them at Memfault.
21
22
22
23
<!-- excerpt end -->
@@ -27,43 +28,43 @@ formatted, how you capture them, and how we use them at Memfault.
27
28
28
29
## What is a Linux Coredump
29
30
30
- A linux coredump represents a snapshot of the crashing process' memory. It can
31
+ A Linux coredump represents a snapshot of the crashing process' memory. It can
31
32
be loaded into programs like GDB to inspect the state of the process at the time
32
33
of crash. It is written as an ELF[ ^ elf_format ] file. The entirety of the ELF
33
34
format is outside the scope of this article, but we will touch on a few of the
34
35
more important bits when looking at an ELF core file.
35
36
36
- ## What triggers a cordump
37
+ ## What Triggers a Coredump
37
38
38
39
Coredumps are triggered by certain signals generated by or sent to a program.
39
40
The full list of signals can be found in the signal man page[ ^ man_signal ] . Here
40
41
are the signals that cause a coredump:
41
42
42
- - SIGABRT: Abnormal termination of the program, such as a call to abort.
43
- - SIGBUS: Bus error (bad memory access).
44
- - SIGFPE: Floating-point exception.
45
- - SIGILL: Illegal instruction.
46
- - SIGQUIT: Quit from keyboard.
47
- - SIGSEGV: Invalid memory reference.
48
- - SIGSYS: Bad system call.
49
- - SIGTRAP: Trace/breakpoint trap.
50
-
51
- Of these the most common culprits you'll likely see are ` SIGSEGV ` , ` SIGBUS ` , and
52
- ` SIGABRT ` . These are signals that will be generated when a program tries to
53
- access memory that it doesn't have access to, tries to dereference a null
54
- pointer, or when the program calls ` abort ` . These typically indicate a fairly
55
- serious bug in either your program, or the libraries that it uses.
56
-
57
- Coredumps are very useful in these situations, as generally you're going to want
58
- to inspect the running state of the process a the time of crash. From the
59
- coredump you can get a backtrace of the crashing thread, the values of the
43
+ - ` SIGABRT ` : Abnormal termination of the program, such as a call to abort.
44
+ - ` SIGBUS ` : Bus error (bad memory access).
45
+ - ` SIGFPE ` : Floating-point exception.
46
+ - ` SIGILL ` : Illegal instruction.
47
+ - ` SIGQUIT ` : Quit from keyboard.
48
+ - ` SIGSEGV ` : Invalid memory reference.
49
+ - ` SIGSYS ` : Bad system call.
50
+ - ` SIGTRAP ` : Trace/breakpoint trap.
51
+
52
+ Of these, the most common culprits you'll likely see are ` SIGSEGV ` , ` SIGBUS ` ,
53
+ and ` SIGABRT ` . These signals will be generated when a program tries to access
54
+ memory that it doesn't have access to, tries to dereference a null pointer, or
55
+ when the program calls ` abort ` . These typically indicate a fairly serious bug in
56
+ either your program or the libraries that it uses.
57
+
58
+ Coredumps are very useful in these situations, as generally, you're going to
59
+ want to inspect the running state of the process at the time of crash. From the
60
+ coredump, you can get a backtrace of the crashing thread, the values of the
60
61
registers at the time of crash, and the values of the local variables at each
61
62
frame of the backtrace.
62
63
63
- ## How are coredumps enabled/collected
64
+ ## How are Coredumps Enabled/Collected
64
65
65
66
Enabling coredumps on your Linux device requires a few configuration options. To
66
- start with you'll need the following options enabled on your kernel at a
67
+ start with, you'll need the following options enabled on your kernel at a
67
68
minimum:
68
69
69
70
``` c
@@ -74,30 +75,42 @@ CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS=y
74
75
These settings will enable the kernel to generate coredumps, as well as set the
75
76
default mappings that are present in the coredump. ` man core ` [ ^ man_core ]
76
77
provides a good overview of the options available to you when configuring
77
- coredumps.
78
+ coredumps. It's worth noting that these options are enabled for most distros by
79
+ default.
78
80
79
- ### core_pattern
81
+ In addition the kernel configuration, you'll need to set the ` ulimit ` for the
82
+ process that you want to capture a coredump for. The ` ulimit ` command is used to
83
+ set the resource limits for a process. The ` core ` resource limit is the one
84
+ we're interested in. This sets the maximum size of a coredump that can be
85
+ generated by a process. To make things easy, you can set it to unlimited with
86
+ the following command:
87
+
88
+ ``` bash
89
+ ulimit -c unlimited
90
+ ```
91
+
92
+ ### ` core_pattern `
80
93
81
94
The kernel provides an interface for controlling where and how coredumps are
82
95
written. The ` /proc/sys/kernel/core_pattern ` [ ^ man_core ] file provides two
83
96
methods for capturing coredumps from crashed processes. A coredump can be
84
- written directly to a file by providing a path directly to it . For example if we
85
- wanted to write the core file to our ` /tmp ` directory with both the process name
86
- and the pid we would write the following to ` /proc/sys/kernel/core_pattern ` .
97
+ written directly to a file by providing its path. For example, if we wanted to
98
+ write the core file to our ` /tmp ` directory with both the process name and the
99
+ pid, we would write the following to ` /proc/sys/kernel/core_pattern ` .
87
100
88
101
``` bash
89
102
/tmp/core.%e.%p
90
103
```
91
104
92
- In this example ` %e ` expands to the name of the crashing process, and ` %p `
105
+ In this example, ` %e ` expands to the name of the crashing process, and ` %p `
93
106
expands to the PID of the crashing process. More information on the available
94
107
expansions can be found in the ` man core ` [ ^ man_core ] page.
95
108
96
- We can also pipe a coredump directly to a program. This is useful when we want
109
+ We can also pipe a coredump directly to a program, which is useful when we want
97
110
to modify the coredump in flight. The coredump is streamed to the provided
98
- program via ` stdin ` . The configuration is similar to saving directly to a file
111
+ program via ` stdin ` . The configuration is similar to saving directly to a file ,
99
112
except the first character must be a ` | ` . This is how we capture coredumps in
100
- the Memfault SDK, and will be covered more in depth later in the article.
113
+ the Memfault SDK, and will be covered more in- depth later in this article.
101
114
102
115
#### ` procfs ` Shallow Dive
103
116
@@ -106,32 +119,32 @@ program that is being piped to exits, we have access to the `procfs` of the
106
119
crashing process. But what is ` procfs ` , and how does it help us with a coredump?
107
120
108
121
` procfs ` gives us direct, usually read-only, access to some of the kernel's data
109
- structures[ ^ man_proc ] . This can be system wide information, or information about
110
- individual processes. For our purposes we are interested mostly in the
111
- information about the process that is currently crashing. We can get direct read
112
- only access to all mapped memory by address through
113
- ` /proc/<pid>/mem ` [ ^ man_proc_pid_mem ] , or look at the command line arguments of
114
- the process through ` /proc/<pid>/cmdline ` [ ^ man_proc_pid_cmdline ] .
122
+ structures[ ^ man_proc ] . This can be system- wide information, or information about
123
+ individual processes. We are mostly interested in information about the process
124
+ that is currently crashing. We can get direct, read-only access to all mapped
125
+ memory by address through ` /proc/<pid>/mem ` [ ^ man_proc_pid_mem ] , or look at the
126
+ command line arguments of the process through
127
+ ` /proc/<pid>/cmdline ` [ ^ man_proc_pid_cmdline ] .
115
128
116
129
## Elf Core File Layout
117
130
118
131
Linux coredumps use a subset of the ELF format. The coredump itself is a
119
132
snapshot of the crashing process' memory, as well as some metadata to help
120
- debuggers understand the state of the process at the time of crash. We will
133
+ debuggers understand the state of the process at the time of the crash. We will
121
134
touch on the most important aspects of the core file in this article. We will
122
- not be doing an exhaustive dive into the ELF format, however, if you are
135
+ not be doing an exhaustive dive into the ELF format; however, if you are
123
136
interested in learning more about the ELF format, the ELF File
124
137
Format[ ^ elf_format ] is a great resource.
125
138
126
139
![ ] ({% img_url linux-coredump/elf-core-layout.png %})
127
140
128
141
### ELF Header
129
142
130
- The above image gives us a very high level view of the layout of a coredump. To
131
- start, the ELF header outlines the layout of the file and source of the file. We
132
- can see if the producing system was 32-bit or 64-bit, little or big endian, and
133
- the architecture of the system. Additionally it shows the offset to the program
134
- headers. Here is the layout of the ELF header[ ^ elf_format ] :
143
+ The above image gives us a very high- level view of the layout of a coredump. To
144
+ start, the ELF header outlines the layout of the file and the source of the
145
+ file. We can see if the producing system was 32-bit or 64-bit, little or big
146
+ endian, and the architecture of the system. Additionally it shows the offset to
147
+ the program headers. Here is the layout of the ELF header[ ^ elf_format ] :
135
148
136
149
``` c
137
150
typedef struct {
@@ -158,20 +171,20 @@ discussion are broken down below:
158
171
- ` e_ident ` : This field is an array of bytes that identify the file as an ELF
159
172
file.
160
173
- ` e_type ` : This field tells us what type of file we are looking at. For our
161
- purposes this will always be ` ET_CORE ` .
174
+ purposes, this will always be ` ET_CORE ` .
162
175
- ` e_machine ` : This field tells us the architecture of the system that produced
163
176
the file. Common values here are
164
177
[ ` EM_ARM ` ] ( https://github.com/torvalds/linux/blob/c45323b7560ec87c37c729b703c86ee65f136d75/include/uapi/linux/elf-em.h#L26 )
165
- for 32 bit ARM, and
178
+ for 32- bit ARM, and
166
179
[ ` EM_AARCH64 ` ] ( https://github.com/torvalds/linux/blob/c45323b7560ec87c37c729b703c86ee65f136d75/include/uapi/linux/elf-em.h#L46 )
167
180
for aarch64.
168
181
- ` e_phoff ` : This field tells us the offset to the program headers.
169
182
- ` e_phentsize ` : This field tells us the size of each program header.
170
183
171
184
### Program Headers and Segments
172
185
173
- The meat of our coredump exists in the program headers. There are a wide variety
174
- of program header types defined in the Elf File Format[ ^ elf_format ] . From the
186
+ The meat of our coredump exists in the program headers. A wide variety of
187
+ program header types are defined in the Elf File Format[ ^ elf_format ] . From the
175
188
perspective of the coredump, however, we are primarily interested in the
176
189
` PT_NOTE ` and ` PT_LOAD ` program headers.
177
190
@@ -192,8 +205,8 @@ typedef struct {
192
205
193
206
Here is a brief breakdown of the fields we care about in the program header:
194
207
195
- - ` p_type ` : This field tells us what type of segment we are looking at. For our
196
- purposes this will be either ` PT_NOTE ` or ` PT_LOAD ` .
208
+ - ` p_type ` : This field tells us what type of segment we are looking at. This
209
+ will be either ` PT_NOTE ` or ` PT_LOAD ` for our purposes .
197
210
- ` p_offset ` : This field tells us the offset from the beginning of the file
198
211
where the segment starts.
199
212
- ` p_vaddr ` : This field tells us the virtual address where the segment is
@@ -204,22 +217,22 @@ Here is a brief breakdown of the fields we care about in the program header:
204
217
- ` p_memsz ` : This field tells us the size of the segment in memory.
205
218
- ` p_align ` : This field tells us the alignment of the segment.
206
219
207
- We'll start by taking a look at the format of the ` PT_NOTE ` segments. Below is
208
- the layout of a ` PT_NOTE ` segment.
220
+ We'll start by looking at the format of the ` PT_NOTE ` segments. Below is the
221
+ layout of a ` PT_NOTE ` segment.
209
222
210
223
![ ] ({% img_url linux-coredump/elf-note-layout.png %})
211
224
212
- The first two fields of the segment are fairly self explanatory, they represent
213
- the size of both the name and the descriptor. The ` name ` field is a string that
214
- represents the type of note. The ` desc ` field is a structure that contains the
225
+ The first two fields of the segment are fairly self- explanatory, they represent
226
+ the size of both the name and the descriptor. The ` name ` field is a string
227
+ representing the type of note. The ` desc ` field is a structure that contains the
215
228
actual data of the note. The ` type ` field tells us what type of note we are
216
229
looking at. It is an unsigned integer that represents the type of note. It's
217
230
worth noting that the ` name ` field works as a kind of namespace for the type
218
231
field. Two notes with the same type field can be differentiated by their name
219
232
field.
220
233
221
234
The ` PT_LOAD ` segment is a bit more straightforward. This represents a segment
222
- of memory that was loaded into the process at the time of crash. These can
235
+ of memory that was loaded into the process at the time of the crash. These can
223
236
represent either the stack, heap, or any other segment of memory that was loaded
224
237
into the process.
225
238
@@ -235,20 +248,20 @@ offering on MCU and Android, we needed a few basic things:
235
248
236
249
Based on what we've learned about Linux core files so far, they are an obvious
237
250
fit for these requirements. We can use an established system to route
238
- information about crashed processes, add metadata that helps gives us
239
- information the device in question, and do all of this without making any source
240
- modifications to anything running on the system. For this reason our first pass
241
- at coredumps leave them largely untouched from what the kernel provides. The
242
- only addition is a note that contains the metadata we use to identify devices
243
- and the version of software they're running on. This takes advantage of the fact
244
- that the ` PT_NOTE ` segment is a free form segment that can be used to add any
245
- metadata we want to the coredump.
251
+ information about crashed processes, add metadata that helps give us information
252
+ about the device in question, and do all of this without making any source
253
+ modifications to anything running on the system. For this reason, our first pass
254
+ at coredumps leaves them largely untouched compared to what the kernel provides.
255
+ The only addition is a note that contains the metadata we use to identify
256
+ devices and the version of software they're running on. This takes advantage of
257
+ the fact that the ` PT_NOTE ` segment is a free- form segment that can be used to
258
+ add any metadata we want to the coredump.
246
259
247
260
This allows us to gather additional information about the process that crashed,
248
261
and more easily stream memory to avoid unnecessary allocations or memory usage.
249
262
250
- Now that we've covered all the background information we can start to dive into
251
- the innards of the ` memfault-core-handler ` . First we use the pipe operation that
263
+ Now that we've covered all the background information, we can dive into the
264
+ innards of the ` memfault-core-handler ` . First, we use the pipe operation that
252
265
was outlined earlier.
253
266
[ Here] ( https://github.com/memfault/memfault-linux-sdk/blob/49adfe0ce0cb6082360012b0f0092a31e8030048/meta-memfault/recipes-memfault/memfaultd/files/memfaultd/src/coredump/mod.rs#L14 )
254
267
is the pattern we write to ` /proc/sys/kernel/core_pattern ` to pipe the coredump
@@ -258,27 +271,27 @@ to our handler:
258
271
| /usr/sbin/memfault-core-handler -c /path/to/config %P %e %I %s
259
272
```
260
273
261
- This tells the kernel to pipe the coredump to our handler, and provides the
274
+ This tells the kernel to pipe the coredump to our handler and provides the
262
275
handler with the PID of the crashing process (` %P ` ), the name of the crashing
263
- process (%e ), the UID of the crashing process (` %I ` ), and the signal that caused
264
- the crash (` %s ` ).
276
+ process (` %e ` ), the UID of the crashing process (` %I ` ), and the signal that
277
+ caused the crash (` %s ` ).
265
278
266
- When a crash occurs the kernel will write the coredump to the ` stdin ` of the
279
+ When a crash occurs, the kernel will write the coredump to the ` stdin ` of the
267
280
handler. The handler will then read all the program headers into memory. This
268
- sets us up to do two things. First we'll read all of the ` PT_NOTE ` segments and
281
+ sets us up to do two things. First, we'll read all of the ` PT_NOTE ` segments and
269
282
save them in memory. For the first iteration of the handler, we won't do
270
283
anything further with them until we write them to a file. They'll become more
271
284
important in later articles as we get into more of the special sauce of the
272
285
handler.
273
286
274
287
The next thing the handler does is read all of the memory ranges for each
275
- ` PT_LOAD ` segment in the coredump. Instead of storing this in memory we'll
276
- stream it directly to the output file from ` /proc/<pid>/mem ` . This is done to
277
- reduce the memory footprint of the handler, and prevent any issues where we
278
- would potentially need to seek backwards in the stream. As mentioned before,
279
- ` stdin ` is a one way stream, and we can't seek backwards in it.
288
+ ` PT_LOAD ` segment in the coredump. Instead of storing this in memory, we'll
289
+ stream it directly to the output file from ` /proc/<pid>/mem ` . We do this to
290
+ reduce the memory footprint of the handler and prevent any issues where we would
291
+ potentially need to seek backwards in the stream. As mentioned before, ` stdin `
292
+ is a one way stream, and we can't seek backwards in it.
280
293
281
- After we've written all of the ` PT_LOAD ` segments to the output file we should
294
+ After we've written all of the ` PT_LOAD ` segments to the output file, we should
282
295
have an ELF coredump that is largely the same as what the kernel would have
283
296
written. The only difference is that we've added a note to the coredump, the
284
297
contents of which we won't cover in this article, as it's not particularly
@@ -290,25 +303,25 @@ our previous ELF layout diagram with the changes we've made.
290
303
![ ] ({% img_url linux-coredump/elf-core-layout-annotated.png %})
291
304
292
305
And there we have it! We've copied our coredump over from ` stdin ` with a few
293
- minor changes. Now you're probably wondering, why did we go through all of this
306
+ minor changes. Now, you're probably wondering: why did we go through all of this
294
307
trouble to end up with a file that's largely the same as what the kernel would
295
- have produced? Well for one it allows us to add metadata to the coredump, but it
296
- also sets the stage for more advanced coredump handling in the future that we'll
297
- cover in the the next article.
308
+ have produced? Well, for one, it allows us to add metadata to the coredump, but
309
+ it also sets the stage for more advanced coredump handling in the future that
310
+ we'll cover in the next article.
298
311
299
312
## Conclusion
300
313
301
- We've covered the basics of coredumps on Linux, and how they're used in the
314
+ We've covered the basics of coredumps on Linux and how they're used in the
302
315
Memfault SDK. You should now have a pretty good idea of how things look under
303
- the hood. While the baseline coredumps are useful, and a known commodity, there
316
+ the hood. While the baseline coredumps are useful and a known commodity, there
304
317
are a few things that aren't great about them. The biggest issue is that they
305
- can be quite large for processes that have many threads, or do a large amount of
306
- memory allocation. This can be a large problem for embedded devices that may not
307
- have a lot of room to store large files. In the next article we'll take a look
308
- at the steps we've taken to reduce the size of coredumps.
318
+ can be quite large for processes with many threads or do a large amount of
319
+ memory allocation. This can be a significant problem for embedded devices that
320
+ may not have a lot of room to store large files. In the next article, we'll take
321
+ a look at the steps we've taken to reduce the size of coredumps.
309
322
310
323
In the meantime, if you'd like to poke around the source code for the coredump
311
- handler you can find it
324
+ handler, you can find it
312
325
[ here] ( https://github.com/memfault/memfaultd/tree/main/memfaultd/src/cli/memfault_core_handler ) .
313
326
314
327
<!-- Interrupt Keep START -->
0 commit comments