You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Note that this function is memory-safe only if the `box` parameter follows some invariants, in particular:
88
-
- The list of `message`s is circular (the code never tests the `next` field to see if it can be a null pointer)
89
-
- Each `message` points to a `buffer` whose size corresponds to `box->length`.
90
+
1. Each `message` points to a `buffer` whose size corresponds to `box->length`.
91
+
2. The list of `message`s is circular (the code never tests the `next` field to see if it can be a null pointer)
92
+
90
93
91
94
So, if we try to analyze this code as is, Codex will correctly report that the code is not memory safe. Indeed, a main feature of Codex is that the analysis is sound: if there is a spatial memory safety issue, it should report it.
92
95
93
96
Luckily, it is easy to express the required invariants in our type system. It suffices to copy the header file, and edit it as follows:
94
97
95
98
```c
96
-
structmessage(len) {
99
+
structmessage(len) {
97
100
struct message(len)+ next;
98
101
char[len]+ buffer;
99
102
};
100
103
101
-
∃ mlen:integer with self > 0.
102
-
struct message_box {
104
+
∃ mlen:integer with self > 0.
105
+
struct message_box {
103
106
(integer with self = mlen) length;
104
107
struct message(mlen)+ first;
105
108
};
@@ -123,12 +126,20 @@ Now, this updated header file is not for the C compiler, but is used
123
126
by our Codex tool, that can now verify that `zeros_buffer` is
124
127
memory-safe (as it does not report any alarm) automatically. Note that
125
128
this proof relies on the hypothesis that the `box` argument of
126
-
`zeros_buffer` correspond to the memory layout described by the types;
127
-
but that in any analyzed function that would call `zeros_buffer`, we
128
-
would check this hypothesis. Thus, if you verify all the functions in
129
-
a program, we prove it memory-safe.
130
-
131
-
Finally, this verification of `zeros_buffer` can be make not only on
129
+
`zeros_buffer` correspond to the memory layout described by the types.
130
+
This assumption is checked in any analyzed function that would call `zeros_buffer`.
131
+
Thus, if you verify all the functions in a program, we prove it memory-safe.
132
+
133
+
{: .note }
134
+
While codex **ensures spatial memory safety** (no invalid pointer read/write),
135
+
it does **not ensure termination**.
136
+
Even with our given types, the `zeros_buffer` function may loop infinitely.
137
+
Indeed, we cannot express the invariant stating the list is circular. It is sort
138
+
of implied by the constraints that the `next` pointer is never null, since memory
139
+
is finite, the list will eventually reach a loop. However, we may have a lasso-shape,
140
+
where the first few `message`s are not part of that loop.
141
+
142
+
Finally, this verification of `zeros_buffer` can be made not only on
132
143
the C source code, but also on the compiled machine code, i.e. Codex
133
144
can perform type-checking of both C and machine code automatically!
134
145
@@ -140,7 +151,7 @@ inspired by that of C, as the basis for this abstraction. While
140
151
initial versions of this type system have been proposed in
141
152
[VMCAI'22](/papers/2022-vcmai-lightweight-shape-analysis.html) and
142
153
used in [RTAS'21](/papers/2021-rtas-no-crash-no-exploit.html), this
143
-
paper extends it significatively with new features like support for
154
+
paper extends it significantly with new features like support for
144
155
union, parameterized, and existential types. The paper shows how to
145
156
combine all these features to encode many complex low-level idioms,
146
157
such as flexible array members or discriminated unions using a memory
@@ -154,7 +165,7 @@ memory area whose length corresponds to the contents of this
154
165
integer". Thus, a type system that can be used to guarantee memory
155
166
safety must use dependent types. This makes type checking particularly
156
167
complex, which is why we use abstract interpretation to type-check the
157
-
program. Abstract interpretation also allows automatical infererence
168
+
program. Abstract interpretation also allows automatic inference
158
169
of other kinds of program invariants (beyond those expressed by the
159
170
type system), that helps the overall analysis to type-check the
0 commit comments