Skip to content

Commit 2e6e0e8

Browse files
author
Daniel Kroening
committed
manual: add section on the property generator
1 parent 39559f4 commit 2e6e0e8

File tree

2 files changed

+308
-3
lines changed

2 files changed

+308
-3
lines changed

doc/cprover-manual/index.md

Lines changed: 5 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -12,17 +12,19 @@
1212

1313
## 4. [Test Suite Generation](test-suite/)
1414

15-
## 5. Modeling
15+
## 5. [Program Properties](properties/)
16+
17+
## 6. Modeling
1618

1719
[Nondeterminism](modeling/nondeterminism/),
1820
[Assumptions](modeling/assumptions/),
1921
[Pointers](modeling/pointers/),
2022
[Floating Point](modeling/floating-point/)
2123

22-
## 6. Build Systems
24+
## 7. Build Systems
2325

2426
[Integration into Build Systems with goto-cc](goto-cc/),
2527
[Integration with Visual Studio builds](visual-studio/)
2628

27-
## 7. [The CPROVER API Reference](api/)
29+
## 8. [The CPROVER API Reference](api/)
2830

doc/cprover-manual/properties.md

Lines changed: 303 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,303 @@
1+
## Automatically Generating Properties
2+
3+
### What is a "Property"?
4+
5+
We have mentioned *properties* several times so far, but we never
6+
explained *what* kind of properties CBMC can verify. We cover
7+
this topic in more detail in this section.
8+
9+
CBMC uses
10+
[assertions](http://en.wikipedia.org/wiki/Assertion_(computing)) to
11+
specify program properties. Assertions are properties of the state of
12+
the program when the program reaches a particular program location.
13+
Assertions are often written by the programmer by means of the `assert`
14+
macro.
15+
16+
In addition to the assertions written by the programmer, assertions for
17+
specific properties can also be generated automatically by CBMC, often
18+
relieving the programmer from expressing properties that should hold in any
19+
well-behaved program.
20+
21+
CBMC comes with an assertion generator, which performs a conservative [static
22+
analysis](http://en.wikipedia.org/wiki/Static_code_analysis) to determine
23+
program locations that potentially contain a bug. Due to the imprecision of
24+
the static analysis, it is important to emphasize that these generated
25+
assertions are only *potential* bugs, and that the Model Checker first needs
26+
to confirm that they are indeed genuine bugs.
27+
28+
The assertion generator can generate assertions for the verification of
29+
the following properties:
30+
31+
- **Buffer overflows.** For each array access, check whether the upper
32+
and lower bounds are violated.
33+
34+
- **Pointer safety.** Search for `NULL`-pointer dereferences or
35+
dereferences of other invalid pointers.
36+
37+
- **Memory leaks.** Check whether the program constructs dyanamically
38+
allocated data structures that are subsequently inaccessible.
39+
40+
- **Division by zero.** Check whether there is a division by zero in
41+
the program.
42+
43+
- **Not-a-Number.** Check whether floating-point computation may
44+
result in [NaNs](http://en.wikipedia.org/wiki/NaN).
45+
46+
- **Arithmetic overflow.** Check whether a numerical overflow occurs
47+
during an arithmetic operation.
48+
49+
- **Undefined shifts.** Check for shifts with excessive distance.
50+
51+
We refrain from explaining the properties above in detail. Most of them
52+
relate to behaviors that are left undefined by the respective language
53+
semantics. For a discussion on why these behaviors are usually very
54+
undesirable, read [this](http://blog.regehr.org/archives/213) blog post
55+
by John Regehr.
56+
57+
All the properties described above are *reachability* properties. They
58+
are always of the form
59+
60+
"*Is there a path through the program such that property ... is
61+
violated?*"
62+
63+
The counterexamples to such properties are always program paths. Users
64+
of the Eclipse plugin can step through these counterexamples in a way
65+
that is similar to debugging programs. The installation of this plugin
66+
is explained [here](http://www.cprover.org/eclipse-plugin/).
67+
68+
### Using goto-instrument
69+
70+
The goto-instrument static analyzer operates on goto-binaries, which is
71+
a binary representation of control-flow graphs. The goto-binary is
72+
extracted from program source code using goto-cc, which is explained
73+
[here](../goto-cc/). Given a goto-program, goto-instrument operates
74+
as follows:
75+
76+
1. A goto-binary is read in.
77+
2. The specified static analyses are performed.
78+
3. Any potential bugs found are transformed into corresponding
79+
assertions, and are added into the program.
80+
4. A new goto-binary (with assertions) is written to disc.
81+
82+
As an example, we begin with small C program we call `expr.c` (taken
83+
from [here](http://www.spinroot.com/uno/)):
84+
85+
```C
86+
int *ptr;
87+
88+
int main(void) {
89+
if (ptr)
90+
*ptr = 0;
91+
if (!ptr)
92+
*ptr = 1;
93+
}
94+
```
95+
96+
The program contains an obvious NULL-pointer dereference. We first
97+
compile the example program with goto-cc and then instrument the
98+
resulting goto-binary with pointer checks.
99+
100+
```
101+
goto-cc expr.c -o in.gb goto-instrument in.gb out.gb --pointer-check
102+
```
103+
104+
We can now get a list of the assertions that have been generated as
105+
follows:
106+
107+
```
108+
goto-instrument out.gb --show-properties
109+
```
110+
111+
Using CBMC on `out.gb`, we can obtain a counterexample
112+
trace for the NULL-pointer dereference:
113+
114+
```
115+
cbmc out.gb
116+
```
117+
118+
The goto-instrument program supports the following checks:
119+
120+
Flag | Check
121+
-----------------------------|----------------------------------------------
122+
`--no-assertions` | ignore user assertions
123+
`--bounds-check` | add array bounds checks
124+
`--div-by-zero-check` | add division by zero checks
125+
`--pointer-check` | add pointer checks
126+
`--signed-overflow-check` | add arithmetic over- and underflow checks
127+
`--unsigned-overflow-check` | add arithmetic over- and underflow checks
128+
`--undefined-shift-check` | add range checks for shift distances
129+
`--nan-check` | add floating-point NaN checks
130+
`--uninitialized-check` | add checks for uninitialized locals (experimental)
131+
`--error-label label` | check that given label is unreachable
132+
133+
#### Generating function bodies
134+
135+
Sometimes implementations for called functions are not available in the goto
136+
program, or it is desirable to replace bodies of functions with certain
137+
predetermined stubs (for example to confirm that these functions are never
138+
called, or to indicate that these functions will never return). For this purpose
139+
goto-instrument provides the `--generate-function-body` option, that takes a
140+
regular expression (in [ECMAScript syntax]
141+
(http://en.cppreference.com/w/cpp/regex/ecmascript)) that describes the names of
142+
the functions to generate. Note that this will only generate bodies for
143+
functions that do not already have one; If one wishes to replace the body of a
144+
function with an existing definition, the `--remove-function-body` option can be
145+
used to remove the body of the function prior to generating a new one.
146+
147+
The shape of the stub itself can be chosen with the
148+
`--generate-function-body-options` parameter, which can take the following
149+
values:
150+
151+
Option | Result
152+
-----------------------------|-------------------------------------------------------------
153+
`nondet-return` | Do nothing and return a nondet result (this is the default)
154+
`assert-false` | Make the body contain an assert(false)
155+
`assume-false` | Make the body contain an assume(false)
156+
`assert-false-assume-false` | Combines assert-false and assume-false
157+
`havoc` | Set the contents of parameters and globals to nondet
158+
159+
The various combinations of assert-false and assume-false can be used to
160+
indicate that functions shouldn't be called, that they will never return or
161+
both.
162+
163+
Example: We have a program like this:
164+
165+
```C
166+
// error_example.c
167+
#include <stdlib.h>
168+
169+
void api_error(void);
170+
void internal_error(void);
171+
172+
int main(void)
173+
{
174+
int arr[10] = {1,2,3,4,5, 6, 7, 8, 9, 10};
175+
int sum = 0;
176+
for(int i = 1; i < 10; ++i)
177+
{
178+
sum += arr[i];
179+
}
180+
if(sum != 55)
181+
{
182+
// we made a mistake when calculating the sum
183+
internal_error();
184+
}
185+
if(rand() < 0)
186+
{
187+
// we think this cannot happen
188+
api_error();
189+
}
190+
return 0;
191+
}
192+
```
193+
194+
Now, we can compile the program and detect that the error functions are indeed
195+
called by invoking these commands
196+
197+
```
198+
goto-cc error_example.c -o error_example.goto
199+
# Replace all functions ending with _error
200+
# (Excluding those starting with __)
201+
# With ones that have an assert(false) body
202+
goto-instrument error_example.goto error_example_replaced.goto \
203+
--generate-function-body '(?!__).*_error' \
204+
--generate-function-body-options assert-false
205+
cbmc error_example_replaced.goto
206+
```
207+
208+
Which gets us the output
209+
210+
> ** Results:
211+
> error_example.c function api_error
212+
> [api_error.assertion.1] line 4 assertion false: FAILURE
213+
>
214+
> error_example.c function internal_error
215+
> [internal_error.assertion.1] line 5 assertion false: FAILURE
216+
>
217+
> ** 2 of 2 failed (2 iterations)
218+
> VERIFICATION FAILED
219+
220+
As opposed to the verification success we would have gotten without the
221+
instrumentation step.
222+
223+
The havoc option takes further parameters `globals` and `params` with this
224+
syntax: `havoc[,globals:<regex>][,params:<regex>]` (where the square brackets
225+
indicate an optional part). The regular expressions have the same format as the
226+
those for the `--generate-function-body` option and indicate which globals and
227+
function parameters should be set to nondet. All regular expressions require
228+
exact matches (i.e. the regular expression `a|b` will match 'a' and 'b' but not
229+
'adrian' or 'bertha').
230+
231+
Example: With a C program like this
232+
233+
```C
234+
struct Complex {
235+
double real;
236+
double imag;
237+
};
238+
239+
struct Complex AGlobalComplex;
240+
int do_something_with_complex(struct Complex *complex);
241+
```
242+
243+
And the command line
244+
245+
```
246+
goto-instrument in.goto out.goto
247+
--generate-function-body do_something_with_complex
248+
--generate-function-body-options
249+
'havoc,params:.*,globals:AGlobalComplex'
250+
```
251+
252+
The goto code equivalent of the following will be generated:
253+
254+
```C
255+
int do_something_with_complex(struct Complex *complex)
256+
{
257+
if(complex)
258+
{
259+
complex->real = nondet_double();
260+
complex->imag = nondet_double();
261+
}
262+
AGlobalComplex.real = nondet_double();
263+
AGlobalComplex.imag = nondet_double();
264+
return nondet_int();
265+
}
266+
```
267+
268+
A note on limitations: Because only static information is used for code
269+
generation, arrays of unknown size and pointers will not be affected by this;
270+
Which means that for code like this:
271+
272+
```C
273+
struct Node {
274+
int val;
275+
struct Node *next;
276+
};
277+
278+
void do_something_with_node(struct Node *node);
279+
```
280+
281+
Code like this will be generated:
282+
283+
```C
284+
void do_something_with_node(struct Node *node)
285+
{
286+
if(node)
287+
{
288+
node->val = nondet_int();
289+
node->next = nondet_0();
290+
}
291+
}
292+
```
293+
294+
Note that no attempt to follow the `next` pointer is made. If an array of
295+
unknown (or 0) size is encountered, a diagnostic is emitted and the array is not
296+
further examined.
297+
298+
Some care must be taken when choosing the regular expressions for globals and
299+
functions. Names starting with `__` are reserved for internal purposes; For
300+
example, replacing functions or setting global variables with the `__CPROVER`
301+
prefix might make analysis impossible. To avoid doing this by accident, negative
302+
lookahead can be used. For example, `(?!__).*` matches all names not starting
303+
with `__`.

0 commit comments

Comments
 (0)