Rework memory allocation failure checks and options #5401

danpoe · 2020-06-30T15:30:04Z

Each commit message has a non-empty body, explaining why the change was made.
Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
n/a My commit message includes data points confirming performance improvements (if claimed).
My PR is restricted to a single feature or bugfix.
n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.

NlightNFotis

LGTM so far, only one question

NlightNFotis · 2020-06-30T15:55:21Z

src/analyses/goto_check.h

@@ -58,7 +58,7 @@ void goto_check(
  " --no-built-in-assertions     ignore assertions in built-in library\n" \
  " --enum-range-check           checks that all enum type expressions have values in the enum range\n" /* NOLINT(whitespace/line_length) */ \
  " --pointer-primitive-check    checks that all pointers in pointer primitives are valid or null\n" /* NOLINT(whitespace/line_length) */ \
-  " --allocate-size-check        checks that not more memory is allocated than can be addressed by cbmc" /* NOLINT(whitespace/line_length) */
+  << help_entry("--allocate-size-check", "checks that not more memory is allocated than can be addressed by cbmc") /* NOLINT(whitespace/line_length) */


What is the help_entry function. Where was it introduced? Is there a CBMC equivalent?

It's defined in parse_options.cpp. I've introduced it a few weeks ago. It's supposed to make writing the help entries easier, by automatically wrapping the description at 80 characters, and indenting it by 30 characters.

martin-cs

I believe this works and is a good implementation of the idea but I don't see why it is needed. To my mind, having the modelling of malloc, including the checks in the library makes way more sense. Why is this PR an improvement?

danpoe · 2020-06-30T17:34:35Z

@martin-cs: The PR essentially does two things:

It fixes the issue raised by @hannes-steffenhagen-diffblue's on the previous PR (see Add malloc-may-fail option to goto-check #5212)
The checking of the allocated size (flag --allocate-size-check) is now done for __CPROVER_allocate(), which is used by the models. This means the models are simpler, the check is only present when the flag is given, and it will also work for plain __CPROVER_allocate() statements. That this is implemented in goto_checkt is consistent with how other primitives like __CPROVER_r_ok, etc. are handled.

smowton

You should reword the PR, this isn't a refactor, which I would expect to change no behaviour. Approving to remove my codeowner block (I don't mind either way), but you should certainly win the argument with @martin-cs before merging :)

smowton · 2020-06-30T18:38:09Z

src/analyses/goto_check.h

@@ -56,7 +57,8 @@ void goto_check(
  " --nan-check                  check floating-point for NaN\n" \
  " --no-built-in-assertions     ignore assertions in built-in library\n" \
  " --enum-range-check           checks that all enum type expressions have values in the enum range\n" /* NOLINT(whitespace/line_length) */ \
-  " --pointer-primitive-check    checks that all pointers in pointer primitives are valid or null\n" /* NOLINT(whitespace/line_length) */
+  " --pointer-primitive-check    checks that all pointers in pointer primitives are valid or null\n" /* NOLINT(whitespace/line_length) */ \
+  " --allocate-size-check        checks that not more memory is allocated than can be addressed by cbmc" /* NOLINT(whitespace/line_length) */


checks that no individual allocation exceeds CBMC's addresss space limit? After all total alloc can still be very large

smowton · 2020-06-30T19:07:33Z

src/cbmc/cbmc_parse_options.cpp

-    // NOLINTNEXTLINE(whitespace/line_length)
-    " --malloc-may-fail            allow malloc calls to return a null pointer\n"
+    << help_entry(
+      "--allocate-size-null",


overly-large-allocation-returns-null? The other option would then be overly-large-allocation-asserts

martin-cs · 2020-07-01T09:56:00Z

Is #5121 the right issue? I don't see the link here.

I think the model way of checking these is better because it is more flexible. It can be turned on and off during analysis, the limits can be varied, etc. But I appreciate that this is just my opinion. More importantly than where it is, having all of the modelling / instrumentation for memory allocation in one place is a good thing IMHO.

danpoe · 2020-07-01T12:40:52Z

Ah, it's #5212. I think it depends on whether one views this as a modelling issue or not. The flag --allocate-size-check checks that an implementation limit of cbmc isn't hit. The most straightforward way of doing this is to instrument a precondition before every __CPROVER_allocate() invocation. On the other hand, whether malloc() can nondeterministically return null is more of a modelling question, and the code for this is still part of the model.

martin-cs · 2020-07-01T12:53:28Z

I appreciate the distinction and it is a useful thing to note.

#5210 looks more relevant to me.

Sorry to be awkward but this feels like something that needs a bit more discussion. If this is checking an internal limit of the 'base + offset' pointer model so that we don't generate spurious models then...

A. Why should it be an option? Why would you want it turned off?
B. Wouldn't 'assert then assume' be the right way to handle this because we can't reasonably proceed with an allocation that is too large.
C. Doesn't this also affect global (or even stack allocated!) objects above a certain size?

danpoe · 2020-07-01T13:18:50Z

Yes, #5210 is the first of the two related PRs. The issue though that this one fixes was pointed out by @hannes-steffenhagen-diffblue on #5212.

A. If you know the limit will not be hit you don't need to turn on the option (or if you're using cbmc for bug finding). Currently cbmc adds no checks by default, and all of them need to be explicitely enabled (e.g., --pointer-check, etc.)
B. Not necessarily. It's done like that for all goto_checkt checks, e.g. array bounds checks, even though one could question how much sense it makes to continue to execute after an out of bounds access.
C. Yes, I think it also affects those.

There's also other cbmc primitives that have preconditions for which checks need to be explicitely enabled (see http://cprover.diffblue.com/memory-primitives.html).

The option adds an assertion to check that not more memory is allocated with __CPROVER_allocate() than cbmc can address. Since the primitive is used by the models of malloc(), etc., this also guarantees that those function don't allocate more memory than can be addressed.

Adapt the help messages of all the tools to work with the macro HELP_GOTO_CHECK which is now using help_entry() for the entry for --overly-large-allocation-check

Adapt the malloc failure regression tests to use the new options --overly-large-allocation-check, --overly-large-allocation-returns-null, and --allocate-may-fail

Due to the changes to the malloc models, less assertions are now generated by default. Thus, adapt the regexes `<n> of <m> failed` in the test.desc files

chrisr-diffblue

Looks generally good to me, but I have two requests before I'm happy to approve :-)

Get an 'OK' from @martin-cs
Update the user docs - I think some of the removed command line options were possibly documented, if so, they'll need removing from the docs too, plus add user docs for the new options too.

danpoe · 2020-07-03T06:43:53Z

@martin-cs Would you be ok with this going in for now? I agree it's not a perfect or complete solution, it just would help us as it fixes an issue for us.

martin-cs · 2020-07-07T13:22:50Z

@danpoe Sorry for the lag. I think it is possible to sort this out but let's just check we are talking about the same thing. This program shows the problem:

#define BIG_NUMBER   // something bigger than 1 << (sizeof(void*)*CHAR_BITS - (--object-bits n))

char big_array[BIG_NUMBER];
char other_array[256];

int main (int argc, char **argv) {
  char *b = big_array;
  char *c = other_array;

  *(b + (BIG_NUMBER - 1) = '\0';  // Should modify big_array not other_array

  char *p = malloc(BIG_NUMBER);
  char *q = malloc(256);

  *(p + (BIG_NUMBER - 1)) = '\0'; // Should modify the first allocated block, not the second one

  return 0;
}

So the root cause of the problem is the split between object and offset in the representation of pointers in the default back-end? If the offset is too high then it can overflow into the object, resulting in incorrect answers.

The patch you are proposing will add a command-line option to CBMC which will insert assertions everywhere __CPROVER_alloc is used. When the flag is used it will indicate when the program is able to dynamically allocate a block of memory that /might/ have this problem.

Have I understood the problem?

Ideally we need to solve the actual root of the problem which is the overflow (or underflow) from offset to object. The back-end effectively already tracks when a pointer is becoming invalid as it is one of the overflow bits in the adder when modifying the offset. It would take a little thought and care to expose this to the front-end / instrumentation and you would need to decide if the problem was A. creating an invalid pointer or using an invalid pointer, B. whether making the pointer valid again is possible.
A quicker fix would be to prevent objects that are too large being created assuming that bounds checks are used. This doesn't actually solve the problem as offset overflow can modify a pointer so that it is still in-bounds for another object. Regardless, this should be possible by checking global and local variables and putting assert/assume before calls to __CPROVER_alloc. As it is an internal function and only used in a few place I think this makes more sense to do in the modelled C library. As long as --object-bits is converted into a global variable this should be a trivial patch.
I think there is an important question about what the default should be. I can't see a reason not to have this on by default. Ordered compare with constants, particularly powers of 2, are about as cheap as you can get. If it is on by default then in the model makes more sense to me.

HTH

danpoe · 2020-07-16T14:22:02Z

Yes, I agree with your analysis, the allocation of a block for which not all parts can be addressed (in the realm of cbmc) is not in itself a problem. So yes the best fix would probably be to catch the pointer overflows instead.

It turns out that --pointer-overflow-check only checks for overflows of the complete bitvector, as opposed to overflows of the offset part only. The offset part does not overflow into the the object bits part though but wraps around.

Btw, if the result of pointer arithmetic points outside the pointed-to memory block, the behavior is undefined (except pointing to one past the last element which is allowed). See, e.g., here: https://blog.regehr.org/archives/1395.

So a possible solution would be:

add an always on check which verifies that the offset part of a pointer does not overflow
modify --pointer-overflow-check such that it verifies that the result of pointer arithmetic does not point outside the pointed-to object

martin-cs · 2020-07-19T11:40:31Z

I like the proposed course of action. Thanks for giving this the extra consideration. Limitations in how we handle pointers should not conceal bugs, so I like the always on check for offset overflow. I can't see what the use of checking the overflow of a whole pointer is, so modifying --pointer-overflow-check sounds good to me. The proposed semantics seem good to me because it catches an extra kind of UB. I think having it as an option still makes sense because I can see cases (sort of like 1.B. above) where people assume that any kind of pointer arithmetic is fine as long as they only access in the object.

martin-cs · 2020-07-19T11:50:03Z

@danpoe Might your suggestion also address #5096 ?

danpoe requested review from chrisr-diffblue, hannes-steffenhagen-diffblue, kroening, martin-cs, peterschrammel, pkesseli, smowton, tautschnig and thk123 as code owners June 30, 2020 15:30

NlightNFotis approved these changes Jun 30, 2020

View reviewed changes

martin-cs reviewed Jun 30, 2020

View reviewed changes

smowton approved these changes Jun 30, 2020

View reviewed changes

danpoe changed the title ~~Refactor memory allocation failure checks and options~~ Reword memory allocation failure checks and options Jul 1, 2020

danpoe changed the title ~~Reword memory allocation failure checks and options~~ Rework memory allocation failure checks and options Jul 1, 2020

danpoe added the aws Bugs or features of importance to AWS CBMC users label Jul 1, 2020

danpoe force-pushed the refactor/malloc branch from 252be8c to fa4b305 Compare July 2, 2020 13:12

danpoe added 5 commits July 2, 2020 14:13

Adapt usages of the HELP_GOTO_CHECK macro

0ee13c9

Adapt the help messages of all the tools to work with the macro HELP_GOTO_CHECK which is now using help_entry() for the entry for --overly-large-allocation-check

Simplify malloc models for allocation failures

c466524

Adapt existing regression tests for new command line options

610483b

Adapt the malloc failure regression tests to use the new options --overly-large-allocation-check, --overly-large-allocation-returns-null, and --allocate-may-fail

Adapt unrelated regression tests

02172cc

Due to the changes to the malloc models, less assertions are now generated by default. Thus, adapt the regexes `<n> of <m> failed` in the test.desc files

danpoe force-pushed the refactor/malloc branch from fa4b305 to 02172cc Compare July 2, 2020 13:15

chrisr-diffblue reviewed Jul 2, 2020

View reviewed changes

danpoe mentioned this pull request Jul 20, 2020

Confusing results with --pointer-overflow-check #5426

Closed

This was referenced Sep 9, 2020

false positives when --malloc-may-fail and --malloc-fail-null are used #5490

Closed

goto-cc incompatible with cbmc's --malloc-may-fail and --malloc-fail-null #5492

Closed

feliperodri removed the aws Bugs or features of importance to AWS CBMC users label May 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework memory allocation failure checks and options #5401

Rework memory allocation failure checks and options #5401

danpoe commented Jun 30, 2020 •

edited

Loading

NlightNFotis left a comment

NlightNFotis Jun 30, 2020

danpoe Jun 30, 2020

martin-cs left a comment

danpoe commented Jun 30, 2020 •

edited

Loading

smowton left a comment •

edited

Loading

smowton Jun 30, 2020

danpoe Jul 2, 2020

smowton Jun 30, 2020

danpoe Jul 2, 2020

martin-cs commented Jul 1, 2020

danpoe commented Jul 1, 2020

martin-cs commented Jul 1, 2020

danpoe commented Jul 1, 2020 •

edited

Loading

chrisr-diffblue left a comment

danpoe commented Jul 3, 2020

martin-cs commented Jul 7, 2020

danpoe commented Jul 16, 2020 •

edited

Loading

martin-cs commented Jul 19, 2020

martin-cs commented Jul 19, 2020

Rework memory allocation failure checks and options #5401

Are you sure you want to change the base?

Rework memory allocation failure checks and options #5401

Conversation

danpoe commented Jun 30, 2020 • edited Loading

NlightNFotis left a comment

Choose a reason for hiding this comment

NlightNFotis Jun 30, 2020

Choose a reason for hiding this comment

danpoe Jun 30, 2020

Choose a reason for hiding this comment

martin-cs left a comment

Choose a reason for hiding this comment

danpoe commented Jun 30, 2020 • edited Loading

smowton left a comment • edited Loading

Choose a reason for hiding this comment

smowton Jun 30, 2020

Choose a reason for hiding this comment

danpoe Jul 2, 2020

Choose a reason for hiding this comment

smowton Jun 30, 2020

Choose a reason for hiding this comment

danpoe Jul 2, 2020

Choose a reason for hiding this comment

martin-cs commented Jul 1, 2020

danpoe commented Jul 1, 2020

martin-cs commented Jul 1, 2020

danpoe commented Jul 1, 2020 • edited Loading

chrisr-diffblue left a comment

Choose a reason for hiding this comment

danpoe commented Jul 3, 2020

martin-cs commented Jul 7, 2020

danpoe commented Jul 16, 2020 • edited Loading

martin-cs commented Jul 19, 2020

martin-cs commented Jul 19, 2020

danpoe commented Jun 30, 2020 •

edited

Loading

danpoe commented Jun 30, 2020 •

edited

Loading

smowton left a comment •

edited

Loading

danpoe commented Jul 1, 2020 •

edited

Loading

danpoe commented Jul 16, 2020 •

edited

Loading