-
Notifications
You must be signed in to change notification settings - Fork 273
Rework memory allocation failure checks and options #5401
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
danpoe
commented
Jun 30, 2020
•
edited
Loading
edited
- Each commit message has a non-empty body, explaining why the change was made.
- Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
- The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
- Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
- n/a My commit message includes data points confirming performance improvements (if claimed).
- My PR is restricted to a single feature or bugfix.
- n/a White-space or formatting changes outside the feature-related changed lines are in commits of their own.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM so far, only one question
src/analyses/goto_check.h
Outdated
@@ -58,7 +58,7 @@ void goto_check( | |||
" --no-built-in-assertions ignore assertions in built-in library\n" \ | |||
" --enum-range-check checks that all enum type expressions have values in the enum range\n" /* NOLINT(whitespace/line_length) */ \ | |||
" --pointer-primitive-check checks that all pointers in pointer primitives are valid or null\n" /* NOLINT(whitespace/line_length) */ \ | |||
" --allocate-size-check checks that not more memory is allocated than can be addressed by cbmc" /* NOLINT(whitespace/line_length) */ | |||
<< help_entry("--allocate-size-check", "checks that not more memory is allocated than can be addressed by cbmc") /* NOLINT(whitespace/line_length) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the help_entry
function. Where was it introduced? Is there a CBMC equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's defined in parse_options.cpp
. I've introduced it a few weeks ago. It's supposed to make writing the help entries easier, by automatically wrapping the description at 80 characters, and indenting it by 30 characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this works and is a good implementation of the idea but I don't see why it is needed. To my mind, having the modelling of malloc, including the checks in the library makes way more sense. Why is this PR an improvement?
@martin-cs: The PR essentially does two things:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should reword the PR, this isn't a refactor, which I would expect to change no behaviour. Approving to remove my codeowner block (I don't mind either way), but you should certainly win the argument with @martin-cs before merging :)
src/analyses/goto_check.h
Outdated
@@ -56,7 +57,8 @@ void goto_check( | |||
" --nan-check check floating-point for NaN\n" \ | |||
" --no-built-in-assertions ignore assertions in built-in library\n" \ | |||
" --enum-range-check checks that all enum type expressions have values in the enum range\n" /* NOLINT(whitespace/line_length) */ \ | |||
" --pointer-primitive-check checks that all pointers in pointer primitives are valid or null\n" /* NOLINT(whitespace/line_length) */ | |||
" --pointer-primitive-check checks that all pointers in pointer primitives are valid or null\n" /* NOLINT(whitespace/line_length) */ \ | |||
" --allocate-size-check checks that not more memory is allocated than can be addressed by cbmc" /* NOLINT(whitespace/line_length) */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
checks that no individual allocation exceeds CBMC's addresss space limit? After all total alloc can still be very large
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
src/cbmc/cbmc_parse_options.cpp
Outdated
// NOLINTNEXTLINE(whitespace/line_length) | ||
" --malloc-may-fail allow malloc calls to return a null pointer\n" | ||
<< help_entry( | ||
"--allocate-size-null", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overly-large-allocation-returns-null? The other option would then be overly-large-allocation-asserts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
Is #5121 the right issue? I don't see the link here. I think the model way of checking these is better because it is more flexible. It can be turned on and off during analysis, the limits can be varied, etc. But I appreciate that this is just my opinion. More importantly than where it is, having all of the modelling / instrumentation for memory allocation in one place is a good thing IMHO. |
Ah, it's #5212. I think it depends on whether one views this as a modelling issue or not. The flag |
I appreciate the distinction and it is a useful thing to note. #5210 looks more relevant to me. Sorry to be awkward but this feels like something that needs a bit more discussion. If this is checking an internal limit of the 'base + offset' pointer model so that we don't generate spurious models then... A. Why should it be an option? Why would you want it turned off? |
Yes, #5210 is the first of the two related PRs. The issue though that this one fixes was pointed out by @hannes-steffenhagen-diffblue on #5212. A. If you know the limit will not be hit you don't need to turn on the option (or if you're using cbmc for bug finding). Currently cbmc adds no checks by default, and all of them need to be explicitely enabled (e.g., There's also other cbmc primitives that have preconditions for which checks need to be explicitely enabled (see http://cprover.diffblue.com/memory-primitives.html). |
The option adds an assertion to check that not more memory is allocated with __CPROVER_allocate() than cbmc can address. Since the primitive is used by the models of malloc(), etc., this also guarantees that those function don't allocate more memory than can be addressed.
Adapt the help messages of all the tools to work with the macro HELP_GOTO_CHECK which is now using help_entry() for the entry for --overly-large-allocation-check
Adapt the malloc failure regression tests to use the new options --overly-large-allocation-check, --overly-large-allocation-returns-null, and --allocate-may-fail
Due to the changes to the malloc models, less assertions are now generated by default. Thus, adapt the regexes `<n> of <m> failed` in the test.desc files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks generally good to me, but I have two requests before I'm happy to approve :-)
- Get an 'OK' from @martin-cs
- Update the user docs - I think some of the removed command line options were possibly documented, if so, they'll need removing from the docs too, plus add user docs for the new options too.
@martin-cs Would you be ok with this going in for now? I agree it's not a perfect or complete solution, it just would help us as it fixes an issue for us. |
@danpoe Sorry for the lag. I think it is possible to sort this out but let's just check we are talking about the same thing. This program shows the problem:
So the root cause of the problem is the split between object and offset in the representation of pointers in the default back-end? If the offset is too high then it can overflow into the object, resulting in incorrect answers. The patch you are proposing will add a command-line option to CBMC which will insert assertions everywhere __CPROVER_alloc is used. When the flag is used it will indicate when the program is able to dynamically allocate a block of memory that /might/ have this problem. Have I understood the problem?
HTH |
Yes, I agree with your analysis, the allocation of a block for which not all parts can be addressed (in the realm of cbmc) is not in itself a problem. So yes the best fix would probably be to catch the pointer overflows instead. It turns out that Btw, if the result of pointer arithmetic points outside the pointed-to memory block, the behavior is undefined (except pointing to one past the last element which is allowed). See, e.g., here: https://blog.regehr.org/archives/1395. So a possible solution would be:
|
I like the proposed course of action. Thanks for giving this the extra consideration. Limitations in how we handle pointers should not conceal bugs, so I like the always on check for offset overflow. I can't see what the use of checking the overflow of a whole pointer is, so modifying |