-
Notifications
You must be signed in to change notification settings - Fork 273
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
String solver: support bounded-length intermediate strings #5246
base: develop
Are you sure you want to change the base?
String solver: support bounded-length intermediate strings #5246
Conversation
75e58c3
to
94f5a3c
Compare
This adds support for the string solver to use constant-sized arrays to store variable-length strings (which may be cheaper since smaller formulas are generally needed to represent them and their constraints to the solver) This is off-by-default for now since it has not been heavily studied: I expect constant-sized arrays would be expensive in the case where strings are numerous and sparsely constrained, but cheaper when there are many index expressions leading to an explosion of Ackermann constraints.
94f5a3c
to
8b50dcf
Compare
@@ -126,7 +126,7 @@ class prop_conv_solvert : public conflict_providert, | |||
|
|||
propt ∝ | |||
|
|||
messaget log; | |||
mutable messaget log; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is this related to the changes?
array_typet make_char_array_type( | ||
const typet &element_type, | ||
const typet &index_type, | ||
optionalt<std::size_t> max_length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ const
const typet &index_type, | ||
optionalt<std::size_t> max_length) | ||
{ | ||
exprt array_size = max_length && *max_length <= max_concrete_char_array_length |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ const
/// with indefinite size assuming generally the wasted space is not worth it. | ||
const std::size_t max_concrete_char_array_length = 256; | ||
|
||
array_typet make_char_array_type( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please document
/// \file | ||
/// Size limit to choose between variable- and fixed-size char arrays | ||
|
||
#include "max_concrete_char_array.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the file be rather be named max_concrete_char_array_length?
// already has an interval associated (necessarily the previous interval), | ||
// just drop this interval and therefore effectively extend the one to | ||
// our right to cover the whole intervening space: | ||
bool should_drop_interval = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const
else | ||
{ | ||
// This interval already extends to a defined index; leave it alone. | ||
intervals_it++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ inverting the if and else branches could make it easier to read since the else is much shorter and when reading this line we forgot what the if condition was.
@@ -2034,3 +2210,57 @@ static bool is_valid_string_constraint( | |||
|
|||
return true; | |||
} | |||
|
|||
static bool check_string_limit( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs documentation
@@ -50,7 +54,24 @@ Author: Alberto Griggio, [email protected] | |||
" --max-nondet-string-length n bound the length of nondet (e.g. input) strings.\n" /* NOLINT(*) */ \ | |||
" Default is " + std::to_string(MAX_CONCRETE_STRING_SIZE - 1) + "; note that\n" /* NOLINT(*) */ \ | |||
" setting the value higher than this does not work\n" /* NOLINT(*) */ \ | |||
" with --trace or --validate-trace.\n" /* NOLINT(*) */ | |||
" with --trace or --validate-trace.\n" /* NOLINT(*) */ \ | |||
" --max-intermediate-string-length n bound the length of intermediate\n" /* NOLINT(*) */ \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
⛏️ new line after n would be easier to read
@@ -133,6 +133,28 @@ class interval_sparse_arrayt final : public sparse_arrayt | |||
/// Complexity is logarithmic in the number of entries. | |||
exprt at(std::size_t index) const; | |||
|
|||
/// Iterators over interval boundaries | |||
typedef std::map<std::size_t, exprt>::const_iterator const_iterator; // NOLINT | |||
const_iterator begin() const |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these iterators can be confusing since one could think we are iterating over all indices rather than intervals.
By making restrict_sparse_array_to_indices
a method of this class (or sparse_array), I think you wouldn't have to define them.
This adds support for the string solver to use constant-sized arrays to store
variable-length strings (which may be cheaper since smaller formulas are generally needed
to represent them and their constraints to the solver)
This is off-by-default for now since it has not been heavily studied: I expect constant-sized
arrays would be expensive in the case where strings are numerous and sparsely constrained,
but cheaper when there are many index expressions leading to an explosion of Ackermann
constraints.
I originally wrote this to try to avoid the huge explosion of Ackermann constraints that arose when the string solver created a large index set. That is successfully achieved, as constant-sized arrays compile with cost proportional to the number of accesses rather than the quadratic complexity of the unbounded arrays it usually uses; however in practice any formula that featured a large enough index set that the Ackermann constraints were prohibitively numerous also generates a huge array of instances of universal string constraints, so this does not solve the problem by itself.
I successfully prototyped a followup that compresses the index sets by only instantiating universal constraints at distinct indices for a particular model, having observed hundreds of constraints that usually evaluated to address the same concrete indices in any viable model. However this requires more work, so I'm submitting this groundwork in the meantime.