Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

String solver: support bounded-length intermediate strings #5246

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

smowton
Copy link
Contributor

@smowton smowton commented Feb 26, 2020

This adds support for the string solver to use constant-sized arrays to store
variable-length strings (which may be cheaper since smaller formulas are generally needed
to represent them and their constraints to the solver)

This is off-by-default for now since it has not been heavily studied: I expect constant-sized
arrays would be expensive in the case where strings are numerous and sparsely constrained,
but cheaper when there are many index expressions leading to an explosion of Ackermann
constraints.

I originally wrote this to try to avoid the huge explosion of Ackermann constraints that arose when the string solver created a large index set. That is successfully achieved, as constant-sized arrays compile with cost proportional to the number of accesses rather than the quadratic complexity of the unbounded arrays it usually uses; however in practice any formula that featured a large enough index set that the Ackermann constraints were prohibitively numerous also generates a huge array of instances of universal string constraints, so this does not solve the problem by itself.

I successfully prototyped a followup that compresses the index sets by only instantiating universal constraints at distinct indices for a particular model, having observed hundreds of constraints that usually evaluated to address the same concrete indices in any viable model. However this requires more work, so I'm submitting this groundwork in the meantime.

This adds support for the string solver to use constant-sized arrays to store
variable-length strings (which may be cheaper since smaller formulas are generally needed
to represent them and their constraints to the solver)

This is off-by-default for now since it has not been heavily studied: I expect constant-sized
arrays would be expensive in the case where strings are numerous and sparsely constrained,
but cheaper when there are many index expressions leading to an explosion of Ackermann
constraints.
@smowton smowton force-pushed the smowton/feature/string-solver-support-fixed-sized-arrays branch from 94f5a3c to 8b50dcf Compare February 27, 2020 11:38
@@ -126,7 +126,7 @@ class prop_conv_solvert : public conflict_providert,

propt ∝

messaget log;
mutable messaget log;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this related to the changes?

array_typet make_char_array_type(
const typet &element_type,
const typet &index_type,
optionalt<std::size_t> max_length)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ const

const typet &index_type,
optionalt<std::size_t> max_length)
{
exprt array_size = max_length && *max_length <= max_concrete_char_array_length
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ const

/// with indefinite size assuming generally the wasted space is not worth it.
const std::size_t max_concrete_char_array_length = 256;

array_typet make_char_array_type(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please document

/// \file
/// Size limit to choose between variable- and fixed-size char arrays

#include "max_concrete_char_array.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the file be rather be named max_concrete_char_array_length?

// already has an interval associated (necessarily the previous interval),
// just drop this interval and therefore effectively extend the one to
// our right to cover the whole intervening space:
bool should_drop_interval =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const

else
{
// This interval already extends to a defined index; leave it alone.
intervals_it++;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ inverting the if and else branches could make it easier to read since the else is much shorter and when reading this line we forgot what the if condition was.

@@ -2034,3 +2210,57 @@ static bool is_valid_string_constraint(

return true;
}

static bool check_string_limit(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs documentation

@@ -50,7 +54,24 @@ Author: Alberto Griggio, [email protected]
" --max-nondet-string-length n bound the length of nondet (e.g. input) strings.\n" /* NOLINT(*) */ \
" Default is " + std::to_string(MAX_CONCRETE_STRING_SIZE - 1) + "; note that\n" /* NOLINT(*) */ \
" setting the value higher than this does not work\n" /* NOLINT(*) */ \
" with --trace or --validate-trace.\n" /* NOLINT(*) */
" with --trace or --validate-trace.\n" /* NOLINT(*) */ \
" --max-intermediate-string-length n bound the length of intermediate\n" /* NOLINT(*) */ \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ new line after n would be easier to read

@@ -133,6 +133,28 @@ class interval_sparse_arrayt final : public sparse_arrayt
/// Complexity is logarithmic in the number of entries.
exprt at(std::size_t index) const;

/// Iterators over interval boundaries
typedef std::map<std::size_t, exprt>::const_iterator const_iterator; // NOLINT
const_iterator begin() const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these iterators can be confusing since one could think we are iterating over all indices rather than intervals.
By making restrict_sparse_array_to_indices a method of this class (or sparse_array), I think you wouldn't have to define them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants