Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dominator analysis: analyse on basic block granularity #5042

Open
wants to merge 6 commits into
base: develop
Choose a base branch
from

Conversation

smowton
Copy link
Contributor

@smowton smowton commented Aug 20, 2019

Our dominator analysis could produce very large data structures because it computed dominator sets on an instruction granularity. In typical programs the basic-block graph is of course considerably smaller, which combined with the worst-case n-squared space complexity of dominator sets can save a great deal of memory.

Of course there are knock-on consequences for users: most are simple, natural_loopst needs a little restructuring, and accelerate et al need some awkward functions to accommodate their desire to edit the program while keeping the analysis up to date (if this is too awful we could insist that it fully recomputes the analysis after each change).

Accelerate has no tests, so it's hard to know if it's still working, but the Java front-end uses dominator analysis for local variable analysis, so the JBMC testsuite provides good testing of cfg_dominatorst (but not natural_loopst)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks fine to me, mostly nitpicks (and because the second commit has a bunch of changes all at once it's a bit hard to review but since a lot of these need to be dragged through pretty winded code paths I'm not sure if this even can be sensibly broken up further)

Target target) const
std::list<Target> get_successors(
Target target,
std::function<bool(const Target &)> is_instruction_out_of_range)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is already a template and you're not storing the function anywhere there's no real reason to use std::function here, you could just add the function type as another template parameter (I'm not sure if it's going to matter in this particular instance but since this avoids type erasure it can under some circumstance dramatically affect optimisation, for example in this case the difference is in one case generating a straight loop with virtual function calls and another that's partially unrolled/vectorised).

const auto &this_var_dom_block =
dominator_analysis.cfg[this_var_dom_block_index].block;
// Only consider placing variable declarations at the head of
// a basic block (which conveniently is always a safe choice, even

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what this means (i.e. why is it a safe choice and why would it not be a safe choice?)

@@ -26,20 +26,44 @@ template <class P, class T, bool post_dom>
class cfg_dominators_templatet
{
public:
typedef std::set<T> target_sett;
typedef std::set<std::size_t> target_sett;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this is referring to the node indices? In that case, shouldn't this use node_indext?

⛏️ (As a side note, for cases like this I'd also prefer a "strong typedef" i.e. wrapping it in a struct just to avoid accidentally doing things with them they were not meant to do - like comparing them to unrelated indices or adding them together, things like that).

/// Returns true if basic block \p lhs [post]dominates \p rhs
bool basic_block_dominates(std::size_t lhs, std::size_t rhs) const
{
return cfg[rhs].dominators.count(lhs);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ > 0 for additional verbosity? Never a big fan of implicit to-bool conversion myself


bool basic_block_reachable(std::size_t block) const
{
return !cfg[block].dominators.empty();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this be false for the first block?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All blocks dominate themselves by definition

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(except ones not reachable from the entry point, which are never visited by cfg_dominatorst)

return post_dom;
}

UNREACHABLE; // Entry map is inconsistent with block members?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ I think in this case it might be better to just use a regular invariant and put the comment on that instead

if(lhs == rhs)
return true;

const auto lhs_block = cfg.entry_map.at(lhs);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⛏️ Maybe call these block_id or something of the sort

@smowton smowton force-pushed the smowton/feature/dominator-analysis-use-basic-blocks branch from 1a28c15 to c1cac71 Compare August 21, 2019 16:24
@smowton
Copy link
Contributor Author

smowton commented Aug 21, 2019

@hannes-steffenhagen-diffblue take another look, I had to make the get-basic-block-from-instruction operation cheaper to get acceptable performance, for which I introduced dense_integer_mapt.

Preliminary performance figures, using the same benchmarking example as you did in #1787 (comment)

Examples:

  • straightline is like v1 in the linked comment: a long straight-line function.
  • small, medium and largeish are 100, 1000 and 3000 repeats of example v2 in the linked comment (a function with lots of small basic blocks)
  • blocks is a version of medium with basic blocks of size 3

As with the linked comment, I used dependency-graph to benchmark as a pass that both computes postdoms and then hammers it with queries (58% of runtime is spent in control_dependencies of which almost all of that is postdom queries).

Results:

  • Straightline: 4.5s -> 0.5s (down 89%)
  • Blocks: 7.5s -> 6.0s (down 20%)
  • Small: 0.05s -> 0.04s (unmeasurable)
  • Medium: 4.6s -> 3.3s (down 28%)
  • Largeish: 107s -> 61s (down 43%)

So up front this looks to me like we should get a huge win for mostly-straight-line code (as you'd expect from using basic-block form), and moderate wins (20-30%) from code consisting of 2-3 line basic blocks. Unlike #1787 we continue to use Cooper's dominator algorithm, but we shrink the sizes of the sets it tracks using basic-block form, which particularly shows its worth when we deal with large GOTO programs.

@smowton
Copy link
Contributor Author

smowton commented Aug 21, 2019

re: breaking up the PR, a lot of this now introduces and uses a sensible API for dominators, cf. the original code which merrily rummaged in both cfg_dominatorst and cfgt's fields. If this is looking like it might go in then I'll submit those first as cleanups.

@smowton
Copy link
Contributor Author

smowton commented Aug 21, 2019

Best case against the Java testsuite: regression/jbmc/coreModels, down from 3.2s to 1.7s. Total jbmc-CORE test suite time down from 67 to 59 seconds.

@smowton
Copy link
Contributor Author

smowton commented Aug 22, 2019

Have so far split out #5045 #5051 #5055, which should be pretty uncontroversial, and #5049, which floats the dense_integer_mapt concept for cfgt by itself, without the basic-blocks innovation introduced here. I will make one more PR in the morning, to factor out this PR's cleanup changes to natural_loopst, then this one will sit on top of the whole stack.

@smowton smowton force-pushed the smowton/feature/dominator-analysis-use-basic-blocks branch from c1cac71 to a8cbb76 Compare August 27, 2019 15:48
This is required because otherwise a cfgt<..., goto_programt::targett> wouldn't be able to
accept const_targett as an argument to get_node_index(...) etc, which is undesirable. In
particular its own entry_mapt::keys() would not be usable. In general this ought to be some
constified version of 'I', but since base `cfgt` is currently restricted to either targett
or const_targett then statically specifying const_targett suffices.
This gives a map-like interface but vector-like lookup cost, ideal for
mapping keys that have a bijection with densely packed integers, such as
the location numbers of a well-formed goto_modelt or goto_programt.
This saves a lot of allocations, memory, and time when the cfg is frequently queried.
@smowton smowton force-pushed the smowton/feature/dominator-analysis-use-basic-blocks branch from a8cbb76 to 1f25b16 Compare August 27, 2019 16:06
In order to accept both lvalue and rvalue arguments, their argument must be deduced, not a
class template argument (and therefore concretely specified).
Previously this used instruction granularity, which especially combined with a naive
dominator set (not a dominator tree) can waste a great deal of time and memory computing
and storing the dominator sets.

Now that cfg_dominatorst works on a basic-block granularity, natural_loopst does as well.
@smowton smowton force-pushed the smowton/feature/dominator-analysis-use-basic-blocks branch from 1f25b16 to 370fc30 Compare August 27, 2019 16:32
@martin-cs
Copy link
Collaborator

@smowton I am listed as a reviewer for this and am happy to do so but I think it might be best to wait until things have been rebased / merged. Correct me if I am wrong.

@smowton
Copy link
Contributor Author

smowton commented Aug 28, 2019

Agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants