Skip to content

Commit 0e2c128

Browse files
committed
Auto merge of #76044 - ecstatic-morse:dataflow-lattice, r=oli-obk
Support dataflow problems on arbitrary lattices This PR implements last of the proposed extensions I mentioned in the design meeting for the original dataflow refactor. It extends the current dataflow framework to work with arbitrary lattices, not just `BitSet`s. This is a prerequisite for dataflow-enabled MIR const-propagation. Personally, I am skeptical of the usefulness of doing const-propagation pre-monomorphization, since many useful constants only become known after monomorphization (e.g. `size_of::<T>()`) and users have a natural tendency to hand-optimize the rest. It's probably worth exprimenting with, however, and others have shown interest cc `@rust-lang/wg-mir-opt.` The `Idx` associated type is moved from `AnalysisDomain` to `GenKillAnalysis` and replaced with an associated `Domain` type that must implement `JoinSemiLattice`. Like before, each `Analysis` defines the "bottom value" for its domain, but can no longer override the dataflow join operator. Analyses that want to use set intersection must now use the `lattice::Dual` newtype. `GenKillAnalysis` impls have an additional requirement that `Self::Domain: BorrowMut<BitSet<Self::Idx>>`, which effectively means that they must use `BitSet<Self::Idx>` or `lattice::Dual<BitSet<Self::Idx>>` as their domain. Most of these changes were mechanical. However, because a `Domain` is no longer always a powerset of some index type, we can no longer use an `IndexVec<BasicBlock, GenKillSet<A::Idx>>>` to store cached block transfer functions. Instead, we use a boxed `dyn Fn` trait object. I discuss a few alternatives to the current approach in a commit message. The majority of new lines of code are to preserve existing Graphviz diagrams for those unlucky enough to have to debug dataflow analyses. I find these diagrams incredibly useful when things are going wrong and considered regressing them unacceptable, especially the pretty-printing of `MovePathIndex`s, which are used in many dataflow analyses. This required a parallel `fmt` trait used only for printing dataflow domains, as well as a refactoring of the `graphviz` module now that we cannot expect the domain to be a `BitSet`. Some features did have to be removed, such as the gen/kill display mode (which I didn't use but existed to mirror the output of the old dataflow framework) and line wrapping. Since I had to rewrite much of it anyway, I took the opportunity to switch to a `Visitor` for printing dataflow state diffs instead of using cursors, which are error prone for code that must be generic over both forward and backward analyses. As a side-effect of this change, we no longer have quadratic behavior when writing graphviz diagrams for backward dataflow analyses. r? `@pnkfelix`
2 parents 9fe551a + b015109 commit 0e2c128

File tree

23 files changed

+947
-678
lines changed

23 files changed

+947
-678
lines changed

Cargo.lock

+1
Original file line numberDiff line numberDiff line change
@@ -3756,6 +3756,7 @@ dependencies = [
37563756
"itertools 0.8.2",
37573757
"log_settings",
37583758
"polonius-engine",
3759+
"regex",
37593760
"rustc_apfloat",
37603761
"rustc_ast",
37613762
"rustc_attr",

compiler/rustc_index/src/bit_set.rs

+37-17
Original file line numberDiff line numberDiff line change
@@ -28,13 +28,20 @@ pub const WORD_BITS: usize = WORD_BYTES * 8;
2828
/// will panic if the bitsets have differing domain sizes.
2929
///
3030
/// [`GrowableBitSet`]: struct.GrowableBitSet.html
31-
#[derive(Clone, Eq, PartialEq, Decodable, Encodable)]
32-
pub struct BitSet<T: Idx> {
31+
#[derive(Eq, PartialEq, Decodable, Encodable)]
32+
pub struct BitSet<T> {
3333
domain_size: usize,
3434
words: Vec<Word>,
3535
marker: PhantomData<T>,
3636
}
3737

38+
impl<T> BitSet<T> {
39+
/// Gets the domain size.
40+
pub fn domain_size(&self) -> usize {
41+
self.domain_size
42+
}
43+
}
44+
3845
impl<T: Idx> BitSet<T> {
3946
/// Creates a new, empty bitset with a given `domain_size`.
4047
#[inline]
@@ -52,11 +59,6 @@ impl<T: Idx> BitSet<T> {
5259
result
5360
}
5461

55-
/// Gets the domain size.
56-
pub fn domain_size(&self) -> usize {
57-
self.domain_size
58-
}
59-
6062
/// Clear all elements.
6163
#[inline]
6264
pub fn clear(&mut self) {
@@ -75,12 +77,6 @@ impl<T: Idx> BitSet<T> {
7577
}
7678
}
7779

78-
/// Efficiently overwrite `self` with `other`.
79-
pub fn overwrite(&mut self, other: &BitSet<T>) {
80-
assert!(self.domain_size == other.domain_size);
81-
self.words.clone_from_slice(&other.words);
82-
}
83-
8480
/// Count the number of set bits in the set.
8581
pub fn count(&self) -> usize {
8682
self.words.iter().map(|e| e.count_ones() as usize).sum()
@@ -243,6 +239,21 @@ impl<T: Idx> SubtractFromBitSet<T> for BitSet<T> {
243239
}
244240
}
245241

242+
impl<T> Clone for BitSet<T> {
243+
fn clone(&self) -> Self {
244+
BitSet { domain_size: self.domain_size, words: self.words.clone(), marker: PhantomData }
245+
}
246+
247+
fn clone_from(&mut self, from: &Self) {
248+
if self.domain_size != from.domain_size {
249+
self.words.resize(from.domain_size, 0);
250+
self.domain_size = from.domain_size;
251+
}
252+
253+
self.words.copy_from_slice(&from.words);
254+
}
255+
}
256+
246257
impl<T: Idx> fmt::Debug for BitSet<T> {
247258
fn fmt(&self, w: &mut fmt::Formatter<'_>) -> fmt::Result {
248259
w.debug_list().entries(self.iter()).finish()
@@ -363,7 +374,7 @@ const SPARSE_MAX: usize = 8;
363374
///
364375
/// This type is used by `HybridBitSet`; do not use directly.
365376
#[derive(Clone, Debug)]
366-
pub struct SparseBitSet<T: Idx> {
377+
pub struct SparseBitSet<T> {
367378
domain_size: usize,
368379
elems: ArrayVec<[T; SPARSE_MAX]>,
369380
}
@@ -464,18 +475,27 @@ impl<T: Idx> SubtractFromBitSet<T> for SparseBitSet<T> {
464475
/// All operations that involve an element will panic if the element is equal
465476
/// to or greater than the domain size. All operations that involve two bitsets
466477
/// will panic if the bitsets have differing domain sizes.
467-
#[derive(Clone, Debug)]
468-
pub enum HybridBitSet<T: Idx> {
478+
#[derive(Clone)]
479+
pub enum HybridBitSet<T> {
469480
Sparse(SparseBitSet<T>),
470481
Dense(BitSet<T>),
471482
}
472483

484+
impl<T: Idx> fmt::Debug for HybridBitSet<T> {
485+
fn fmt(&self, w: &mut fmt::Formatter<'_>) -> fmt::Result {
486+
match self {
487+
Self::Sparse(b) => b.fmt(w),
488+
Self::Dense(b) => b.fmt(w),
489+
}
490+
}
491+
}
492+
473493
impl<T: Idx> HybridBitSet<T> {
474494
pub fn new_empty(domain_size: usize) -> Self {
475495
HybridBitSet::Sparse(SparseBitSet::new_empty(domain_size))
476496
}
477497

478-
fn domain_size(&self) -> usize {
498+
pub fn domain_size(&self) -> usize {
479499
match self {
480500
HybridBitSet::Sparse(sparse) => sparse.domain_size,
481501
HybridBitSet::Dense(dense) => dense.domain_size,

compiler/rustc_mir/Cargo.toml

+1
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ itertools = "0.8"
1414
tracing = "0.1"
1515
log_settings = "0.1.1"
1616
polonius-engine = "0.12.0"
17+
regex = "1"
1718
rustc_middle = { path = "../rustc_middle" }
1819
rustc_attr = { path = "../rustc_attr" }
1920
rustc_data_structures = { path = "../rustc_data_structures" }

compiler/rustc_mir/src/dataflow/framework/cursor.rs

+25-16
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ use std::borrow::Borrow;
44
use std::cmp::Ordering;
55

66
use rustc_index::bit_set::BitSet;
7+
use rustc_index::vec::Idx;
78
use rustc_middle::mir::{self, BasicBlock, Location};
89

910
use super::{Analysis, Direction, Effect, EffectIndex, Results};
@@ -26,7 +27,7 @@ where
2627
{
2728
body: &'mir mir::Body<'tcx>,
2829
results: R,
29-
state: BitSet<A::Idx>,
30+
state: A::Domain,
3031

3132
pos: CursorPosition,
3233

@@ -46,17 +47,16 @@ where
4647
{
4748
/// Returns a new cursor that can inspect `results`.
4849
pub fn new(body: &'mir mir::Body<'tcx>, results: R) -> Self {
49-
let bits_per_block = results.borrow().entry_set_for_block(mir::START_BLOCK).domain_size();
50-
50+
let bottom_value = results.borrow().analysis.bottom_value(body);
5151
ResultsCursor {
5252
body,
5353
results,
5454

55-
// Initialize to an empty `BitSet` and set `state_needs_reset` to tell the cursor that
55+
// Initialize to the `bottom_value` and set `state_needs_reset` to tell the cursor that
5656
// it needs to reset to block entry before the first seek. The cursor position is
5757
// immaterial.
5858
state_needs_reset: true,
59-
state: BitSet::new_empty(bits_per_block),
59+
state: bottom_value,
6060
pos: CursorPosition::block_entry(mir::START_BLOCK),
6161

6262
#[cfg(debug_assertions)]
@@ -68,23 +68,21 @@ where
6868
self.body
6969
}
7070

71-
/// Returns the `Analysis` used to generate the underlying results.
71+
/// Returns the underlying `Results`.
72+
pub fn results(&self) -> &Results<'tcx, A> {
73+
&self.results.borrow()
74+
}
75+
76+
/// Returns the `Analysis` used to generate the underlying `Results`.
7277
pub fn analysis(&self) -> &A {
7378
&self.results.borrow().analysis
7479
}
7580

7681
/// Returns the dataflow state at the current location.
77-
pub fn get(&self) -> &BitSet<A::Idx> {
82+
pub fn get(&self) -> &A::Domain {
7883
&self.state
7984
}
8085

81-
/// Returns `true` if the dataflow state at the current location contains the given element.
82-
///
83-
/// Shorthand for `self.get().contains(elem)`
84-
pub fn contains(&self, elem: A::Idx) -> bool {
85-
self.state.contains(elem)
86-
}
87-
8886
/// Resets the cursor to hold the entry set for the given basic block.
8987
///
9088
/// For forward dataflow analyses, this is the dataflow state prior to the first statement.
@@ -94,7 +92,7 @@ where
9492
#[cfg(debug_assertions)]
9593
assert!(self.reachable_blocks.contains(block));
9694

97-
self.state.overwrite(&self.results.borrow().entry_set_for_block(block));
95+
self.state.clone_from(&self.results.borrow().entry_set_for_block(block));
9896
self.pos = CursorPosition::block_entry(block);
9997
self.state_needs_reset = false;
10098
}
@@ -202,12 +200,23 @@ where
202200
///
203201
/// This can be used, e.g., to apply the call return effect directly to the cursor without
204202
/// creating an extra copy of the dataflow state.
205-
pub fn apply_custom_effect(&mut self, f: impl FnOnce(&A, &mut BitSet<A::Idx>)) {
203+
pub fn apply_custom_effect(&mut self, f: impl FnOnce(&A, &mut A::Domain)) {
206204
f(&self.results.borrow().analysis, &mut self.state);
207205
self.state_needs_reset = true;
208206
}
209207
}
210208

209+
impl<'mir, 'tcx, A, R, T> ResultsCursor<'mir, 'tcx, A, R>
210+
where
211+
A: Analysis<'tcx, Domain = BitSet<T>>,
212+
T: Idx,
213+
R: Borrow<Results<'tcx, A>>,
214+
{
215+
pub fn contains(&self, elem: T) -> bool {
216+
self.get().contains(elem)
217+
}
218+
}
219+
211220
#[derive(Clone, Copy, Debug)]
212221
struct CursorPosition {
213222
block: BasicBlock,

compiler/rustc_mir/src/dataflow/framework/direction.rs

+14-14
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ pub trait Direction {
1818
/// `effects.start()` must precede or equal `effects.end()` in this direction.
1919
fn apply_effects_in_range<A>(
2020
analysis: &A,
21-
state: &mut BitSet<A::Idx>,
21+
state: &mut A::Domain,
2222
block: BasicBlock,
2323
block_data: &mir::BasicBlockData<'tcx>,
2424
effects: RangeInclusive<EffectIndex>,
@@ -27,7 +27,7 @@ pub trait Direction {
2727

2828
fn apply_effects_in_block<A>(
2929
analysis: &A,
30-
state: &mut BitSet<A::Idx>,
30+
state: &mut A::Domain,
3131
block: BasicBlock,
3232
block_data: &mir::BasicBlockData<'tcx>,
3333
) where
@@ -55,9 +55,9 @@ pub trait Direction {
5555
tcx: TyCtxt<'tcx>,
5656
body: &mir::Body<'tcx>,
5757
dead_unwinds: Option<&BitSet<BasicBlock>>,
58-
exit_state: &mut BitSet<A::Idx>,
58+
exit_state: &mut A::Domain,
5959
block: (BasicBlock, &'_ mir::BasicBlockData<'tcx>),
60-
propagate: impl FnMut(BasicBlock, &BitSet<A::Idx>),
60+
propagate: impl FnMut(BasicBlock, &A::Domain),
6161
) where
6262
A: Analysis<'tcx>;
6363
}
@@ -72,7 +72,7 @@ impl Direction for Backward {
7272

7373
fn apply_effects_in_block<A>(
7474
analysis: &A,
75-
state: &mut BitSet<A::Idx>,
75+
state: &mut A::Domain,
7676
block: BasicBlock,
7777
block_data: &mir::BasicBlockData<'tcx>,
7878
) where
@@ -112,7 +112,7 @@ impl Direction for Backward {
112112

113113
fn apply_effects_in_range<A>(
114114
analysis: &A,
115-
state: &mut BitSet<A::Idx>,
115+
state: &mut A::Domain,
116116
block: BasicBlock,
117117
block_data: &mir::BasicBlockData<'tcx>,
118118
effects: RangeInclusive<EffectIndex>,
@@ -224,9 +224,9 @@ impl Direction for Backward {
224224
_tcx: TyCtxt<'tcx>,
225225
body: &mir::Body<'tcx>,
226226
dead_unwinds: Option<&BitSet<BasicBlock>>,
227-
exit_state: &mut BitSet<A::Idx>,
227+
exit_state: &mut A::Domain,
228228
(bb, _bb_data): (BasicBlock, &'_ mir::BasicBlockData<'tcx>),
229-
mut propagate: impl FnMut(BasicBlock, &BitSet<A::Idx>),
229+
mut propagate: impl FnMut(BasicBlock, &A::Domain),
230230
) where
231231
A: Analysis<'tcx>,
232232
{
@@ -281,7 +281,7 @@ impl Direction for Forward {
281281

282282
fn apply_effects_in_block<A>(
283283
analysis: &A,
284-
state: &mut BitSet<A::Idx>,
284+
state: &mut A::Domain,
285285
block: BasicBlock,
286286
block_data: &mir::BasicBlockData<'tcx>,
287287
) where
@@ -321,7 +321,7 @@ impl Direction for Forward {
321321

322322
fn apply_effects_in_range<A>(
323323
analysis: &A,
324-
state: &mut BitSet<A::Idx>,
324+
state: &mut A::Domain,
325325
block: BasicBlock,
326326
block_data: &mir::BasicBlockData<'tcx>,
327327
effects: RangeInclusive<EffectIndex>,
@@ -428,9 +428,9 @@ impl Direction for Forward {
428428
tcx: TyCtxt<'tcx>,
429429
body: &mir::Body<'tcx>,
430430
dead_unwinds: Option<&BitSet<BasicBlock>>,
431-
exit_state: &mut BitSet<A::Idx>,
431+
exit_state: &mut A::Domain,
432432
(bb, bb_data): (BasicBlock, &'_ mir::BasicBlockData<'tcx>),
433-
mut propagate: impl FnMut(BasicBlock, &BitSet<A::Idx>),
433+
mut propagate: impl FnMut(BasicBlock, &A::Domain),
434434
) where
435435
A: Analysis<'tcx>,
436436
{
@@ -499,7 +499,7 @@ impl Direction for Forward {
499499
// MIR building adds discriminants to the `values` array in the same order as they
500500
// are yielded by `AdtDef::discriminants`. We rely on this to match each
501501
// discriminant in `values` to its corresponding variant in linear time.
502-
let mut tmp = BitSet::new_empty(exit_state.domain_size());
502+
let mut tmp = analysis.bottom_value(body);
503503
let mut discriminants = enum_def.discriminants(tcx);
504504
for (value, target) in values.iter().zip(targets.iter().copied()) {
505505
let (variant_idx, _) =
@@ -508,7 +508,7 @@ impl Direction for Forward {
508508
from that of `SwitchInt::values`",
509509
);
510510

511-
tmp.overwrite(exit_state);
511+
tmp.clone_from(exit_state);
512512
analysis.apply_discriminant_switch_effect(
513513
&mut tmp,
514514
bb,

0 commit comments

Comments
 (0)