rust/src/librustc_mir/dataflow/at_location.rs

244 lines
8.1 KiB
Rust
Raw Normal View History

//! A nice wrapper to consume dataflow results at several CFG
//! locations.
use rustc::mir::{BasicBlock, Location};
Merge indexed_set.rs into bitvec.rs, and rename it bit_set.rs. Currently we have two files implementing bitsets (and 2D bit matrices). This commit combines them into one, taking the best features from each. This involves renaming a lot of things. The high level changes are as follows. - bitvec.rs --> bit_set.rs - indexed_set.rs --> (removed) - BitArray + IdxSet --> BitSet (merged, see below) - BitVector --> GrowableBitSet - {,Sparse,Hybrid}IdxSet --> {,Sparse,Hybrid}BitSet - BitMatrix --> BitMatrix - SparseBitMatrix --> SparseBitMatrix The changes within the bitset types themselves are as follows. ``` OLD OLD NEW BitArray<C> IdxSet<T> BitSet<T> -------- ------ ------ grow - grow new - (remove) new_empty new_empty new_empty new_filled new_filled new_filled - to_hybrid to_hybrid clear clear clear set_up_to set_up_to set_up_to clear_above - clear_above count - count contains(T) contains(&T) contains(T) contains_all - superset is_empty - is_empty insert(T) add(&T) insert(T) insert_all - insert_all() remove(T) remove(&T) remove(T) words words words words_mut words_mut words_mut - overwrite overwrite merge union union - subtract subtract - intersect intersect iter iter iter ``` In general, when choosing names I went with: - names that are more obvious (e.g. `BitSet` over `IdxSet`). - names that are more like the Rust libraries (e.g. `T` over `C`, `insert` over `add`); - names that are more set-like (e.g. `union` over `merge`, `superset` over `contains_all`, `domain_size` over `num_bits`). Also, using `T` for index arguments seems more sensible than `&T` -- even though the latter is standard in Rust collection types -- because indices are always copyable. It also results in fewer `&` and `*` sigils in practice.
2018-09-14 15:07:25 +10:00
use rustc_data_structures::bit_set::{BitIter, BitSet, HybridBitSet};
2019-02-08 06:28:15 +09:00
use crate::dataflow::{BitDenotation, BlockSets, DataflowResults};
use crate::dataflow::move_paths::{HasMoveData, MovePathIndex};
use std::iter;
/// A trait for "cartesian products" of multiple FlowAtLocation.
///
/// There's probably a way to auto-impl this, but I think
/// it is cleaner to have manual visitor impls.
pub trait FlowsAtLocation {
/// Reset the state bitvector to represent the entry to block `bb`.
fn reset_to_entry_of(&mut self, bb: BasicBlock);
/// Reset the state bitvector to represent the exit of the
/// terminator of block `bb`.
///
/// **Important:** In the case of a `Call` terminator, these
/// effects do *not* include the result of storing the destination
/// of the call, since that is edge-dependent (in other words, the
/// effects don't apply to the unwind edge).
fn reset_to_exit_of(&mut self, bb: BasicBlock);
2019-02-08 14:53:55 +01:00
/// Builds gen and kill sets for statement at `loc`.
///
/// Note that invoking this method alone does not change the
/// `curr_state` -- you must invoke `apply_local_effect`
/// afterwards.
fn reconstruct_statement_effect(&mut self, loc: Location);
2019-02-08 14:53:55 +01:00
/// Builds gen and kill sets for terminator for `loc`.
///
/// Note that invoking this method alone does not change the
/// `curr_state` -- you must invoke `apply_local_effect`
/// afterwards.
fn reconstruct_terminator_effect(&mut self, loc: Location);
/// Apply current gen + kill sets to `flow_state`.
///
/// (`loc` parameters can be ignored if desired by
/// client. For the terminator, the `stmt_idx` will be the number
/// of statements in the block.)
fn apply_local_effect(&mut self, loc: Location);
}
/// Represents the state of dataflow at a particular
/// CFG location, both before and after it is
/// executed.
///
/// Data flow results are typically computed only as basic block
/// boundaries. A `FlowInProgress` allows you to reconstruct the
/// effects at any point in the control-flow graph by starting with
/// the state at the start of the basic block (`reset_to_entry_of`)
/// and then replaying the effects of statements and terminators
/// (e.g., via `reconstruct_statement_effect` and
/// `reconstruct_terminator_effect`; don't forget to call
/// `apply_local_effect`).
pub struct FlowAtLocation<'tcx, BD>
where
BD: BitDenotation<'tcx>,
{
base_results: DataflowResults<'tcx, BD>,
Merge indexed_set.rs into bitvec.rs, and rename it bit_set.rs. Currently we have two files implementing bitsets (and 2D bit matrices). This commit combines them into one, taking the best features from each. This involves renaming a lot of things. The high level changes are as follows. - bitvec.rs --> bit_set.rs - indexed_set.rs --> (removed) - BitArray + IdxSet --> BitSet (merged, see below) - BitVector --> GrowableBitSet - {,Sparse,Hybrid}IdxSet --> {,Sparse,Hybrid}BitSet - BitMatrix --> BitMatrix - SparseBitMatrix --> SparseBitMatrix The changes within the bitset types themselves are as follows. ``` OLD OLD NEW BitArray<C> IdxSet<T> BitSet<T> -------- ------ ------ grow - grow new - (remove) new_empty new_empty new_empty new_filled new_filled new_filled - to_hybrid to_hybrid clear clear clear set_up_to set_up_to set_up_to clear_above - clear_above count - count contains(T) contains(&T) contains(T) contains_all - superset is_empty - is_empty insert(T) add(&T) insert(T) insert_all - insert_all() remove(T) remove(&T) remove(T) words words words words_mut words_mut words_mut - overwrite overwrite merge union union - subtract subtract - intersect intersect iter iter iter ``` In general, when choosing names I went with: - names that are more obvious (e.g. `BitSet` over `IdxSet`). - names that are more like the Rust libraries (e.g. `T` over `C`, `insert` over `add`); - names that are more set-like (e.g. `union` over `merge`, `superset` over `contains_all`, `domain_size` over `num_bits`). Also, using `T` for index arguments seems more sensible than `&T` -- even though the latter is standard in Rust collection types -- because indices are always copyable. It also results in fewer `&` and `*` sigils in practice.
2018-09-14 15:07:25 +10:00
curr_state: BitSet<BD::Idx>,
stmt_gen: HybridBitSet<BD::Idx>,
stmt_kill: HybridBitSet<BD::Idx>,
}
impl<'tcx, BD> FlowAtLocation<'tcx, BD>
where
BD: BitDenotation<'tcx>,
{
/// Iterate over each bit set in the current state.
pub fn each_state_bit<F>(&self, f: F)
where
F: FnMut(BD::Idx),
{
2018-03-06 01:00:44 +00:00
self.curr_state.iter().for_each(f)
}
/// Iterate over each `gen` bit in the current effect (invoke
/// `reconstruct_statement_effect` or
/// `reconstruct_terminator_effect` first).
pub fn each_gen_bit<F>(&self, f: F)
where
F: FnMut(BD::Idx),
{
2018-03-06 01:00:44 +00:00
self.stmt_gen.iter().for_each(f)
}
pub fn new(results: DataflowResults<'tcx, BD>) -> Self {
let bits_per_block = results.sets().bits_per_block();
Merge indexed_set.rs into bitvec.rs, and rename it bit_set.rs. Currently we have two files implementing bitsets (and 2D bit matrices). This commit combines them into one, taking the best features from each. This involves renaming a lot of things. The high level changes are as follows. - bitvec.rs --> bit_set.rs - indexed_set.rs --> (removed) - BitArray + IdxSet --> BitSet (merged, see below) - BitVector --> GrowableBitSet - {,Sparse,Hybrid}IdxSet --> {,Sparse,Hybrid}BitSet - BitMatrix --> BitMatrix - SparseBitMatrix --> SparseBitMatrix The changes within the bitset types themselves are as follows. ``` OLD OLD NEW BitArray<C> IdxSet<T> BitSet<T> -------- ------ ------ grow - grow new - (remove) new_empty new_empty new_empty new_filled new_filled new_filled - to_hybrid to_hybrid clear clear clear set_up_to set_up_to set_up_to clear_above - clear_above count - count contains(T) contains(&T) contains(T) contains_all - superset is_empty - is_empty insert(T) add(&T) insert(T) insert_all - insert_all() remove(T) remove(&T) remove(T) words words words words_mut words_mut words_mut - overwrite overwrite merge union union - subtract subtract - intersect intersect iter iter iter ``` In general, when choosing names I went with: - names that are more obvious (e.g. `BitSet` over `IdxSet`). - names that are more like the Rust libraries (e.g. `T` over `C`, `insert` over `add`); - names that are more set-like (e.g. `union` over `merge`, `superset` over `contains_all`, `domain_size` over `num_bits`). Also, using `T` for index arguments seems more sensible than `&T` -- even though the latter is standard in Rust collection types -- because indices are always copyable. It also results in fewer `&` and `*` sigils in practice.
2018-09-14 15:07:25 +10:00
let curr_state = BitSet::new_empty(bits_per_block);
let stmt_gen = HybridBitSet::new_empty(bits_per_block);
let stmt_kill = HybridBitSet::new_empty(bits_per_block);
FlowAtLocation {
base_results: results,
curr_state: curr_state,
stmt_gen: stmt_gen,
stmt_kill: stmt_kill,
}
}
/// Access the underlying operator.
pub fn operator(&self) -> &BD {
self.base_results.operator()
}
Merge indexed_set.rs into bitvec.rs, and rename it bit_set.rs. Currently we have two files implementing bitsets (and 2D bit matrices). This commit combines them into one, taking the best features from each. This involves renaming a lot of things. The high level changes are as follows. - bitvec.rs --> bit_set.rs - indexed_set.rs --> (removed) - BitArray + IdxSet --> BitSet (merged, see below) - BitVector --> GrowableBitSet - {,Sparse,Hybrid}IdxSet --> {,Sparse,Hybrid}BitSet - BitMatrix --> BitMatrix - SparseBitMatrix --> SparseBitMatrix The changes within the bitset types themselves are as follows. ``` OLD OLD NEW BitArray<C> IdxSet<T> BitSet<T> -------- ------ ------ grow - grow new - (remove) new_empty new_empty new_empty new_filled new_filled new_filled - to_hybrid to_hybrid clear clear clear set_up_to set_up_to set_up_to clear_above - clear_above count - count contains(T) contains(&T) contains(T) contains_all - superset is_empty - is_empty insert(T) add(&T) insert(T) insert_all - insert_all() remove(T) remove(&T) remove(T) words words words words_mut words_mut words_mut - overwrite overwrite merge union union - subtract subtract - intersect intersect iter iter iter ``` In general, when choosing names I went with: - names that are more obvious (e.g. `BitSet` over `IdxSet`). - names that are more like the Rust libraries (e.g. `T` over `C`, `insert` over `add`); - names that are more set-like (e.g. `union` over `merge`, `superset` over `contains_all`, `domain_size` over `num_bits`). Also, using `T` for index arguments seems more sensible than `&T` -- even though the latter is standard in Rust collection types -- because indices are always copyable. It also results in fewer `&` and `*` sigils in practice.
2018-09-14 15:07:25 +10:00
pub fn contains(&self, x: BD::Idx) -> bool {
self.curr_state.contains(x)
}
/// Returns an iterator over the elements present in the current state.
2019-02-08 06:28:15 +09:00
pub fn iter_incoming(&self) -> iter::Peekable<BitIter<'_, BD::Idx>> {
2018-03-06 01:00:53 +00:00
self.curr_state.iter().peekable()
}
/// Creates a clone of the current state and applies the local
/// effects to the clone (leaving the state of self intact).
/// Invokes `f` with an iterator over the resulting state.
2018-03-06 01:00:53 +00:00
pub fn with_iter_outgoing<F>(&self, f: F)
where
2019-02-08 06:28:15 +09:00
F: FnOnce(BitIter<'_, BD::Idx>),
{
let mut curr_state = self.curr_state.clone();
curr_state.union(&self.stmt_gen);
curr_state.subtract(&self.stmt_kill);
2018-03-06 01:00:53 +00:00
f(curr_state.iter());
}
}
impl<'tcx, BD> FlowsAtLocation for FlowAtLocation<'tcx, BD>
where BD: BitDenotation<'tcx>
{
fn reset_to_entry_of(&mut self, bb: BasicBlock) {
self.curr_state.overwrite(self.base_results.sets().on_entry_set_for(bb.index()));
}
fn reset_to_exit_of(&mut self, bb: BasicBlock) {
self.reset_to_entry_of(bb);
self.curr_state.union(self.base_results.sets().gen_set_for(bb.index()));
self.curr_state.subtract(self.base_results.sets().kill_set_for(bb.index()));
}
fn reconstruct_statement_effect(&mut self, loc: Location) {
2018-03-06 00:58:01 +00:00
self.stmt_gen.clear();
self.stmt_kill.clear();
2017-12-24 00:45:53 +02:00
{
let mut sets = BlockSets {
on_entry: &mut self.curr_state,
gen_set: &mut self.stmt_gen,
kill_set: &mut self.stmt_kill,
};
self.base_results
.operator()
.before_statement_effect(&mut sets, loc);
}
self.apply_local_effect(loc);
let mut sets = BlockSets {
New `ActiveBorrows` dataflow for two-phase `&mut`; not yet borrowed-checked. High-level picture: The old `Borrows` analysis is now called `Reservations` (implemented as a newtype wrapper around `Borrows`); this continues to compute whether a `Rvalue::Ref` can reach a statement without an intervening `EndRegion`. In addition, we also track what `Place` each such `Rvalue::Ref` was immediately assigned to in a given borrow (yay for MIR-structural properties!). The new `ActiveBorrows` analysis then tracks the initial use of any of those assigned `Places` for a given borrow. I.e. a borrow becomes "active" immediately after it starts being "used" in some way. (This is conservative in the sense that we will treat a copy `x = y;` as a use of `y`; in principle one might further delay activation in such cases.) The new `ActiveBorrows` analysis needs to take the `Reservations` results as an initial input, because the reservation state influences the gen/kill sets for `ActiveBorrows`. In particular, a use of `a` activates a borrow `a = &b` if and only if there exists a path (in the control flow graph) from the borrow to that use. So we need to know if the borrow reaches a given use to know if it really gets a gen-bit or not. * Incorporating the output from one dataflow analysis into the input of another required more changes to the infrastructure than I had expected, and even after those changes, the resulting code is still a bit subtle. * In particular, Since we need to know the intrablock reservation state, we need to dynamically update a bitvector for the reservations as we are also trying to compute the gen/kills bitvector for the active borrows. * The way I ended up deciding to do this (after also toying with at least two other designs) is to put both the reservation state and the active borrow state into a single bitvector. That is why we now have separate (but related) `BorrowIndex` and `ReserveOrActivateIndex`: each borrow index maps to a pair of neighboring reservation and activation indexes. As noted above, these changes are solely adding the active borrows dataflow analysis (and updating the existing code to cope with the switch from `Borrows` to `Reservations`). The code to process the bitvector in the borrow checker currently just skips over all of the active borrow bits. But atop this commit, one *can* observe the analysis results by looking at the graphviz output, e.g. via ```rust #[rustc_mir(borrowck_graphviz_preflow="pre_two_phase.dot", borrowck_graphviz_postflow="post_two_phase.dot")] ``` Includes doc for `FindPlaceUses`, as well as `Reservations` and `ActiveBorrows` structs, which are wrappers are the `Borrows` struct that dictate which flow analysis should be performed.
2017-12-01 12:32:51 +01:00
on_entry: &mut self.curr_state,
gen_set: &mut self.stmt_gen,
kill_set: &mut self.stmt_kill,
};
self.base_results
.operator()
.statement_effect(&mut sets, loc);
}
fn reconstruct_terminator_effect(&mut self, loc: Location) {
2018-03-06 00:58:01 +00:00
self.stmt_gen.clear();
self.stmt_kill.clear();
2017-12-24 00:45:53 +02:00
{
let mut sets = BlockSets {
on_entry: &mut self.curr_state,
gen_set: &mut self.stmt_gen,
kill_set: &mut self.stmt_kill,
};
self.base_results
.operator()
.before_terminator_effect(&mut sets, loc);
}
self.apply_local_effect(loc);
let mut sets = BlockSets {
New `ActiveBorrows` dataflow for two-phase `&mut`; not yet borrowed-checked. High-level picture: The old `Borrows` analysis is now called `Reservations` (implemented as a newtype wrapper around `Borrows`); this continues to compute whether a `Rvalue::Ref` can reach a statement without an intervening `EndRegion`. In addition, we also track what `Place` each such `Rvalue::Ref` was immediately assigned to in a given borrow (yay for MIR-structural properties!). The new `ActiveBorrows` analysis then tracks the initial use of any of those assigned `Places` for a given borrow. I.e. a borrow becomes "active" immediately after it starts being "used" in some way. (This is conservative in the sense that we will treat a copy `x = y;` as a use of `y`; in principle one might further delay activation in such cases.) The new `ActiveBorrows` analysis needs to take the `Reservations` results as an initial input, because the reservation state influences the gen/kill sets for `ActiveBorrows`. In particular, a use of `a` activates a borrow `a = &b` if and only if there exists a path (in the control flow graph) from the borrow to that use. So we need to know if the borrow reaches a given use to know if it really gets a gen-bit or not. * Incorporating the output from one dataflow analysis into the input of another required more changes to the infrastructure than I had expected, and even after those changes, the resulting code is still a bit subtle. * In particular, Since we need to know the intrablock reservation state, we need to dynamically update a bitvector for the reservations as we are also trying to compute the gen/kills bitvector for the active borrows. * The way I ended up deciding to do this (after also toying with at least two other designs) is to put both the reservation state and the active borrow state into a single bitvector. That is why we now have separate (but related) `BorrowIndex` and `ReserveOrActivateIndex`: each borrow index maps to a pair of neighboring reservation and activation indexes. As noted above, these changes are solely adding the active borrows dataflow analysis (and updating the existing code to cope with the switch from `Borrows` to `Reservations`). The code to process the bitvector in the borrow checker currently just skips over all of the active borrow bits. But atop this commit, one *can* observe the analysis results by looking at the graphviz output, e.g. via ```rust #[rustc_mir(borrowck_graphviz_preflow="pre_two_phase.dot", borrowck_graphviz_postflow="post_two_phase.dot")] ``` Includes doc for `FindPlaceUses`, as well as `Reservations` and `ActiveBorrows` structs, which are wrappers are the `Borrows` struct that dictate which flow analysis should be performed.
2017-12-01 12:32:51 +01:00
on_entry: &mut self.curr_state,
gen_set: &mut self.stmt_gen,
kill_set: &mut self.stmt_kill,
};
self.base_results
.operator()
.terminator_effect(&mut sets, loc);
}
fn apply_local_effect(&mut self, _loc: Location) {
self.curr_state.union(&self.stmt_gen);
self.curr_state.subtract(&self.stmt_kill);
}
}
impl<'tcx, T> FlowAtLocation<'tcx, T>
where
T: HasMoveData<'tcx> + BitDenotation<'tcx, Idx = MovePathIndex>,
{
pub fn has_any_child_of(&self, mpi: T::Idx) -> Option<T::Idx> {
// We process `mpi` before the loop below, for two reasons:
// - it's a little different from the loop case (we don't traverse its
// siblings);
// - ~99% of the time the loop isn't reached, and this code is hot, so
// we don't want to allocate `todo` unnecessarily.
Merge indexed_set.rs into bitvec.rs, and rename it bit_set.rs. Currently we have two files implementing bitsets (and 2D bit matrices). This commit combines them into one, taking the best features from each. This involves renaming a lot of things. The high level changes are as follows. - bitvec.rs --> bit_set.rs - indexed_set.rs --> (removed) - BitArray + IdxSet --> BitSet (merged, see below) - BitVector --> GrowableBitSet - {,Sparse,Hybrid}IdxSet --> {,Sparse,Hybrid}BitSet - BitMatrix --> BitMatrix - SparseBitMatrix --> SparseBitMatrix The changes within the bitset types themselves are as follows. ``` OLD OLD NEW BitArray<C> IdxSet<T> BitSet<T> -------- ------ ------ grow - grow new - (remove) new_empty new_empty new_empty new_filled new_filled new_filled - to_hybrid to_hybrid clear clear clear set_up_to set_up_to set_up_to clear_above - clear_above count - count contains(T) contains(&T) contains(T) contains_all - superset is_empty - is_empty insert(T) add(&T) insert(T) insert_all - insert_all() remove(T) remove(&T) remove(T) words words words words_mut words_mut words_mut - overwrite overwrite merge union union - subtract subtract - intersect intersect iter iter iter ``` In general, when choosing names I went with: - names that are more obvious (e.g. `BitSet` over `IdxSet`). - names that are more like the Rust libraries (e.g. `T` over `C`, `insert` over `add`); - names that are more set-like (e.g. `union` over `merge`, `superset` over `contains_all`, `domain_size` over `num_bits`). Also, using `T` for index arguments seems more sensible than `&T` -- even though the latter is standard in Rust collection types -- because indices are always copyable. It also results in fewer `&` and `*` sigils in practice.
2018-09-14 15:07:25 +10:00
if self.contains(mpi) {
return Some(mpi);
}
let move_data = self.operator().move_data();
let move_path = &move_data.move_paths[mpi];
let mut todo = if let Some(child) = move_path.first_child {
vec![child]
} else {
return None;
};
while let Some(mpi) = todo.pop() {
Merge indexed_set.rs into bitvec.rs, and rename it bit_set.rs. Currently we have two files implementing bitsets (and 2D bit matrices). This commit combines them into one, taking the best features from each. This involves renaming a lot of things. The high level changes are as follows. - bitvec.rs --> bit_set.rs - indexed_set.rs --> (removed) - BitArray + IdxSet --> BitSet (merged, see below) - BitVector --> GrowableBitSet - {,Sparse,Hybrid}IdxSet --> {,Sparse,Hybrid}BitSet - BitMatrix --> BitMatrix - SparseBitMatrix --> SparseBitMatrix The changes within the bitset types themselves are as follows. ``` OLD OLD NEW BitArray<C> IdxSet<T> BitSet<T> -------- ------ ------ grow - grow new - (remove) new_empty new_empty new_empty new_filled new_filled new_filled - to_hybrid to_hybrid clear clear clear set_up_to set_up_to set_up_to clear_above - clear_above count - count contains(T) contains(&T) contains(T) contains_all - superset is_empty - is_empty insert(T) add(&T) insert(T) insert_all - insert_all() remove(T) remove(&T) remove(T) words words words words_mut words_mut words_mut - overwrite overwrite merge union union - subtract subtract - intersect intersect iter iter iter ``` In general, when choosing names I went with: - names that are more obvious (e.g. `BitSet` over `IdxSet`). - names that are more like the Rust libraries (e.g. `T` over `C`, `insert` over `add`); - names that are more set-like (e.g. `union` over `merge`, `superset` over `contains_all`, `domain_size` over `num_bits`). Also, using `T` for index arguments seems more sensible than `&T` -- even though the latter is standard in Rust collection types -- because indices are always copyable. It also results in fewer `&` and `*` sigils in practice.
2018-09-14 15:07:25 +10:00
if self.contains(mpi) {
return Some(mpi);
}
let move_path = &move_data.move_paths[mpi];
if let Some(child) = move_path.first_child {
todo.push(child);
}
// After we've processed the original `mpi`, we should always
// traverse the siblings of any of its children.
if let Some(sibling) = move_path.next_sibling {
todo.push(sibling);
}
}
return None;
}
}