[MIR] Implement overflow checking
The initial set of changes is from @Aatch's #33255 PR, rebased on master, plus:
Added an `Assert` terminator to MIR, to simplify working with overflow and bounds checks.
With this terminator, error cases can be accounted for directly, instead of looking for lang item calls.
It also keeps the MIR slimmer, with no extra explicit blocks for the actual panic calls.
Warnings can be produced when the `Assert` is known to always panic at runtime, e.g.:
```rust
warning: index out of bounds: the len is 1 but the index is 3
--> <anon>:1:14
1 |> fn main() { &[std::io::stdout()][3]; }
|> ^^^^^^^^^^^^^^^^^^^^^^
```
Generalized the `OperandValue::FatPtr` optimization to any aggregate pair of immediates.
This allows us to generate the same IR for overflow checks as old trans, not something worse.
For example, addition on `i16` calls `llvm.sadd.with.overflow.i16`, which returns `{i16, i1}`.
However, the Rust type `(i16, bool)`, has to be `{i16, i8}`, only an immediate `bool` is `i1`.
But if we split the pair into an `i16` and an `i1`, we can pass them around as such for free.
The latest addition is a rebase of #34054, updated to work for pairs too. Closes#34054, fixes#33873.
Last but not least, the `#[rustc_inherit_overflow_checks]` attribute was introduced to control the
overflow checking behavior of generic or `#[inline]` functions, when translated in another crate.
It is **not** intended to be used by crates other than `libcore`, which is in the unusual position of
being distributed as only an optimized build with no checks, even when used from debug mode.
Before MIR-based translation, this worked out fine, as the decision for overflow was made at
translation time, in the crate being compiled, but MIR stored in `rlib` has to contain the checks.
To avoid always generating the checks and slowing everything down, a decision was made to
use an attribute in the few spots of `libcore` that need it (see #33255 for previous discussion):
* `core::ops::{Add, Sub, Mul, Neg, Shl, Shr}` implementations for integers, which have `#[inline]` methods and can be used in generic abstractions from other crates
* `core::ops::{Add, Sub, Mul, Neg, Shl, Shr}Assign` same as above, for augmented assignment
* `pow` and `abs` methods on integers, which intentionally piggy-back on built-in multiplication and negation, respectively, to get overflow checks
* `core::iter::{Iterator, Chain, Peek}::count` and `core::iter::Enumerate::{next, nth}`, also documented as panicking on overflow, from addition, counting elements of an iterator in an `usize`
generate fewer basic blocks for variant switches
CC #33567
Adds a new field to TestKind::Switch that tracks the variants that are actually matched against. The other candidates target a common "otherwise" block.
this introduces a DropAndReplace terminator as a fix to #30380. That terminator
is suppsoed to be translated by desugaring during drop elaboration, which is
not implemented in this commit, so this breaks `-Z orbit` temporarily.
Fixes to mir dataflow
Fixes to mir dataflow
This collects a bunch of changes to `rustc_borrowck::borrowck::dataflow` (which others have pointed out should probably migrate to some crate that isn't tied to the borrow-checker -- but I have not attempted that here, especially since there are competing approaches to dataflow that we should also evaluate).
These changes:
1. Provide a family of related analyses: MovingOutStatements (which is what the old AST-based dataflo computed), as well as MaybeInitialized, MaybeUninitalized, and DefinitelyInitialized.
* (The last two are actually inverses of each other; we should pick one and drop the other.)
2. Fix bugs in the pre-existing analysis implementation, which was untested and thus some obvious bugs went unnoticed, which brings us to the third point:
3. Add a unit test infrastructure for the MIR dataflow analysis.
* The tests work by adding a new intrinsic that is able to query the analysis state for a particular expression (technically, a particular L-value).
* See the examples in compile-fail/mir-dataflow/inits-1.rs and compile-fail/mir-dataflow/uninits-1.rs
* These tests are only checking the results for MaybeInitialized, MaybeUninitalized, and DefinitelyInitialized; I am not sure if it will be feasible to generalize this testing strategy to the MovingOutStatements dataflow operator.
(The crucial thing these changes are working toward (but are not yet
in this commit) is a way to pretty-print MIR without having the
`NodeId` for that MIR in hand.)
Some simple improvements to MIR pretty printing
In short, this PR changes the MIR printer so that it:
* places an empty line between the MIR for each item
* does *not* write an empty line before the first BB when there are no
var decls
* aligns the "// Scope" comments 50 chars in (makes the output more
readable)
* prints the scope comments as "// scope N at ..." instead of "//
Scope(N) at ..."
* prints a prettier scope tree:
* no more unbalanced delimiters!
* no more "Parent" entry (these convey no useful information)
* drop the "Scope()" and just print scope IDs
* no braces when the scope is empty
In action: https://gist.github.com/jonas-schievink/1c11226cbb112892a9470ce0f9870b65
Only break critical edges where actually needed
Currently, to prepare for MIR trans, we break _all_ critical edges,
although we only actually need to do this for edges originating from a
call that gets translated to an invoke instruction in LLVM.
This has the unfortunate effect of undoing a bunch of the things that
SimplifyCfg has done. A particularly bad case arises when you have a
C-like enum with N variants and a derived PartialEq implementation.
In that case, the match on the (&lhs, &rhs) tuple gets translated into
nested matches with N arms each and a basic block each, resulting in N²
basic blocks. SimplifyCfg reduces that to roughly 2*N basic blocks, but
breaking the critical edges means that we go back to N².
In nickel.rs, there is such an enum with roughly N=800. So we get about
640K basic blocks or 2.5M lines of LLVM IR. LLVM takes a while to
reduce that to the final "disr_a == disr_b".
So before this patch, we had 2.5M lines of IR with 640K basic blocks,
which took about about 3.6s in LLVM to get optimized and translated.
After this patch, we get about 650K lines with about 1.6K basic blocks
and spent a little less than 0.2s in LLVM.
cc #33111
r? @Aatch
mir: don't attempt to promote Unpromotable constant temps.
Fixes#33537. This was a non-problem in regular functions, but we also promote in `const fn`s.
There we always qualify temps so you can't depend on `Unpromotable` temps being `NOT_CONST`.
In short, this PR changes the MIR printer so that it:
* places an empty line between the MIR for each item
* does *not* write an empty line before the first BB when there are no
var decls
* aligns the "// Scope" comments 50 chars in (makes the output more
readable)
* prints the scope comments as "// scope N at ..." instead of "//
Scope(N) at ..."
* prints a prettier scope tree:
* no more unbalanced delimiters!
* no more "Parent" entry (these convey no useful information)
* drop the "Scope()" and just print scope IDs
* no braces when the scope is empty
Currently, to prepare for MIR trans, we break _all_ critical edges,
although we only actually need to do this for edges originating from a
call that gets translated to an invoke instruction in LLVM.
This has the unfortunate effect of undoing a bunch of the things that
SimplifyCfg has done. A particularly bad case arises when you have a
C-like enum with N variants and a derived PartialEq implementation.
In that case, the match on the (&lhs, &rhs) tuple gets translated into
nested matches with N arms each and a basic block each, resulting in N²
basic blocks. SimplifyCfg reduces that to roughly 2*N basic blocks, but
breaking the critical edges means that we go back to N².
In nickel.rs, there is such an enum with roughly N=800. So we get about
640K basic blocks or 2.5M lines of LLVM IR. LLVM takes a while to
reduce that to the final "disr_a == disr_b".
So before this patch, we had 2.5M lines of IR with 640K basic blocks,
which took about about 3.6s in LLVM to get optimized and translated.
After this patch, we get about 650K lines with about 1.6K basic blocks
and spent a little less than 0.2s in LLVM.
cc #33111
mir: drop temps outside-in by scheduling the drops inside-out.
It was backwards all along, but only noticeable with multiple drops in one rvalue scope. Fixes#32433.