NLL: improve inference with flow results, represent regions with bitsets, and more
This PR begins with a number of edits to the NLL code and then includes a large number of smaller refactorings (these refactorings ought not to change behavior). There are a lot of commits here, but each is individually simple. The goal is to land everything up to but not including the changes to how we handle closures, which are conceptually more complex.
The NLL specific changes are as follows (in order of appearance):
**Modify the region inferencer's approach to free regions.** Previously, for each free region (lifetime parameter) `'a`, it would compute the set of other free regions that `'a` outlives (e.g., if we have `where 'a: 'b`, then this set would be `{'a, 'b}`). Then it would mark those free regions as "constants" and report an error if inference tried to extend `'a` to include any other region (e.g., `'c`) that is not in that outlives set. In this way, the value of `'a` would never grow beyond the maximum that could type check. The new approach is to allow `'a` to grow larger. Then, after the fact, we check over the value of `'a` and see what other free regions it is required to outlive, and we check that those outlives relationships are justified by the where clauses in scope etc.
**Modify constraint generation to consider maybe-init.** When we have a "drop-live" variable `x` (i.e., a variable that will be dropped but will not be otherwise used), we now consider whether `x` is "maybe initialized" at that point. If not, then we know the drop is a no-op, and we can allow its regions to be dead. Due to limitations in the fragment code, this currently only works at the level of entire variables.
**Change representation of regions to use a `BitMatrix`.** We used to use a `BTreeSet`, which was rather silly. We now use a MxN matrix of bits, where `M` is the number of variables and `N` is the number of possible elements in each set (size of the CFG + number of free regions).
The remaining commits (starting from
extract the `implied_bounds` code into a helper function ") are all "no-op" refactorings, I believe.
~~One concern I have is with the commit "with -Zverbose, print all details of closure substs"; this commit seems to include some "internal" stuff in the mir-dump files, such as internal interner numbers, that I fear may vary by platform. Annoying. I guess we will see.~~ (I removed this commit.)
As for reviewer, @arielb1 has been reviewing the PRs, and they are certainly welcome to review this one too. But I figured it'd maybe be good to have more people taking a look and being familiar with this code, so I'll "nominate" @pnkfelix .
r? @pnkfelix
In particular, if we see a variable is DROP-LIVE, but it is not
MAYBE-INIT, then we can ignore the drop. This leavess attempt to use
more complex refinements of the idea (e.g., for subpaths or subfields)
to future work.
Rather than declaring some region variables to be constant, and
reporting errors when they would have to change, we instead populate
each free region X with a minimal set of points (the CFG plus end(X)),
and then we let inference do its thing. This may add other `end(Y)`
points into X; we can then check after the fact that indeed `X: Y`
holds.
This requires a bit of "blame" detection to find where the bad
constraint came from: we are currently using a pretty dumb
algorithm. Good place for later expansion.
Fix CopyPropagation regression (2)
Remaining part of MIR copyprop regression by (I think) #45380, which I missed in #45753.
```rust
fn foo(mut x: i32) -> i32 {
let y = x;
x = 123; // `x` is assigned only once in MIR, but cannot be propagated to `y`
y
}
```
So any assignment to an argument cannot be propagated.
create a drop ladder for an array if any value is moved out
r? @arielb1
first commit for fix https://github.com/rust-lang/rust/issues/34708 (note: this still handles the subslice case in a very broken manner)
This simplifies analysis and borrow-checking because liveness at the
resume point can always be simply propagated.
Later on, the "dead" Resumes are removed.
incr.comp.: Some preparatory work for caching more query results.
This PR
* adds and updates some encoding/decoding routines for various query result types so they can be cached later, and
* adds missing `[input]` annotations for a few `DepNode` variants.
The situation around having to explicitly mark dep-nodes/queries as inputs is not really satisfactory. I hope we can find a way of making this more fool-proof in the future.
r? @nikomatsakis
avoid type-live-for-region obligations on dummy nodes
Type-live-for-region obligations on DUMMY_NODE_ID cause an ICE, and it
turns out that in the few cases they are needed, these obligations are not
needed anyway because they are verified elsewhere.
Fixes#46069.
Beta-nominating because this is a regression for our new beta.
r? @nikomatsakis
To handle packed structs with destructors (which you'll think are a rare
case, but the `#[repr(packed)] struct Packed<T>(T);` pattern is
ever-popular, which requires handling packed structs with destructors to
avoid monomorphization-time errors), drops of subfields of packed
structs should drop a local move of the field instead of the original
one.
cc #27060 - this should deal with that issue after codegen of drop glue
is updated.
The new errors need to be changed to future-compatibility warnings, but
I'll rather do a crater run first with them as errors to assess the
impact.
Type-live-for-region obligations on DUMMY_NODE_ID cause an ICE, and it
turns out that in the few cases they are needed, these obligations are not
needed anyway because they are verified elsewhere.
Fixes#46069.
Add a MIR pass to lower 128-bit operators to lang item calls
Runs only with `-Z lower_128bit_ops` since it's not hooked into targets yet.
This isn't really useful on its own, but the declarations for the lang items need to be in the compiler before compiler-builtins can be updated to define them, so this is part 1 of at least 3.
cc https://github.com/rust-lang/rust/issues/45676 @est31 @nagisa
move closure kind, signature into `ClosureSubsts`
Instead of using side-tables, store the closure-kind and signature in the substitutions themselves. This has two key effects:
- It means that the closure's type changes as inference finds out more things, which is very nice.
- As a result, it avoids the need for the `freshen_closure_like` code (though we still use it for generators).
- It avoids cyclic closures calls.
- These were never meant to be supported, precisely because they make a lot of the fancy inference that we do much more complicated. However, due to an oversight, it was previously possible -- if challenging -- to create a setup where a closure *directly* called itself (see e.g. #21410).
We have to see what the effect of this change is, though. Needs a crater run. Marking as [WIP] until that has been assessed.
r? @arielb1
* The overflow-checking shift items need to take a full 128-bit type, since they need to be able to detect idiocy like `1i128 << (1u128 << 127)`
* The unchecked ones just take u32, like the `*_sh?` methods in core
* Because shift-by-anything is allowed, cast into a new local for every shift
Refactor type memory layouts and ABIs, to be more general and easier to optimize.
To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
* `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
* `Tagged` has its variants discriminated by an integer tag, including C `enum`s
* `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
* `Union` has all its fields at offset `0`
* `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
* `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
* `Uninhabited` corresponds to no values, associated with unreachable control-flow
* `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
* `ScalarPair` has two "scalar components", but only applies to the Rust ABI
* `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
* `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s
Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
* `Option<!>` is 0 bytes
* `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
* `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
* `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
* `enum E { A(bool), B, C, D }` is 1 byte
Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.
Closes#44426, fixes#5977, fixes#14540, fixes#43278.