typeck aggregate rvalues in MIR type checker
This branch is an attempt to land content by @spastorino and @Nashenas88 that was initially landed on nll-master while we waited for https://github.com/rust-lang/rust/pull/45825 to land.
The biggest change it contains is that it extends the MIR type-checker to also type-check MIR aggregate rvalues (at least partially). Specifically, it checks that the operands provided for each field have the right type.
It does not yet check that their well-formedness predicates are met. That is https://github.com/rust-lang/rust/issues/45827. It also does not check other kinds of rvalues (that is https://github.com/rust-lang/rust/issues/45959). @spastorino is working on those issues now.
r? @arielb1
MIR-borrowck: Some minor fixes
- Remove parens when printing dereference (fix#45185)
- Change argument type of `autoderef` to `bool`
- Change argument type of `field_index` to `Field`
move closure kind, signature into `ClosureSubsts`
Instead of using side-tables, store the closure-kind and signature in the substitutions themselves. This has two key effects:
- It means that the closure's type changes as inference finds out more things, which is very nice.
- As a result, it avoids the need for the `freshen_closure_like` code (though we still use it for generators).
- It avoids cyclic closures calls.
- These were never meant to be supported, precisely because they make a lot of the fancy inference that we do much more complicated. However, due to an oversight, it was previously possible -- if challenging -- to create a setup where a closure *directly* called itself (see e.g. #21410).
We have to see what the effect of this change is, though. Needs a crater run. Marking as [WIP] until that has been assessed.
r? @arielb1
Fix typo in MIR "cannot move out of borrowed content"
I believe this all we need to change (#46018). Anyway, do let me know if there is anything else that needs to changed as well!
* The overflow-checking shift items need to take a full 128-bit type, since they need to be able to detect idiocy like `1i128 << (1u128 << 127)`
* The unchecked ones just take u32, like the `*_sh?` methods in core
* Because shift-by-anything is allowed, cast into a new local for every shift
Refactor type memory layouts and ABIs, to be more general and easier to optimize.
To combat combinatorial explosion, type layouts are now described through 3 orthogonal properties:
* `Variants` describes the plurality of sum types (where applicable)
* `Single` is for one inhabited/active variant, including all C `struct`s and `union`s
* `Tagged` has its variants discriminated by an integer tag, including C `enum`s
* `NicheFilling` uses otherwise-invalid values ("niches") for all but one of its inhabited variants
* `FieldPlacement` describes the number and memory offsets of fields (if any)
* `Union` has all its fields at offset `0`
* `Array` has offsets that are a multiple of its `stride`; guarantees all fields have one type
* `Arbitrary` records all the field offsets, which can be out-of-order
* `Abi` describes how values of the type should be passed around, including for FFI
* `Uninhabited` corresponds to no values, associated with unreachable control-flow
* `Scalar` is ABI-identical to its only integer/floating-point/pointer "scalar component"
* `ScalarPair` has two "scalar components", but only applies to the Rust ABI
* `Vector` is for SIMD vectors, typically `#[repr(simd)]` `struct`s in Rust
* `Aggregate` has arbitrary contents, including all non-transparent C `struct`s and `union`s
Size optimizations implemented so far:
* ignoring uninhabited variants (i.e. containing uninhabited fields), e.g.:
* `Option<!>` is 0 bytes
* `Result<T, !>` has the same size as `T`
* using arbitrary niches, not just `0`, to represent a data-less variant, e.g.:
* `Option<bool>`, `Option<Option<bool>>`, `Option<Ordering>` are all 1 byte
* `Option<char>` is 4 bytes
* using a range of niches to represent *multiple* data-less variants, e.g.:
* `enum E { A(bool), B, C, D }` is 1 byte
Code generation now takes advantage of `Scalar` and `ScalarPair` to, in more cases, pass around scalar components as immediates instead of indirectly, through pointers into temporary memory, while avoiding LLVM's "first-class aggregates", and there's more untapped potential here.
Closes#44426, fixes#5977, fixes#14540, fixes#43278.
MIR-borrowck: emit "`foo` does not live long enough" instead of borrow errors
Fixes#45360. As of writing, contains deduplication of existing errors.
r? @nikomatsakis
MIR: hide .rodata constants vs by-ref ABI clash in trans.
Back in #45380, constants were copied into locals during MIR creation to ensure that arguments ' memory can be used by the callee, if the constant is placed in `.rodata` and the ABI passes it by-ref.
However, there are several drawbacks (see https://github.com/rust-lang/rust/pull/45380#discussion_r150447709), most importantly the complication of constant propagation (UB if a constant ends up in `Call` arguments) and inconveniencing analyses.
Instead, I've modified the `rustc_trans` implementation of calls to copy an `Operand::Constant` argument locally if it's not immediate, and added a test that segfaults without the copy.
cc @dotdash @arielb1
incr.comp.: Implement query result cache and use it to cache type checking tables.
This is a spike implementation of caching more than LLVM IR and object files when doing incremental compilation. At the moment, only the `typeck_tables_of` query is cached but MIR and borrow-check will follow shortly. The feature is activated by running with `-Zincremental-queries` in addition to `-Zincremental`, it is not yet active by default.
r? @nikomatsakis
integrate MIR type-checker with NLL inference
This branch refactors NLL type inference so that it uses the MIR type-checker to gather constraints. Along the way, it also refactors how region constraints are gathered in the normal inference context mildly. The new setup is like this:
- What used to be `region_inference` is split into two parts:
- `region_constraints`, which just collects up sets of constraints
- `lexical_region_resolve`, which does the iterative, lexical region resolution
- When `resolve_regions_and_report_errors` is invoked, the inference engine converts the constraints into final values.
- In the MIR type checker, however, we do not invoke this method, but instead periodically take the region constraints and package them up for the NLL solver to use later.
- This allows us to track when and where those constraints were incurred.
- We also remove the central fulfillment context from the MIR type checker, instead instantiating new fulfillment contexts at each point. This allows us to capture the set of obligations that occurred at a particular point, and also to ensure that if the same obligation arises at two points, we will enforce the region constraints at both locations.
- The MIR type checker is also enhanced to instantiate late-bound-regions with fresh variables and handle a few other corner cases that arose.
- I also extracted some of the 'outlives' logic from the regionck, which will be needed later (see future work) to handle the type-outlives relationships.
One concern I have with this branch: since the MIR type checker is used even without the `-Znll` switch, I'm not sure if it will impact performance. One simple fix here would be to only enable the MIR type-checker if debug-assertions are enabled, since it just serves to validate the MIR. Longer term I hope to address this by improving the interface to the trait solver to be more query-based (ongoing work).
There is plenty of future work left. Here are two things that leap to mind:
- **Type-region outlives.** Currently, the NLL solver will ICE if it is required to handle a constraint like `T: 'a`. Fixing this will require a small amount of refactoring to extract the implied bounds code. I plan to follow a file-up bug on this (hopefully with mentoring instructions).
- **Testing.** It's a good idea to enumerate some of the tricky scenarios that need testing, but I think it'd be nice to try and parallelize some of the actual test writing (and resulting bug fixing):
- Same obligation occurring at two points.
- Well-formedness and trait obligations of various kinds (which are not all processed by the current MIR type-checker).
- More tests for how subtyping and region inferencing interact.
- More suggestions welcome!
r? @arielb1
We are heading towards deeper integration with the region inference
system in infcx; in particular, prior to the creation of the
`RegionInferenceContext`, it will be the "owner" of the set of region
variables.
check_unsafety: fix unused unsafe block duplication
The duplicate error message is later removed by error message
deduplication, but it still appears on beta and is still a bug.
r? @eddyb
always add an unreachable branch on matches to give more info to llvm
As part of https://github.com/djzin/rustc-optimization I discovered that some simple enum optimizations (src/unary/three_valued_enum.rs and src/unary/four_valued_enum.rs in the repo) are not applied - and the reason for this is that we erase the info that the discriminant of an enum is one of the options by putting the last one in an "otherwise" branch. This patch adds an extra branch so that LLVM can know what the possibilities are for the discriminant, which fixes the three- and four- valued cases.
Note that for whatever reason, this doesn't fix the case of 2 variants (most notably `Option` and `Result` have 2 variants) - a pass re-ordering might fix this or we may wish to add "assume" annotations on discriminants to force it to optimize.
MIR-borrowck: don't ICE for cannot move from array error
Closes#45694
compile-fail test E0508 now gives
```text
error[E0508]: cannot move out of type `[NonCopy; 1]`, a non-copy array (Ast)
--> .\src\test\compile-fail\E0508.rs:18:18
|
18 | let _value = array[0]; //[ast]~ ERROR E0508
| ^^^^^^^^
| |
| cannot move out of here
| help: consider using a reference instead: `&array[0]`
error[E0508]: cannot move out of type `[NonCopy; 1]`, a non-copy array (Mir)
--> .\src\test\compile-fail\E0508.rs:18:18
|
18 | let _value = array[0]; //[ast]~ ERROR E0508
| ^^^^^^^^ cannot move out of here
error: aborting due to 2 previous errors
```
MIR-borrowck: fix diagnostics for closures
Emit notes for captured variables in the same manner as AST borrowck.
```
error[E0499]: cannot borrow `x` as mutable more than once at a time (Ast)
--> $DIR/borrowck-closures-two-mut.rs:24:24
|
23 | let c1 = to_fn_mut(|| x = 4);
| -- - previous borrow occurs due to use of `x` in closure
| |
| first mutable borrow occurs here
24 | let c2 = to_fn_mut(|| x = 5); //~ ERROR cannot borrow `x` as mutable more than once
| ^^ - borrow occurs due to use of `x` in closure
| |
| second mutable borrow occurs here
25 | }
| - first borrow ends here
error[E0499]: cannot borrow `x` as mutable more than once at a time (Mir)
--> $DIR/borrowck-closures-two-mut.rs:24:24
|
23 | let c1 = to_fn_mut(|| x = 4);
| -- - previous borrow occurs due to use of `x` in closure
| |
| first mutable borrow occurs here
24 | let c2 = to_fn_mut(|| x = 5); //~ ERROR cannot borrow `x` as mutable more than once
| ^^ - borrow occurs due to use of `x` in closure
| |
| second mutable borrow occurs here
25 | }
| - first borrow ends here
```
Fixes#45362.
Fix MIR CopyPropagation errneously propagating assignments to function args
Compiling this code with MIR CopyPropagation activated will result in printing `5`,
because CopyProp errneously propagates the assignment of `5` to all `x`:
```rust
fn bar(mut x: u8) {
println!("{}", x);
x = 5;
}
fn main() {
bar(123);
}
```
If a local is propagated, it will result in an ICE at trans due to an use-before-def:
```rust
fn dummy(x: u8) -> u8 { x }
fn foo(mut x: u8) {
x = dummy(x); // this will assign a local to `x`
}
```
Currently CopyProp conservatively gives up if there are multiple assignments to a local,
but it is not took into account that arguments are already assigned from the beginning.
This PR fixes the problem by preventing propagation of assignments to function arguments.
extend NLL with preliminary support for free regions on functions
This PR extends https://github.com/rust-lang/rust/pull/45538 with support for free regions. This is pretty preliminary and will no doubt want to change in various ways, particularly as we add support for closures, but it's enough to get the basic idea in place:
- We now create specific regions to represent each named lifetime declared on the function.
- Region values can contain references to these regions (represented for now as a `BTreeSet<RegionIndex>`).
- If we wind up trying to infer that `'a: 'b` must hold, but no such relationship was declared, we report an error.
It also does a number of drive-by refactorings.
r? @arielb1
cc @spastorino
add TerminatorKind::FalseEdges and use it in matches
impl #45184 and fixes#45043 right way.
False edges unexpectedly affects uninitialized variables analysis in MIR borrowck.
The macro now takes a format string. It no longer defaults to using the
type name. Didn't seem worth going through contortions to maintain. I
also changed most of the debug formats to be `foo[N]` instead of `fooN`.
Also, factor out `do_mir_borrowck`, which is the code that actually
performs the MIR borrowck from within the scope of an inference context.
This change should be a pure refactoring.
This will be important in next commit, since the input types will be
tagged not with `'gcx` but rather `'tcx`. Also, using the region-erased,
lifted types enables better caching.
Initially MIR differentiated between arguments and locals, which
introduced a need to add extra copies assigning the argument to a
local, even for simple bindings. This differentiation no longer exists,
but we're still creating those copies, bloating the MIR and LLVM IR we
emit.
Additionally, the current approach means that we create debug info for
both the incoming argument (marking it as an argument), and then
immediately shadow it a local that goes by the same name. This can be
confusing when using e.g. "info args" in gdb, or when e.g. a debugger
with a GUI displays the function arguments separately from the local
variables, especially when the binding is mutable, because the argument
doesn't change, while the local variable does.
Mention Clone and refs in --explain E0382
I followed the discussion in #42446 and came up with these additions.
- Mention references before going into traits. They're probably more likely solutions.
- Mention `Clone` before `Copy`. Cloning has wider applicability and `#derive[Copy, Clone]` makes more sense after learning about `Clone`.
The language is not great, any suggestions there would be appreciated ✨