`replace_bound_vars` fast path: check predicates, don't check consts
split out from #98900
`ty::Const` doesn't have precomputed type flags, so
computing `has_vars_bound_at_or_above` for constants
requires us to visit the const and its contained types
and constants. A noop fold should be pretty much equally as
fast so removing it prevents us from walking the constant twice
in case it contains bound vars.
r? `@jackh726`
`arena > Rc` for query results
The `Rc`s have to live for the whole duration as their count cannot go below 1 while stored as part of the query results.
By storing them in an arena we should save a bit of memory because we don't have as many independent allocations and also don't have to clone the `Rc` anymore.
Use constant eval to do strict mem::uninit/zeroed validity checks
I'm not sure about the code organisation here, I just dumped the check in rustc_const_eval at the root. Not hard to move it elsewhere, in any case.
Also, this means cranelift codegen intrinsics lose the strict checks, since they don't seem to depend on rustc_const_eval, and I didn't see a point in keeping around two copies.
I also left comments in the is_zero_valid methods about "uhhh help how do i do this", those apply to both methods equally.
Also rustc_codegen_ssa now depends on rustc_const_eval... is this okay?
Pinging `@RalfJung` since you were the one who mentioned this to me, so I'm assuming you're interested.
Haven't had a chance to run full tests on this since it's really warm, and it's 1AM, I'll check out any failures/comments in the morning :)
Revert "Highlight conflicting param-env candidates"
This reverts #98794, commit 08135254dc.
Seems to have caused an incremental compilation bug. The root cause of the incr comp bug is somewhat unrelated but is triggered by this PR, so I don't feel comfortable with having this PR in the codebase until it can be investigated further. Fixes#99233.
Allow destructuring opaque types in their defining scopes
fixes#96572
Before this PR, the following code snippet failed with an incomprehensible error, and similar code just ICEd in mir borrowck.
```rust
type T = impl Copy;
let foo: T = (1u32, 2u32);
let (a, b) = foo;
```
The problem was that the last line created MIR projections of the form `foo.0` and `foo.1`, but `foo`'s type is `T`, which doesn't have fields (only its hidden type does). But the pattern supplies enough type information (a tuple of two different inference types) to bind a hidden type.
Remove some usages of `guess_head_span`
No need to pass things through `guess_head_span` if they already point to the head span.
Only major change is that we point to the head span of `enum`s on some errors now, which I prefer.
r? `@cjgillot`
interpret: get rid of MemPlaceMeta::Poison
This is achieved by refactoring the projection code (`{mplace,place,operand}_{downcast,field,index,...}`) so that we no longer need to call `assert_mem_place` in the operand handling.
Move abstract const to middle
Moves AbstractConst (and all associated methods) to rustc middle for use in `rustc_infer`.
This allows for const resolution in infer to use abstract consts to walk consts and check if
they are resolvable.
This attempts to resolve the issue where `Foo<{ concrete const }, generic T>` is incorrectly marked as conflicting, and is independent from the other issue where nested abstract consts must be resolved.
r? `@lcnr`
`ty::Const` doesn't have precomputed type flags, so
computing `has_vars_bound_at_or_above` for constants
requires us to visit the const and its contained types
and constants. A noop fold should be pretty much equally as
fast so removing it prevents us from walking the constant twice
in case it contains bound vars.
Implement `for<>` lifetime binder for closures
This PR implements RFC 3216 ([TI](https://github.com/rust-lang/rust/issues/97362)) and allows code like the following:
```rust
let _f = for<'a, 'b> |a: &'a A, b: &'b B| -> &'b C { b.c(a) };
// ^^^^^^^^^^^--- new!
```
cc ``@Aaron1011`` ``@cjgillot``
Pull Derefer before ElaborateDrops
_Follow up work to #97025#96549#96116#95887 #95649_
This moves `Derefer` before `ElaborateDrops` and creates a new `Rvalue` called `VirtualRef` that allows us to bypass many constraints for `DerefTemp`.
r? `@oli-obk`
Lower let-else in MIR
This MR will switch to lower let-else statements in MIR building instead.
To lower let-else in MIR, we build a mini-switch two branches. One branch leads to the matching case, and the other leads to the `else` block. This arrangement will allow temporary lifetime analysis running as-is so that the temporaries are properly extended according to the same rule applied to regular `let` statements.
cc https://github.com/rust-lang/rust/issues/87335Fix#98672
don't allow ZST in ScalarInt
There are several indications that we should not ZST as a ScalarInt:
- We had two ways to have ZST valtrees, either an empty `Branch` or a `Leaf` with a ZST in it.
`ValTree::zst()` used the former, but the latter could possibly arise as well.
- Likewise, the interpreter had `Immediate::Uninit` and `Immediate::Scalar(Scalar::ZST)`.
- LLVM codegen already had to special-case ZST ScalarInt.
So I propose we stop using ScalarInt to represent ZST (which are clearly not integers). Instead, we can add new ZST variants to those types that did not have other variants which could be used for this purpose.
Based on https://github.com/rust-lang/rust/pull/98831. Only the commits starting from "don't allow ZST in ScalarInt" are new.
r? `@oli-obk`
There are several indications that we should not ZST as a ScalarInt:
- We had two ways to have ZST valtrees, either an empty `Branch` or a `Leaf` with a ZST in it.
`ValTree::zst()` used the former, but the latter could possibly arise as well.
- Likewise, the interpreter had `Immediate::Uninit` and `Immediate::Scalar(Scalar::ZST)`.
- LLVM codegen already had to special-case ZST ScalarInt.
So instead add new ZST variants to those types that did not have other variants
which could be used for this purpose.
Clarify MIR semantics of storage statements
Seems worthwhile to start closing out some of the less controversial open questions about MIR semantics. Hopefully this is fairly non-controversial - it's what we implement already, and I see no reason to do anything more restrictive. cc ``@tmiasko`` who commented on this when it was discussed in the original PR that added these docs.
Miscellaneous inlining improvements
Add `#[inline]` to a few trivial non-generic methods from a perf report
that otherwise wouldn't be candidates for inlining.
don't succeed `evaluate_obligation` query if new opaque types were registered
fixes#98608fixes#98604
The root cause of all this is that in type flag computation we entirely ignore nongeneric things like struct fields and the signature of function items. So if a flag had to be set for a struct if it is set for a field, that will only happen if the field is generic, as only the generic parameters are checked.
I now believe we cannot use type flags to handle opaque types. They seem like the wrong tool for this.
Instead, this PR replaces the previous logic by adding a new variant of `EvaluatedToOk`: `EvaluatedToOkModuloOpaqueTypes`, which says that there were some opaque types that got hidden types bound, but that binding may not have been legal (because we don't know if the opaque type was in its defining scope or not).
Highlight conflicting param-env candidates
This could probably be further improved by noting _why_ equivalent param-env candidates (modulo regions) leads to ambiguity.
Fixes#98786
Rollup of 9 pull requests
Successful merges:
- #97917 (Implement ExitCodeExt for Windows)
- #98844 (Reword comments and rename HIR visiting methods.)
- #98979 (interpret: use AllocRange in UninitByteAccess)
- #98986 (Fix missing word in comment)
- #98994 (replace process exit with more detailed exit in src/bootstrap/*.rs)
- #98995 (Add a test for #80471)
- #99002 (suggest adding a derive for #[default] applied to variants)
- #99004 (Add a test for #70408)
- #99017 (Replace boolean argument for print_where_clause with an enum to make code more clear)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
interpret: use AllocRange in UninitByteAccess
also use nice new format string syntax in `interpret/error.rs`, and use the `#` flag to add `0x` prefixes where applicable.
r? ``@oli-obk``
Make lowering a query
Split from https://github.com/rust-lang/rust/pull/88186.
This PR refactors the relationship between lowering and the resolver outputs in order to make lowering itself a query.
In a first part, lowering is changed to avoid modifying resolver outputs, by maintaining its own data structures for creating new `NodeId`s and so.
Then, the `TyCtxt` is modified to allow creating new `LocalDefId`s from inside it. This is done by:
- enclosing `Definitions` in a lock, so as to allow modification;
- creating a query `register_def` whose purpose is to declare a `LocalDefId` to the query system.
See `TyCtxt::create_def` and `TyCtxt::iter_local_def_id` for more detailed explanations of the design.
This makes it possible to mutably borrow different fields of the MIR
body without resorting to methods like `basic_blocks_local_decls_mut_and_var_debug_info`.
To preserve validity of control flow graph caches in the presence of
modifications, a new struct `BasicBlocks` wraps together basic blocks
and control flow graph caches.
The `BasicBlocks` dereferences to `IndexVec<BasicBlock, BasicBlockData>`.
On the other hand a mutable access requires explicit `as_mut()` call.
Do not fetch HIR to compute variances.
Everything can be done using higher-level queries. This simplifies the code, and should allow better incremental caching.
macros: `LintDiagnostic` derive
- Move `LintDiagnosticBuilder` into `rustc_errors` so that a diagnostic derive can refer to it.
- Introduce a `DecorateLint` trait, which is equivalent to `SessionDiagnostic` or `AddToDiagnostic` but for lints. Necessary without making more changes to the lint infrastructure as `DecorateLint` takes a `LintDiagnosticBuilder` and re-uses all of the existing logic for determining what type of diagnostic a lint should be emitted as (e.g. error/warning).
- Various refactorings of the diagnostic derive machinery (extracting `build_field_mapping` helper and moving `sess` field out of the `DiagnosticDeriveBuilder`).
- Introduce a `LintDiagnostic` derive macro that works almost exactly like the `SessionDiagnostic` derive macro except that it derives a `DecorateLint` implementation instead. A new derive is necessary for this because `SessionDiagnostic` is intended for when the generated code creates the diagnostic. `AddToDiagnostic` could have been used but it would have required more changes to the lint machinery.
~~At time of opening this pull request, ignore all of the commits from #98624, it's just the last few commits that are new.~~
r? `@oli-obk`
Fix repr(align) enum handling
`enum`, for better or worse, supports `repr(align)`. That has already caused a bug in https://github.com/rust-lang/rust/issues/92464, which was "fixed" in https://github.com/rust-lang/rust/pull/92932, but it turns out that that fix is wrong and caused https://github.com/rust-lang/rust/issues/96185.
So this reverts #92932 (which fixes#96185), and attempts another strategy for fixing #92464: special-case enums when doing a cast, re-using the code to load the discriminant rather than assuming that the enum has scalar layout. This works fine for the interpreter.
However, #92464 contained another testcase that was previously not in the test suite -- and after adding it, it ICEs again. This is not surprising; codegen needs the same patch that I did in the interpreter. Probably this has to happen [around here](d32ce37a17/compiler/rustc_codegen_ssa/src/mir/rvalue.rs (L276)). Unfortunately I don't know how to do that -- the interpreter can load a discriminant from an operand, but codegen can only do that from a place. `@oli-obk` `@eddyb` `@bjorn3` any idea?
lints: mostly translatable diagnostics
As lints are created slightly differently than other diagnostics, intended to try make them translatable first and then look into the applicability of diagnostic structs but ended up just making most of the diagnostics in the crate translatable (which will still be useful if I do make a lot of them structs later anyway).
r? ``@compiler-errors``
Change enum->int casts to not go through MIR casts.
follow-up to https://github.com/rust-lang/rust/pull/96814
this simplifies all backends and even gives LLVM more information about the return value of `Rvalue::Discriminant`, enabling optimizations in more cases.
Interpret: AllocRange Debug impl, and use it more consistently
The two commits are pretty independent but it did not seem worth having two PRs for them.
r? ``@oli-obk``
Add method to mutate MIR body without invalidating CFG caches.
In addition to adding this method, a handful of passes are updated to use it. There's still quite a few passes that could in principle make use of this as well, but do not at the moment because they use `VisitorMut` or `MirPatch`, which needs additional support for this.
The method name is slightly unwieldy, but I don't expect anyone to be writing it a lot, and at least it says what it does. If anyone has a suggestion for a better name though, would be happy to rename.
r? rust-lang/mir-opt
CTFE interning: don't walk allocations that don't need it
The interning of const allocations visits the mplace looking for references to intern. Walking big aggregates like big static arrays can be costly, so we only do it if the allocation we're interning contains references or interior mutability.
Walking ZSTs was avoided before, and this optimization is now applied to cases where there are no references/relocations either.
---
While initially looking at this in the context of #93215, I've been testing with smaller allocations than the 16GB one in that issue, and with different init/uninit patterns (esp. via padding).
In that example, by default, `eval_to_allocation_raw` is the heaviest query followed by `incr_comp_serialize_result_cache`. So I'll show numbers when incremental compilation is disabled, to focus on the const allocations themselves at 95% of the compilation time, at bigger array sizes on these minimal examples like `static ARRAY: [u64; LEN] = [0; LEN];`.
That is a close construction to parts of the `ctfe-stress-test-5` benchmark, which has const allocations in the megabytes, while most crates usually have way smaller ones. This PR will have the most impact in these situations, as the walk during the interning starts to dominate the runtime.
Unicode crates (some of which are present in our benchmarks) like `ucd`, `encoding_rs`, etc come to mind as having bigger than usual allocations as well, because of big tables of code points (in the hundreds of KB, so still an order of magnitude or 2 less than the stress test).
In a check build, for a single static array shown above, from 100 to 10^9 u64s (for lengths in powers of ten), the constant factors are lowered:
(log scales for easier comparisons)
![plot_log](https://user-images.githubusercontent.com/247183/171422958-16f1ea19-3ed4-4643-812c-1c7c60a97e19.png)
(linear scale for absolute diff at higher Ns)
![plot_linear](https://user-images.githubusercontent.com/247183/171401886-2a869a4d-5cd5-47d3-9a5f-8ce34b7a6917.png)
For one of the alternatives of that issue
```rust
const ROWS: usize = 100_000;
const COLS: usize = 10_000;
static TWODARRAY: [[u128; COLS]; ROWS] = [[0; COLS]; ROWS];
```
we can see a similar reduction of around 3x (from 38s to 12s or so).
For the same size, the slowest case IIRC is when there are uninitialized bytes e.g. via padding
```rust
const ROWS: usize = 100_000;
const COLS: usize = 10_000;
static TWODARRAY: [[(u64, u8); COLS]; ROWS] = [[(0, 0); COLS]; ROWS];
```
then interning/walking does not dominate anymore (but means there is likely still some interesting work left to do here).
Compile times in this case rise up quite a bit, and avoiding interning walks has less impact: around 23%, from 730s on master to 568s with this PR.
Fix FFI-unwind unsoundness with mixed panic mode
UB maybe introduced when an FFI exception happens in a `C-unwind` foreign function and it propagates through a crate compiled with `-C panic=unwind` into a crate compiled with `-C panic=abort` (#96926).
To prevent this unsoundness from happening, we will disallow a crate compiled with `-C panic=unwind` to be linked into `panic-abort` *if* it contains a call to `C-unwind` foreign function or function pointer. If no such call exists, then we continue to allow such mixed panic mode linking because it's sound (and stable). In fact we still need the ability to do mixed panic mode linking for std, because we only compile std once with `-C panic=unwind` and link it regardless panic strategy.
For libraries that wish to remain compile-once-and-linkable-to-both-panic-runtimes, a `ffi_unwind_calls` lint is added (gated under `c_unwind` feature gate) to flag any FFI unwind calls that will cause the linkable panic runtime be restricted.
In summary:
```rust
#![warn(ffi_unwind_calls)]
mod foo {
#[no_mangle]
pub extern "C-unwind" fn foo() {}
}
extern "C-unwind" {
fn foo();
}
fn main() {
// Call to Rust function is fine regardless ABI.
foo::foo();
// Call to foreign function, will cause the crate to be unlinkable to panic-abort if compiled with `-Cpanic=unwind`.
unsafe { foo(); }
//~^ WARNING call to foreign function with FFI-unwind ABI
let ptr: extern "C-unwind" fn() = foo::foo;
// Call to function pointer, will cause the crate to be unlinkable to panic-abort if compiled with `-Cpanic=unwind`.
ptr();
//~^ WARNING call to function pointer with FFI-unwind ABI
}
```
Fix#96926
`@rustbot` label: T-compiler F-c_unwind