Turn type inhabitedness into a query to fix `exhaustive_patterns` perf
We measured in https://github.com/rust-lang/rust/pull/79394 that enabling the [`exhaustive_patterns` feature](https://github.com/rust-lang/rust/issues/51085) causes significant perf degradation. It was conjectured that the culprit is type inhabitedness checking, and [I hypothesized](https://github.com/rust-lang/rust/pull/79394#issuecomment-733861149) that turning this computation into a query would solve most of the problem.
This PR turns `tcx.is_ty_uninhabited_from` into a query, and I measured a 25% perf gain on the benchmark that stress-tests `exhaustiveness_patterns`. This more than compensates for the 30% perf hit I measured [when creating it](https://github.com/rust-lang/rustc-perf/pull/801). We'll have to measure enabling the feature again, but I suspect this fixes the perf regression entirely.
I'd like a perf run on this PR obviously.
I made small atomic commits to help reviewing. The first one is just me discovering the "revisions" feature of the testing framework.
I believe there's a push to move things out of `rustc_middle` because it's huge. I guess `inhabitedness/mod.rs` could be moved out, but it's quite small. `DefIdForest` might be movable somewhere too. I don't know what the policy is for that.
Ping `@camelid` since you were interested in following along
`@rustbot` modify labels: +A-exhaustiveness-checking
Since `DefIdForest` contains 0 or 1 elements the large majority of the
time, by allocating only in the >1 case we avoid almost all allocations,
compared to `Arc<SmallVec<[DefId;1]>>`. This shaves off 0.2% on the
benchmark that stresses uninhabitedness checking.
Make CTFE able to check for UB...
... by not doing any optimizations on the `const fn` MIR used in CTFE. This means we duplicate all `const fn`'s MIR now, once for CTFE, once for runtime. This PR is for checking the perf effect, so we have some data when talking about https://github.com/rust-lang/const-eval/blob/master/rfcs/0000-const-ub.md
To do this, we now have two queries for obtaining mir: `optimized_mir` and `mir_for_ctfe`. It is now illegal to invoke `optimized_mir` to obtain the MIR of a const/static item's initializer, an array length, an inline const expression or an enum discriminant initializer. For `const fn`, both `optimized_mir` and `mir_for_ctfe` work, the former returning the MIR that LLVM should use if the function is called at runtime. Similarly it is illegal to invoke `mir_for_ctfe` on regular functions.
This is all checked via appropriate assertions and I don't think it is easy to get wrong, as there should be no `mir_for_ctfe` calls outside the const evaluator or metadata encoding. Almost all rustc devs should keep using `optimized_mir` (or `instance_mir` for that matter).
Serialize incr comp structures to file via fixed-size buffer
Reduce a large memory spike that happens during serialization by writing
the incr comp structures to file by way of a fixed-size buffer, rather
than an unbounded vector.
Effort was made to keep the instruction count close to that of the
previous implementation. However, buffered writing to a file inherently
has more overhead than writing to a vector, because each write may
result in a handleable error. To reduce this overhead, arrangements are
made so that each LEB128-encoded integer can be written to the buffer
with only one capacity and error check. Higher-level optimizations in
which entire composite structures can be written with one capacity and
error check are possible, but would require much more work.
The performance is mostly on par with the previous implementation, with
small to moderate instruction count regressions. The memory reduction is
significant, however, so it seems like a worth-while trade-off.
Reduce a large memory spike that happens during serialization by writing
the incr comp structures to file by way of a fixed-size buffer, rather
than an unbounded vector.
Effort was made to keep the instruction count close to that of the
previous implementation. However, buffered writing to a file inherently
has more overhead than writing to a vector, because each write may
result in a handleable error. To reduce this overhead, arrangements are
made so that each LEB128-encoded integer can be written to the buffer
with only one capacity and error check. Higher-level optimizations in
which entire composite structures can be written with one capacity and
error check are possible, but would require much more work.
The performance is mostly on par with the previous implementation, with
small to moderate instruction count regressions. The memory reduction is
significant, however, so it seems like a worth-while trade-off.
Access query (DepKind) metadata through fields
This refactors the access to query definition metadata (attributes such as eval always, anon, has_params) and loading/forcing functions to generate a number of structs, instead of matching on the DepKind enum. This makes access to the fields cheaper to compile. Using a struct means that finding the metadata for a given query is just an offset away; previously the match may have been compiled to a jump table but likely not completely inlined as we expect here.
A previous attempt explored a similar strategy, but using trait objects in #78314 that proved less effective, likely due to higher overheads due to forcing dynamic calls and poorer cache utilization (all metadata is fairly densely packed with this PR).
Add check for `[T;N]`/`usize` mismatch in astconv
Helps clarify the issue in #80506
by adding a specific check for mismatches between [T;N] and usize.
r? `@lcnr`
use PlaceRef more consistently instead of loosely coupled local+projection
Instead of working directly with the `projections` array, use `iter_projections` and `last_projection`. This avoids having to construct new `PlaceRef` from the pieces everywhere.
I only did this for a few files, to see how people think about this. If y'all are happy with this, I'll open an E-mentor issue to complete this. I grepped for `Place::ty_from` to find the places that need adjusting -- this could miss some, but I am not sure what else to grep for.
Add `#[track_caller]` to `bug!` and `register_renamed`
Before:
```
thread 'rustc' panicked at 'compiler/rustc_lint/src/context.rs:267:18: invalid lint renaming of broken_intra_doc_links to rustdoc::broken_intra_doc_links', compiler/rustc_middle/src/util/bug.rs:34:26
```
After:
```
thread 'rustc' panicked at 'src/librustdoc/core.rs:455:24: invalid lint renaming of broken_intra_doc_links to rustdoc::broken_intra_doc_links', compiler/rustc_middle/src/util/bug.rs:35:26
```
The reason I added it to `register_renamed` too is that any panic in
that function will be the caller's fault.
Before:
```
thread 'rustc' panicked at 'compiler/rustc_lint/src/context.rs:267:18: invalid lint renaming of broken_intra_doc_links to rustdoc::broken_intra_doc_links', compiler/rustc_middle/src/util/bug.rs:34:26
```
After:
```
thread 'rustc' panicked at 'src/librustdoc/core.rs:455:24: invalid lint renaming of broken_intra_doc_links to rustdoc::broken_intra_doc_links', compiler/rustc_middle/src/util/bug.rs:35:26
```
The reason I added it to `register_renamed` too is that any panic in
that function will be the caller's fault.