Rename `adjustment::PointerCast` and variants using it to `PointerCoercion`
It makes it sounds like the `ExprKind` and `Rvalue` are supposed to represent all pointer related casts, when in reality their just used to share a little enum variants. Make it clear there these are only coercions and that people who see this and think "why are so many pointer related casts not in these variants" aren't insane.
This enum was added in #59987. I'm not sure whether the variant sharing is actually worth it, but this at least makes it less confusing.
r? oli-obk
It makes it sound like the `ExprKind` and `Rvalue` are supposed to represent all pointer related
casts, when in reality their just used to share a some enum variants. Make it clear there these
are only coercion to make it clear why only some pointer related "casts" are in the enum.
Move `TyCtxt::mk_x` to `Ty::new_x` where applicable
Part of rust-lang/compiler-team#616
turns out there's a lot of places we construct `Ty` this is a ridiculously huge PR :S
r? `@oli-obk`
- Switch TypeId to 128 bits
- Hack around the fact that tracing-subscriber dislikes how TypeId is hashed
- Remove lowering of type_id128 from rustc_codegen_llvm
- Remove unnecessary `type_id128` intrinsic (just change return type of `type_id`)
- Only hash the lower 64 bits of the TypeId
- Reword comment
Replace const eval limit by a lint and add an exponential backoff warning
The lint triggers at the first power of 2 that comes after 1 million function calls or traversed back-edges (takes less than a second on usual programs). After the first emission, an unsilenceable warning is repeated at every following power of 2 terminators, causing it to get reported less and less the longer the evaluation runs.
cc `@rust-lang/wg-const-eval`
fixes#93481closes#67217
When constant evaluation fails because its MIR is tainted by errors,
suppress note indicating that erroneous constant was used, since those
errors have to be fixed regardless of the constant being used or not.
Currently a `{D,Subd}iagnosticMessage` can be created from any type that
impls `Into<String>`. That includes `&str`, `String`, and `Cow<'static,
str>`, which are reasonable. It also includes `&String`, which is pretty
weird, and results in many places making unnecessary allocations for
patterns like this:
```
self.fatal(&format!(...))
```
This creates a string with `format!`, takes a reference, passes the
reference to `fatal`, which does an `into()`, which clones the
reference, doing a second allocation. Two allocations for a single
string, bleh.
This commit changes the `From` impls so that you can only create a
`{D,Subd}iagnosticMessage` from `&str`, `String`, or `Cow<'static,
str>`. This requires changing all the places that currently create one
from a `&String`. Most of these are of the `&format!(...)` form
described above; each one removes an unnecessary static `&`, plus an
allocation when executed. There are also a few places where the existing
use of `&String` was more reasonable; these now just use `clone()` at
the call site.
As well as making the code nicer and more efficient, this is a step
towards possibly using `Cow<'static, str>` in
`{D,Subd}iagnosticMessage::{Str,Eager}`. That would require changing
the `From<&'a str>` impls to `From<&'static str>`, which is doable, but
I'm not yet sure if it's worthwhile.
Don't validate constants in const propagation
Validation is neither necessary nor desirable.
The constant validation is already omitted at mir-opt-level >= 3, so there there are not changes in MIR test output (the propagation of invalid constants is covered by an existing test in tests/mir-opt/const_prop/invalid_constant.rs).
They're semantically the same, so this means the backends don't need to handle the intrinsic and means fewer MIR basic blocks in pointer arithmetic code.
Evaluate place expression in `PlaceMention`
https://github.com/rust-lang/rust/pull/102256 introduces a `PlaceMention(place)` MIR statement which keep trace of `let _ = place` statements from surface rust, but without semantics.
This PR proposes to change the behaviour of `let _ =` patterns with respect to the borrow-checker to verify that the bound place is live.
Specifically, consider this code:
```rust
let _ = {
let a = 5;
&a
};
```
This passes borrowck without error on stable. Meanwhile, replacing `_` by `_: _` or `_p` errors with "error[E0597]: `a` does not live long enough", [see playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c448d25a7c205dc95a0967fe96bccce8).
This PR *does not* change how `_` patterns behave with respect to initializedness: it remains ok to bind a moved-from place to `_`.
The relevant test is `tests/ui/borrowck/let_underscore_temporary.rs`. Crater check found no regression.
For consistency, this PR changes miri to evaluate the place found in `PlaceMention`, and report eventual dangling pointers found within it.
r? `@RalfJung`
Add offset_of! macro (RFC 3308)
Implements https://github.com/rust-lang/rfcs/pull/3308 (tracking issue #106655) by adding the built in macro `core::mem::offset_of`. Two of the future possibilities are also implemented:
* Nested field accesses (without array indexing)
* DST support (for `Sized` fields)
I wrote this a few months ago, before the RFC merged. Now that it's merged, I decided to rebase and finish it.
cc `@thomcc` (RFC author)
Encode hashes as bytes, not varint
In a few places, we store hashes as `u64` or `u128` and then apply `derive(Decodable, Encodable)` to the enclosing struct/enum. It is more efficient to encode hashes directly than try to apply some varint encoding. This PR adds two new types `Hash64` and `Hash128` which are produced by `StableHasher` and replace every use of storing a `u64` or `u128` that represents a hash.
Distribution of the byte lengths of leb128 encodings, from `x build --stage 2` with `incremental = true`
Before:
```
( 1) 373418203 (53.7%, 53.7%): 1
( 2) 196240113 (28.2%, 81.9%): 3
( 3) 108157958 (15.6%, 97.5%): 2
( 4) 17213120 ( 2.5%, 99.9%): 4
( 5) 223614 ( 0.0%,100.0%): 9
( 6) 216262 ( 0.0%,100.0%): 10
( 7) 15447 ( 0.0%,100.0%): 5
( 8) 3633 ( 0.0%,100.0%): 19
( 9) 3030 ( 0.0%,100.0%): 8
( 10) 1167 ( 0.0%,100.0%): 18
( 11) 1032 ( 0.0%,100.0%): 7
( 12) 1003 ( 0.0%,100.0%): 6
( 13) 10 ( 0.0%,100.0%): 16
( 14) 10 ( 0.0%,100.0%): 17
( 15) 5 ( 0.0%,100.0%): 12
( 16) 4 ( 0.0%,100.0%): 14
```
After:
```
( 1) 372939136 (53.7%, 53.7%): 1
( 2) 196240140 (28.3%, 82.0%): 3
( 3) 108014969 (15.6%, 97.5%): 2
( 4) 17192375 ( 2.5%,100.0%): 4
( 5) 435 ( 0.0%,100.0%): 5
( 6) 83 ( 0.0%,100.0%): 18
( 7) 79 ( 0.0%,100.0%): 10
( 8) 50 ( 0.0%,100.0%): 9
( 9) 6 ( 0.0%,100.0%): 19
```
The remaining 9 or 10 and 18 or 19 are `u64` and `u128` respectively that have the high bits set. As far as I can tell these are coming primarily from `SwitchTargets`.
Various minor Idx-related tweaks
Nothing particularly exciting here, but a couple of things I noticed as I was looking for more index conversions to simplify.
cc https://github.com/rust-lang/compiler-team/issues/606
r? `@WaffleLapkin`
Unify terminology used in unwind action and terminator, and reflect
the fact that a nounwind panic is triggered instead of an immediate
abort is triggered for this terminator.
And while doing the updates for that, also uses `FieldIdx` in `ProjectionKind::Field` and `TypeckResults::field_indices`.
There's more places that could use it (like `rustc_const_eval` and `LayoutS`), but I tried to keep this PR from exploding to *even more* places.
Part 2/? of https://github.com/rust-lang/compiler-team/issues/606
Since structs are always `VariantIdx(0)`, there's a bunch of files where the only reason they had `VariantIdx` or `vec::Idx` imported at all was to get the first variant.
So this uses a constant for that, and adds some doc-comments to `VariantIdx` while I'm there, since it doesn't have any today.
Updates `interpret`, `codegen_ssa`, and `codegen_cranelift` to consume the new cast instead of the intrinsic.
Includes `CastTransmute` for custom MIR building, to be able to test the extra UB.
Strengthen state tracking in const-prop
Some/many of the changes are replicated between both the const-prop lint and the const-prop optimization.
Behaviour changes:
- const-prop opt does not give a span to propagated values. This was useless as that span's primary purpose is to diagnose evaluation failure in codegen.
- we remove the `OnlyPropagateInto` mode. It was only used for function arguments, which are better modeled by a write before entry.
- the tracking of assignments and discriminants make clearer that we do nothing in `NoPropagation` mode or on indirect places.
Do not ICE when failing to normalize in ConstProp.
There is no reason to delay a bug there, as we bubble up the failure as TooGeneric.
Fixes https://github.com/rust-lang/rust/issues/97728
Support allocations with non-Box<[u8]> bytes
This is prep work for allowing miri to support passing pointers to C code, which will require `Allocation`s to be correctly aligned. Currently, it just makes `Allocation` generic and plumbs the necessary changes through the right places.
The follow-up to this will be adding a type in the miri interpreter which correctly aligns the bytes, using that for the Miri engine, then allowing Miri to pass pointers into these allocations to C calls.
Based off of #100467, credit to ```@emarteca``` for the code
Avoid invoking typeck from borrowck
This PR attempts to reduce direct dependencies between typeck and MIR-related queries. The goal is to have all the information transit either through THIR or through dedicated queries that avoid depending on the whole `TypeckResults`.
In a first commit, we store the type information that MIR building requires into THIR. This avoids edges between mir_built and typeck.
In the second and third commit, we wrap informations around closures (upvars, kind origin and user-provided signature) to avoid borrowck depending on typeck information.
There should be a single remaining borrowck -> typeck edge in the good path, due to inline consts.
Unify validity checks into a single query
Previously, there were two queries to check whether a type allows the 0x01 or zeroed bitpattern.
I am planning on adding a further initness to check in #100423, truly uninit for MaybeUninit, which would make this three queries. This seems overkill for such a small feature, so this PR unifies them into one.
I am not entirely happy with the naming and key type and open for improvements.
r? oli-obk
(This is a large commit. The changes to
`compiler/rustc_middle/src/ty/context.rs` are the most important ones.)
The current naming scheme is a mess, with a mix of `_intern_`, `intern_`
and `mk_` prefixes, with little consistency. In particular, in many
cases it's easy to use an iterator interner when a (preferable) slice
interner is available.
The guiding principles of the new naming system:
- No `_intern_` prefixes.
- The `intern_` prefix is for internal operations.
- The `mk_` prefix is for external operations.
- For cases where there is a slice interner and an iterator interner,
the former is `mk_foo` and the latter is `mk_foo_from_iter`.
Also, `slice_interners!` and `direct_interners!` can now be `pub` or
non-`pub`, which helps enforce the internal/external operations
division.
It's not perfect, but I think it's a clear improvement.
The following lists show everything that was renamed.
slice_interners
- const_list
- mk_const_list -> mk_const_list_from_iter
- intern_const_list -> mk_const_list
- substs
- mk_substs -> mk_substs_from_iter
- intern_substs -> mk_substs
- check_substs -> check_and_mk_substs (this is a weird one)
- canonical_var_infos
- intern_canonical_var_infos -> mk_canonical_var_infos
- poly_existential_predicates
- mk_poly_existential_predicates -> mk_poly_existential_predicates_from_iter
- intern_poly_existential_predicates -> mk_poly_existential_predicates
- _intern_poly_existential_predicates -> intern_poly_existential_predicates
- predicates
- mk_predicates -> mk_predicates_from_iter
- intern_predicates -> mk_predicates
- _intern_predicates -> intern_predicates
- projs
- intern_projs -> mk_projs
- place_elems
- mk_place_elems -> mk_place_elems_from_iter
- intern_place_elems -> mk_place_elems
- bound_variable_kinds
- mk_bound_variable_kinds -> mk_bound_variable_kinds_from_iter
- intern_bound_variable_kinds -> mk_bound_variable_kinds
direct_interners
- region
- intern_region (unchanged)
- const
- mk_const_internal -> intern_const
- const_allocation
- intern_const_alloc -> mk_const_alloc
- layout
- intern_layout -> mk_layout
- adt_def
- intern_adt_def -> mk_adt_def_from_data (unusual case, hard to avoid)
- alloc_adt_def(!) -> mk_adt_def
- external_constraints
- intern_external_constraints -> mk_external_constraints
Other
- type_list
- mk_type_list -> mk_type_list_from_iter
- intern_type_list -> mk_type_list
- tup
- mk_tup -> mk_tup_from_iter
- intern_tup -> mk_tup
Previously, there were two queries to check whether a type allows the
0x01 or zeroed bitpattern.
I am planning on adding a further initness to check, truly uninit for
MaybeUninit, which would make this three queries. This seems overkill
for such a small feature, so this PR unifies them into one.
Miri: basic dyn* support
As usual I am very unsure about the dynamic dispatch stuff, but it passes even the `Pin<&mut dyn* Trait>` test so that is something.
TBH I think it was a mistake to make `dyn Trait` and `dyn* Trait` part of the same `TyKind` variant. Almost everywhere in Miri this lead to the wrong default behavior, resulting in strange ICEs instead of nice "unimplemented" messages. The two types describe pretty different runtime data layout after all.
Strangely I did not need to do the equivalent of [this diff](https://github.com/rust-lang/rust/pull/106532#discussion_r1087095963) in Miri. Maybe that is because the unsizing logic matches on `ty::Dynamic(.., ty::Dyn)` already? In `unsized_info` I don't think the `target_dyn_kind` can be `DynStar`, since then it wouldn't be unsized!
r? `@oli-obk` Cc `@eholk` (dyn-star) https://github.com/rust-lang/rust/issues/102425
There are several `mk_foo`/`intern_foo` pairs, where the former takes an
iterator and the latter takes a slice. (This naming convention is bad,
but that's a fix for another PR.)
This commit changes several `mk_foo` occurrences into `intern_foo`,
avoiding the need for some `.iter()`/`.into_iter()` calls. Affected
cases:
- mk_type_list
- mk_tup
- mk_substs
- mk_const_list
Don't ICE in `might_permit_raw_init` if reference is polymorphic
Emitting optimized MIR for a polymorphic function may require computing layout of a type that isn't (yet) known. This happens in the instcombine pass, for example. Let's fail gracefully in that condition.
cc `@saethlin`
fixes#107999
interpret: rename Pointer::from_addr → from_addr_invalid
This function corresponds to `ptr::invalid` in the standard library; the previous name was not clear enough IMO.
Rename `PointerSized` to `PointerLike`
The old name was unnecessarily vague. This PR renames a nightly language feature that I added, so I don't think it needs any additional approval, though anyone can feel free to speak up if you dislike the rename.
It's still unsatisfying that we don't the user which of {size, alignment} is wrong, but this trait really is just a stepping stone for a more generalized mechanism to create `dyn*`, just meant for nightly testing, so I don't think it really deserves additional diagnostic machinery for now.
Fixes#107696, cc ``@RalfJung``
r? ``@eholk``
Bump bootstrap compiler to 1.68
This also changes our stage0.json to include the rustc component for the rustfmt pinned nightly toolchain, which is currently necessary due to rustfmt dynamically linking to that toolchain's librustc_driver and libstd.
r? `@pietroalbini`
Use stable metric for const eval limit instead of current terminator-based logic
This patch adds a `MirPass` that inserts a new MIR instruction `ConstEvalCounter` to any loops and function calls in the CFG. This instruction is used during Const Eval to count against the `const_eval_limit`, and emit the `StepLimitReached` error, replacing the current logic which uses Terminators only.
The new method of counting loops and function calls should be more stable across compiler versions (i.e., not cause crates that compiled successfully before, to no longer compile when changes to the MIR generation/optimization are made).
Also see: #103877
Rollup of 11 pull requests
Successful merges:
- #106407 (Improve proc macro attribute diagnostics)
- #106960 (Teach parser to understand fake anonymous enum syntax)
- #107085 (Custom MIR: Support binary and unary operations)
- #107086 (Print PID holding bootstrap build lock on Linux)
- #107175 (Fix escaping inference var ICE in `point_at_expr_source_of_inferred_type`)
- #107204 (suggest qualifying bare associated constants)
- #107248 (abi: add AddressSpace field to Primitive::Pointer )
- #107272 (Implement ObjectSafe and WF in the new solver)
- #107285 (Implement `Generator` and `Future` in the new solver)
- #107286 (ICE in new solver if we see an inference variable)
- #107313 (Add Style Team Triagebot config)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
InstCombine away intrinsic validity assertions
This optimization (currently) fires 246 times on the standard library. It seems to fire hardly at all on the big crates in the benchmark suite. Interesting.
- Remove logic that limits const eval based on terminators, and use the
stable metric instead (back edges + fn calls)
- Add unstable flag `tiny-const-eval-limit` to add UI tests that do not
have to go up to the regular 2M step limit
This patch adds a `MirPass` that tracks the number of back-edges and
function calls in the CFG, adds a new MIR instruction to increment a
counter every time they are encountered during Const Eval, and emit a
warning if a configured limit is breached.
...and remove it from `PointeeInfo`, which isn't meant for this.
There are still various places (marked with FIXMEs) that assume all pointers
have the same size and alignment. Fixing this requires parsing non-default
address spaces in the data layout string, which will be done in a followup.
Switching them to `Break(())` and `Continue(())` instead.
libs-api would like to remove these constants, so stop using them in compiler to make the removal PR later smaller.