Migrate memory overlap check from validator to lint
The check attempts to identify potential undefined behaviour, rather
than whether MIR is well-formed. It belongs in the lint not validator.
Follow up to changes from #119077.
`Diagnostic` has 40 methods that return `&mut Self` and could be
considered setters. Four of them have a `set_` prefix. This doesn't seem
necessary for a type that implements the builder pattern. This commit
removes the `set_` prefixes on those four methods.
Make closures carry their own ClosureKind
Right now, we use the "`movability`" field of `hir::Closure` to distinguish a closure and a coroutine. This is paired together with the `CoroutineKind`, which is located not in the `hir::Closure`, but the `hir::Body`. This is strange and redundant.
This PR introduces `ClosureKind` with two variants -- `Closure` and `Coroutine`, which is put into `hir::Closure`. The `CoroutineKind` is thus removed from `hir::Body`, and `Option<Movability>` no longer needs to be a stand-in for "is this a closure or a coroutine".
r? eholk
Remove `DiagCtxt` API duplication
`DiagCtxt` defines the internal API for creating and emitting diagnostics: methods like `struct_err`, `struct_span_warn`, `note`, `create_fatal`, `emit_bug`. There are over 50 methods.
Some of these methods are then duplicated across several other types: `Session`, `ParseSess`, `Parser`, `ExtCtxt`, and `MirBorrowckCtxt`. `Session` duplicates the most, though half the ones it does are unused. Each duplicated method just calls forward to the corresponding method in `DiagCtxt`. So this duplication exists to (in the best case) shorten chains like `ecx.tcx.sess.parse_sess.dcx.emit_err()` to `ecx.emit_err()`.
This API duplication is ugly and has been bugging me for a while. And it's inconsistent: there's no real logic about which methods are duplicated, and the use of `#[rustc_lint_diagnostic]` and `#[track_caller]` attributes vary across the duplicates.
This PR removes the duplicated API methods and makes all diagnostic creation and emission go through `DiagCtxt`. It also adds `dcx` getter methods to several types to shorten chains. This approach scales *much* better than API duplication; indeed, the PR adds `dcx()` to numerous types that didn't have API duplication: `TyCtxt`, `LoweringCtxt`, `ConstCx`, `FnCtxt`, `TypeErrCtxt`, `InferCtxt`, `CrateLoader`, `CheckAttrVisitor`, and `Resolver`. These result in a lot of changes from `foo.tcx.sess.emit_err()` to `foo.dcx().emit_err()`. (You could do this with more types, but it gets into diminishing returns territory for types that don't emit many diagnostics.)
After all these changes, some call sites are more verbose, some are less verbose, and many are the same. The total number of lines is reduced, mostly because of the removed API duplication. And consistency is increased, because calls to `emit_err` and friends are always preceded with `.dcx()` or `.dcx`.
r? `@compiler-errors`
Clean up `check_consts` and misc fixes
1. Remove most of the logic around erroring with trait methods. I have kept the part resolving it to a concrete impl, as that is used for const stability checks.
2. Turning on `effects` causes ICE with generic args, due to `~const Tr` when `Tr` is not `#[const_trait]` tripping up expectation in code that handles generic args, more specifically here:
8681e077b8/compiler/rustc_hir_analysis/src/astconv/generics.rs (L377)
We set `arg_count.correct` to `Err` to correctly signal that an error has already been reported.
3. UI test blesses.
Edit(fmease): Fixes#117244 (UI test is in #119099 for now).
r? compiler-errors
Split coroutine desugaring kind from source
What a coroutine is desugared from (gen/async gen/async) should be separate from where it comes (fn/block/closure).
Separate MIR lints from validation
Add a MIR lint pass, enabled with -Zlint-mir, which identifies undefined or
likely erroneous behaviour.
The initial implementation mostly migrates existing checks of this nature from
MIR validator, where they did not belong (those checks have false positives and
there is nothing inherently invalid about MIR with undefined behaviour).
Fixes#104736Fixes#104843Fixes#116079Fixes#116736Fixes#118990
`IntoDiagnostic` defaults to `ErrorGuaranteed`, because errors are the
most common diagnostic level. It makes sense to do likewise for the
closely-related (and much more widely used) `DiagnosticBuilder` type,
letting us write `DiagnosticBuilder<'a, ErrorGuaranteed>` as just
`DiagnosticBuilder<'a>`. This cuts over 200 lines of code due to many
multi-line things becoming single line things.
And make all hand-written `IntoDiagnostic` impls generic, by using
`DiagnosticBuilder::new(dcx, level, ...)` instead of e.g.
`dcx.struct_err(...)`.
This means the `create_*` functions are the source of the error level.
This change will let us remove `struct_diagnostic`.
Note: `#[rustc_lint_diagnostics]` is added to `DiagnosticBuilder::new`,
it's necessary to pass diagnostics tests now that it's used in
`into_diagnostic` functions.
Currently, `emit_diagnostic` takes `&mut self`.
This commit changes it so `emit_diagnostic` takes `self` and the new
`emit_diagnostic_without_consuming` function takes `&mut self`.
I find the distinction useful. The former case is much more common, and
avoids a bunch of `mut` and `&mut` occurrences. We can also restrict the
latter with `pub(crate)` which is nice.
codegen: panic when trying to compute size/align of extern type
The alignment is also computed when accessing a field of extern type at non-zero offset, so we also panic in that case.
Previously `size_of_val` worked because the code path there assumed that "thin pointer" means "sized". But that's not true any more with extern types. The returned size and align are just blatantly wrong, so it seems better to panic than returning wrong results. We use a non-unwinding panic since code probably does not expect size_of_val to panic.
Renamings:
- find -> opt_hir_node
- get -> hir_node
- find_by_def_id -> opt_hir_node_by_def_id
- get_by_def_id -> hir_node_by_def_id
Fix rebase changes using removed methods
Use `tcx.hir_node_by_def_id()` whenever possible in compiler
Fix clippy errors
Fix compiler
Apply suggestions from code review
Co-authored-by: Vadim Petrochenkov <vadim.petrochenkov@gmail.com>
Add FIXME for `tcx.hir()` returned type about its removal
Simplify with with `tcx.hir_node_by_def_id`
guarantee that char and u32 are ABI-compatible
In https://github.com/rust-lang/rust/pull/116894 we added a guarantee that `char` has the same alignment as `u32`, but there is still one axis where these types could differ: function call ABI. So let's nail that down as well: in a function signature, `char` and `u32` are completely equivalent.
This is a new stable guarantee, so it will need t-lang approval.
remove redundant imports
detects redundant imports that can be eliminated.
for #117772 :
In order to facilitate review and modification, split the checking code and removing redundant imports code into two PR.
r? `@petrochenkov`
detects redundant imports that can be eliminated.
for #117772 :
In order to facilitate review and modification, split the checking code and
removing redundant imports code into two PR.
compile-time evaluation: detect writes through immutable pointers
This has two motivations:
- it unblocks https://github.com/rust-lang/rust/pull/116745 (and therefore takes a big step towards `const_mut_refs` stabilization), because we can now detect if the memory that we find in `const` can be interned as "immutable"
- it would detect the UB that was uncovered in https://github.com/rust-lang/rust/pull/117905, which was caused by accidental stabilization of `copy` functions in `const` that can only be called with UB
When UB is detected, we emit a future-compat warn-by-default lint. This is not a breaking change, so completely in line with [the const-UB RFC](https://rust-lang.github.io/rfcs/3016-const-ub.html), meaning we don't need t-lang FCP here. I made the lint immediately show up for dependencies since it is nearly impossible to even trigger this lint without `const_mut_refs` -- the accidentally stabilized `copy` functions are the only way this can happen, so the crates that popped up in #117905 are the only causes of such UB (in the code that crater covers), and the three cases of UB that we know about have all been fixed in their respective crates already.
The way this is implemented is by making use of the fact that our interpreter is already generic over the notion of provenance. For CTFE we now use the new `CtfeProvenance` type which is conceptually an `AllocId` plus a boolean `immutable` flag (but packed for a more efficient representation). This means we can mark a pointer as immutable when it is created as a shared reference. The flag will be propagated to all pointers derived from this one. We can then check the immutable flag on each write to reject writes through immutable pointers.
I just hope perf works out.
rustc: Harmonize `DefKind` and `DefPathData`
Follow up to https://github.com/rust-lang/rust/pull/118188.
`DefPathData::(ClosureExpr,ImplTrait)` are renamed to match `DefKind::(Closure,OpaqueTy)`.
`DefPathData::ImplTraitAssocTy` is replaced with `DefPathData::TypeNS(kw::Empty)` because both correspond to `DefKind::AssocTy`.
It's possible that introducing `(DefKind,DefPathData)::AssocOpaqueTy` instead could be a better solution, but that would be a much more invasive change.
Const generic parameters introduced for effects are moved from `DefPathData::TypeNS` to `DefPathData::ValueNS`, because constants are values.
`DefPathData` is no longer passed to `create_def` functions to avoid redundancy.
`DefPathData::(ClosureExpr,ImplTrait)` are renamed to match `DefKind::(Closure,OpaqueTy)`.
`DefPathData::ImplTraitAssocTy` is replaced with `DefPathData::TypeNS(kw::Empty)` because both correspond to `DefKind::AssocTy`.
It's possible that introducing `(DefKind,DefPathData)::AssocOpaqueTy` could be a better solution, but that would be a much more invasive change.
Const generic parameters introduced for effects are moved from `DefPathData::TypeNS` to `DefPathData::ValueNS`, because constants are values.
`DefPathData` is no longer passed to `create_def` functions to avoid redundancy.
explain a good reason for why LocalValue does not store the type of the local
As found out by `@lcnr` in https://github.com/rust-lang/rust/pull/112307, storing the type here can lead to subtle bugs when it gets out of sync with the MIR body. That's not the reason why the interpreter does it this way I think, but good thing we dodged that bullet. :)
Currently we always do this:
```
use rustc_fluent_macro::fluent_messages;
...
fluent_messages! { "./example.ftl" }
```
But there is no need, we can just do this everywhere:
```
rustc_fluent_macro::fluent_messages! { "./example.ftl" }
```
which is shorter.
The `fluent_messages!` macro produces uses of
`crate::{D,Subd}iagnosticMessage`, which means that every crate using
the macro must have this import:
```
use rustc_errors::{DiagnosticMessage, SubdiagnosticMessage};
```
This commit changes the macro to instead use
`rustc_errors::{D,Subd}iagnosticMessage`, which avoids the need for the
imports.
Miri: GC the dead_alloc_map too
dead_alloc_map is the last piece of state in the interpreter I can find that leaks. With this PR, all of the long-term memory growth I can find in Miri with programs that do things like run a big `loop {` or run property tests is attributable to some data structure properties in borrow tracking, and is _extremely_ slow.
My only gripe with the commit in this PR is that I don't have a new test for it. I'd like to have a regression test for this, but it would have to be statistical I think because the peak memory of a process that Linux reports is not exactly the same run-to-run. Which means it would have to not be very sensitive to slow leaks (some guesswork suggests for acceptable CI time we would be checking for like 10% memory growth over a minute or two, which is still pretty fast IMO).
Unless someone has a better idea for how to detect a regression, I think on balance I'm fine with manually keeping an eye on the memory use situation.
r? RalfJung
Expand Miri's BorTag GC to a Provenance GC
As suggested in https://github.com/rust-lang/miri/issues/3080#issuecomment-1732505573
We previously solved memory growth issues associated with the Stacked Borrows and Tree Borrows runtimes with a GC. But of course we also have state accumulation associated with whole allocations elsewhere in the interpreter, and this PR starts tackling those.
To do this, we expand the visitor for the GC so that it can visit a BorTag or an AllocId. Instead of collecting all live AllocIds into a single HashSet, we just collect from the Machine itself then go through an accessor `InterpCx::is_alloc_live` which checks a number of allocation data structures in the core interpreter. This avoids the overhead of all the inserts that collecting their keys would require.
r? ``@RalfJung``
interpret: simplify handling of shifts by no longer trying to handle signed and unsigned shift amounts in the same branch
While we're at it, also update comments in codegen and MIR building related to shifts, and fix the overflow error printed by Miri on negative shift amounts.
generator layout: ignore fake borrows
fixes#117059
We emit fake shallow borrows in case the scrutinee place uses a `Deref` and there is a match guard. This is necessary to prevent the match guard from mutating the scrutinee: fab1054e17/compiler/rustc_mir_build/src/build/matches/mod.rs (L1250-L1265)
These fake borrows end up impacting the generator witness computation in `mir_generator_witnesses`, which causes the issue in #117059. This PR now completely ignores fake borrows during this computation. This is sound as thse are always removed after analysis and the actual computation of the generator layout happens afterwards.
Only the second commit impacts behavior, and could be backported by itself.
r? types
patterns: reject raw pointers that are not just integers
Matching against `0 as *const i32` is fine, matching against `&42 as *const i32` is not.
This extends the existing check against function pointers and wide pointers: we now uniformly reject all these pointer types during valtree construction, and then later lint because of that. See [here](https://github.com/rust-lang/rust/pull/116930#issuecomment-1784654073) for some more explanation and context.
Also fixes https://github.com/rust-lang/rust/issues/116929.
Cc `@oli-obk` `@lcnr`
Most notably, this commit changes the `pub use crate::*;` in that file
to `use crate::*;`. This requires a lot of `use` items in other crates
to be adjusted, because everything defined within `rustc_span::*` was
also available via `rustc_span::source_map::*`, which is bizarre.
The commit also removes `SourceMap::span_to_relative_line_string`, which
is unused.
Do not assert in op_to_const.
`op_to_const` is used in `try_destructure_mir_constant_for_diagnostics`, which may encounter invalid constants created by optimizations and debugging.
r? ``@oli-obk``
Fixes https://github.com/rust-lang/rust/issues/117368
Avoid the path trimming ICE lint in error reporting
Types or really anything in MIR should never be formatted without path trimming disabled, because its formatting often tries to construct trimmed paths. In this case, the lint turns a nice error report into an irrelevant ICE.
Support enum variants in offset_of!
This MR implements support for navigating through enum variants in `offset_of!`, placing the enum variant name in the second argument to `offset_of!`. The RFC placed it in the first argument, but I think it interacts better with nested field access in the second, as you can then write things like
```rust
offset_of!(Type, field.Variant.field)
```
Alternatively, a syntactic distinction could be made between variants and fields (e.g. `field::Variant.field`) but I'm not convinced this would be helpful.
[RFC 3308 # Enum Support](https://rust-lang.github.io/rfcs/3308-offset_of.html#enum-support-offset_ofsomeenumstructvariant-field_on_variant)
Tracking Issue #106655.
share some track_caller logic between interpret and codegen
Also move the code that implements the track_caller intrinsics out of the core interpreter engine -- it's just a helper creating a const-allocation, doesn't need to be part of the interpreter core.
- Sort dependencies and features sections.
- Add `tidy` markers to the sorted sections so they stay sorted.
- Remove empty `[lib`] sections.
- Remove "See more keys..." comments.
Excluded files:
- rustc_codegen_{cranelift,gcc}, because they're external.
- rustc_lexer, because it has external use.
- stable_mir, because it has external use.
See through aggregates in GVN
This PR is extracted from https://github.com/rust-lang/rust/pull/111344
The first 2 commit are cleanups to avoid repeated work. I propose to stop removing useless assignments as part of this pass, and let a later `SimplifyLocals` do it. This makes tests easier to read (among others).
The next 3 commits add a constant folding mechanism to the GVN pass, presented in https://github.com/rust-lang/rust/pull/116012. ~This pass is designed to only use global allocations, to avoid any risk of accidental modification of the stored state.~
The following commits implement opportunistic simplifications, in particular:
- projections of aggregates: `MyStruct { x: a }.x` gets replaced by `a`, works with enums too;
- projections of arrays: `[a, b][0]` becomes `a`;
- projections of repeat expressions: `[a; N][x]` becomes `a`;
- transform arrays of equal operands into a repeat rvalue.
Fixes https://github.com/rust-lang/miri/issues/3090
r? `@oli-obk`
Make `ty::print::Printer` take `&mut self` instead of `self`
based on #116815
This simplifies the code by removing all the `self` assignments and
makes the flow of data clearer - always into the printer.
Especially in v0 mangling, which already used `&mut self` in some
places, it gets a lot more uniform.
This simplifies the code by removing all the `self` assignments and
makes the flow of data clearer - always into the printer.
Especially in v0 mangling, which already used `&mut self` in some
places, it gets a lot more uniform.
Implement rustc part of RFC 3127 trim-paths
This PR implements (or at least tries to) [RFC 3127 trim-paths](https://github.com/rust-lang/rust/issues/111540), the rustc part. That is `-Zremap-path-scope` with all of it's components/scopes.
`@rustbot` label: +F-trim-paths
Remove lots of generics from `ty::print`
All of these generics mostly resolve to the same thing, which means we can remove them, greatly simplifying the types involved in pretty printing and unlocking another simplification (that is not performed in this PR): Using `&mut self` instead of passing `self` through the return type.
cc `@eddyb` you probably know why it's like this, just checking in and making sure I didn't do anything bad
r? oli-obk
These are `Self` in almost all printers except one, which can just store
the state as a field instead. This simplifies the printer and allows for
further simplifications, for example using `&mut self` instead of
passing around the printer.
don't UB on dangling ptr deref, instead check inbounds on projections
This implements https://github.com/rust-lang/reference/pull/1387 in Miri. See that PR for what the change is about.
Detecting dangling references in `let x = &...;` is now done by validity checking only, so some tests need to have validity checking enabled. There is no longer inherently a "nodangle" check in evaluating the expression `&*ptr` (aside from the aliasing model).
r? `@oli-obk`
Based on:
- https://github.com/rust-lang/reference/pull/1387
- https://github.com/rust-lang/rust/pull/115524
interpret: clean up AllocBytes
Fixes https://github.com/rust-lang/miri/issues/2836
Nothing has moved here in half a year, so let's just remove these unused stubs -- they need a proper re-design anyway.
r? `@oli-obk`
It's a better name, and lets "active features" refer to the features
that are active in a particular program, due to being declared or
enabled by the edition.
The commit also renames `Features::enabled` as `Features::active` to
match this; I changed my mind and have decided that "active" is a little
better thatn "enabled" for this, particularly because a number of
pre-existing comments use "active" in this way.
Finally, the commit renames `Status::Stable` as `Status::Accepted`, to
match `ACCEPTED_FEATURES`.