Use `load`+`store` instead of `memcpy` for small integer arrays
I was inspired by #98892 to see whether, rather than making `mem::swap` do something smart in the library, we could update MIR assignments like `*_1 = *_2` to do something smarter than `memcpy` for sufficiently-small types that doing it inline is going to be better than a `memcpy` call in assembly anyway. After all, special code may help `mem::swap`, but if the "obvious" MIR can just result in the correct thing that helps everything -- other code like `mem::replace`, people doing it manually, and just passing around by value in general -- as well as makes MIR inlining happier since it doesn't need to deal with all the complicated library code if it just sees a couple assignments.
LLVM will turn the short, known-length `memcpy`s into direct instructions in the backend, but that's too late for it to be able to remove `alloca`s. In general, replacing `memcpy`s with typed instructions is hard in the middle-end -- even for `memcpy.inline` where it knows it won't be a function call -- is hard [due to poison propagation issues](https://rust-lang.zulipchat.com/#narrow/stream/187780-t-compiler.2Fwg-llvm/topic/memcpy.20vs.20load-store.20for.20MIR.20assignments/near/360376712). So because we know more about the type invariants -- these are typed copies -- rustc can emit something more specific, allowing LLVM to `mem2reg` away the `alloca`s in some situations.
#52051 previously did something like this in the library for `mem::swap`, but it ended up regressing during enabling mir inlining (cbbf06b0cd), so this has been suboptimal on stable for ≈5 releases now.
The code in this PR is narrowly targeted at just integer arrays in LLVM, but works via a new method on the [`LayoutTypeMethods`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/trait.LayoutTypeMethods.html) trait, so specific backends based on cg_ssa can enable this for more situations over time, as we find them. I don't want to try to bite off too much in this PR, though. (Transparent newtypes and simple things like the 3×usize `String` would be obvious candidates for a follow-up.)
Codegen demonstrations: <https://llvm.godbolt.org/z/fK8hT9aqv>
Before:
```llvm
define void `@swap_rgb48_old(ptr` noalias nocapture noundef align 2 dereferenceable(6) %x, ptr noalias nocapture noundef align 2 dereferenceable(6) %y) unnamed_addr #1 {
%a.i = alloca [3 x i16], align 2
call void `@llvm.lifetime.start.p0(i64` 6, ptr nonnull %a.i)
call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %a.i, ptr noundef nonnull align 2 dereferenceable(6) %x, i64 6, i1 false)
tail call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %x, ptr noundef nonnull align 2 dereferenceable(6) %y, i64 6, i1 false)
call void `@llvm.memcpy.p0.p0.i64(ptr` noundef nonnull align 2 dereferenceable(6) %y, ptr noundef nonnull align 2 dereferenceable(6) %a.i, i64 6, i1 false)
call void `@llvm.lifetime.end.p0(i64` 6, ptr nonnull %a.i)
ret void
}
```
Note it going to stack:
```nasm
swap_rgb48_old: # `@swap_rgb48_old`
movzx eax, word ptr [rdi + 4]
mov word ptr [rsp - 4], ax
mov eax, dword ptr [rdi]
mov dword ptr [rsp - 8], eax
movzx eax, word ptr [rsi + 4]
mov word ptr [rdi + 4], ax
mov eax, dword ptr [rsi]
mov dword ptr [rdi], eax
movzx eax, word ptr [rsp - 4]
mov word ptr [rsi + 4], ax
mov eax, dword ptr [rsp - 8]
mov dword ptr [rsi], eax
ret
```
Now:
```llvm
define void `@swap_rgb48(ptr` noalias nocapture noundef align 2 dereferenceable(6) %x, ptr noalias nocapture noundef align 2 dereferenceable(6) %y) unnamed_addr #0 {
start:
%0 = load <3 x i16>, ptr %x, align 2
%1 = load <3 x i16>, ptr %y, align 2
store <3 x i16> %1, ptr %x, align 2
store <3 x i16> %0, ptr %y, align 2
ret void
}
```
still lowers to `dword`+`word` operations, but has no stack traffic:
```nasm
swap_rgb48: # `@swap_rgb48`
mov eax, dword ptr [rdi]
movzx ecx, word ptr [rdi + 4]
movzx edx, word ptr [rsi + 4]
mov r8d, dword ptr [rsi]
mov dword ptr [rdi], r8d
mov word ptr [rdi + 4], dx
mov word ptr [rsi + 4], cx
mov dword ptr [rsi], eax
ret
```
And as a demonstration that this isn't just `mem::swap`, a `mem::replace` on a small array (since replace doesn't use swap since #83022), which used to be `memcpy`s in LLVM changes in IR
```llvm
define void `@replace_short_array(ptr` noalias nocapture noundef sret([3 x i32]) dereferenceable(12) %0, ptr noalias noundef align 4 dereferenceable(12) %r, ptr noalias nocapture noundef readonly dereferenceable(12) %v) unnamed_addr #0 {
start:
%1 = load <3 x i32>, ptr %r, align 4
store <3 x i32> %1, ptr %0, align 4
%2 = load <3 x i32>, ptr %v, align 4
store <3 x i32> %2, ptr %r, align 4
ret void
}
```
but that lowers to reasonable `dword`+`qword` instructions still
```nasm
replace_short_array: # `@replace_short_array`
mov rax, rdi
mov rcx, qword ptr [rsi]
mov edi, dword ptr [rsi + 8]
mov dword ptr [rax + 8], edi
mov qword ptr [rax], rcx
mov rcx, qword ptr [rdx]
mov edx, dword ptr [rdx + 8]
mov dword ptr [rsi + 8], edx
mov qword ptr [rsi], rcx
ret
```
Rollup of 6 pull requests
Successful merges:
- #112081 (Avoid ICE on `#![doc(test(...)]` with literal parameter)
- #112196 (Resolve vars in result from `scrape_region_constraints`)
- #112303 (Normalize in infcx instead of globally for `Option::as_deref` suggestion)
- #112316 (Ensure space is inserted after keyword in `unused_delims`)
- #112318 (Merge method, type and const object safety checks)
- #112322 (Don't mention `IMPLIED_BOUNDS_ENTAILMENT` if signatures reference error)
Failed merges:
- #112251 (rustdoc: convert `if let Some()` that always matches to variable)
r? `@ghost`
`@rustbot` modify labels: rollup
Normalize in infcx instead of globally for `Option::as_deref` suggestion
fixes#112293
The projection may contain inference variables. These inference variables are local to the local inference context. Using `tcx.normalize_erasing_regions` doesn't work here because this method is global and does not have access to the inference context. It's therefore unable to deal with the inference variables. We normalize in the local inference context instead, which knowns about the inference variables.
The test looks a little different than the issue example, I made it more minimal and verified that it still ICEs on nightly.
Also contains a drive-by fix to properly compare the types.
r? `@compiler-errors`
Resolve vars in result from `scrape_region_constraints`
Since we perform `type_op::Normalize` in the local infcx when the new solver is enabled, vars aren't necessarily resolved, which triggers this ICE:
f85ab544df/compiler/rustc_infer/src/infer/nll_relate/mod.rs (L481)
There are more tests that go from ICE -> pass due to this change, but I just added revisions to a few for CI.
r? `@lcnr`
The projection may contain inference variables. These inference
variables are local to the local inference context. Using
`tcx.normalize_erasing_regions` doesn't work here because this method is
global and does not have access to the inference context. It's therefore
unable to deal with the inference variables. We normalize in the local
inference context instead, which knowns about the inference variables.
Show note for type ascription on a local binding interpreted as a constant pattern and not a new variable
Given the code
```rust
pub fn main() {
const y: i32 = 4;
let y: i32 = 3;
}
```
`y` in the let binding is actually interpreted as a constant pattern and is not a new variable, causing confusing diagnostics about refutable patterns in local binding.
This PR extends the note for type ascription of a constant pattern to `AscribeUserType` patterns which have `Constant` subpatterns.
Fixes#112269.
Fix type-inference regression in #112225
The type inference of argument-position closures and async blocks regressed in 1.70 as the evaluation order of async blocks changed, as they are not implicitly wrapped in an identity-function anymore.
Fixes#112225 by making sure the evaluation order stays the same as it used to.
r? `@compiler-errors`
As this was a stable-to-stable regression, it might be worth to consider backporting. Although the workaround for this is trivial as well: Just wrap the async block in another block.
Migrate GUI colors test to original CSS color format
Follow-up of https://github.com/rust-lang/rust/pull/111459.
The update for `browser-ui-test` version is because for hex color conversions, it used a precision of 1 instead of 2, which was problematic.
r? `@notriddle`
Given the code
```rust
pub fn main() {
const y: i32 = 4;
let y: i32 = 3;
}
```
`y` in the let binding is actually interpreted as a constant pattern
and is not a new variable, causing confusing diagnostics about
refutable patterns in local binding.
This commit extends the note for type ascription as a constant pattern
to `AscribeUserType` patterns as well.
Fix bug where private item with intermediate doc hidden re-export was not inlined
This fixes this bug:
```rust
mod private {
/// Original.
pub struct Bar3;
}
/// Hidden.
#[doc(hidden)]
pub use crate::private::Bar3;
/// Visible.
pub use self::Bar3 as Reexport;
```
In this case, `private::Bar3` should be inlined and renamed `Reexport` but instead we have:
```
pub use self::Bar3 as Reexport;
```
and no links.
There were actually two issues: the first one is that we forgot to check if the next intermediate re-export was doc hidden. The second was that we made the `#[doc(hidden)]` attribute inheritable, which shouldn't be possible.
r? `@notriddle`
The type inference of argument-position closures and async blocks
regressed in 1.70 as the evaluation order of async blocks changed, as
they are not implicitly wrapped in an identity-function anymore.
Fixes#112225 by making sure the evaluation order stays the same as it
used to.
Only check inlining counter after recursing.
This PR aims to reduce the strength of https://github.com/rust-lang/rust/pull/105119 even more.
In the current implementation, we check the inline count before recursing. This means that we never actually reach inlining depth 3.
This PR checks the counter after recursion, to give a chance to inline at depth >= 3.
r? `@scottmcm`
cc `@JakobDegen`
only suppress coercion error if type is definitely unsized
we previously suppressed coercion errors when the return type was `dyn Trait` because we expect a far more descriptive `Sized` trait error to be emitted instead, however the code that does this suppression does not consider where-clause predicates since it just looked at the HIR. let's do that instead by creating an obligation and checking if it may hold.
fixes#110683fixes#112208
Fix codegen test suite for bare-metal-like targets
For Ferrocene I needed to run the test suite for custom target with no unwinding and static relocation. Running the tests uncovered ~20 failures due to the test suite not accounting for these options. This PR fixes them by:
* Fixing `CHECK`s to account for functions having extra LLVM IR attributes (in this case `nounwind`).
* Fixing `CHECK`s to account for the `dso_local` LLVM IR modifier, which is [added to every item when relocation is static](f3d597b31c/compiler/rustc_codegen_llvm/src/mono_item.rs (L139-L142)).
* Fixing `CHECK`s to account for missing `uwtables` attributes.
* Added the `needs-unwind` attributes for tests that are designed to check unwinding.
There is no part of Rust CI that checks this unfortunately, and testing whether the PR works locally is kinda hard because you need a target with std enabled but no unwinding and static relocations. Still, this works in my local testing, and if future PRs accidentally break this Ferrocene will take care of sending followup PRs.
suggest `Option::as_deref(_mut)` on type mismatch in option combinator if it passes typeck
Fixes#106342.
This adds a suggestion to call `.as_deref()` (or `.as_deref_mut()` resp.) if typeck fails due to a type mismatch in the function passed to an `Option` combinator such as `.map()` or `.and_then()`.
For example:
```rs
fn foo(_: &str) {}
Some(String::new()).map(foo);
```
The `.map()` method requires its argument to satisfy `F: FnOnce(String)`, but it received `fn(&str)`, which won't pass. However, placing a `.as_deref()` before the `.map()` call fixes this since `&str == &<String as Deref>::Target`
Don't use `can_eq` in `derive(..)` suggestion for missing method
Unsatisfied predicates returned from method probe may reference inference vars from that probe, so drop this extra check I added in #110877 for more accurate derive suggestions...
Fixes#111500
Normalize anon consts in new solver
We don't do any of that `expand_abstract_consts` stuff so this isn't sufficient to make GCE work, but it does allow, e.g. `[(); 1]: Default`, to solve.
r? `@BoxyUwU`
Require that const param tys implement `ConstParamTy`
1. Require that const param tys implement `ConstParamTy` instead of using `search_for_adt_const_param_violation`
2. Add `StructuralPartialEq` as a supertrait for `ConstParamTy`, since we need to make sure that we derive *both* `PartialEq` and `Eq`
3. Implement `ConstParamTy` for tuples up to 12 (or whatever the default for tuples is)
4. Add some custom diagnostics to `ConstParamTy` errors, to avoid regressions from (1.). It's still not as great as it could be -- will point out inline in comments.
r? `@BoxyUwU`
Use translatable diagnostics in `rustc_const_eval`
This PR:
* adds a `no_span` parameter to `note` / `help` attributes when using `Subdiagnostic` to allow adding notes/helps without using a span
* has minor tweaks and changes to error messages