Enable varargs support for calling conventions other than C or cdecl
This patch makes it possible to use varargs for calling conventions,
which are either based on C (efiapi) or C is based on them (sysv64 and win64).
Also pinging ``@phlopsi,`` because he noticed first this oversight when writing a library for UEFI.
Use `br` instead of `switch` in more cases.
`codegen_switchint_terminator` already uses `br` instead of `switch` when there is one normal target plus the `otherwise` target. But there's another common case with two normal targets and an `otherwise` target that points to an empty unreachable BB. This comes up a lot when switching on the tags of enums that use niches.
The pattern looks like this:
```
bb1: ; preds = %bb6
%3 = load i8, ptr %_2, align 1, !range !9, !noundef !4
%4 = sub i8 %3, 2
%5 = icmp eq i8 %4, 0
%_6 = select i1 %5, i64 0, i64 1
switch i64 %_6, label %bb3 [
i64 0, label %bb4
i64 1, label %bb2
]
bb3: ; preds = %bb1
unreachable
```
This commit adds code to convert the `switch` to a `br`:
```
bb1: ; preds = %bb6
%3 = load i8, ptr %_2, align 1, !range !9, !noundef !4
%4 = sub i8 %3, 2
%5 = icmp eq i8 %4, 0
%_6 = select i1 %5, i64 0, i64 1
%6 = icmp eq i64 %_6, 0
br i1 %6, label %bb4, label %bb2
bb3: ; No predecessors!
unreachable
```
This has a surprisingly large effect on compile times, with reductions of 5% on debug builds of some crates. The reduction is all due to LLVM taking less time. Maybe LLVM is just much better at handling `br` than `switch`.
The resulting code is still suboptimal.
- The `icmp`, `select`, `icmp` sequence is silly, converting an `i1` to an `i64` and back to an `i1`. But with the current code structure it's hard to avoid, and LLVM will easily clean it up, in opt builds at least.
- `bb3` is usually now truly dead code (though not always, so it can't be removed universally).
r? `@scottmcm`
`codegen_switchint_terminator` already uses `br` instead of `switch`
when there is one normal target plus the `otherwise` target. But there's
another common case with two normal targets and an `otherwise` target
that points to an empty unreachable BB. This comes up a lot when
switching on the tags of enums that use niches.
The pattern looks like this:
```
bb1: ; preds = %bb6
%3 = load i8, ptr %_2, align 1, !range !9, !noundef !4
%4 = sub i8 %3, 2
%5 = icmp eq i8 %4, 0
%_6 = select i1 %5, i64 0, i64 1
switch i64 %_6, label %bb3 [
i64 0, label %bb4
i64 1, label %bb2
]
bb3: ; preds = %bb1
unreachable
```
This commit adds code to convert the `switch` to a `br`:
```
bb1: ; preds = %bb6
%3 = load i8, ptr %_2, align 1, !range !9, !noundef !4
%4 = sub i8 %3, 2
%5 = icmp eq i8 %4, 0
%_6 = select i1 %5, i64 0, i64 1
%6 = icmp eq i64 %_6, 0
br i1 %6, label %bb4, label %bb2
bb3: ; No predecessors!
unreachable
```
This has a surprisingly large effect on compile times, with reductions
of 5% on debug builds of some crates. The reduction is all due to LLVM
taking less time. Maybe LLVM is just much better at handling `br` than
`switch`.
The resulting code is still suboptimal.
- The `icmp`, `select`, `icmp` sequence is silly, converting an `i1` to an `i64`
and back to an `i1`. But with the current code structure it's hard to avoid,
and LLVM will easily clean it up, in opt builds at least.
- `bb3` is usually now truly dead code (though not always, so it can't
be removed universally).
Don't use usub.with.overflow intrinsic
The canonical form of a usub.with.overflow check in LLVM are separate sub + icmp instructions, rather than a usub.with.overflow intrinsic. Using usub.with.overflow will generally result in worse optimization potential.
The backend will attempt to form usub.with.overflow when it comes to actual instruction selection. This is not fully reliable, but I believe this is a better tradeoff than using the intrinsic in IR.
Fixes#103285.
It's not clear why it was done, and apparently it's no longer necessary now.
Such additions are unpredictable for early doc link resolution and would force us to collect all doc links from all external traits.
ci: Bring back ninja for dist builders
The primary reason for this is that make can result in a substantial under utilization of parallelism (noticed while testing on a workstation), mostly due to the submake structure preventing good dependency tracking and scheduling.
In f758c7b2a78 (Debian 6 doesn't have ninja, so use make for the dist builds) llvm.ninja was disabled due to lack of distro package. This is no longer the case with the CentOS 7 base, so bring ninja back for a performance boost.
rustdoc: Simplify modifications of effective visibility table
It is now obvious that rustdoc only calls `set_access_level` with foreign def ids and `AccessLevel::Public`.
The second commit makes one more step and separates effective visibilities coming from rustc from similar data collected by rustdoc for extern `DefId`s.
The original table is no longer modified and now only contains local def ids as populated by rustc.
cc https://github.com/rust-lang/rust/pull/102026 `@Bryanskiy`
Rollup of 5 pull requests
Successful merges:
- #93582 (Allow `impl Fn() -> impl Trait` in return position)
- #103560 (Point only to the identifiers in the typo suggestions of shadowed names instead of the entire struct)
- #103588 (rustdoc: add missing URL redirect)
- #103689 (Do fewer passes and generally be more efficient when filtering tests)
- #103740 (rustdoc: remove unnecessary CSS `.search-results { padding-bottom }`)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
rustdoc: remove unnecessary CSS `.search-results { padding-bottom }`
There's nothing underneath it anyway. The conversation on #84462 never really spelled out why it was added.
Do fewer passes and generally be more efficient when filtering tests
Follow-on of the work I started with this PR: https://github.com/rust-lang/rust/pull/99939
Basically, the startup code for libtest is really inefficient, but that's not usually a problem because it is distributed in release and workloads are small. But under Miri which can be 100x slower than a debug build, these inefficiencies explode.
Most of the diff here is making test filtering single-pass. There are a few other small optimizations as well, but they are more straightforward.
With this PR, the startup time of the `iced` tests with `--features=code_asm,mvex` drops from 17 to 2 minutes (I think Miri has gotten slower under this workload since #99939). The easiest way to try this out is to set `MIRI_LIB_SRC` to a checkout of this branch when running `cargo +nightly miri test --features=code_asm,mvex`.
r? `@thomcc`
rustdoc: add missing URL redirect
https://github.com/rust-lang/rust/pull/94753 missed some redirect settings, and one of the missing URL shows up in an error message. This PR adds those redirects.
Point only to the identifiers in the typo suggestions of shadowed names instead of the entire struct
Fixes#103358.
As discussed in the issue, the `Span` of the candidate `Ident` for a typo replacement is stored alongside its `Symbol` in `TypoSuggestion`. Then, the span of the identifier is what the "you might have meant to refer to" note is pointed at, rather than the entire struct definition.
Comments in #103111 and the issue both suggest that it is desirable to:
1. include names defined in the same crate as the typo,
2. ignore names defined elsewhere such as in `std`, _and_
3. include names introduced indirectly via `use`.
Since a name from another crate but introduced via `use` has non-local `def_id`, to achieve this, a suggestion is displayed if either the `def_id` of the suggested name is local, or the `span` of the suggested name is in the same file as the typo itself.
Some UI tests have also been modified to reflect this change.
r? `@cjgillot`
Allow `impl Fn() -> impl Trait` in return position
_This was originally proposed as part of #93082 which was [closed](https://github.com/rust-lang/rust/pull/93082#issuecomment-1027225715) due to allowing `impl Fn() -> impl Trait` in argument position._
This allows writing the following function signatures:
```rust
fn f0() -> impl Fn() -> impl Trait;
fn f3() -> &'static dyn Fn() -> impl Trait;
```
These signatures were already allowed for common traits and associated types, there is no reason why `Fn*` traits should be special in this regard.
`impl Trait` in both `f0` and `f3` means "new existential type", just like with `-> impl Iterator<Item = impl Trait>` and such.
Arrow in `impl Fn() ->` is right-associative and binds from right to left, it's tested by [this test](a819fecb8d/src/test/ui/impl-trait/impl_fn_associativity.rs).
There even is a test that `f0` compiles:
2f004d2d40/src/test/ui/impl-trait/nested_impl_trait.rs (L25-L28)
But it was changed in [PR 48084 (lines)](https://github.com/rust-lang/rust/pull/48084/files#diff-ccecca938872d65ffe8cd1c3ef1956e309fac83bcda547d8b16b89257e53a437R37) to test the opposite, probably unintentionally given [PR 48084 (lines)](https://github.com/rust-lang/rust/pull/48084/files#diff-5a02f1ed43debed1fd24f7aad72490064f795b9420f15d847bac822aa4621a1cR476-R477).
r? `@nikomatsakis`
----
This limitation is especially annoying with async code, since it forces one to write this:
```rust
trait AsyncFn3<A, B, C>: Fn(A, B, C) -> <Self as AsyncFn3<A, B, C>>::Future {
type Future: Future<Output = Self::Out>;
type Out;
}
impl<A, B, C, Fut, F> AsyncFn3<A, B, C> for F
where
F: Fn(A, B, C) -> Fut,
Fut: Future,
{
type Future = Fut;
type Out = Fut::Output;
}
fn async_closure() -> impl AsyncFn3<i32, i32, i32, Out = u32> {
|a, b, c| async move { (a + b + c) as u32 }
}
```
Instead of:
```rust
fn async_closure() -> impl Fn(i32, i32, i32) -> impl Future<Output = u32> {
|a, b, c| async move { (a + b + c) as u32 }
}
```
If the compiler is built with `rpath = false`, then it won't find its
own libraries unless the library search path is set. We already do that
while running the actual compiletests, but #100260 added another rustc
command for getting the target cfg.
Check compiletest suite=codegen mode=codegen (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
thread 'main' panicked at 'error: failed to get cfg info from "[...]/build/x86_64-unknown-linux-gnu/stage1/bin/rustc"
--- stdout
--- stderr
[...]/build/x86_64-unknown-linux-gnu/stage1/bin/rustc: error while loading shared libraries: librustc_driver-a2a76dc626cd02d2.so: cannot open shared object file: No such file or directory
', src/tools/compiletest/src/common.rs:476:13
Now the library path is set here as well, so it works without rpath.