Move iter_results to dyn FnMut rather than a generic
This means that we're no longer generating the iteration/locking code for each invocation site of iter_results, rather just once per query (roughly), which seems much better: this is a 15% win in instruction counts when compiling the rustc_query_impl crate. The code where this is used also is pretty cold, I suspect; the old solution didn't fully monomorphize either.
Update LLVM for more wasm simd updates
This fixes the temporary regression introduced in #84339 where the wasm
target uses `fpto{s,u}i` intrinsics but the codegen for those intrinsics
with the `+nontrapping-fptoint` LLVM feature wasn't very good (aka it
didn't use the wasm instruction). The fixes brought in here fix that and
also implement the second-to-last simd instruction in LLVM.
- Literally, variants are not artificial. We have `yield` statements,
upvars and inner variables in the source code.
- Functionally, we don't want debuggers to suppress the variants. It
contains the state of the generator, which is useful when debugging.
So they shouldn't be marked artificial.
- Debuggers may use artificial flags to find the active variant. In
this case, marking variants artificial will make debuggers not work
properly.
Fixes#79009.
This fixes the temporary regression introduced in #84339 where the wasm
target uses `fpto{s,u}i` intrinsics but the codegen for those intrinsics
with the `+nontrapping-fptoint` LLVM feature wasn't very good (aka it
didn't use the wasm instruction). The fixes brought in here fix that and
also implement the second-to-last simd instruction in LLVM.
Previously, this caused a bug on NixOS:
1. bootstrap.py would download and patch stage0/cargo
2. bootstrap.py would download nightly cargo, but extract it to
stage0/cargo instead of ci-rustc/cargo.
3. bootstrap.py would fail to build rustbuild because stage0/cargo
wasn't patched.
The "proper" fix is to extract nightly cargo to ci-rustc instead, but it
doesn't seem to be necessary at all, so this just skips downloading it
instead.
Add std::os::unix::fs::chroot to change the root directory of the current process
This is a straightforward wrapper that uses the existing helpers for C
string handling and errno handling.
Having this available is convenient for UNIX utility programs written in
Rust, and avoids having to call the unsafe `libc::chroot` directly and
handle errors manually, in a program that may otherwise be entirely safe
code.
Reuse `sys::unix::cmath` on other platforms
Reuse `sys::unix::cmath` on all non-`windows` platforms.
`unix` is chosen as the canonical location instead of `unsupported` or `common` because `unsupported` doesn't make sense semantically and `common` is reserved for code that is supported on all platforms. Also `unix` is already the home of some non-`windows` code that is technically not exclusive to `unix` like `unix::path`.
All fields except the discriminant (including `outer_fields`)
should be put into structures inside the variant part, which gives
an equivalent layout but offers us much better integration with
debuggers.
This is a straightforward wrapper that uses the existing helpers for C
string handling and errno handling.
Having this available is convenient for UNIX utility programs written in
Rust, and avoids having to call the unsafe `libc::chroot` directly and
handle errors manually, in a program that may otherwise be entirely safe
code.
Implement RFC 1260 with feature_name `imported_main`.
This is the second extraction part of #84062 plus additional adjustments.
This (mostly) implements RFC 1260.
However there's still one test case failure in the extern crate case. Maybe `LocalDefId` doesn't work here? I'm not sure.
cc https://github.com/rust-lang/rust/issues/28937
r? `@petrochenkov`
Rollup of 10 pull requests
Successful merges:
- #84451 (Use flex more consistently)
- #84590 (Point out that behavior might be switched on 2015 and 2018 too one day)
- #84682 (Don't rebind in `transitive_bounds_that_define_assoc_type`)
- #84683 (Minor grammar tweaks for readability to btree internals)
- #84688 (Remove unnecessary CSS rules for search results)
- #84690 (Remove unneeded bottom margin on search results)
- #84692 (Link between std::env::{var, var_os} and std::env::{vars, vars_os})
- #84705 (make feature recommendations optional)
- #84706 (Drop alias `reduce` for `fold` - we have a `reduce` function)
- #84713 (Fix labels for regression issue template)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Fix labels for regression issue template
Each label needs to be separated by a comma (see the ICE issue template
for an example of correct usage).
As a result of this problem, the `regression-untriaged` label has not
been automatically added to issues opened with this template.
See c127530be7 for another example of this.
r? `````@Mark-Simulacrum`````
Drop alias `reduce` for `fold` - we have a `reduce` function
Searching for "reduce" currently puts the `reduce` alias for `fold`
above the actual `reduce` function. The `reduce` function already has a
cross-reference for `fold`, and vice versa.
Link between std::env::{var, var_os} and std::env::{vars, vars_os}
In #84551 I linked between `std::env::{args, args_os}` and this PR does the same but for `std::env::{var, var_os}` and `std::env::{vars, vars_os}`. Now all of `std::env::{var, var_os, vars, vars_os, args, args_os}` should each mention their `_os` or non-`_os` equivalent in the docs so that you can easily navigate between them.
Minor grammar tweaks for readability to btree internals
I was reading through the btree implementation and I noticed some grammar that could be improved in Node.rs so here is what I think would be a minor improvement.
Point out that behavior might be switched on 2015 and 2018 too one day
Reword documentation to make it clear that behaviour can be switched on older editions too, one day in the future. It doesn't *have* to be switched, but I think it's good to have it as an option and re-evaluate it a few months/years down the line when e.g. the crates that showed up in crater were broken by different changes in the language already.
cc #25725, #65819, #66145, #84147 , and https://github.com/rust-lang/rust/issues/84133#issuecomment-818005314
Use flex more consistently
Builds on #84376, related to #84354.
- Fully replaces `float: right` with `flex` on `.content .out-of-band`.
- Uses `flex` more consistently with existing usage (on `h3`, `h4`, etc.).
Tested on various widths to make sure the pages behave as before.
On arm64 we have seen on several databases that ISB (instruction synchronization
barrier) is better to use than yield in a spin loop. The yield instruction is a
nop. The isb instruction puts the processor to sleep for some short time. isb
is a good equivalent to the pause instruction on x86.
Below is an experiment that shows the effects of yield and isb on Arm64 and the
time of a pause instruction on x86 Intel processors. The micro-benchmarks use
https://github.com/google/benchmark.git
$ cat a.cc
static void BM_scalar_increment(benchmark::State& state) {
int i = 0;
for (auto _ : state)
benchmark::DoNotOptimize(i++);
}
BENCHMARK(BM_scalar_increment);
static void BM_yield(benchmark::State& state) {
for (auto _ : state)
asm volatile("yield"::);
}
BENCHMARK(BM_yield);
static void BM_isb(benchmark::State& state) {
for (auto _ : state)
asm volatile("isb"::);
}
BENCHMARK(BM_isb);
BENCHMARK_MAIN();
$ g++ -o run a.cc -O2 -lbenchmark -lpthread
$ ./run
--------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------
AWS Graviton2 (Neoverse-N1) processor:
BM_scalar_increment 0.485 ns 0.485 ns 1000000000
BM_yield 0.400 ns 0.400 ns 1000000000
BM_isb 13.2 ns 13.2 ns 52993304
AWS Graviton (A-72) processor:
BM_scalar_increment 0.897 ns 0.874 ns 801558633
BM_yield 0.877 ns 0.875 ns 800002377
BM_isb 13.0 ns 12.7 ns 55169412
Apple Arm64 M1 processor:
BM_scalar_increment 0.315 ns 0.315 ns 1000000000
BM_yield 0.313 ns 0.313 ns 1000000000
BM_isb 9.06 ns 9.06 ns 77259282
static void BM_pause(benchmark::State& state) {
for (auto _ : state)
asm volatile("pause"::);
}
BENCHMARK(BM_pause);
Intel Skylake processor:
BM_scalar_increment 0.295 ns 0.295 ns 1000000000
BM_pause 41.7 ns 41.7 ns 16780553
Tested on Graviton2 aarch64-linux with `./x.py test`.
This means that we're no longer generating the iteration/locking code for each
invocation site of iter_results, rather just once per query.
This is a 15% win in instruction counts when compiling the rustc_query_impl crate.
Each label needs to be separated by a comma (see the ICE issue template
for an example of correct usage).
As a result of this problem, the `regression-untriaged` label has not
been automatically added to issues opened with this template.
See c127530be7 for another example of this.
Revert PR 77885 everywhere
Change to probe-stack=call (instead of inline-or-call) everywhere again, for now.
We had already reverted the change on stable back in PR #83412.
Since then, we've had some movement on issue #83139, but not a 100% fix.
But also since then, we had bug reported, issue #84667, that looks like outright codegen breakage, rather than problems confined to debuginfo issues. So we are reverting PR #77885 on stable and beta. We'll reland PR #77885 (or some variant) switching back to an LLVM-dependent selection of out-of-line call vs inline-asm, after these other issues have been resolved.