This minor change removes the need to reverse resulting digits.
Since reverse is O(|digit_num|) but bounded by 128, it's unlikely
to be a noticeable in practice. At the same time, this code is
also a 1 line shorter, so combined with tiny perf win, why not?
I ran https://gist.github.com/ttsugriy/ed14860ef597ab315d4129d5f8adb191
on M1 macbook air and got a small improvement
```
Running benches/base_n_benchmark.rs (target/release/deps/base_n_benchmark-825fe5895b5c2693)
push_str/old time: [14.180 µs 14.313 µs 14.462 µs]
Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
4 (4.00%) high mild
1 (1.00%) high severe
push_str/new time: [13.741 µs 13.839 µs 13.973 µs]
Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
3 (3.00%) high mild
5 (5.00%) high severe
```
Rollup of 9 pull requests
Successful merges:
- #114111 (Improve test case for experimental API remove_matches)
- #114169 (refactor builtin unsize handling, extend comments)
- #114182 (clean up after 113312)
- #114193 (Update lexer emoji diagnostics to Unicode 15.0)
- #114200 (Detect trait upcasting through struct tail unsizing in new solver select)
- #114228 (Check lazy type aliases for well-formedness)
- #114267 (Map RPITIT's opaque type bounds back from projections to opaques)
- #114269 (Migrate GUI colors test to original CSS color format)
- #114286 (Add missing feature gate in multiple_supertrait_upcastable doc)
r? `@ghost`
`@rustbot` modify labels: rollup
Map RPITIT's opaque type bounds back from projections to opaques
An RPITIT in a program's AST is eventually translated into both a projection GAT and an opaque. The opaque is used for default trait methods, like:
```
trait Foo {
fn bar() -> impl Sized { 0i32 }
}
```
The item bounds for both the projection and opaque are identical, and both have a *projection* self ty. This is mostly okay, since we can normalize this projection within the default trait method body to the opaque, but it does two things:
1. it leads to bugs in places where we don't normalize item bounds, like `deduce_future_output_from_obligations`
2. it leads to extra match arms that are both suspicious looking and also easy to miss
This PR maps the opaque type bounds of the RPITIT's *opaque* back to the opaque's self type to avoid this quirk. Then we can fix the UI test for #108304 (1.) and also remove a bunch of match arms (2.).
Fixes#108304
r? `@spastorino`
Check lazy type aliases for well-formedness
Previously we didn't check if `T: Mul` holds given lazy `type Alias<T> = <T as Mul>::Output;`.
Now we do. It only makes sense.
`@rustbot` label F-lazy_type_alias
r? `@oli-obk`
Move BOLT from `bootstrap` to `opt-dist`
Currently, we use BOLT to optimize LLVM for x64 Linux. The BOLT instrumentation and optimization step is implemented in `bootstrap`, but it was always quite hacky, because BOLT works quite differently than PGO. Rather than building an instrumented artifact, it takes an already built artifact and instruments it in-place. This is not a good fit for the bootstrap caching mechanism, and it meant that we had to run BOLT "on-the-fly" when packaging LLVM artifacts into the created sysroot.
The BOLT code was also really only used by the PGO script (now called `opt-dist`) and nothing else, so it was quite hardcoded for this one single usage. And even if someone wanted to use the `--llvm-bolt-profile-[use/generate]` bootstrap flags outside of the PGO script, they would also need to implement profile gathering, as this is not performed by bootstrap anyway.
I think that it could be more practical to move the BOLT logic to the `opt-dist` tool instead. This simplifies bootstrap, removes unneeded implementation of BOLT caching (we will now do it exactly once - no need to check if it has been performed already when bootstrap copies `libLLVM.so` around multiple times) and removes two BOLT-specific bootstrap flags, and also one special case for building LLVM (instead I pass the linker flags with `--set llvm.ldflags` from `opt-dist` now).
There are also a few disadvantages to this new approach:
- We have to guess/find the path to the built `libLLVM.so` file (but currently this is quite easy, it's just in `stage2/lib`).
- We have to provide the BOLT profile externally to bootstrap, so that it is packaged into the reproducible artifacts archive. Doesn't seem like a big deal to me.
- We are depending on some inner behavior of boostrap in `opt-dist` (namely, that `libLLVM.so` is hardlinked). But we do that for many other things in the `opt-dist` tool anyway, it's tied quite closely to bootstrap.
I would like to go back to my attempts to also use BOLT for `librustc_driver.so`, and I think that it might be a bit simpler if I also do it from the `opt-dist` tool, so this is the first step towards that.
Anyway, let me know what you think about this. It's just a refactoring in a way, no big deal.
r? bootstrap
Only golden arches
A number of tests in the test suite have applied the somewhat comedic practice of ignoring *every* single target architecture that rustc has ever supported. This is silly, when they are clearly tests built around certain assumptions, primarily of the x86-64 architecture, or in one case when they are only relevant for a handful of 32-bit targets. This has even resulted, in one case, in the same architecture being ignored twice!
Document these better, and use a "revision + only-arch" idiom in the test headers to denote the "golden arches" that actually pass these tests.
This function has some shared code for the thin LTO and fat LTO cases,
but those cases have so little in common that it's actually clearer to
treat them fully separately.
PR #112946 tweaked the naming of LLVM threads, but messed things up
slightly, resulting in threads on Windows having names like `optimize
module {} regex.f10ba03eb5ec7975-cgu.0`.
This commit removes the extraneous `{} `.
The main loop has a *very* complex condition, which includes two
mentions of `codegen_state`. The body of the loop then immediately
switches on the `codegen_state`.
I find it easier to understand if it's a `loop` and we check for exit
conditions after switching on `codegen_state`. We end up with a tiny bit
of code duplication, but it's clear that (a) we never exit in the
`Ongoing` case, (b) we exit in the `Completed` state only if several
things are true (and there's interaction with LTO there), and (c) we
exit in the `Aborted` state if a couple of things are true. Also, the
exit conditions are all simple conjunctions.
This loop condition involves `codegen_state`, `work_items`, and
`running_with_own_token`. But the body of the loop cannot modify
`codegen_state`, so repeatedly checking it is unnecessary.
`CodegenContext` is immutable except for the `worker` field - we clone
`CodegenContext` in multiple places, changing the `worker` field each
time. It's simpler to move the `worker` field out of `CodegenContext`.
Because it's usefulness wasn't clear to me, and I initially wondered if
it could be removed. The text is based on the text in #50972, the PR
that added the flag.
It took me some time to understand how the main thread can lend a
jobserver token to an LLVM thread. This commit renames a couple of
things to make it clearer.
- Rename the `LLVMing` variant as `Lending`, because that is a clearer
description of what is happening.
- Rename `running` as `running_with_own_token`, which makes it clearer
that there might be one additional LLVM thread running (with a loaned
token). Also add a comment to its definition.
And rename the `Compiled` variant as `Finished`, because that name makes
it clearer there is nothing left to do, contrasting nicely with the
`Needs*` variants.
`TokenCursor` currently does doc comment desugaring on the fly, if the
`desugar_doc_comment` field is set. This requires also modifying the
token stream on the fly with `replace_prev_and_rewind`.
This commit moves the doc comment desugaring out of `TokenCursor`, by
introducing a new `TokenStream::desugar_doc_comment` method. This
separation of desugaring and iterating makes the code nicer.