Sync rustc_codegen_cranelift
I did a large refactoring of the intrinsics module to remove the intrinsic_match macro which is not very clear to other people. This also enables rustfmt to run on this code. While I already did a sync yesterday, I am going to do another sync again to avoid potential conflicts as those will likely be painful to resolve.
r? ``@ghost``
``@rustbot`` label +A-codegen +A-cranelift +T-compiler
Check that RPITs constrained by a recursive call in a closure are compatible
Fixes#99073
Adapts a similar visitor pattern to `find_opaque_ty_constraints` (that we use to check TAITs), but with some changes:
0. Only walk the "OnlyBody" children, instead of all items in the RPIT's defining scope
1. Only walk through the body's children if we found a constraining usage
2. Don't actually do any inference, just do a comparison and error if they're mismatched
----
r? `@oli-obk` -- you know all this impl-trait stuff best... is this the right approach? I can explain the underlying issue better if you'd like, in case that might reveal a better solution. Not sure if it's possible to gather up the closure's defining usages of the RPIT while borrowck'ing the outer function, that might be a better place to put this check...
Add a brief comment explaining why the diagnostic migration lints aren't
included in the `rustc::internal` diagnostic group.
Signed-off-by: David Wood <david.wood@huawei.com>
Remove some redundant checks from BufReader
The implementation of BufReader contains a lot of redundant checks. While any one of these checks is not particularly expensive to execute, especially when taken together they dramatically inhibit LLVM's ability to make subsequent optimizations by confusing data flow increasing the code size of anything that uses BufReader.
In particular, these changes have a ~2x increase on the benchmark that this adds a `black_box` to. I'm adding that `black_box` here just in case LLVM gets clever enough to remove the reads entirely. Right now it can't, but these optimizations are really setting it up to do so.
We get this optimization by factoring all the actual buffer management and bounds-checking logic into a new module inside `bufreader` with a new `Buffer` type. This makes it much easier to ensure that we have correctly encapsulated the management of the region of the buffer that we have read bytes into, and it lets us provide a new faster way to do small reads. `Buffer::consume_with` lets a caller do a read from the buffer with a single bounds check, instead of the double-check that's required to use `buffer` + `consume`.
Unfortunately I'm not aware of a lot of open-source usage of `BufReader` in perf-critical environments. Some time ago I tweaked this code because I saw `BufReader` in a profile at work, and I contributed some benchmarks to the `bincode` crate which exercise `BufReader::buffer`. These changes appear to help those benchmarks at little, but all these sorts of benchmarks are kind of fragile so I'm wary of quoting anything specific.
Use `action/checkout@v3` in the book
As this type of document is often copied/pasted, using a newer version of `actions/checkout` would be better.
changelog: none
Signed-off-by: Yuki Okushi <jtitor@2k36.org>
Update cargo
5 commits in d8d30a75376f78bb0fabe3d28ee9d87aa8035309..85b500ccad8cd0b63995fd94a03ddd4b83f7905b
2022-07-19 13:59:17 +0000 to 2022-07-24 21:10:46 +0000
- Make the empty rustc-wrapper test more explicit. (rust-lang/cargo#10899)
- expand RUSTC_WRAPPER docs (rust-lang/cargo#10896)
- Stabilize Workspace Inheritance (rust-lang/cargo#10859)
- Fix typo in unstable docs: s/PROGJCT/PROJECT/ (rust-lang/cargo#10890)
- refactor(source): Open query API for adding more types of queries (rust-lang/cargo#10883)
Rollup of 8 pull requests
Successful merges:
- #98583 (Stabilize Windows `FileTypeExt` with `is_symlink_dir` and `is_symlink_file`)
- #99698 (Prefer visibility map parents that are not `doc(hidden)` first)
- #99700 (Add a clickable link to the layout section)
- #99712 (passes: port more of `check_attr` module)
- #99759 (Remove dead code from cg_llvm)
- #99765 (Don't build std for *-uefi targets)
- #99771 (Update pulldown-cmark version to 0.9.2 (fixes url encoding for some chars))
- #99775 (rustdoc: do not allocate String when writing path full name)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
rustdoc: do not allocate String when writing path full name
No idea if this makes any perf difference, but it just seems like premature pessimisation to use String when str will do.
passes: port more of `check_attr` module
Continues from #99213.
Port more diagnostics in `rustc_passes::check_attr` to using the diagnostic derive and translation machinery.
r? `@compiler-errors`
Add a clickable link to the layout section
The layout section (activated by `--show-type-layout`) is currently not linkable to (outside of chrome's link to text feature). This PR makes it linkable via `#layout`.
Prefer visibility map parents that are not `doc(hidden)` first
Far simpler approach to #98876.
This only fixes the case where the parent is `doc(hidden)`, not where the child is `doc(hidden)` since I don't know how to get the attrs on the import statement given a `ModChild`... I'll try to follow up with that, but this is a good first step.
Stabilize Windows `FileTypeExt` with `is_symlink_dir` and `is_symlink_file`
These calls allow detecting whether a symlink is a file or a directory,
a distinction Windows maintains, and one important to software that
wants to do further operations on the symlink (e.g. removing it).
Optimized vec::IntoIter::next_chunk impl
```
x86_64v1, default
test vec::bench_next_chunk ... bench: 696 ns/iter (+/- 22)
x86_64v1, pr
test vec::bench_next_chunk ... bench: 309 ns/iter (+/- 4)
znver2, default
test vec::bench_next_chunk ... bench: 17,272 ns/iter (+/- 117)
znver2, pr
test vec::bench_next_chunk ... bench: 211 ns/iter (+/- 3)
```
On znver2 the default impl seems to be slow due to different inlining decisions. It goes through `core::array::iter_next_chunk`
which has a deep call tree.
codegen: use new {re,de,}allocator annotations in llvm
This obviates the patch that teaches LLVM internals about
_rust_{re,de}alloc functions by putting annotations directly in the IR
for the optimizer.
The sole test change is required to anchor FileCheck to the body of the
`box_uninitialized` method, so it doesn't see the `allocalign` on
`__rust_alloc` and get mad about the string `alloca` showing up. Since I
was there anyway, I added some checks on the attributes to prove the
right attributes got set.
r? `@nikic`
```
test vec::bench_next_chunk ... bench: 696 ns/iter (+/- 22)
x86_64v1, pr
test vec::bench_next_chunk ... bench: 309 ns/iter (+/- 4)
znver2, default
test vec::bench_next_chunk ... bench: 17,272 ns/iter (+/- 117)
znver2, pr
test vec::bench_next_chunk ... bench: 211 ns/iter (+/- 3)
```
The znver2 default impl seems to be slow due to inlining decisions. It goes through `core::array::iter_next_chunk`
which has a deeper call tree.