Remove some redundant checks from BufReader
The implementation of BufReader contains a lot of redundant checks. While any one of these checks is not particularly expensive to execute, especially when taken together they dramatically inhibit LLVM's ability to make subsequent optimizations by confusing data flow increasing the code size of anything that uses BufReader.
In particular, these changes have a ~2x increase on the benchmark that this adds a `black_box` to. I'm adding that `black_box` here just in case LLVM gets clever enough to remove the reads entirely. Right now it can't, but these optimizations are really setting it up to do so.
We get this optimization by factoring all the actual buffer management and bounds-checking logic into a new module inside `bufreader` with a new `Buffer` type. This makes it much easier to ensure that we have correctly encapsulated the management of the region of the buffer that we have read bytes into, and it lets us provide a new faster way to do small reads. `Buffer::consume_with` lets a caller do a read from the buffer with a single bounds check, instead of the double-check that's required to use `buffer` + `consume`.
Unfortunately I'm not aware of a lot of open-source usage of `BufReader` in perf-critical environments. Some time ago I tweaked this code because I saw `BufReader` in a profile at work, and I contributed some benchmarks to the `bincode` crate which exercise `BufReader::buffer`. These changes appear to help those benchmarks at little, but all these sorts of benchmarks are kind of fragile so I'm wary of quoting anything specific.
Update cargo
5 commits in d8d30a75376f78bb0fabe3d28ee9d87aa8035309..85b500ccad8cd0b63995fd94a03ddd4b83f7905b
2022-07-19 13:59:17 +0000 to 2022-07-24 21:10:46 +0000
- Make the empty rustc-wrapper test more explicit. (rust-lang/cargo#10899)
- expand RUSTC_WRAPPER docs (rust-lang/cargo#10896)
- Stabilize Workspace Inheritance (rust-lang/cargo#10859)
- Fix typo in unstable docs: s/PROGJCT/PROJECT/ (rust-lang/cargo#10890)
- refactor(source): Open query API for adding more types of queries (rust-lang/cargo#10883)
Rollup of 8 pull requests
Successful merges:
- #98583 (Stabilize Windows `FileTypeExt` with `is_symlink_dir` and `is_symlink_file`)
- #99698 (Prefer visibility map parents that are not `doc(hidden)` first)
- #99700 (Add a clickable link to the layout section)
- #99712 (passes: port more of `check_attr` module)
- #99759 (Remove dead code from cg_llvm)
- #99765 (Don't build std for *-uefi targets)
- #99771 (Update pulldown-cmark version to 0.9.2 (fixes url encoding for some chars))
- #99775 (rustdoc: do not allocate String when writing path full name)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
rustdoc: do not allocate String when writing path full name
No idea if this makes any perf difference, but it just seems like premature pessimisation to use String when str will do.
passes: port more of `check_attr` module
Continues from #99213.
Port more diagnostics in `rustc_passes::check_attr` to using the diagnostic derive and translation machinery.
r? `@compiler-errors`
Add a clickable link to the layout section
The layout section (activated by `--show-type-layout`) is currently not linkable to (outside of chrome's link to text feature). This PR makes it linkable via `#layout`.
Prefer visibility map parents that are not `doc(hidden)` first
Far simpler approach to #98876.
This only fixes the case where the parent is `doc(hidden)`, not where the child is `doc(hidden)` since I don't know how to get the attrs on the import statement given a `ModChild`... I'll try to follow up with that, but this is a good first step.
Stabilize Windows `FileTypeExt` with `is_symlink_dir` and `is_symlink_file`
These calls allow detecting whether a symlink is a file or a directory,
a distinction Windows maintains, and one important to software that
wants to do further operations on the symlink (e.g. removing it).
Optimized vec::IntoIter::next_chunk impl
```
x86_64v1, default
test vec::bench_next_chunk ... bench: 696 ns/iter (+/- 22)
x86_64v1, pr
test vec::bench_next_chunk ... bench: 309 ns/iter (+/- 4)
znver2, default
test vec::bench_next_chunk ... bench: 17,272 ns/iter (+/- 117)
znver2, pr
test vec::bench_next_chunk ... bench: 211 ns/iter (+/- 3)
```
On znver2 the default impl seems to be slow due to different inlining decisions. It goes through `core::array::iter_next_chunk`
which has a deep call tree.
codegen: use new {re,de,}allocator annotations in llvm
This obviates the patch that teaches LLVM internals about
_rust_{re,de}alloc functions by putting annotations directly in the IR
for the optimizer.
The sole test change is required to anchor FileCheck to the body of the
`box_uninitialized` method, so it doesn't see the `allocalign` on
`__rust_alloc` and get mad about the string `alloca` showing up. Since I
was there anyway, I added some checks on the attributes to prove the
right attributes got set.
r? `@nikic`
```
test vec::bench_next_chunk ... bench: 696 ns/iter (+/- 22)
x86_64v1, pr
test vec::bench_next_chunk ... bench: 309 ns/iter (+/- 4)
znver2, default
test vec::bench_next_chunk ... bench: 17,272 ns/iter (+/- 117)
znver2, pr
test vec::bench_next_chunk ... bench: 211 ns/iter (+/- 3)
```
The znver2 default impl seems to be slow due to inlining decisions. It goes through `core::array::iter_next_chunk`
which has a deeper call tree.
Fix some broken link fragments.
An exception for link fragments starting with `-` was added in #49590. However, it is not clear what issues were encountered at the time. Perhaps those were fixed in the meantime.
This removes the exception, and fixes a couple of broken links that were skipped due to it.
rustdoc: Add support for `#[rustc_must_implement_one_of]`
This PR adds support for `#[rustc_must_implement_one_of]` attribute added in #92164. There is a desire to eventually use this attribute of `Read`, so making it show up in docs is a good thing.
I "stole" the styling from cfg notes, not sure what would be a proper styling. Currently it looks like this:
![2022-07-14_15-00](https://user-images.githubusercontent.com/38225716/178968170-913c1dd5-8875-4a95-9848-b075a0bb8998.png)
<details><summary>Code to reproduce</summary>
<p>
```rust
#![feature(rustc_attrs)]
#[rustc_must_implement_one_of(a, b)]
pub trait Trait {
fn req();
fn a(){ Self::b() }
fn b(){ Self::a() }
}
```
</p>
</details>
This obviates the patch that teaches LLVM internals about
_rust_{re,de}alloc functions by putting annotations directly in the IR
for the optimizer.
The sole test change is required to anchor FileCheck to the body of the
`box_uninitialized` method, so it doesn't see the `allocalign` on
`__rust_alloc` and get mad about the string `alloca` showing up. Since I
was there anyway, I added some checks on the attributes to prove the
right attributes got set.
While we're here, we also emit allocator attributes on
__rust_alloc_zeroed. This should allow LLVM to perform more
optimizations for zeroed blocks, and probably fixes#90032. [This
comment](https://github.com/rust-lang/rust/issues/24194#issuecomment-308791157)
mentions "weird UB-like behaviour with bitvec iterators in
rustc_data_structures" so we may need to back this change out if things
go wrong.
The new test cases require LLVM 15, so we copy them into LLVM
14-supporting versions, which we can delete when we drop LLVM 14.
Rollup of 5 pull requests
Successful merges:
- #99618 (handle consts with param/infer in `const_eval_resolve` better)
- #99666 (Restore `Opaque` behavior to coherence check)
- #99692 (interpret, ptr_offset_from: refactor and test too-far-apart check)
- #99739 (Remove erroneous E0133 code from an error message.)
- #99748 (Use full type name instead of just saying `impl Trait` in "captures lifetime" error)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Use full type name instead of just saying `impl Trait` in "captures lifetime" error
I think this is very useful, especially when there's >1 `impl Trait`, and it just means passing around a bit more info that we already have access to.
Remove erroneous E0133 code from an error message.
This error message is about `derive` and `packed`, but E0133 is for
"Unsafe code was used outside of an unsafe function or block".
r? ``@estebank``
interpret, ptr_offset_from: refactor and test too-far-apart check
We didn't have any tests for the "too far apart" message, and indeed that check mostly relied on the in-bounds check and was otherwise probably not entirely correct... so I rewrote that check, and it is before the in-bounds check so we can test it separately.
Restore `Opaque` behavior to coherence check
Fixes#99663.
This broke in 84c3fcd2a0285c06a682c9b064640084e4c7271b. I'm not exactly certain that adding this behavior back is necessarily correct, but at least the UI test I provided may stimulate some thoughts.
I think delaying a bug here is certainly not correct in the case of opaques -- if we want to change coherence behavior for opaques, then we should at least be emitting a new error.
r? ``@lcnr``
handle consts with param/infer in `const_eval_resolve` better
This PR addresses [this thread here](https://github.com/rust-lang/rust/pull/99449#discussion_r924141230). Was this the change you were looking for ``@lcnr?``
Interestingly, one test has begun to pass. Was that expected?
r? ``@lcnr``
Sync rustc_codegen_cranelift
This time most of the changes are bugfixes. No exciting new features to report. Thanks `@matthiaskrgr` for reporting a bunch of crashes!
r? `@ghost`
`@rustbot` label +A-codegen +A-cranelift +T-compiler
Remove reachable coverage without counters
Remove reachable coverage without counters to maintain invariant that
either there is no coverage at all or there is a live coverage counter
left that provides the function source hash.
The motivating example would be a following closure:
```rust
let f = |x: bool| {
debug_assert!(x);
};
```
Which, with span changes from #93967, with disabled debug assertions,
after the final CFG simplifications but before removal of dead blocks,
gives rise to MIR:
```rust
fn main::{closure#0}(_1: &[closure@a.rs:2:13: 2:22], _2: bool) -> () {
debug x => _2;
let mut _0: ();
bb0: {
Coverage::Expression(4294967295) = 1 - 2;
return;
}
...
}
```
Which also makes the initial instrumentation quite suspect, although
this pull request doesn't attempt to address that aspect directly.
Fixes#98833.
r? ``@wesleywiser`` ``@richkadel``
Remove some explicit `self.infcx` for `FnCtxt`, which already derefs into `InferCtxt`
The use of `self.infcx.method_on_infcx` vs `self.method_on_infcx` when `self` is a `FnCtxt` is a bit inconsistent, so I'm moving some `self.infcx` usages I found to just use autoderef
Slightly improve mismatched GAT where clause error
This makes the error reporting a bit more standardized between `where` on GATs and functions.
cc #99206 (`@BoxyUwU),` don't want to mark this as as "fixed" because they're still not perfect, but this is still an improvement IMO so I want to land it incrementally.
regarding "consider adding where clause to trait definition", we don't actually do that for methods as far as i can tell? i could file an issue to look into that maybe.