Make [u8]::reverse() 5x faster
Since LLVM doesn't vectorize the loop for us, do unaligned reads of a larger type and use LLVM's bswap intrinsic to do the reversing of the actual bytes. cfg!-restricted to x86 and x86_64, as I assume it wouldn't help on things like ARMv5.
Also makes [u16]::reverse() a more modest 1.5x faster by loading/storing u32 and swapping the u16s with ROT16.
Thank you ptr::*_unaligned for making this easy :)
Benchmark results (from my i5-2500K):
```text
# Before
test slice::reverse_u8 ... bench: 273,836 ns/iter (+/- 15,592) = 3829 MB/s
test slice::reverse_u16 ... bench: 139,793 ns/iter (+/- 17,748) = 7500 MB/s
test slice::reverse_u32 ... bench: 74,997 ns/iter (+/- 5,130) = 13981 MB/s
test slice::reverse_u64 ... bench: 47,452 ns/iter (+/- 2,213) = 22097 MB/s
# After
test slice::reverse_u8 ... bench: 52,170 ns/iter (+/- 3,962) = 20099 MB/s
test slice::reverse_u16 ... bench: 93,330 ns/iter (+/- 4,412) = 11235 MB/s
test slice::reverse_u32 ... bench: 74,731 ns/iter (+/- 1,425) = 14031 MB/s
test slice::reverse_u64 ... bench: 47,556 ns/iter (+/- 3,025) = 22049 MB/s
```
If you're curious about the assembly, instead of doing this
```
movzx eax, byte ptr [rdi]
movzx ecx, byte ptr [rsi]
mov byte ptr [rdi], cl
mov byte ptr [rsi], al
```
it does this
```
mov rax, qword ptr [rdx]
mov rbx, qword ptr [r11 + rcx - 8]
bswap rbx
mov qword ptr [rdx], rbx
bswap rax
mov qword ptr [r11 + rcx - 8], rax
```
impl Clone for .split_whitespace()
Use custom closure structs for the predicates so that the iterator's
clone can simply be derived. This should also reduce virtual call
overhead by not using function pointers.
Fixes#41655
Bump cargo for rust-lang/cargo#4000
rust-lang/cargo#4000 recently landed, which fixes warnings about using `-Z` when `CARGO_INCREMENTAL` is set while running stable/beta builds. As #41751 has now landed, these warnings will turn to errors in the next release, so getting the cargo fix in place is necessary unless we want people confused about why they can no longer compile anything on stable/beta.
When `RUST_BACKTRACE=1`, remove all frames after
`__rust_maybe_catch_panic`. Tested on `main`, threads, tests and
benches. Cleaning of the top of the stacktrace is let to a future PR.
Fixes#40201
See #41815
The documentation for flt2dec doesn't match up with the actual
implementation, so fix the documentation to align with reality.
Presumably due to the mismatch, the formatting code for floats in
std::fmt can use correspondingly shorter arrays in some places, so fix
those places up as well.
Fixes#41304.
[Doc] improve `thread::Thread` and `thread::Builder` documentations
Part of #29378
- Adds information about the stack_size when using `Builder`. This might be considered too low level, but I assume that if someone wants to create their own builder instead of using `thread::spawn` they may be interested in that info.
- Updates the `thread::Thread` structure doc, mostly by explaining how to get one, the previous example was removed because it was not related to `thread::Thread`, but rather to `thread::Builder::name`.
Not much is present there, mostly because this API is not often used (the only method that seems useful is `unpark`, which is documented in #41809).
incr.comp.: Hash more pieces of crate metadata to detect changes there.
This PR adds incr. comp. hashes for non-`Entry` pieces of data in crate metadata.
The first part of it I like: `EntryBuilder` is refactored into the more generally applicable `IsolatedEncoder` which provides means of encoding something into metadata while also feeding the encoded data into an incr. comp. hash. We already did this for `Entry`, now we are doing it for various other pieces of data too, like the set of exported symbols and so on. The hashes generated there are persisted together with the per-`Entry` hashes and are also used for dep-graph dirtying the same way.
The second part of the PR I'm not entirely happy with: In order to make sure that we don't forget registering a read to the new `DepNodes` introduced here, I added the `Tracked<T>` struct. This struct wraps a value and requires a `DepNode` when accessing the wrapped value. This makes it harder to overlook adding read edges in the right places and works just fine.
However, crate metadata is already used in places where there is no `tcx` yet or even in places where no `cnum` has been assigned -- this makes it harder to apply this feature consistently or implement it ergonomically. The result is not too bad but there's a bit more code churn and a bit more opportunity to get something wrong than I would have liked. On the other hand, wrapping things in `Tracked<T>` already has revealed some bugs, so there's definitely some value in it.
This is still a work in progress:
- [x] I need to write some test cases.
- [x] Accessing the CodeMap should really be dependency tracked too, especially with the new path-remapping feature.
cc @nikomatsakis
dump-mir was causing cycles by invoking item-path-str at bad times
Workaround for now, but probably a better fix is to opt **in** to using the types for impls (if we do that at all; maybe filename/line is better).
Fixes#41697
Fixed argument inference for closures when coercing into 'fn'
This fixes https://github.com/rust-lang/rust/issues/41755. The tests `compile-fail/closure-no-fn.rs` and `compile-fail/issue-40000.rs` were modified. A new test `run-pass/closure_to_fn_coercion-expected-types.rs` was added
r? @nikomatsakis
@bors: r+ 38fe8d2 rollup
1) changed "long way into" to "long way toward"
2) changed "developer lives" to "developers' lives"
3) removed the "either... or..." format from second paragraph because there are more than 2 options
4) Minor revisions to paragraphs 3-6 to make them more consistent in format and to fix minor grammar issues.
Implement the illegal_floating_point_literal_pattern compat lint
Adds a future-compatibility lint for the [breaking-change] introduced by issue #41620 . cc issue #41255 .
rustc: treat const bodies like fn bodies in middle::region.
Allows `T::ASSOC_CONST` to be used without a `T: 'static` bound.
cc @rust-lang/compiler @rust-lang/lang
1) changed "long way into" to "long way toward"
2) changed "developer lives" to "developers' lives"
3) removed the "either... or..." format from second paragraph because there are more than 2 options
4) Minor revisions to paragraphs 3-6 to make them more consistent in format and to fix minor grammar issues.