abort immediately on bad mem::zeroed/uninit
Now that we have non-unwinding panics, let's use them for these assertions. This re-establishes the property that `mem::uninitialized` and `mem::zeroed` will never unwind -- the earlier approach of causing panics here sometimes led to hard-to-debug segfaults when the surrounding code was not able to cope with the unexpected unwinding.
Cc `@bjorn3` I did not touch cranelift but I assume it needs a similar patch. However it has a `codegen_panic` abstraction that I did not want to touch since I didn't know how else it is used.
add ptr::from_{ref,mut}
We have methods to avoid almost all `as` casts around raw pointer handling, except for the initial cast from reference to raw pointer. These new methods close that gap.
(I also moved `null_mut` next to `null` to keep the file consistently organized.)
r? libs-api
Tracking issue: https://github.com/rust-lang/rust/issues/106116
According to Godbolt¹, on x86_64 using binary and produces slightly
better code than using subtraction. Readability of both is pretty
much equivalent so might just as well use the shorter option.
¹ https://rust.godbolt.org/z/9jM3ejbMx
Rollup of 8 pull requests
Successful merges:
- #105584 (add assert messages if chunks/windows are length 0)
- #105602 (interpret: add read_machine_[ui]size convenience methods)
- #105824 (str.lines() docstring: clarify that line endings are not returned)
- #105980 (Refer to "Waker" rather than "RawWaker" in `drop` comment)
- #105986 (Fix typo in reading_half_a_pointer.rs)
- #105995 (Add regression test for #96530)
- #106008 (Sort lint_groups in no_lint_suggestion)
- #106014 (Add comment explaining what the scrape-examples-toggle.goml GUI test is about)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Clarify that raw retags are not permitted in Mir
Not sure when this changed, but documentation and the validator needed to be updated. This also removes raw retags from custom mir.
cc rust-lang/miri#2735
r? `@RalfJung`
Refer to "Waker" rather than "RawWaker" in `drop` comment
In my view this is technically more correct as `Waker` actually implements `Drop` (which calls the `drop` method) whereas `RawWaker` does not.
str.lines() docstring: clarify that line endings are not returned
Previously, the str.lines() docstring stated that lines are split at line endings, but not whether those were returned or not. This new version of the docstring states this explicitly, avoiding the need of getting to doctests to get an answer to this FAQ.
Rename `assert_uninit_valid` intrinsic
It's not about "uninit" anymore but about "filling with 0x01 bytes" so the name should at least try to reflect that.
This is actually not fully correct though, as it does still panic for all uninit with `-Zstrict-init-checks`. I'm not sure what the best way is to deal with that not causing confusion. I guess we could just remove the flag? I don't think having it makes a lot of sense anymore with the direction that we have chose to go. It could be relevant again if #100423 lands so removing it may be a bit over eager.
r? `@RalfJung`
Implement `From<bool>` for f32, f64
As is required for trait implementations, these are insta-stable. Given there is a release tomorrow and this needs FCP, I set 1.65 as the stable version.
`@rustbot` label +A-floating-point +C-feature-request +needs-fcp +relnotes +S-waiting-on-review +T-libs-api -T-libs
Implement va_list and va_arg for s390x FFI
Following the s390x ELF ABI and based on the clang implementation, provide appropriate definitions of va_list in library/core/src/ffi/mod.rs and va_arg handling in compiler/rustc_codegen_llvm/src/va_arg.rs.
Fixes the following test cases on s390x:
src/test/run-make-fulldeps/c-link-to-rust-va-list-fn src/test/ui/abi/variadic-ffi.rs
Fixes https://github.com/rust-lang/rust/issues/84628.
Following the s390x ELF ABI and based on the clang implementation,
provide appropriate definitions of va_list in library/core/src/ffi/mod.rs
and va_arg handling in compiler/rustc_codegen_llvm/src/va_arg.rs.
Fixes the following test cases on s390x:
src/test/run-make-fulldeps/c-link-to-rust-va-list-fn
src/test/ui/abi/variadic-ffi.rs
Fixes https://github.com/rust-lang/rust/issues/84628.
Revert "Replace usage of `ResumeTy` in async lowering with `Context`"
Reverts rust-lang/rust#105250
Fixes: #105501
Following instructions from [forge](https://forge.rust-lang.org/compiler/reviews.html#reverts).
This change introduced a breaking change that is not actionable nor relevant, and is blocking updates to our toolchain. Along with other comments on the CL marking issues that are fixed by reverts, reverting is best until these issues can be resolved
cc. `@Swatinem`
Support call and drop terminators in custom mir
The only caveat with this change is that cleanup blocks are not supported. I would like to add them, but it's not quite clear to me what the best way to do that is, so I'll have to think about it some more.
r? ``@oli-obk``
Previously, the str.lines() docstring stated that lines are split at line
endings, but not whether those were returned or not. This new version of the
docstring states this explicitly, avoiding the need of getting to doctests to
get an answer to this FAQ.
This commit
- Renames `Split*::{as_str -> remainder}` as it seems less confusing
- Makes `remainder` return Option<&str> to distinguish between
"iterator is exhausted" and "the tail is empty"
doc: Fix a few small issues
Hey, while reading through the (awesome) stdlib docs, I found a few minor typos.
* A few typos around generic types (`;` vs `,`)
* Use inline code formatting for code fragments
* One instance of wrong wording
Custom MIR: Many more improvements
Commits are each atomic changes, best reviewed one at a time, with the exception that the last commit includes all the documentation.
### First commit
Unsafetyck was not correctly disabled before for `dialect = "built"` custom MIR. This is fixed and a regression test is added.
### Second commit
Implements `Discriminant`, `SetDiscriminant`, and `SwitchInt`.
### Third commit
Implements indexing, field, and variant projections.
### Fourth commit
Documents the previous commits and everything else.
There is some amount of weirdness here due to having to beat Rust syntax into cooperating with MIR concepts, but it hopefully should not be too much. All of it is documented.
r? `@oli-obk`
Remove wrong note for short circuiting operators
They *are* representable by traits, even if the short-circuiting behaviour requires a different approach than the non-short-circuiting operators. For an example proposal, see the postponed [RFC 2722](https://github.com/rust-lang/rfcs/pull/2722). As it is not accurate, remove most of the note.
They *are* representable by traits, even if the short-circuiting
behaviour requires a different approach than the non-short-circuiting
operators. For an example proposal, see the postponed RFC 2722.
As it is not accurate, reword the note.
attempt to clarify align_to docs
This is not intended the change the docs at all, but `@workingjubilee` said the current docs are incomprehensible to some people so this is an attempt to fix that. No idea if it helps, so -- feedback welcome.
(Please let's not use this to discuss *changing* the spec. Whoever wants to change the spec should please make a separate PR for that.)
Replace usage of `ResumeTy` in async lowering with `Context`
Replaces using `ResumeTy` / `get_context` in favor of using `&'static mut Context<'_>`.
Usage of the `'static` lifetime here is technically "cheating", and replaces the raw pointer in `ResumeTy` and the `get_context` fn that pulls the correct lifetimes out of thin air.
fixes https://github.com/rust-lang/rust/issues/104828 and https://github.com/rust-lang/rust/pull/104321#issuecomment-1336363077
r? `@oli-obk`
Replaces using `ResumeTy` / `get_context` in favor of using `&'static mut Context<'_>`.
Usage of the `'static` lifetime here is technically "cheating", and replaces
the raw pointer in `ResumeTy` and the `get_context` fn that pulls the
correct lifetimes out of thin air.
PartialEq: PERs are homogeneous
PartialEq claims that it corresponds to a PER, but that is only a well-defined statement when `Rhs == Self`. There is no standard notion of PER on a relation between two different sets/types. So move this out of the first paragraph and clarify this.
Adjust inlining attributes around panic_immediate_abort
The goal of `panic_immediate_abort` is to permit the panic runtime and formatting code paths to be optimized away. But while poking through some disassembly of a small program compiled with that option, I found that was not the case. Enabling LTO did address that specific issue, but enabling LTO is a steep price to pay for this feature doing its job.
This PR fixes that, by tweaking two things:
* All the slice indexing functions that we `const_eval_select` on get `#[inline]`. `objdump -dC` told me that originally some `_ct` functions could end up in an executable. I won't pretend to understand what's going on there.
* Normalize attributes across all `panic!` wrappers: use `inline(never) + cold` normally, and `inline` when `panic_immediate_abort` is enabled.
But also, with LTO and `panic_immediate_abort` enabled, this patch knocks ~709 kB out of the `.text` segment of `librustc_driver.so`. That is slightly surprising to me, my best theory is that this shifts some inlining earlier in compilation, enabling some subsequent optimizations. The size improvement of `librustc_driver.so` with `panic_immediate_abort` due to this patch is greater with LTO than without LTO, which I suppose backs up this theory.
I do not know how to test this. I would quite like to, because I think what this is solving was an accidental regression. This only works with `-Zbuild-std` which is a cargo flag, and thus can't be used in a rustc codegen test.
r? `@thomcc`
---
I do not seriously think anyone is going to use a compiler built with `panic_immediate_abort`, but I wanted a big complicated Rust program to try this out on, and the compiler is such.
Add `type_ascribe!` macro as placeholder syntax for type ascription
This makes it still possible to test the internal semantics of type ascription even once the `:`-syntax is removed from the parser. The macro now gets used in a bunch of UI tests that test the semantics and not syntax of type ascription.
I might have forgotten a few tests but this should hopefully be most of them. The remaining ones will certainly be found once type ascription is removed from the parser altogether.
Part of #101728
`#![custom_mir]`: Various improvements
This PR makes a bunch of improvements to `#![custom_mir]`. Ideally this would be 4 PRs, one for each commit, but those would take forever to get merged and be a pain to juggle. Should still be reviewed one commit at a time though.
### Commit 1: Support arbitrary `let`
Before this change, all locals used in the body need to be declared at the top of the `mir!` invocation, which is rather annoying. We attempt to change that.
Unfortunately, we still have the requirement that the output of the `mir!` macro must resolve, typecheck, etc. Because of that, we can't just accept this in the THIR -> MIR parser because something like
```rust
{
let x = 0;
Goto(other)
}
other = {
RET = x;
Return()
}
```
will fail to resolve. Instead, the implementation does macro shenanigans to find the let declarations and extract them as part of the `mir!` macro. That *works*, but it is fairly complicated and degrades debuginfo by quite a bit. Specifically, the spans for any statements and declarations that are affected by this are completely wrong. My guess is that this is a net improvement though.
One way to recover some of the debuginfo would be to not support type annotations in the `let` statements, which would allow us to parse like `let $stmt:stmt`. That seems quite surprising though.
### Commit 2: Parse consts
Reuses most of the const parsing from regular Mir building for building custom mir
### Commit 3: Parse statics
Statics are slightly weird because the Mir primitive associated with them is a reference/pointer to them, so this is factored out separately.
### Commit 4: Fix some spans
A bunch of the spans were non-ideal, so we adjust them to be much more helpful.
r? `@oli-obk`
Add slice to the stack allocated string comment
Precise that the "stack allocated string" is not a string but a string slice.
``@rustbot`` label +A-docs
Stop peeling the last iteration of the loop in `Vec::resize_with`
`resize_with` uses the `ExtendWith` code that peels the last iteration:
341d8b8a2c/library/alloc/src/vec/mod.rs (L2525-L2529)
But that's kinda weird for `ExtendFunc` because it does the same thing on the last iteration anyway:
341d8b8a2c/library/alloc/src/vec/mod.rs (L2494-L2502)
So this just has it use the normal `extend`-from-`TrustedLen` code instead.
r? `@ghost`
Manually implement PartialEq for Option<T> and specialize non-nullable types
This PR manually implements `PartialEq` and `StructuralPartialEq` for `Option`, which seems to produce slightly better codegen than the automatically derived implementation.
It also allows specializing on the `core::num::NonZero*` and `core::ptr::NonNull` types, taking advantage of the niche optimization by transmuting the `Option<T>` to `T` to be compared directly, which can be done in just two instructions.
A comparison of the original, new and specialized code generation is available [here](https://godbolt.org/z/dE4jxdYsa).
Previously, async constructs would be lowered to "normal" generators,
with an additional `from_generator` / `GenFuture` shim in between to
convert from `Generator` to `Future`.
The compiler will now special-case these generators internally so that
async constructs will *directly* implement `Future` without the need
to go through the `from_generator` / `GenFuture` shim.
The primary motivation for this change was hiding this implementation
detail in stack traces and debuginfo, but it can in theory also help
the optimizer as there is less abstractions to see through.
Constify remaining `Layout` methods
Makes the methods on `Layout` that aren't yet unstably const, under the same feature and issue, #67521. Most of them required no changes, only non-trivial change is probably constifying `ValidAlignment` which may affect #102072
Deprecate the unstable `ptr_to_from_bits` feature
I propose that we deprecate the (unstable!) `to_bits` and `from_bits` methods on raw pointers. (With the intent to ~~remove them once `addr` has been around long enough to make the transition easy on people -- maybe another 6 weeks~~ remove them fairly soon after, as the strict and expose versions have been around for a while already.)
The APIs that came from the strict provenance explorations (#95228) are a more holistic version of these, and things like `.expose_addr()` work for the "that cast looks sketchy" case even if the full strict provenance stuff never happens. (As a bonus, `addr` is even shorter than `to_bits`, though it is only applicable if people can use full strict provenance! `addr` is *not* a direct replacement for `to_bits`.) So I think it's fine to move away from the `{to|from}_bits` methods, and encourage the others instead.
That also resolves the worry that was brought up (I forget where) that `q.to_bits()` and `(*q).to_bits()` both work if `q` is a pointer-to-floating-point, as they also have a `to_bits` method.
Tracking issue #91126
Code search: https://github.com/search?l=Rust&p=1&q=ptr_to_from_bits&type=Code
For potential pushback, some users in case they want to chime in
- `@RSSchermer` 365bb68541/arwa/src/html/custom_element.rs (L105)
- `@strax` 99616d1dbf/openexr/src/core/alloc.rs (L36)
- `@MiSawa` 577c622358/crates/kernel/src/timer.rs (L50)
Add slice methods for indexing via an array of indices.
Disclaimer: It's been a while since I contributed to the main Rust repo, apologies in advance if this is large enough already that it should've been an RFC.
---
# Update:
- Based on feedback, removed the `&[T]` variant of this API, and removed the requirements for the indices to be sorted.
# Description
This adds the following slice methods to `core`:
```rust
impl<T> [T] {
pub unsafe fn get_many_unchecked_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N];
pub fn get_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> Option<[&mut T; N]>;
}
```
This allows creating multiple mutable references to disjunct positions in a slice, which previously required writing some awkward code with `split_at_mut()` or `iter_mut()`. For the bound-checked variant, the indices are checked against each other and against the bounds of the slice, which requires `N * (N + 1) / 2` comparison operations.
This has a proof-of-concept standalone implementation here: https://crates.io/crates/index_many
Care has been taken that the implementation passes miri borrow checks, and generates straight-forward assembly (though this was only checked on x86_64).
# Example
```rust
let v = &mut [1, 2, 3, 4];
let [a, b] = v.get_many_mut([0, 2]).unwrap();
std::mem::swap(a, b);
*v += 100;
assert_eq!(v, &[3, 2, 101, 4]);
```
# Codegen Examples
<details>
<summary>Click to expand!</summary>
Disclaimer: Taken from local tests with the standalone implementation.
## Unchecked Indexing:
```rust
pub unsafe fn example_unchecked(slice: &mut [usize], indices: [usize; 3]) -> [&mut usize; 3] {
slice.get_many_unchecked_mut(indices)
}
```
```nasm
example_unchecked:
mov rcx, qword, ptr, [r9]
mov r8, qword, ptr, [r9, +, 8]
mov r9, qword, ptr, [r9, +, 16]
lea rcx, [rdx, +, 8*rcx]
lea r8, [rdx, +, 8*r8]
lea rdx, [rdx, +, 8*r9]
mov qword, ptr, [rax], rcx
mov qword, ptr, [rax, +, 8], r8
mov qword, ptr, [rax, +, 16], rdx
ret
```
## Checked Indexing (Option):
```rust
pub unsafe fn example_option(slice: &mut [usize], indices: [usize; 3]) -> Option<[&mut usize; 3]> {
slice.get_many_mut(indices)
}
```
```nasm
mov r10, qword, ptr, [r9, +, 8]
mov rcx, qword, ptr, [r9, +, 16]
cmp rcx, r10
je .LBB0_7
mov r9, qword, ptr, [r9]
cmp rcx, r9
je .LBB0_7
cmp rcx, r8
jae .LBB0_7
cmp r10, r9
je .LBB0_7
cmp r9, r8
jae .LBB0_7
cmp r10, r8
jae .LBB0_7
lea r8, [rdx, +, 8*r9]
lea r9, [rdx, +, 8*r10]
lea rcx, [rdx, +, 8*rcx]
mov qword, ptr, [rax], r8
mov qword, ptr, [rax, +, 8], r9
mov qword, ptr, [rax, +, 16], rcx
ret
.LBB0_7:
mov qword, ptr, [rax], 0
ret
```
## Checked Indexing (Panic):
```rust
pub fn example_panic(slice: &mut [usize], indices: [usize; 3]) -> [&mut usize; 3] {
let len = slice.len();
match slice.get_many_mut(indices) {
Some(s) => s,
None => {
let tmp = indices;
index_many::sorted_bound_check_failed(&tmp, len)
}
}
}
```
```nasm
example_panic:
sub rsp, 56
mov rax, qword, ptr, [r9]
mov r10, qword, ptr, [r9, +, 8]
mov r9, qword, ptr, [r9, +, 16]
cmp r9, r10
je .LBB0_6
cmp r9, rax
je .LBB0_6
cmp r9, r8
jae .LBB0_6
cmp r10, rax
je .LBB0_6
cmp rax, r8
jae .LBB0_6
cmp r10, r8
jae .LBB0_6
lea rax, [rdx, +, 8*rax]
lea r8, [rdx, +, 8*r10]
lea rdx, [rdx, +, 8*r9]
mov qword, ptr, [rcx], rax
mov qword, ptr, [rcx, +, 8], r8
mov qword, ptr, [rcx, +, 16], rdx
mov rax, rcx
add rsp, 56
ret
.LBB0_6:
mov qword, ptr, [rsp, +, 32], rax
mov qword, ptr, [rsp, +, 40], r10
mov qword, ptr, [rsp, +, 48], r9
lea rcx, [rsp, +, 32]
mov edx, 3
call index_many::bound_check_failed
ud2
```
</details>
# Extensions
There are multiple optional extensions to this.
## Indexing With Ranges
This could easily be expanded to allow indexing with `[I; N]` where `I: SliceIndex<Self>`. I wanted to keep the initial implementation simple, so I didn't include it yet.
## Panicking Variant
We could also add this method:
```rust
impl<T> [T] {
fn index_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N];
}
```
This would work similar to the regular index operator and panic with out-of-bound indices. The advantage would be that we could more easily ensure good codegen with a useful panic message, which is non-trivial with the `Option` variant.
This is implemented in the standalone implementation, and used as basis for the codegen examples here and there.
Pin::new_unchecked: discuss pinning closure captures
Regardless of how the discussion in https://github.com/rust-lang/rust/pull/102737 turns out, pinning closure captures is super subtle business and probably worth discussing separately.
Fix doc example for `wrapping_abs`
The `max` variable is unused. This change introduces the `min_plus` variable, to make the example similar to the one from `saturating_abs`. An alternative would be to remove the unused variable.
add examples to chunks remainder methods.
add examples to chunks remainder methods.
my motivation for adding the examples was to make it very clear that the state of the iterator (in terms of where its cursor lies) has no effect on what remainder returns.
Also fixed some links to rchunk remainder methods.
This moves the stable sort implementation to the core::slice::sort module. By
virtue of being in core it can't access `Vec`. The two `Vec` used by merge sort,
`buf` and `runs`, are modelled as custom types that implement the very limited
required `Vec` interface with the help of provided allocation and free
functions. This is done to allow future re-use of functions and logic between
stable and unstable sort. Such as `insert_head`.
clarify that realloc refreshes pointer provenance even when the allocation remains in-place
This [matches what C does](https://en.cppreference.com/w/c/memory/realloc):
> The original pointer ptr is invalidated and any access to it is undefined behavior (even if reallocation was in-place).
Cc `@rust-lang/wg-allocators`
`VecDeque::resize` should re-use the buffer in the passed-in element
Today it always copies it for *every* appended element, but one of those clones is avoidable.
This adds `iter::repeat_n` (https://github.com/rust-lang/rust/issues/104434) as the primitive needed to do this. If this PR is acceptable, I'll also use this in `Vec` rather than its custom `ExtendElement` type & infrastructure that is harder to share between multiple different containers:
101e1822c3/library/alloc/src/vec/mod.rs (L2479-L2492)
* Fix doc examples for Platforms with underaligned integer primitives.
* Mutable pointer doc examples use mutable pointers.
* Fill out tracking issue.
* Minor formatting changes.
Enforce that `dyn*` coercions are actually pointer-sized
Implement a perma-unstable, rudimentary `PointerSized` trait to enforce `dyn*` casts are `usize`-sized for now, at least to prevent ICEs and weird codegen issues from cropping up after monomorphization since currently we enforce *nothing*.
This probably can/should be removed in favor of a more sophisticated trait for handling `dyn*` conversions when we decide on one, but I just want to get something up for discussion and experimentation for now.
r? ```@eholk``` cc ```@tmandry``` (though feel free to claim/reassign)
Fixes#102141Fixes#102173
Simplify some pointer method implementations
- Make `pointer::with_metadata_of` const (+simplify implementation) (cc #75091)
- Simplify implementation of various pointer methods
r? ```@scottmcm```
----
`from_raw_parts::<T>(this, metadata(self))` was annoying me for a while and I've finally figured out how it should _actually_ be done.
Fix mod_inv termination for the last iteration
On usize=u64 platforms, the 4th iteration would overflow the `mod_gate` back to 0. Similarly for usize=u32 platforms, the 3rd iteration would overflow much the same way.
I tested various approaches to resolving this, including approaches with `saturating_mul` and `widening_mul` to a double usize. Turns out LLVM likes `mul_with_overflow` the best. In fact now, that LLVM can see the iteration count is limited, it will happily unroll the loop into a nice linear sequence.
You will also notice that the code around the loop got simplified somewhat. Now that LLVM is handling the loop nicely, there isn’t any more reasons to manually unroll the first iteration out of the loop (though looking at the code today I’m not sure all that complexity was necessary in the first place).
Fixes#103361
Support `#[track_caller]` on async fns
Adds `#[track_caller]` to the generator that is created when we desugar the async fn.
Fixes#78840
Open questions:
- What is the performance impact of adding `#[track_caller]` to every `GenFuture`'s `poll(...)` function, even if it's unused (i.e., the parent span does not set `#[track_caller]`)? We might need to set it only conditionally, if the indirection causes overhead we don't want.
x86_64 SSE2 fast-path for str.contains(&str) and short needles
Based on Wojciech Muła's [SIMD-friendly algorithms for substring searching](http://0x80.pl/articles/simd-strfind.html#sse-avx2)
The two-way algorithm is Big-O efficient but it needs to preprocess the needle
to find a "critical factorization" of it. This additional work is significant
for short needles. Additionally it mostly advances needle.len() bytes at a time.
The SIMD-based approach used here on the other hand can advance based on its
vector width, which can exceed the needle length. Except for pathological cases,
but due to being limited to small needles the worst case blowup is also small.
benchmarks taken on a Zen2, compiled with `-Ccodegen-units=1`:
```
OLD:
test str::bench_contains_16b_in_long ... bench: 504 ns/iter (+/- 14) = 5061 MB/s
test str::bench_contains_2b_repeated_long ... bench: 948 ns/iter (+/- 175) = 2690 MB/s
test str::bench_contains_32b_in_long ... bench: 445 ns/iter (+/- 6) = 5732 MB/s
test str::bench_contains_bad_naive ... bench: 130 ns/iter (+/- 1) = 569 MB/s
test str::bench_contains_bad_simd ... bench: 84 ns/iter (+/- 8) = 880 MB/s
test str::bench_contains_equal ... bench: 142 ns/iter (+/- 7) = 394 MB/s
test str::bench_contains_short_long ... bench: 677 ns/iter (+/- 25) = 3768 MB/s
test str::bench_contains_short_short ... bench: 27 ns/iter (+/- 2) = 2074 MB/s
NEW:
test str::bench_contains_16b_in_long ... bench: 82 ns/iter (+/- 0) = 31109 MB/s
test str::bench_contains_2b_repeated_long ... bench: 73 ns/iter (+/- 0) = 34945 MB/s
test str::bench_contains_32b_in_long ... bench: 71 ns/iter (+/- 1) = 35929 MB/s
test str::bench_contains_bad_naive ... bench: 7 ns/iter (+/- 0) = 10571 MB/s
test str::bench_contains_bad_simd ... bench: 97 ns/iter (+/- 41) = 762 MB/s
test str::bench_contains_equal ... bench: 4 ns/iter (+/- 0) = 14000 MB/s
test str::bench_contains_short_long ... bench: 73 ns/iter (+/- 0) = 34945 MB/s
test str::bench_contains_short_short ... bench: 12 ns/iter (+/- 0) = 4666 MB/s
```
Make `pointer::byte_offset_from` more generic
As suggested by https://github.com/rust-lang/rust/issues/96283#issuecomment-1288792955 (cc ````@scottmcm),```` make `pointer::byte_offset_from` work on pointers of different types. `byte_offset_from` really doesn't care about pointer types, so this is totally fine and, for example, allows patterns like this:
```rust
ptr::addr_of!(x.b).byte_offset_from(ptr::addr_of!(x))
```
The only possible downside is that this removes the `T` == `U` hint to inference, but I don't think this matter much. I don't think there are a lot of cases where you'd want to use `byte_offset_from` with a pointer of unbounded type (and in such cases you can just specify the type).
````@rustbot```` label +T-libs-api
Fix inconsistent rounding of 0.5 when formatted to 0 decimal places
As described in #70336, when displaying values to zero decimal places the value of 0.5 is rounded to 1, which is inconsistent with the display of other half-integer values which round to even.
From testing the flt2dec implementation, it looks like this comes down to the condition in the fixed-width Dragon implementation where an empty buffer is treated as a case to apply rounding up. I believe the change below fixes it and updates only the relevant tests.
Nevertheless I am aware this is very much a core piece of functionality, so please take a very careful look to make sure I haven't missed anything. I hope this change does not break anything in the wider ecosystem as having a consistent rounding behaviour in floating point formatting is in my opinion a useful feature to have.
Resolves#70336
interpret: support for per-byte provenance
Also factors the provenance map into its own module.
The third commit does the same for the init mask. I can move it in a separate PR if you prefer.
Fixes https://github.com/rust-lang/miri/issues/2181
r? `@oli-obk`
- bump simd compare to 32bytes
- import small slice compare code from memmem crate
- try a few different probe bytes to avoid degenerate cases
- but special-case 2-byte needles
Add `rustc_deny_explicit_impl`
Also adjust `E0322` error message to be more general, since it's used for `DiscriminantKind` and `Pointee` as well.
Also add `rustc_deny_explicit_impl` on the `Tuple` and `Destruct` marker traits.
Remove unused symbols and diagnostic items
As the title suggests, this removes unused symbols from `sym::` and `#[rustc_diagnostic_item]` annotations that weren't mentioned anywhere.
Originally I tried to use grep, to find symbols and item names that are never mentioned via `sym::name`, however this produced a lot of false positives (?), for example clippy matching on `Symbol::as_str` or macros "implicitly" adding `sym::`. I ended up fixing all these false positives (?) by hand, but tbh I'm not sure if it was worth it...
Based on Wojciech Muła's "SIMD-friendly algorithms for substring searching"[0]
The two-way algorithm is Big-O efficient but it needs to preprocess the needle
to find a "criticla factorization" of it. This additional work is significant
for short needles. Additionally it mostly advances needle.len() bytes at a time.
The SIMD-based approach used here on the other hand can advance based on its
vector width, which can exceed the needle length. Except for pathological cases,
but due to being limited to small needles the worst case blowup is also small.
benchmarks taken on a Zen2:
```
16CGU, OLD:
test str::bench_contains_short_short ... bench: 27 ns/iter (+/- 1)
test str::bench_contains_short_long ... bench: 667 ns/iter (+/- 29)
test str::bench_contains_bad_naive ... bench: 131 ns/iter (+/- 2)
test str::bench_contains_bad_simd ... bench: 130 ns/iter (+/- 2)
test str::bench_contains_equal ... bench: 148 ns/iter (+/- 4)
16CGU, NEW:
test str::bench_contains_short_short ... bench: 8 ns/iter (+/- 0)
test str::bench_contains_short_long ... bench: 135 ns/iter (+/- 4)
test str::bench_contains_bad_naive ... bench: 130 ns/iter (+/- 2)
test str::bench_contains_bad_simd ... bench: 292 ns/iter (+/- 1)
test str::bench_contains_equal ... bench: 3 ns/iter (+/- 0)
1CGU, OLD:
test str::bench_contains_short_short ... bench: 30 ns/iter (+/- 0)
test str::bench_contains_short_long ... bench: 713 ns/iter (+/- 17)
test str::bench_contains_bad_naive ... bench: 131 ns/iter (+/- 3)
test str::bench_contains_bad_simd ... bench: 130 ns/iter (+/- 3)
test str::bench_contains_equal ... bench: 148 ns/iter (+/- 6)
1CGU, NEW:
test str::bench_contains_short_short ... bench: 10 ns/iter (+/- 0)
test str::bench_contains_short_long ... bench: 111 ns/iter (+/- 0)
test str::bench_contains_bad_naive ... bench: 135 ns/iter (+/- 3)
test str::bench_contains_bad_simd ... bench: 274 ns/iter (+/- 2)
test str::bench_contains_equal ... bench: 4 ns/iter (+/- 0)
```
[0] http://0x80.pl/articles/simd-strfind.html#sse-avx2
The `max` variable is unused. This change introduces the `min_plus`
variable, to make the example similar to the one from `saturating_abs`.
An alternative would be to remove the unused variable.
Fixed some `_i32` notation in `maybe_uninit`’s doc
This PR just changed two lines in the documentation for `MaybeUninit`:
```rs
let val = 0x12345678i32;
```
was changed to:
```rs
let val = 0x12345678_i32;
```
in two doctests, making the values a tad easier to read.
It does not seem like there are other literals needing this change in the file.
Stabilize const char convert
Split out `const_char_from_u32_unchecked` from `const_char_convert` and stabilize the rest, i.e. stabilize the following functions:
```Rust
impl char {
pub const fn from_u32(self, i: u32) -> Option<char>;
pub const fn from_digit(self, num: u32, radix: u32) -> Option<char>;
pub const fn to_digit(self, radix: u32) -> Option<u32>;
}
// Available through core::char and std::char
mod char {
pub const fn from_u32(i: u32) -> Option<char>;
pub const fn from_digit(num: u32, radix: u32) -> Option<char>;
}
```
And put the following under the `from_u32_unchecked` const stability gate as it needs `Option::unwrap` which isn't const-stable (yet):
```Rust
impl char {
pub const unsafe fn from_u32_unchecked(i: u32) -> char;
}
// Available through core::char and std::char
mod char {
pub const unsafe fn from_u32_unchecked(i: u32) -> char;
}
```
cc the tracking issue #89259 (which I'd like to keep open for `const_char_from_u32_unchecked`).
Use `derive_const` and rm manual StructuralEq impl
This does not change any semantics of the impl except for the const stability. It should be fine because trait methods and const bounds can never be used in stable without enabling `const_trait_impl`.
cc `@oli-obk`
Add small clarification around using pointers derived from references
r? `@RalfJung`
One question about your example from https://github.com/rust-lang/libs-team/issues/122: at what point does UB arise? If writing 0 does not cause UB and the reference `x` is never read or written to (explicitly or implicitly by being wrapped in another data structure) after the call to `foo`, does UB only arise when dropping the value? I don't really get that since I thought references were always supposed to point to valid data?
```rust
fn foo(x: &mut NonZeroI32) {
let ptr = x as *mut NonZeroI32;
unsafe { ptr.cast::<i32>().write(0); } // no UB here
// What now? x is considered garbage when?
}
```
Improve performance of `rem_euclid()` for signed integers
such code is copy from
https://github.com/rust-lang/rust/blob/master/library/std/src/f32.rs and
https://github.com/rust-lang/rust/blob/master/library/std/src/f64.rs
using `r+rhs.abs()` is faster than calc it with an if clause. Bench result:
```
$ cargo bench
Compiling div-euclid v0.1.0 (/me/div-euclid)
Finished bench [optimized] target(s) in 1.01s
Running unittests src/lib.rs (target/release/deps/div_euclid-7a4530ca7817d1ef)
running 7 tests
test tests::it_works ... ignored
test tests::bench_aaabs ... bench: 10,498,793 ns/iter (+/- 104,360)
test tests::bench_aadefault ... bench: 11,061,862 ns/iter (+/- 94,107)
test tests::bench_abs ... bench: 10,477,193 ns/iter (+/- 81,942)
test tests::bench_default ... bench: 10,622,983 ns/iter (+/- 25,119)
test tests::bench_zzabs ... bench: 10,481,971 ns/iter (+/- 43,787)
test tests::bench_zzdefault ... bench: 11,074,976 ns/iter (+/- 29,633)
test result: ok. 0 passed; 0 failed; 1 ignored; 6 measured; 0 filtered out; finished in 19.35s
```
It seems that, default `rem_euclid` triggered a branch prediction, thus `bench_default` is faster than `bench_aadefault` and `bench_aadefault`, which shuffles the order of calculations. but all of them slower than what it was in `f64`'s and `f32`'s `rem_euclid`, thus I submit this PR.
bench code:
```rust
#![feature(test)]
extern crate test;
fn rem_euclid(a:i32,rhs:i32)->i32{
let r = a % rhs;
if r < 0 { r + rhs.abs() } else { r }
}
#[cfg(test)]
mod tests {
use super::*;
use test::Bencher;
use rand::prelude::*;
use rand::rngs::SmallRng;
const N:i32=1000;
#[test]
fn it_works() {
let a: i32 = 7; // or any other integer type
let b = 4;
let d:Vec<i32>=(-N..=N).collect();
let n:Vec<i32>=(-N..0).chain(1..=N).collect();
for i in &d {
for j in &n {
assert_eq!(i.rem_euclid(*j),rem_euclid(*i,*j));
}
}
assert_eq!(rem_euclid(a,b), 3);
assert_eq!(rem_euclid(-a,b), 1);
assert_eq!(rem_euclid(a,-b), 3);
assert_eq!(rem_euclid(-a,-b), 1);
}
#[bench]
fn bench_aaabs(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=rem_euclid(*i,*j);
}
}
res
});
}
#[bench]
fn bench_aadefault(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=i.rem_euclid(*j);
}
}
res
});
}
#[bench]
fn bench_abs(b: &mut Bencher) {
let d:Vec<i32>=(-N..=N).collect();
let n:Vec<i32>=(-N..0).chain(1..=N).collect();
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=rem_euclid(*i,*j);
}
}
res
});
}
#[bench]
fn bench_default(b: &mut Bencher) {
let d:Vec<i32>=(-N..=N).collect();
let n:Vec<i32>=(-N..0).chain(1..=N).collect();
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=i.rem_euclid(*j);
}
}
res
});
}
#[bench]
fn bench_zzabs(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=rem_euclid(*i,*j);
}
}
res
});
}
#[bench]
fn bench_zzdefault(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=i.rem_euclid(*j);
}
}
res
});
}
}
```
Add the `#[derive_const]` attribute
Closes#102371. This is a minimal patchset for the attribute to work. There are no restrictions on what traits this attribute applies to.
r? `````@oli-obk`````
Make `Hash`, `Hasher` and `BuildHasher` `#[const_trait]` and make `Sip` const `Hasher`
This PR enables using Hashes in const context.
r? ``@fee1-dead``
This patch allows the usage of the `track_caller` annotation on
generators, as well as sets them conditionally if the parent also has
`track_caller` set.
Also add this annotation on the `GenFuture`'s `poll()` function.
Add support for custom mir
This implements rust-lang/compiler-team#564 . Details about the design, motivation, etc. can be found in there.
r? ```@oli-obk```
Add context to compiler error message
Changed `creates a temporary which is freed while still in use` to `creates a temporary value which is freed while still in use`.
Const Compare for Tuples
Makes the impls for Tuples of ~const `PartialEq` types also `PartialEq`, impls for Tuples of ~const `PartialOrd` types also `PartialOrd`, for Tuples of ~const `Ord` types also `Ord`.
behind the `#![feature(const_cmp)]` gate.
~~Do not merge before #104113 is merged because I want to use this feature to clean up the new test that I added there.~~
r? ``@fee1-dead``
Fix `const_fn_trait_ref_impl`, add test for it
#99943 broke `#[feature(const_fn_trait_ref_impl)]`, this PR fixes this and adds a test for it.
r? ````@fee1-dead````
The type is unsafe and now exposed to the whole crate.
Document it properly and add an unsafe method so the
caller can make it visible that something unsafe is happening.
Implement `std::marker::Tuple`, use it in `extern "rust-call"` and `Fn`-family traits
Implements rust-lang/compiler-team#537
I made a few opinionated decisions in this implementation, specifically:
1. Enforcing `extern "rust-call"` on fn items during wfcheck,
2. Enforcing this for all functions (not just ones that have bodies),
3. Gating this `Tuple` marker trait behind its own feature, instead of grouping it into (e.g.) `unboxed_closures`.
Still needing to be done:
1. Enforce that `extern "rust-call"` `fn`-ptrs are well-formed only if they have 1/2 args and the second one implements `Tuple`. (Doing this would fix ICE in #66696.)
2. Deny all explicit/user `impl`s of the `Tuple` trait, kinda like `Sized`.
3. Fixing `Tuple` trait built-in impl for chalk, so that chalkification tests are un-broken.
Open questions:
1. Does this need t-lang or t-libs signoff?
Fixes#99820
fix a comment in UnsafeCell::new
There are several safe methods that access the inner value: `into_inner` has existed since forever and `get_mut` also exists since recently. So this comment seems just wrong. But `&self` methods return raw pointers and thus require unsafe code (though the methods themselves are still safe).
benchmark result:
```
$ cargo bench
Compiling div-euclid v0.1.0 (/me/div-euclid)
Finished bench [optimized] target(s) in 1.01s
Running unittests src/lib.rs (target/release/deps/div_euclid-7a4530ca7817d1ef)
running 7 tests
test tests::it_works ... ignored
test tests::bench_aaabs ... bench: 10,498,793 ns/iter (+/- 104,360)
test tests::bench_aadefault ... bench: 11,061,862 ns/iter (+/- 94,107)
test tests::bench_abs ... bench: 10,477,193 ns/iter (+/- 81,942)
test tests::bench_default ... bench: 10,622,983 ns/iter (+/- 25,119)
test tests::bench_zzabs ... bench: 10,481,971 ns/iter (+/- 43,787)
test tests::bench_zzdefault ... bench: 11,074,976 ns/iter (+/- 29,633)
test result: ok. 0 passed; 0 failed; 1 ignored; 6 measured; 0 filtered out; finished in 19.35s
```
benchmark code:
```rust
#![feature(test)]
extern crate test;
#[inline(always)]
fn rem_euclid(a:i32,rhs:i32)->i32{
let r = a % rhs;
if r < 0 {
// if rhs is `integer::MIN`, rhs.wrapping_abs() == rhs.wrapping_abs,
// thus r.wrapping_add(rhs.wrapping_abs()) == r.wrapping_add(rhs) == r - rhs,
// which suits our need.
// otherwise, rhs.wrapping_abs() == -rhs, which won't overflow since r is negative.
r.wrapping_add(rhs.wrapping_abs())
} else {
r
}
}
#[cfg(test)]
mod tests {
use super::*;
use test::Bencher;
use rand::prelude::*;
use rand::rngs::SmallRng;
const N:i32=1000;
#[test]
fn it_works() {
let a: i32 = 7; // or any other integer type
let b = 4;
let d:Vec<i32>=(-N..=N).collect();
let n:Vec<i32>=(-N..0).chain(1..=N).collect();
for i in &d {
for j in &n {
assert_eq!(i.rem_euclid(*j),rem_euclid(*i,*j));
}
}
assert_eq!(rem_euclid(a,b), 3);
assert_eq!(rem_euclid(-a,b), 1);
assert_eq!(rem_euclid(a,-b), 3);
assert_eq!(rem_euclid(-a,-b), 1);
}
#[bench]
fn bench_aaabs(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=rem_euclid(*i,*j);
}
}
res
});
}
#[bench]
fn bench_aadefault(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=i.rem_euclid(*j);
}
}
res
});
}
#[bench]
fn bench_abs(b: &mut Bencher) {
let d:Vec<i32>=(-N..=N).collect();
let n:Vec<i32>=(-N..0).chain(1..=N).collect();
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=rem_euclid(*i,*j);
}
}
res
});
}
#[bench]
fn bench_default(b: &mut Bencher) {
let d:Vec<i32>=(-N..=N).collect();
let n:Vec<i32>=(-N..0).chain(1..=N).collect();
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=i.rem_euclid(*j);
}
}
res
});
}
#[bench]
fn bench_zzabs(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=rem_euclid(*i,*j);
}
}
res
});
}
#[bench]
fn bench_zzdefault(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=i.rem_euclid(*j);
}
}
res
});
}
}
```
such code is copy from
https://github.com/rust-lang/rust/blob/master/library/std/src/f32.rs
and
https://github.com/rust-lang/rust/blob/master/library/std/src/f64.rs
using r+rhs.abs() is faster than calc it directly.
Bench result:
```
$ cargo bench
Compiling div-euclid v0.1.0 (/me/div-euclid)
Finished bench [optimized] target(s) in 1.01s
Running unittests src/lib.rs (target/release/deps/div_euclid-7a4530ca7817d1ef)
running 7 tests
test tests::it_works ... ignored
test tests::bench_aaabs ... bench: 10,498,793 ns/iter (+/- 104,360)
test tests::bench_aadefault ... bench: 11,061,862 ns/iter (+/- 94,107)
test tests::bench_abs ... bench: 10,477,193 ns/iter (+/- 81,942)
test tests::bench_default ... bench: 10,622,983 ns/iter (+/- 25,119)
test tests::bench_zzabs ... bench: 10,481,971 ns/iter (+/- 43,787)
test tests::bench_zzdefault ... bench: 11,074,976 ns/iter (+/- 29,633)
test result: ok. 0 passed; 0 failed; 1 ignored; 6 measured; 0 filtered out; finished in 19.35s
```
bench code:
```
#![feature(test)]
extern crate test;
fn rem_euclid(a:i32,rhs:i32)->i32{
let r = a % rhs;
if r < 0 { r + rhs.abs() } else { r }
}
#[cfg(test)]
mod tests {
use super::*;
use test::Bencher;
use rand::prelude::*;
use rand::rngs::SmallRng;
const N:i32=1000;
#[test]
fn it_works() {
let a: i32 = 7; // or any other integer type
let b = 4;
let d:Vec<i32>=(-N..=N).collect();
let n:Vec<i32>=(-N..0).chain(1..=N).collect();
for i in &d {
for j in &n {
assert_eq!(i.rem_euclid(*j),rem_euclid(*i,*j));
}
}
assert_eq!(rem_euclid(a,b), 3);
assert_eq!(rem_euclid(-a,b), 1);
assert_eq!(rem_euclid(a,-b), 3);
assert_eq!(rem_euclid(-a,-b), 1);
}
#[bench]
fn bench_aaabs(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=rem_euclid(*i,*j);
}
}
res
});
}
#[bench]
fn bench_aadefault(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=i.rem_euclid(*j);
}
}
res
});
}
#[bench]
fn bench_abs(b: &mut Bencher) {
let d:Vec<i32>=(-N..=N).collect();
let n:Vec<i32>=(-N..0).chain(1..=N).collect();
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=rem_euclid(*i,*j);
}
}
res
});
}
#[bench]
fn bench_default(b: &mut Bencher) {
let d:Vec<i32>=(-N..=N).collect();
let n:Vec<i32>=(-N..0).chain(1..=N).collect();
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=i.rem_euclid(*j);
}
}
res
});
}
#[bench]
fn bench_zzabs(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=rem_euclid(*i,*j);
}
}
res
});
}
#[bench]
fn bench_zzdefault(b: &mut Bencher) {
let mut d:Vec<i32>=(-N..=N).collect();
let mut n:Vec<i32>=(-N..0).chain(1..=N).collect();
let mut rng=SmallRng::from_seed([1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,21]);
d.shuffle(&mut rng);
n.shuffle(&mut rng);
d.shuffle(&mut rng);
b.iter(||{
let mut res=0;
for i in &d {
for j in &n {
res+=i.rem_euclid(*j);
}
}
res
});
}
}
```
The new implementation doesn't use weak lang items and instead changes
`#[alloc_error_handler]` to an attribute macro just like
`#[global_allocator]`.
The attribute will generate the `__rg_oom` function which is called by
the compiler-generated `__rust_alloc_error_handler`. If no `__rg_oom`
function is defined in any crate then the compiler shim will call
`__rdl_oom` in the alloc crate which will simply panic.
This also fixes link errors with `-C link-dead-code` with
`default_alloc_error_handler`: `__rg_oom` was previously defined in the
alloc crate and would attempt to reference the `oom` lang item, even if
it didn't exist. This worked as long as `__rg_oom` was excluded from
linking since it was not called.
This is a prerequisite for the stabilization of
`default_alloc_error_handler` (#102318).
The signature for new was
```
fn new<F>(f: F) -> Lazy<T, F>
```
Notably, with `F` unconstrained, `T` can be literally anything, and just
`let _ = Lazy::new(|| 92)` would not typecheck.
This historiacally was a necessity -- `new` is a `const` function, it
couldn't have any bounds. Today though, we can move `new` under the `F:
FnOnce() -> T` bound, which gives the compiler enough data to infer the
type of T from closure.