Implement `RawWaker` and `Waker` getters for underlying pointers
implement #87021
New APIs:
- `RawWaker::data(&self) -> *const ()`
- `RawWaker::vtable(&self) -> &'static RawWakerVTable`
- ~`Waker::as_raw_waker(&self) -> &RawWaker`~ `Waker::as_raw(&self) -> &RawWaker`
This third one is an auxiliary function to make the two APIs above more useful. Since we can only get `&Waker` in `Future::poll`, without this, we need to `transmute` it into `&RawWaker` (relying on `repr(transparent)`) in order to access its data/vtable pointers.
~Not sure if it should be named `as_raw` or `as_raw_waker`. Seems we always use `as_<something-raw>` instead of just `as_raw`. But `as_raw_waker` seems not quite consistent with `Waker::from_raw`.~ As suggested in https://github.com/rust-lang/rust/pull/91828#discussion_r770729837, use `as_raw`.
Carefully remove bounds checks from some chunk iterator functions
So, I was writing code that requires the equivalent of `rchunks(N).rev()` (which isn't the same as forward `chunks(N)` — in particular, if the buffer length is not a multiple of `N`, I must handle the "remainder" first).
I happened to look at the codegen output of the function (I was actually interested in whether or not a nested loop was being unrolled — it was), and noticed that in the outer `rchunks(n).rev()` loop, LLVM seemed to be unable to remove the bounds checks from the iteration: https://rust.godbolt.org/z/Tnz4MYY8f (this panic was from the split_at in `RChunks::next_back`).
After doing some experimentation, it seems all of the `next_back` in the non-exact chunk iterators have the issue: (`Chunks::next_back`, `RChunks::next_back`, `ChunksMut::next_back`, and `RChunksMut::next_back`)...
Even worse, the forward `rchunks` iterators sometimes have the issue as well (... but only sometimes). For example https://rust.godbolt.org/z/oGhbqv53r has bounds checks, but if I uncomment the loop body, it manages to remove the check (which is bizarre, since I'd expect the opposite...). I suspect it's highly dependent on the surrounding code, so I decided to remove the bounds checks from them anyway. Overall, this change includes:
- All `next_back` functions on the non-`Exact` iterators (e.g. `R?Chunks(Mut)?`).
- All `next` functions on the non-exact rchunks iterators (e.g. `RChunks(Mut)?`).
I wasn't able to catch any of the other chunk iterators failing to remove the bounds checks (I checked iterations over `r?chunks(_exact)?(_mut)?` with constant chunk sizes under `-O3`, `-Os`, and `-Oz`), which makes sense, since these were the cases where it was harder to prove the bounds check correct to remove...
In fact, it took quite a bit of thinking to convince myself that using unchecked_ here was valid — so I'm not really surprised that LLVM had trouble (although compilers are slightly better at this sort of reasoning than humans). A consequence of that is the fact that the `// SAFETY` comment for these are... kinda long...
---
I didn't do this for, or even think about it for, any of the other iteration methods; just `next` and `next_back` (where it mattered). If this PR is accepted, I'll file a follow up for someone (possibly me) to look at the others later (in particular, `nth`/`nth_back` looked like they had similar logic), but I wanted to do this now, as IMO `next`/`next_back` are the most important here, since they're what gets used by the iteration protocol.
---
Note: While I don't expect this to impact performance directly, the panic is a side effect, which would otherwise not exist in these loops. That is, this could prevent the compiler from being able to move/remove/otherwise rework a loop over these iterators (as an example, it could not delete the code for a loop whose body computes a value which doesn't get used).
Also, some like to be able to have confidence this code has no panicking branches in the optimized code, and "no bounds checks" is kinda part of the selling point of Rust's iterators anyway.
Remove deprecated and unstable slice_partition_at_index functions
They have been deprecated since commit 01ac5a97c9
which was part of the 1.49.0 release, so from the point of nightly,
11 releases ago.
Make `char::DecodeUtf16::size_hist` more precise
New implementation takes into account contents of `self.buf` and rounds lower bound up instead of down.
Fixes#88762
Revival of #88763
Add `intrinsics::const_deallocate`
Tracking issue: #79597
Related: #91884
This allows deallocation of a memory allocated by `intrinsics::const_allocate`. At the moment, this can be only used to reduce memory usage, but in the future this may be useful to detect memory leaks (If an allocated memory remains after evaluation, raise an error...?).
impl Not for !
The lack of this impl caused trouble for me in some degenerate cases of macro-generated code of the form `if !$cond {...}`, even without `feature(never_type)` on a stable compiler. Namely if `$cond` contains a `return` or `break` or similar diverging expression, which would otherwise be perfectly legal in boolean position, the code previously failed to compile with:
```console
error[E0600]: cannot apply unary operator `!` to type `!`
--> library/core/tests/ops.rs:239:8
|
239 | if !return () {}
| ^^^^^^^^^^ cannot apply unary operator `!`
```
Simplification of BigNum::bit_length
As indicated in the comment, the BigNum::bit_length function could be
optimized by using CLZ, which is often a single instruction instead a
loop.
I think the code is also simpler now without the loop.
I added some additional tests for Big8x3 and Big32x40 to ensure that
there were no regressions.
Partially stabilize `maybe_uninit_extra`
This covers:
```rust
impl<T> MaybeUninit<T> {
pub unsafe fn assume_init_read(&self) -> T { ... }
pub unsafe fn assume_init_drop(&mut self) { ... }
}
```
It does not cover the const-ness of `write` under `const_maybe_uninit_write` nor the const-ness of `assume_init_read` (this commit adds `const_maybe_uninit_assume_init_read` for that).
FCP: https://github.com/rust-lang/rust/issues/63567#issuecomment-958590287.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
This covers:
impl<T> MaybeUninit<T> {
pub unsafe fn assume_init_read(&self) -> T { ... }
pub unsafe fn assume_init_drop(&mut self) { ... }
}
It does not cover the const-ness of `write` under
`const_maybe_uninit_write` nor the const-ness of
`assume_init_read` (this commit adds
`const_maybe_uninit_assume_init_read` for that).
FCP: https://github.com/rust-lang/rust/issues/63567#issuecomment-958590287.
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
As indicated in the comment, the BigNum::bit_length function could be
optimized by using CLZ, which is often a single instruction instead a
loop.
I think the code is also simpler now without the loop.
I added some additional tests for Big8x3 and Big32x40 to ensure that
there were no regressions.
disable test with self-referential generator on Miri
Running the libcore test suite in Miri currently fails due to the known incompatibility of self-referential generators with Miri's aliasing checks (https://github.com/rust-lang/unsafe-code-guidelines/issues/148). So let's disable that test in Miri for now.
Allow reverse iteration of lowercase'd/uppercase'd chars
The PR implements `DoubleEndedIterator` trait for `ToLowercase` and `ToUppercase`.
This enables reverse iteration of lowercase/uppercase variants of character sequences.
One of use cases: determining whether a char sequence is a suffix of another one.
Example:
```rust
fn endswith_ignore_case(s1: &str, s2: &str) -> bool {
for eob in s1
.chars()
.flat_map(|c| c.to_lowercase())
.rev()
.zip_longest(s2.chars().flat_map(|c| c.to_lowercase()).rev())
{
match eob {
EitherOrBoth::Both(c1, c2) => {
if c1 != c2 {
return false;
}
}
EitherOrBoth::Left(_) => return true,
EitherOrBoth::Right(_) => return false,
}
}
true
}
```
Revert "Temporarily rename int_roundings functions to avoid conflicts"
This reverts commit 3ece63b64e.
This should be okay because #90329 has been merged.
r? `@joshtriplett`
Mark defaulted `PartialEq`/`PartialOrd` methods as const
WIthout it, `const` impls of these traits are unpleasant to write. I think this kind of change is allowed now. although it looks like it might require some Miri tweaks. Let's find out.
r? ```@fee1-dead```
Do array-slice equality via array equality, rather than always via slices
~~Draft because it needs a rebase after #91766 eventually gets through bors.~~
This enables the optimizations from #85828 to be used for array-to-slice comparisons too, not just array-to-array.
For example, <https://play.rust-lang.org/?version=nightly&mode=release&edition=2021&gist=5f9ba69b3d5825a782f897c830d3a6aa>
```rust
pub fn demo(x: &[u8], y: [u8; 4]) -> bool {
*x == y
}
```
Currently writes the array to stack for no reason:
```nasm
sub rsp, 4
mov dword ptr [rsp], edx
cmp rsi, 4
jne .LBB0_1
mov eax, dword ptr [rdi]
cmp eax, dword ptr [rsp]
sete al
add rsp, 4
ret
.LBB0_1:
xor eax, eax
add rsp, 4
ret
```
Whereas with the change in this PR it just compares it directly:
```nasm
cmp rsi, 4
jne .LBB1_1
cmp dword ptr [rdi], edx
sete al
ret
.LBB1_1:
xor eax, eax
ret
```
Constify `bool::then{,_some}`
Note on `~const Drop`: it has no effect when called from runtime functions, when called from const contexts, the trait system ensures that the type can be dropped in const contexts.
This'll still go via slices eventually for large arrays, but this way slice comparisons to short arrays can use the same memcmp-avoidance tricks.
Added some tests for all the combinations to make sure I didn't accidentally infinitely-recurse something.
Minor improvements to `future::join!`'s implementation
This is a follow-up from #91645, regarding [some remarks I made](https://rust-lang.zulipchat.com/#narrow/stream/187312-wg-async-foundations/topic/join!/near/264293660).
Mainly:
- it hides the recursive munching through a private `macro`, to avoid leaking such details (a corollary is getting rid of the need to use ``@`` to disambiguate);
- it uses a `match` binding, _outside_ the `async move` block, to better match the semantics from function-like syntax;
- it pre-pins the future before calling into `poll_fn`, since `poll_fn`, alone, cannot guarantee that its capture does not move (to clarify: I believe the previous code was sound, thanks to the outer layer of `async`. But I find it clearer / more robust to refactorings this way 🙂).
- it uses `@ibraheemdev's` very neat `.ready()?`;
- it renames `Took` to `Taken` for consistency with `Done` (tiny nit 😄).
~~TODO~~Done:
- [x] Add unit tests to enforce the function-like `:value` semantics are respected.
r? `@nrc`
Sync portable-simd to remove autosplats
This PR syncs portable-simd in up to a8385522ad in order to address the type inference breakages documented on nightly in https://github.com/rust-lang/rust/issues/90904 by removing the vector + scalar binary operations (called "autosplats", "broadcasting", or "rank promotion", depending on who you ask) that allow `{scalar} + &'_ {scalar}` to fail in some cases, because it becomes possible the programmer may have meant `{scalar} + &'_ {vector}`.
A few quality-of-life improvements make their way in as well:
- Lane counts can now go to 64, as LLVM seems to have fixed their miscompilation for those.
- `{i,u}8x64` to `__m512i` is now available.
- a bunch of `#[must_use]` notes appear throughout the module.
- Some implementations, mostly instances of `impl core::ops::{Op}<Simd> for Simd` that aren't `{vector} + {vector}` (e.g. `{vector} + &'_ {vector}`), leverage some generics and `where` bounds now to make them easier to understand by reducing a dozen implementations into one (and make it possible for people to open the docs on less burly devices).
- And some internal-only improvements.
None of these changes should affect a beta backport, only actual users of `core::simd` (and most aren't even visible in the programmatic sense), though I can extract an even more minimal changeset for beta if necessary. It seemed simpler to just keep moving forward.
Make `array::{try_from_fn, try_map}` and `Iterator::try_find` generic over `Try`
Fixes#85115
This only updates unstable functions.
`array::try_map` didn't actually exist before; this adds it under the still-open tracking issue #79711 from the old PR #79713.
Tracking issue for the new trait: #91285
This would also solve the return type question in for the proposed `Iterator::try_reduce` in #87054
disable tests in Miri that take too long
Comparing slices of length `usize::MAX` diverges in Miri. In fact these tests even diverge in rustc unless `-O` is passed. I tried this code to check that:
```rust
#![feature(slice_take)]
const EMPTY_MAX: &'static [()] = &[(); usize::MAX];
fn main() {
let mut slice: &[_] = &[(); usize::MAX];
println!("1");
assert_eq!(Some(&[] as _), slice.take(usize::MAX..));
println!("2");
let remaining: &[_] = EMPTY_MAX;
println!("3");
assert_eq!(remaining, slice);
println!("4");
}
```
So, disable these tests in Miri for now.
Fixes 85115
This only updates unstable functions.
`array::try_map` didn't actually exist before, despite the tracking issue 79711 still being open from the old PR 79713.
Fix Iterator::advance_by contract inconsistency
The `advance_by(n)` docs state that in the error case `Err(k)` that k is always less than n.
It also states that `advance_by(0)` may return `Err(0)` to indicate an exhausted iterator.
These statements are inconsistent.
Since only one implementation (Skip) actually made use of that I changed it to return Ok(()) in that case too.
While adding some tests I also found a bug in `Take::advance_back_by`.
Methods that were only blocked on `const_panic` have been stabilized.
The remaining methods of `duration_consts_2` are all related to floats,
and as such have been placed behind the `duration_consts_float` feature
gate.
The `advance_by(n)` docs state that in the error case `Err(k)` that k is always less than n.
It also states that `advance_by(0)` may return `Err(0)` to indicate an exhausted iterator.
These statements are inconsistent.
Since only one implementation (Skip) actually made use of that I changed it to return Ok(()) in that case too.
While adding some tests I also found a bug in `Take::advance_back_by`.
Stabilize `const_raw_ptr_deref` for `*const T`
This stabilizes dereferencing immutable raw pointers in const contexts.
It does not stabilize `*mut T` dereferencing. This is behind the
same feature gate as mutable references.
closes https://github.com/rust-lang/rust/issues/51911
pub use core::simd;
A portable abstraction over SIMD has been a major pursuit in recent years for several programming languages. In Rust, `std::arch` offers explicit SIMD acceleration via compiler intrinsics, but it does so at the cost of having to individually maintain each and every single such API, and is almost completely `unsafe` to use. `core::simd` offers safe abstractions that are resolved to the appropriate SIMD instructions by LLVM during compilation, including scalar instructions if that is all that is available.
`core::simd` is enabled by the `#![portable_simd]` nightly feature tracked in https://github.com/rust-lang/rust/issues/86656 and is introduced here by pulling in the https://github.com/rust-lang/portable-simd repository as a subtree. We built the repository out-of-tree to allow faster compilation and a stochastic test suite backed by the proptest crate to verify that different targets, features, and optimizations produce the same result, so that using this library does not introduce any surprises. As these tests are technically non-deterministic, and thus can introduce overly interesting Heisenbugs if included in the rustc CI, they are visible in the commit history of the subtree but do nothing here. Some tests **are** introduced via the documentation, but these use deterministic asserts.
There are multiple unsolved problems with the library at the current moment, including a want for better documentation, technical issues with LLVM scalarizing and lowering to libm, room for improvement for the APIs, and so far I have not added the necessary plumbing for allowing the more experimental or libm-dependent APIs to be used. However, I thought it would be prudent to open this for review in its current condition, as it is both usable and it is likely I am going to learn something else needs to be fixed when bors tries this out.
The major types are
- `core::simd::Simd<T, N>`
- `core::simd::Mask<T, N>`
There is also the `LaneCount` struct, which, together with the SimdElement and SupportedLaneCount traits, limit the implementation's maximum support to vectors we know will actually compile and provide supporting logic for bitmasks. I'm hoping to simplify at least some of these out of the way as the compiler and library evolve.
These tests just verify some basic APIs of core::simd function, and
guarantees that attempting to access the wrong things doesn't work.
The majority of tests are stochastic, and so remain upstream, but
a few deterministic tests arrive in the subtree as doc tests.
This stabilizes dereferencing immutable raw pointers in const contexts.
It does not stabilize `*mut T` dereferencing. This is placed behind the
`const_raw_mut_ptr_deref` feature gate.
Add #[must_use] to expensive computations
The unifying theme for this commit is weak, admittedly. I put together a list of "expensive" functions when I originally proposed this whole effort, but nobody's cared about that criterion. Still, it's a decent way to bite off a not-too-big chunk of work.
Given the grab bag nature of this commit, the messages I used vary quite a bit. I'm open to wording changes.
For some reason clippy flagged four `BTreeSet` methods but didn't say boo about equivalent ones on `HashSet`. I stared at them for a while but I can't figure out the difference so I added the `HashSet` ones in.
```rust
// Flagged by clippy.
alloc::collections::btree_set::BTreeSet<T> fn difference<'a>(&'a self, other: &'a BTreeSet<T>) -> Difference<'a, T>;
alloc::collections::btree_set::BTreeSet<T> fn symmetric_difference<'a>(&'a self, other: &'a BTreeSet<T>) -> SymmetricDifference<'a, T>
alloc::collections::btree_set::BTreeSet<T> fn intersection<'a>(&'a self, other: &'a BTreeSet<T>) -> Intersection<'a, T>;
alloc::collections::btree_set::BTreeSet<T> fn union<'a>(&'a self, other: &'a BTreeSet<T>) -> Union<'a, T>;
// Ignored by clippy, but not by me.
std::collections::HashSet<T, S> fn difference<'a>(&'a self, other: &'a HashSet<T, S>) -> Difference<'a, T, S>;
std::collections::HashSet<T, S> fn symmetric_difference<'a>(&'a self, other: &'a HashSet<T, S>) -> SymmetricDifference<'a, T, S>
std::collections::HashSet<T, S> fn intersection<'a>(&'a self, other: &'a HashSet<T, S>) -> Intersection<'a, T, S>;
std::collections::HashSet<T, S> fn union<'a>(&'a self, other: &'a HashSet<T, S>) -> Union<'a, T, S>;
```
Parent issue: #89692
r? ```@joshtriplett```
Implement split_array and split_array_mut
This implements `[T]::split_array::<const N>() -> (&[T; N], &[T])` and `[T; N]::split_array::<const M>() -> (&[T; M], &[T])` and their mutable equivalents. These are another few “missing” array implementations now that const generics are a thing, similar to #74373, #75026, etc. Fixes#74674.
This implements `[T; N]::split_array` returning an array and a slice. Ultimately, this is probably not what we want, we would want the second return value to be an array of length N-M, which will likely be possible with future const generics enhancements. We need to implement the array method now though, to immediately shadow the slice method. This way, when the slice methods get stabilized, calling them on an array will not be automatic through coercion, so we won't have trouble stabilizing the array methods later (cf. into_iter debacle).
An unchecked version of `[T]::split_array` could also be added as in #76014. This would not be needed for `[T; N]::split_array` as that can be compile-time checked. Edit: actually, since split_at_unchecked is internal-only it could be changed to be split_array-only.
Make more `From` impls `const` (libcore)
Adding `const` to `From` implementations in the core. `rustc_const_unstable` attribute is not added to unstable implementations.
Tracking issue: #88674
<details>
<summary>Done</summary><div>
- `T` from `T`
- `T` from `!`
- `Option<T>` from `T`
- `Option<&T>` from `&Option<T>`
- `Option<&mut T>` from `&mut Option<T>`
- `Cell<T>` from `T`
- `RefCell<T>` from `T`
- `UnsafeCell<T>` from `T`
- `OnceCell<T>` from `T`
- `Poll<T>` from `T`
- `u32` from `char`
- `u64` from `char`
- `u128` from `char`
- `char` from `u8`
- `AtomicBool` from `bool`
- `AtomicPtr<T>` from `*mut T`
- `AtomicI(bits)` from `i(bits)`
- `AtomicU(bits)` from `u(bits)`
- `i(bits)` from `NonZeroI(bits)`
- `u(bits)` from `NonZeroU(bits)`
- `NonNull<T>` from `Unique<T>`
- `NonNull<T>` from `&T`
- `NonNull<T>` from `&mut T`
- `Unique<T>` from `&mut T`
- `Infallible` from `!`
- `TryIntError` from `!`
- `TryIntError` from `Infallible`
- `TryFromSliceError` from `Infallible`
- `FromResidual for Option<T>`
</div></details>
<details>
<summary>Remaining</summary><dev>
- `NonZero` from `NonZero`
These can't be made const at this time because these use Into::into.
https://github.com/rust-lang/rust/blob/master/library/core/src/convert/num.rs#L393
- `std`, `alloc`
There may still be many implementations that can be made `const`.
</div></details>
Automatic exponential formatting in Debug
Context: See [this comment from the libs team](https://github.com/rust-lang/rfcs/pull/2729#issuecomment-853454204)
---
Makes `"{:?}"` switch to exponential for floats based on magnitude. The libs team suggested exploring this idea in the discussion thread for RFC rust-lang/rfcs#2729. (**note:** this is **not** an implementation of the RFC; it is an implementation of one of the alternatives)
Thresholds chosen were 1e-4 and 1e16. Justification described [here](https://github.com/rust-lang/rfcs/pull/2729#issuecomment-864482954).
**This will require a crater run.**
---
As mentioned in the commit message of 8731d4dfb4, this behavior will not apply when a precision is supplied, because I wanted to preserve the following existing and useful behavior of `{:.PREC?}` (which recursively applies `{:.PREC}` to floats in a struct):
```rust
assert_eq!(
format!("{:.2?}", [100.0, 0.000004]),
"[100.00, 0.00]",
)
```
I looked around and am not sure where there are any tests that actually use this in the test suite, though?
All things considered, I'm surprised that this change did not seem to break even a single existing test in `x.py test --stage 2`. (even when I tried a smaller threshold of 1e6)
The unifying theme for this commit is weak, admittedly. I put together a
list of "expensive" functions when I originally proposed this whole
effort, but nobody's cared about that criterion. Still, it's a decent
way to bite off a not-too-big chunk of work.
Given the grab bag nature of this commit, the messages I used vary quite
a bit.
Speedup int log10 branchless
This is achieved with a branchless bit-twiddling implementation of the case x < 100_000, and using this as building block.
Benchmark on an Intel i7-8700K (Coffee Lake):
```
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 165 169 4 2.42% x 0.98
num::int_log::u8_log10_random 438 423 -15 -3.42% x 1.04
num::int_log::u8_log10_random_small 438 423 -15 -3.42% x 1.04
num::int_log::u16_log10_predictable 633 417 -216 -34.12% x 1.52
num::int_log::u16_log10_random 908 471 -437 -48.13% x 1.93
num::int_log::u16_log10_random_small 945 471 -474 -50.16% x 2.01
num::int_log::u32_log10_predictable 1,496 1,340 -156 -10.43% x 1.12
num::int_log::u32_log10_random 1,076 873 -203 -18.87% x 1.23
num::int_log::u32_log10_random_small 1,145 874 -271 -23.67% x 1.31
num::int_log::u64_log10_predictable 4,005 3,171 -834 -20.82% x 1.26
num::int_log::u64_log10_random 1,247 1,021 -226 -18.12% x 1.22
num::int_log::u64_log10_random_small 1,265 921 -344 -27.19% x 1.37
num::int_log::u128_log10_predictable 39,667 39,579 -88 -0.22% x 1.00
num::int_log::u128_log10_random 6,456 6,696 240 3.72% x 0.96
num::int_log::u128_log10_random_small 4,108 3,903 -205 -4.99% x 1.05
```
Benchmark on an M1 Mac Mini:
```
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 143 130 -13 -9.09% x 1.10
num::int_log::u8_log10_random 375 325 -50 -13.33% x 1.15
num::int_log::u8_log10_random_small 376 325 -51 -13.56% x 1.16
num::int_log::u16_log10_predictable 500 322 -178 -35.60% x 1.55
num::int_log::u16_log10_random 794 405 -389 -48.99% x 1.96
num::int_log::u16_log10_random_small 1,035 405 -630 -60.87% x 2.56
num::int_log::u32_log10_predictable 1,144 894 -250 -21.85% x 1.28
num::int_log::u32_log10_random 832 786 -46 -5.53% x 1.06
num::int_log::u32_log10_random_small 832 787 -45 -5.41% x 1.06
num::int_log::u64_log10_predictable 2,681 2,057 -624 -23.27% x 1.30
num::int_log::u64_log10_random 1,015 806 -209 -20.59% x 1.26
num::int_log::u64_log10_random_small 1,004 795 -209 -20.82% x 1.26
num::int_log::u128_log10_predictable 56,825 56,526 -299 -0.53% x 1.01
num::int_log::u128_log10_random 9,056 8,861 -195 -2.15% x 1.02
num::int_log::u128_log10_random_small 1,528 1,527 -1 -0.07% x 1.00
```
The 128 bit case remains ridiculously slow because llvm fails to optimize division by a constant 128-bit value to multiplications. This could be worked around but it seems preferable to fix this in llvm.
From u32 up, table lookup (like suggested [here](https://github.com/rust-lang/rust/issues/70887#issuecomment-881099813)) is still faster, but requires a hardware `leading_zeros` to be viable, and might clog up the cache.
Add 'core::array::from_fn' and 'core::array::try_from_fn'
These auxiliary methods fill uninitialized arrays in a safe way and are particularly useful for elements that don't implement `Default`.
```rust
// Foo doesn't implement Default
struct Foo(usize);
let _array = core::array::from_fn::<_, _, 2>(|idx| Foo(idx));
```
Different from `FromIterator`, it is guaranteed that the array will be fully filled and no error regarding uninitialized state will be throw. In certain scenarios, however, the creation of an **element** can fail and that is why the `try_from_fn` function is also provided.
```rust
#[derive(Debug, PartialEq)]
enum SomeError {
Foo,
}
let array = core::array::try_from_fn(|i| Ok::<_, SomeError>(i));
assert_eq!(array, Ok([0, 1, 2, 3, 4]));
let another_array = core::array::try_from_fn(|_| Err(SomeError::Foo));
assert_eq!(another_array, Err(SomeError::Foo));
```
Fix Lower/UpperExp formatting for integers and precision zero
Fixes the integer part of #89493 (I daren't touch the floating-point formatting code). The issue is that the "subtracted" precision essentially behaves like extra trailing zeros, but this is not currently reflected in the code properly.
implement advance_(back_)_by on more iterators
Add more efficient, non-default implementations for `feature(iter_advance_by)` (#77404) on more iterators and adapters.
This PR only contains implementations where skipping over items doesn't elide any observable side-effects such as user-provided closures or `clone()` functions. I'll put those in a separate PR.
const fn for option copied, take & replace
Tracking issue: [#67441](https://github.com/rust-lang/rust/issues/67441)
Adding const fn for the copied, take and replace method of Option. Also adding necessary unit test.
It's my first contribution so I am pretty sure I don't know what I'm doing but there's a first for everything!
Constify ?-operator for Result and Option
Try to make `?`-operator usable in `const fn` with `Result` and `Option`, see #74935 . Note that the try-operator itself was constified in #87237.
TODO
* [x] Add tests for const T -> T conversions
* [x] cleanup commits
* [x] Remove `#![allow(incomplete_features)]`
* [?] Await decision in #86808 - I'm not sure
* [x] Await support for parsing `~const` in bootstrapping compiler
* [x] Tracking issue(s)? - #88674
Make `Duration` respect `width` when formatting using `Debug`
When printing or writing a `std::time::Duration` using `Debug` formatting, it previously completely ignored any specified `width`. This is unlike types like integers and floats, which do pad to `width`, for both `Display` and `Debug`, though not all types consider `width` in their `Debug` output (see e.g. #30164). Curiously, `Duration`'s `Debug` formatting *did* consider `precision`.
This PR makes `Duration` pad to `width` just like integers and floats, so that
```rust
format!("|{:8?}|", Duration::from_millis(1234))
```
returns
```
|1.234s |
```
Before you ask "who formats `Debug` output?", note that `Duration` doesn't actually implement `Display`, so `Debug` is currently the only way to format `Duration`s. I think that's wrong, and `Duration` should get a `Display` implementation, but in the meantime there's no harm in making the `Debug` formatting respect `width` rather than ignore it.
I chose the default alignment to be left-aligned. The general rule Rust uses is: numeric types are right-aligned by default, non-numeric types left-aligned. It wasn't clear to me whether `Duration` is a numeric type or not. The fact that a formatted `Duration` can end with suffixes of variable length (`"s"`, `"ms"`, `"µs"`, etc.) made me lean towards left-alignment, but it would be trivial to change it.
Fixes issue #88059.
These functions are unstable, but because they're inherent they still
introduce conflicts with stable trait functions in crates. Temporarily
rename them to fix these conflicts, until we can resolve those conflicts
in a better way.
This is achieved with a branchless bit-twiddling implementation of the
case x < 100_000, and using this as building block.
Benchmark on an Intel i7-8700K (Coffee Lake):
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 165 169 4 2.42% x 0.98
num::int_log::u8_log10_random 438 423 -15 -3.42% x 1.04
num::int_log::u8_log10_random_small 438 423 -15 -3.42% x 1.04
num::int_log::u16_log10_predictable 633 417 -216 -34.12% x 1.52
num::int_log::u16_log10_random 908 471 -437 -48.13% x 1.93
num::int_log::u16_log10_random_small 945 471 -474 -50.16% x 2.01
num::int_log::u32_log10_predictable 1,496 1,340 -156 -10.43% x 1.12
num::int_log::u32_log10_random 1,076 873 -203 -18.87% x 1.23
num::int_log::u32_log10_random_small 1,145 874 -271 -23.67% x 1.31
num::int_log::u64_log10_predictable 4,005 3,171 -834 -20.82% x 1.26
num::int_log::u64_log10_random 1,247 1,021 -226 -18.12% x 1.22
num::int_log::u64_log10_random_small 1,265 921 -344 -27.19% x 1.37
num::int_log::u128_log10_predictable 39,667 39,579 -88 -0.22% x 1.00
num::int_log::u128_log10_random 6,456 6,696 240 3.72% x 0.96
num::int_log::u128_log10_random_small 4,108 3,903 -205 -4.99% x 1.05
Benchmark on an M1 Mac Mini:
name old ns/iter new ns/iter diff ns/iter diff % speedup
num::int_log::u8_log10_predictable 143 130 -13 -9.09% x 1.10
num::int_log::u8_log10_random 375 325 -50 -13.33% x 1.15
num::int_log::u8_log10_random_small 376 325 -51 -13.56% x 1.16
num::int_log::u16_log10_predictable 500 322 -178 -35.60% x 1.55
num::int_log::u16_log10_random 794 405 -389 -48.99% x 1.96
num::int_log::u16_log10_random_small 1,035 405 -630 -60.87% x 2.56
num::int_log::u32_log10_predictable 1,144 894 -250 -21.85% x 1.28
num::int_log::u32_log10_random 832 786 -46 -5.53% x 1.06
num::int_log::u32_log10_random_small 832 787 -45 -5.41% x 1.06
num::int_log::u64_log10_predictable 2,681 2,057 -624 -23.27% x 1.30
num::int_log::u64_log10_random 1,015 806 -209 -20.59% x 1.26
num::int_log::u64_log10_random_small 1,004 795 -209 -20.82% x 1.26
num::int_log::u128_log10_predictable 56,825 56,526 -299 -0.53% x 1.01
num::int_log::u128_log10_random 9,056 8,861 -195 -2.15% x 1.02
num::int_log::u128_log10_random_small 1,528 1,527 -1 -0.07% x 1.00
The 128 bit case remains ridiculously slow because llvm fails to optimize division by
a constant 128-bit value to multiplications. This could be worked around but it seems
preferable to fix this in llvm.
From u32 up, table lookup (like suggested here
https://github.com/rust-lang/rust/issues/70887#issuecomment-881099813) is still
faster, but requires a hardware leading_zero to be viable, and might clog up the
cache.
Stabilize `UnsafeCell::raw_get()`
This PR stabilizes the associated function `UnsafeCell::raw_get()`. The FCP has [already completed](https://github.com/rust-lang/rust/issues/66358#issuecomment-899095068). While there was some discussion about the naming after the close of the FCP, it looks like people have agreed on this name. Still, it would probably be best if a `libs-api` member had a look at this and stated whether more discussion is needed.
While I was at it, I added some tests for `UnsafeCell`, because there were barely any.
Closes#66358.
fix: move test that require mut to another
Adding TODOs for Option::take and Option::copied
TODO to FIXME + moving const stability under normal
Moving const stability attr under normal stab attr
move more rustc stability attributes
Added the `Option::unzip()` method
* Adds the `Option::unzip()` method to turn an `Option<(T, U)>` into `(Option<T>, Option<U>)` under the `unzip_option` feature
* Adds tests for both `Option::unzip()` and `Option::zip()`, I noticed that `.zip()` didn't have any
* Adds `#[inline]` to a few of `Option`'s methods that were missing it
implement TrustedLen for Flatten/FlatMap if the U: IntoIterator == [T; N]
This only works if arrays are passed directly instead of array iterators
because we need to be sure that they have not been advanced before
Flatten does its size calculation.
resolves#87094
Update Rust Float-Parsing Algorithms to use the Eisel-Lemire algorithm.
# Summary
Rust, although it implements a correct float parser, has major performance issues in float parsing. Even for common floats, the performance can be 3-10x [slower](https://arxiv.org/pdf/2101.11408.pdf) than external libraries such as [lexical](https://github.com/Alexhuszagh/rust-lexical) and [fast-float-rust](https://github.com/aldanor/fast-float-rust).
Recently, major advances in float-parsing algorithms have been developed by Daniel Lemire, along with others, and implement a fast, performant, and correct float parser, with speeds up to 1200 MiB/s on Apple's M1 architecture for the [canada](0e2b5d163d/data/canada.txt) dataset, 10x faster than Rust's 130 MiB/s.
In addition, [edge-cases](https://github.com/rust-lang/rust/issues/85234) in Rust's [dec2flt](868c702d0c/library/core/src/num/dec2flt) algorithm can lead to over a 1600x slowdown relative to efficient algorithms. This is due to the use of Clinger's correct, but slow [AlgorithmM and Bellepheron](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.45.4152&rep=rep1&type=pdf), which have been improved by faster big-integer algorithms and the Eisel-Lemire algorithm, respectively.
Finally, this algorithm provides substantial improvements in the number of floats the Rust core library can parse. Denormal floats with a large number of digits cannot be parsed, due to use of the `Big32x40`, which simply does not have enough digits to round a float correctly. Using a custom decimal class, with much simpler logic, we can parse all valid decimal strings of any digit count.
```rust
// Issue in Rust's dec2fly.
"2.47032822920623272088284396434110686182e-324".parse::<f64>(); // Err(ParseFloatError { kind: Invalid })
```
# Solution
This pull request implements the Eisel-Lemire algorithm, modified from [fast-float-rust](https://github.com/aldanor/fast-float-rust) (which is licensed under Apache 2.0/MIT), along with numerous modifications to make it more amenable to inclusion in the Rust core library. The following describes both features in fast-float-rust and improvements in fast-float-rust for inclusion in core.
**Documentation**
Extensive documentation has been added to ensure the code base may be maintained by others, which explains the algorithms as well as various associated constants and routines. For example, two seemingly magical constants include documentation to describe how they were derived as follows:
```rust
// Round-to-even only happens for negative values of q
// when q ≥ −4 in the 64-bit case and when q ≥ −17 in
// the 32-bitcase.
//
// When q ≥ 0,we have that 5^q ≤ 2m+1. In the 64-bit case,we
// have 5^q ≤ 2m+1 ≤ 2^54 or q ≤ 23. In the 32-bit case,we have
// 5^q ≤ 2m+1 ≤ 2^25 or q ≤ 10.
//
// When q < 0, we have w ≥ (2m+1)×5^−q. We must have that w < 2^64
// so (2m+1)×5^−q < 2^64. We have that 2m+1 > 2^53 (64-bit case)
// or 2m+1 > 2^24 (32-bit case). Hence,we must have 2^53×5^−q < 2^64
// (64-bit) and 2^24×5^−q < 2^64 (32-bit). Hence we have 5^−q < 2^11
// or q ≥ −4 (64-bit case) and 5^−q < 2^40 or q ≥ −17 (32-bitcase).
//
// Thus we have that we only need to round ties to even when
// we have that q ∈ [−4,23](in the 64-bit case) or q∈[−17,10]
// (in the 32-bit case). In both cases,the power of five(5^|q|)
// fits in a 64-bit word.
const MIN_EXPONENT_ROUND_TO_EVEN: i32;
const MAX_EXPONENT_ROUND_TO_EVEN: i32;
```
This ensures maintainability of the code base.
**Improvements for Disguised Fast-Path Cases**
The fast path in float parsing algorithms attempts to use native, machine floats to represent both the significant digits and the exponent, which is only possible if both can be exactly represented without rounding. In practice, this means that the significant digits must be 53-bits or less and the then exponent must be in the range `[-22, 22]` (for an f64). This is similar to the existing dec2flt implementation.
However, disguised fast-path cases exist, where there are few significant digits and an exponent above the valid range, such as `1.23e25`. In this case, powers-of-10 may be shifted from the exponent to the significant digits, discussed at length in https://github.com/rust-lang/rust/issues/85198.
**Digit Parsing Improvements**
Typically, integers are parsed from string 1-at-a-time, requiring unnecessary multiplications which can slow down parsing. An approach to parse 8 digits at a time using only 3 multiplications is described in length [here](https://johnnylee-sde.github.io/Fast-numeric-string-to-int/). This leads to significant performance improvements, and is implemented for both big and little-endian systems.
**Unsafe Changes**
Relative to fast-float-rust, this library makes less use of unsafe functionality and clearly documents it. This includes the refactoring and documentation of numerous unsafe methods undesirably marked as safe. The original code would look something like this, which is deceptively marked as safe for unsafe functionality.
```rust
impl AsciiStr {
#[inline]
pub fn step_by(&mut self, n: usize) -> &mut Self {
unsafe { self.ptr = self.ptr.add(n) };
self
}
}
...
#[inline]
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
// the first character is 'e'/'E' and scientific mode is enabled
let start = *s;
s.step();
...
}
```
The new code clearly documents safety concerns, and does not mark unsafe functionality as safe, leading to better safety guarantees.
```rust
impl AsciiStr {
/// Advance the view by n, advancing it in-place to (n..).
pub unsafe fn step_by(&mut self, n: usize) -> &mut Self {
// SAFETY: same as step_by, safe as long n is less than the buffer length
self.ptr = unsafe { self.ptr.add(n) };
self
}
}
...
/// Parse the scientific notation component of a float.
fn parse_scientific(s: &mut AsciiStr<'_>) -> i64 {
let start = *s;
// SAFETY: the first character is 'e'/'E' and scientific mode is enabled
unsafe {
s.step();
}
...
}
```
This allows us to trivially demonstrate the new implementation of dec2flt is safe.
**Inline Annotations Have Been Removed**
In the previous implementation of dec2flt, inline annotations exist practically nowhere in the entire module. Therefore, these annotations have been removed, which mostly does not impact [performance](https://github.com/aldanor/fast-float-rust/issues/15#issuecomment-864485157).
**Fixed Correctness Tests**
Numerous compile errors in `src/etc/test-float-parse` were present, due to deprecation of `time.clock()`, as well as the crate dependencies with `rand`. The tests have therefore been reworked as a [crate](https://github.com/Alexhuszagh/rust/tree/master/src/etc/test-float-parse), and any errors in `runtests.py` have been patched.
**Undefined Behavior**
An implementation of `check_len` which relied on undefined behavior (in fast-float-rust) has been refactored, to ensure that the behavior is well-defined. The original code is as follows:
```rust
#[inline]
pub fn check_len(&self, n: usize) -> bool {
unsafe { self.ptr.add(n) <= self.end }
}
```
And the new implementation is as follows:
```rust
/// Check if the slice at least `n` length.
fn check_len(&self, n: usize) -> bool {
n <= self.as_ref().len()
}
```
Note that this has since been fixed in [fast-float-rust](https://github.com/aldanor/fast-float-rust/pull/29).
**Inferring Binary Exponents**
Rather than explicitly store binary exponents, this new implementation infers them from the decimal exponent, reducing the amount of static storage required. This removes the requirement to store [611 i16s](868c702d0c/library/core/src/num/dec2flt/table.rs (L8)).
# Code Size
The code size, for all optimizations, does not considerably change relative to before for stripped builds, however it is **significantly** smaller prior to stripping the resulting binaries. These binary sizes were calculated on x86_64-unknown-linux-gnu.
**new**
Using rustc version 1.55.0-dev.
opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|400k|300K
1|396k|292K
2|392k|292K
3|392k|296K
s|396k|292K
z|396k|292K
**old**
Using rustc version 1.53.0-nightly.
opt-level|size|size(stripped)
|:-:|:-:|:-:|
0|3.2M|304K
1|3.2M|292K
2|3.1M|284K
3|3.1M|284K
s|3.1M|284K
z|3.1M|284K
# Correctness
The dec2flt implementation passes all of Rust's unittests and comprehensive float parsing tests, along with numerous other tests such as Nigel Toa's comprehensive float [tests](https://github.com/nigeltao/parse-number-fxx-test-data) and Hrvoje Abraham [strtod_tests](https://github.com/ahrvoje/numerics/blob/master/strtod/strtod_tests.toml). Therefore, it is unlikely that this algorithm will incorrectly round parsed floats.
# Issues Addressed
This will fix and close the following issues:
- resolves#85198
- resolves#85214
- resolves#85234
- fixes#31407
- fixes#31109
- fixes#53015
- resolves#68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
Implementation is based off fast-float-rust, with a few notable changes.
- Some unsafe methods have been removed.
- Safe methods with inherently unsafe functionality have been removed.
- All unsafe functionality is documented and provably safe.
- Extensive documentation has been added for simpler maintenance.
- Inline annotations on internal routines has been removed.
- Fixed Python errors in src/etc/test-float-parse/runtests.py.
- Updated test-float-parse to be a library, to avoid missing rand dependency.
- Added regression tests for #31109 and #31407 in core tests.
- Added regression tests for #31109 and #31407 in ui tests.
- Use the existing slice primitive to simplify shared dec2flt methods
- Remove Miri ignores from dec2flt, due to faster parsing times.
- resolves#85198
- resolves#85214
- resolves#85234
- fixes#31407
- fixes#31109
- fixes#53015
- resolves#68396
- closes https://github.com/aldanor/fast-float-rust/issues/15
Due to #20400 the corresponding TrustedLen impls need a helper trait
instead of directly adding `Item = &[T;N]` bounds.
Since TrustedLen is a public trait this in turn means
the helper trait needs to be public. Since it's just a workaround
for a compiler deficit it's marked hidden, unstable and unsafe.
This only works if arrays are passed directly instead of array iterators
because we need to be sure that they have not been advanced before
Flatten does its size calculation.
Add Integer::log variants
_This is another attempt at landing https://github.com/rust-lang/rust/pull/70835, which was approved by the libs team but failed on Android tests through Bors. The text copied here is from the original issue. The only change made so far is the addition of non-`checked_` variants of the log methods._
_Tracking issue: #70887_
---
This implements `{log,log2,log10}` methods for all integer types. The implementation was provided by `@substack` for use in the stdlib.
_Note: I'm not big on math, so this PR is a best effort written with limited knowledge. It's likely I'll be getting things wrong, but happy to learn and correct. Please bare with me._
## Motivation
Calculating the logarithm of a number is a generally useful operation. Currently the stdlib only provides implementations for floats, which means that if we want to calculate the logarithm for an integer we have to cast it to a float and then back to an int.
> would be nice if there was an integer log2 instead of having to either use the f32 version or leading_zeros() which i have to verify the results of every time to be sure
_— [`@substack,` 2020-03-08](https://twitter.com/substack/status/1236445105197727744)_
At higher numbers converting from an integer to a float we also risk overflows. This means that Rust currently only provides log operations for a limited set of integers.
The process of doing log operations by converting between floats and integers is also prone to rounding errors. In the following example we're trying to calculate `base10` for an integer. We might try and calculate the `base2` for the values, and attempt [a base swap](https://www.rapidtables.com/math/algebra/Logarithm.html#log-rules) to arrive at `base10`. However because we're performing intermediate rounding we arrive at the wrong result:
```rust
// log10(900) = ~2.95 = 2
dbg!(900f32.log10() as u64);
// log base change rule: logb(x) = logc(x) / logc(b)
// log2(900) / log2(10) = 9/3 = 3
dbg!((900f32.log2() as u64) / (10f32.log2() as u64));
```
_[playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=6bd6c68b3539e400f9ca4fdc6fc2eed0)_
This is somewhat nuanced as a lot of the time it'll work well, but in real world code this could lead to some hard to track bugs. By providing correct log implementations directly on integers we can help prevent errors around this.
## Implementation notes
I checked whether LLVM intrinsics existed before implementing this, and none exist yet. ~~Also I couldn't really find a better way to write the `ilog` function. One option would be to make it a private method on the number, but I didn't see any precedent for that. I also didn't know where to best place the tests, so I added them to the bottom of the file. Even though they might seem like quite a lot they take no time to execute.~~
## References
- [Log rules](https://www.rapidtables.com/math/algebra/Logarithm.html#log-rules)
- [Rounding error playground](https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=6bd6c68b3539e400f9ca4fdc6fc2eed0)
- [substack's tweet asking about integer log2 in the stdlib](https://twitter.com/substack/status/1236445105197727744)
- [Integer Logarithm, A. Jaffer 2008](https://people.csail.mit.edu/jaffer/III/ilog.pdf)
core: add unstable no_fp_fmt_parse to disable float formatting code
In some projects (e.g. kernel), floating point is forbidden. They can disable
hardware floating point support and use `+soft-float` to avoid fp instructions
from being generated, but as libcore contains the formatting code for `f32`
and `f64`, some fp intrinsics are depended. One could define stubs for these
intrinsics that just panic [1], but it means that if any formatting functions
are accidentally used, mistake can only be caught during the runtime rather
than during compile-time or link-time, and they consume a lot of space without
LTO.
This patch provides an unstable cfg `no_fp_fmt_parse` to disable these.
A panicking stub is still provided for the `Debug` implementation (unfortunately)
because there are some SIMD types that use `#[derive(Debug)]`.
[1]: https://lkml.org/lkml/2021/4/14/1028
* {:.PREC?} already had legitimately useful behavior (recursive formatting of structs using
fixed precision for floats) and I suspect that changes to the output there would be unwelcome.
(besides, precision introduces sinister edge cases where a number can be rounded up to one
of the thresholds)
Thus, the new behavior of Debug is, "dynamically switch to exponential, but only if there's
no precision."
* This could not be implemented in terms of float_to_decimal_common without repeating the branch
on precision, so 'float_to_general_debug' is a new function. The name is '_debug' instead of
'_common' because the considerations in the previous bullet make this logic pretty specific
to Debug.
* 'float_to_decimal_common' is now only used by Display, so I inlined the min_precision argument
and renamed the function accordingly.
This was unsound since a panic in a.next_back() would result in the
length not being updated which would then lead to the same element
being revisited in the side-effect preserving code.
to_digit simplification (less jumps)
I just realised we might be able to make use of the fact that changing case in ascii is easy to help simplify to_digit some more.
It looks a bit cleaner and it looks like it's less jumps and there's less instructions in the generated assembly:
https://godbolt.org/z/84Erh5dhz
The benchmarks don't really tell me much. Maybe a slight improvement on the var radix.
Before:
```
test char::methods::bench_to_digit_radix_10 ... bench: 53,819 ns/iter (+/- 8,314)
test char::methods::bench_to_digit_radix_16 ... bench: 57,265 ns/iter (+/- 10,730)
test char::methods::bench_to_digit_radix_2 ... bench: 55,077 ns/iter (+/- 5,431)
test char::methods::bench_to_digit_radix_36 ... bench: 56,549 ns/iter (+/- 3,248)
test char::methods::bench_to_digit_radix_var ... bench: 43,848 ns/iter (+/- 3,189)
test char::methods::bench_to_digit_radix_10 ... bench: 51,707 ns/iter (+/- 10,946)
test char::methods::bench_to_digit_radix_16 ... bench: 52,835 ns/iter (+/- 2,689)
test char::methods::bench_to_digit_radix_2 ... bench: 51,012 ns/iter (+/- 2,746)
test char::methods::bench_to_digit_radix_36 ... bench: 53,210 ns/iter (+/- 8,645)
test char::methods::bench_to_digit_radix_var ... bench: 40,386 ns/iter (+/- 4,711)
test char::methods::bench_to_digit_radix_10 ... bench: 54,088 ns/iter (+/- 5,677)
test char::methods::bench_to_digit_radix_16 ... bench: 55,972 ns/iter (+/- 17,229)
test char::methods::bench_to_digit_radix_2 ... bench: 52,083 ns/iter (+/- 2,425)
test char::methods::bench_to_digit_radix_36 ... bench: 54,132 ns/iter (+/- 1,548)
test char::methods::bench_to_digit_radix_var ... bench: 41,250 ns/iter (+/- 5,299)
```
After:
```
test char::methods::bench_to_digit_radix_10 ... bench: 48,907 ns/iter (+/- 19,449)
test char::methods::bench_to_digit_radix_16 ... bench: 52,673 ns/iter (+/- 8,122)
test char::methods::bench_to_digit_radix_2 ... bench: 48,509 ns/iter (+/- 2,885)
test char::methods::bench_to_digit_radix_36 ... bench: 50,526 ns/iter (+/- 4,610)
test char::methods::bench_to_digit_radix_var ... bench: 38,618 ns/iter (+/- 3,180)
test char::methods::bench_to_digit_radix_10 ... bench: 54,202 ns/iter (+/- 6,994)
test char::methods::bench_to_digit_radix_16 ... bench: 56,585 ns/iter (+/- 8,448)
test char::methods::bench_to_digit_radix_2 ... bench: 50,548 ns/iter (+/- 1,674)
test char::methods::bench_to_digit_radix_36 ... bench: 52,749 ns/iter (+/- 2,576)
test char::methods::bench_to_digit_radix_var ... bench: 40,215 ns/iter (+/- 3,327)
test char::methods::bench_to_digit_radix_10 ... bench: 50,233 ns/iter (+/- 22,272)
test char::methods::bench_to_digit_radix_16 ... bench: 50,841 ns/iter (+/- 19,981)
test char::methods::bench_to_digit_radix_2 ... bench: 50,386 ns/iter (+/- 4,555)
test char::methods::bench_to_digit_radix_36 ... bench: 52,369 ns/iter (+/- 2,737)
test char::methods::bench_to_digit_radix_var ... bench: 40,417 ns/iter (+/- 2,766)
```
I removed the likely as it resulted in a few less instructions. (It's not been in there long - I added it in the last to_digit iteration).
Make copy/copy_nonoverlapping fn's again
Make copy/copy_nonoverlapping fn's again, rather than intrinsics.
This a short-term change to address issue #84297.
It effectively reverts PRs #81167#81238 (and part of #82967), #83091, and parts of #79684.
Update standard library for IntoIterator implementation of arrays
This PR partially resolves issue #84513 of updating the standard library part.
I haven't found any remaining doctest examples which are using iterators over e.g. &i32 instead of just i32 in the standard library. Can anyone point me to them if there's remaining any?
Thanks!
r? ```@m-ou-se```
While stdlib implementations of the unchecked methods require unchecked
math, there is no reason to gate it behind this for external users. The
reasoning for a separate `step_trait_ext` feature is unclear, and as
such has been merged as well.
Implement indexing slices with pairs of core::ops::Bound<usize>
Closes#49976.
I am not sure about code duplication between `check_range` and `into_maybe_range`. Should be former implemented in terms of the latter? Also this PR doesn't address code duplication between `impl SliceIndex for Range*`.
Format `Struct { .. }` on one line even with `{:#?}`.
The result of `debug_struct("A").finish_non_exhaustive()` before this change:
```
A {
..
}
```
And after this change:
```
A { .. }
```
If there's any fields, the result stays unchanged:
```
A {
field: value,
..
}
Stabilize `peekable_peek_mut`
Resolves#78302. Also adds some documentation on `std::iter::Iterator::peekable()` regarding the new method.
The feature was added in #77491 in Nov' 20, which is recently, but the feature seems reasonably small. Never did a stabilization-pr, excuse my ignorance if there is a protocol I'm not aware of.
Stabilize cmp_min_max_by
I would like to propose cmp::{min_by, min_by_key, max_by, max_by_key}
for stabilization.
These are relatively simple and seemingly uncontroversial functions and
have been unchanged in unstable for a while now.
Closes: #64460
I would like to propose cmp::{min_by, min_by_key, max_by, max_by_key}
for stabilization.
These are relatively simple and seemingly uncontroversial functions and
have been unchanged in unstable for a while now.
Add IEEE 754 compliant fmt/parse of -0, infinity, NaN
This pull request improves the Rust float formatting/parsing libraries to comply with IEEE 754's formatting expectations around certain special values, namely signed zero, the infinities, and NaN. It also adds IEEE 754 compliance tests that, while less stringent in certain places than many of the existing flt2dec/dec2flt capability tests, are intended to serve as the beginning of a roadmap to future compliance with the standard. Some relevant documentation is also adjusted with clarifying remarks.
This PR follows from discussion in https://github.com/rust-lang/rfcs/issues/1074, and closes#24623.
The most controversial change here is likely to be that -0 is now printed as -0. Allow me to explain: While there appears to be community support for an opt-in toggle of printing floats as if they exist in the naively expected domain of numbers, i.e. not the extended reals (where floats live), IEEE 754-2019 is clear that a float converted to a string should be capable of being transformed into the original floating point bit-pattern when it satisfies certain conditions (namely, when it is an actual numeric value i.e. not a NaN and the original and destination float width are the same). -0 is given special attention here as a value that should have its sign preserved. In addition, the vast majority of other programming languages not only output `-0` but output `-0.0` here.
While IEEE 754 offers a broad leeway in how to handle producing what it calls a "decimal character sequence", it is clear that the operations a language provides should be capable of round tripping, and it is confusing to advertise the f32 and f64 types as binary32 and binary64 yet have the most basic way of producing a string and then reading it back into a floating point number be non-conformant with the standard. Further, existing documentation suggested that e.g. -0 would be printed with -0 regardless of the presence of the `+` fmt character, but it prints "+0" instead if given such (which was what led to the opening of #24623).
There are other parsing and formatting issues for floating point numbers which prevent Rust from complying with the standard, as well as other well-documented challenges on the arithmetic level, but I hope that this can be the beginning of motion towards solving those challenges.
Add Result::into_err where the Ok variant is the never type
Equivalent of #66045 but for the inverse situation where `T: Into<!>` rather than `E: Into<!>`.
I'm using the same feature gate name. I can't see why one of these methods would be OK to stabilize but not the other.
Tracking issue: #61695
Remove Option::{unwrap_none, expect_none}.
This removes `Option::unwrap_none` and `Option::expect_none` since we're not going to stabilize them, see https://github.com/rust-lang/rust/issues/62633.
Closes#62633
This commit removes the previous mechanism of differentiating
between "Debug" and "Display" formattings for the sign of -0 so as
to comply with the IEEE 754 standard's requirements on external
character sequences preserving various attributes of a floating
point representation.
In addition, numerous tests are fixed.
Stabilize `unsafe_op_in_unsafe_fn` lint
This makes it possible to override the level of the `unsafe_op_in_unsafe_fn`, as proposed in https://github.com/rust-lang/rust/issues/71668#issuecomment-729770896.
Tracking issue: #71668
r? ```@nikomatsakis``` cc ```@SimonSapin``` ```@RalfJung```
# Stabilization report
This is a stabilization report for `#![feature(unsafe_block_in_unsafe_fn)]`.
## Summary
Currently, the body of unsafe functions is an unsafe block, i.e. you can perform unsafe operations inside.
The `unsafe_op_in_unsafe_fn` lint, stabilized here, can be used to change this behavior, so performing unsafe operations in unsafe functions requires an unsafe block.
For now, the lint is allow-by-default, which means that this PR does not change anything without overriding the lint level.
For more information, see [RFC 2585](https://github.com/rust-lang/rfcs/blob/master/text/2585-unsafe-block-in-unsafe-fn.md)
### Example
```rust
// An `unsafe fn` for demonstration purposes.
// Calling this is an unsafe operation.
unsafe fn unsf() {}
// #[allow(unsafe_op_in_unsafe_fn)] by default,
// the behavior of `unsafe fn` is unchanged
unsafe fn allowed() {
// Here, no `unsafe` block is needed to
// perform unsafe operations...
unsf();
// ...and any `unsafe` block is considered
// unused and is warned on by the compiler.
unsafe {
unsf();
}
}
#[warn(unsafe_op_in_unsafe_fn)]
unsafe fn warned() {
// Removing this `unsafe` block will
// cause the compiler to emit a warning.
// (Also, no "unused unsafe" warning will be emitted here.)
unsafe {
unsf();
}
}
#[deny(unsafe_op_in_unsafe_fn)]
unsafe fn denied() {
// Removing this `unsafe` block will
// cause a compilation error.
// (Also, no "unused unsafe" warning will be emitted here.)
unsafe {
unsf();
}
}
```
Prevent specialized ZipImpl from calling `__iterator_get_unchecked` twice with the same index
Fixes#82291
It's open for review, but conflicts with #82289, wait before merging. The conflict involves only the new test, so it should be rather trivial to fix.
Improve slice.binary_search_by()'s best-case performance to O(1)
This PR aimed to improve the [slice.binary_search_by()](https://doc.rust-lang.org/std/primitive.slice.html#method.binary_search_by)'s best-case performance to O(1).
# Noticed
I don't know why the docs of `binary_search_by` said `"If there are multiple matches, then any one of the matches could be returned."`, but the implementation isn't the same thing. Actually, it returns the **last one** if multiple matches found.
Then we got two options:
## If returns the last one is the correct or desired result
Then I can rectify the docs and revert my changes.
## If the docs are correct or desired result
Then my changes can be merged after fully reviewed.
However, if my PR gets merged, another issue raised: this could be a **breaking change** since if multiple matches found, the returning order no longer the last one instead of it could be any one.
For example:
```rust
let mut s = vec![0, 1, 1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55];
let num = 1;
let idx = s.binary_search(&num);
s.insert(idx, 2);
// Old implementations
assert_eq!(s, [0, 1, 1, 1, 1, 2, 2, 3, 5, 8, 13, 21, 34, 42, 55]);
// New implementations
assert_eq!(s, [0, 1, 1, 1, 2, 1, 2, 3, 5, 8, 13, 21, 34, 42, 55]);
```
# Benchmarking
**Old implementations**
```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1 ... bench: 59 ns/iter (+/- 4)
test slice::binary_search_l1_with_dups ... bench: 59 ns/iter (+/- 3)
test slice::binary_search_l2 ... bench: 76 ns/iter (+/- 5)
test slice::binary_search_l2_with_dups ... bench: 77 ns/iter (+/- 17)
test slice::binary_search_l3 ... bench: 183 ns/iter (+/- 23)
test slice::binary_search_l3_with_dups ... bench: 185 ns/iter (+/- 19)
```
**New implementations (1)**
Implemented by this PR.
```rust
if cmp == Equal {
return Ok(mid);
} else if cmp == Less {
base = mid
}
```
```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1 ... bench: 58 ns/iter (+/- 2)
test slice::binary_search_l1_with_dups ... bench: 37 ns/iter (+/- 4)
test slice::binary_search_l2 ... bench: 76 ns/iter (+/- 3)
test slice::binary_search_l2_with_dups ... bench: 57 ns/iter (+/- 6)
test slice::binary_search_l3 ... bench: 200 ns/iter (+/- 30)
test slice::binary_search_l3_with_dups ... bench: 157 ns/iter (+/- 6)
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1 ... bench: 59 ns/iter (+/- 8)
test slice::binary_search_l1_with_dups ... bench: 37 ns/iter (+/- 2)
test slice::binary_search_l2 ... bench: 77 ns/iter (+/- 2)
test slice::binary_search_l2_with_dups ... bench: 57 ns/iter (+/- 2)
test slice::binary_search_l3 ... bench: 198 ns/iter (+/- 21)
test slice::binary_search_l3_with_dups ... bench: 158 ns/iter (+/- 11)
```
**New implementations (2)**
Suggested by `@nbdd0121` in [comment](https://github.com/rust-lang/rust/pull/74024#issuecomment-665430239).
```rust
base = if cmp == Greater { base } else { mid };
if cmp == Equal { break }
```
```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1 ... bench: 59 ns/iter (+/- 7)
test slice::binary_search_l1_with_dups ... bench: 37 ns/iter (+/- 5)
test slice::binary_search_l2 ... bench: 75 ns/iter (+/- 3)
test slice::binary_search_l2_with_dups ... bench: 56 ns/iter (+/- 3)
test slice::binary_search_l3 ... bench: 195 ns/iter (+/- 15)
test slice::binary_search_l3_with_dups ... bench: 151 ns/iter (+/- 7)
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1 ... bench: 57 ns/iter (+/- 2)
test slice::binary_search_l1_with_dups ... bench: 38 ns/iter (+/- 2)
test slice::binary_search_l2 ... bench: 77 ns/iter (+/- 11)
test slice::binary_search_l2_with_dups ... bench: 57 ns/iter (+/- 4)
test slice::binary_search_l3 ... bench: 194 ns/iter (+/- 15)
test slice::binary_search_l3_with_dups ... bench: 151 ns/iter (+/- 18)
```
I run some benchmarking testings against on two implementations. The new implementation has a lot of improvement in duplicates cases, while in `binary_search_l3` case, it's a little bit slower than the old one.
Implement NOOP_METHOD_CALL lint
Implements the beginnings of https://github.com/rust-lang/lang-team/issues/67 - a lint for detecting noop method calls (e.g, calling `<&T as Clone>::clone()` when `T: !Clone`).
This PR does not fully realize the vision and has a few limitations that need to be addressed either before merging or in subsequent PRs:
* [ ] No UFCS support
* [ ] The warning message is pretty plain
* [ ] Doesn't work for `ToOwned`
The implementation uses [`Instance::resolve`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_middle/ty/instance/struct.Instance.html#method.resolve) which is normally later in the compiler. It seems that there are some invariants that this function relies on that we try our best to respect. For instance, it expects substitutions to have happened, which haven't yet performed, but we check first for `needs_subst` to ensure we're dealing with a monomorphic type.
Thank you to ```@davidtwco,``` ```@Aaron1011,``` and ```@wesleywiser``` for helping me at various points through out this PR ❤️.
Make ptr::write const
~~The code in this PR as of right now is not much more than an experiment.~~
~~This should, if I am not mistaken, in theory compile and pass the tests once the bootstraping compiler is updated. Thus the PR is blocked on that which should happen some time after the February the 9th. Also we might want to wait for #79989 to avoid regressing performance due to using `mem::forget` over `intrinsics::forget`~~.
Add tests for Atomic*::fetch_{min,max}
This ensures that all atomic operations except for fences are tested. This has been useful to test my work on using atomic instructions for atomic operations in cg_clif instead of a global lock.
Implement RFC 2580: Pointer metadata & VTable
RFC: https://github.com/rust-lang/rfcs/pull/2580
~~Before merging this PR:~~
* [x] Wait for the end of the RFC’s [FCP to merge](https://github.com/rust-lang/rfcs/pull/2580#issuecomment-759145278).
* [x] Open a tracking issue: https://github.com/rust-lang/rust/issues/81513
* [x] Update `#[unstable]` attributes in the PR with the tracking issue number
----
This PR extends the language with a new lang item for the `Pointee` trait which is special-cased in trait resolution to implement it for all types. Even in generic contexts, parameters can be assumed to implement it without a corresponding bound.
For this I mostly imitated what the compiler was already doing for the `DiscriminantKind` trait. I’m very unfamiliar with compiler internals, so careful review is appreciated.
This PR also extends the standard library with new unstable APIs in `core::ptr` and `std::ptr`:
```rust
pub trait Pointee {
/// One of `()`, `usize`, or `DynMetadata<dyn SomeTrait>`
type Metadata: Copy + Send + Sync + Ord + Hash + Unpin;
}
pub trait Thin = Pointee<Metadata = ()>;
pub const fn metadata<T: ?Sized>(ptr: *const T) -> <T as Pointee>::Metadata {}
pub const fn from_raw_parts<T: ?Sized>(*const (), <T as Pointee>::Metadata) -> *const T {}
pub const fn from_raw_parts_mut<T: ?Sized>(*mut (),<T as Pointee>::Metadata) -> *mut T {}
impl<T: ?Sized> NonNull<T> {
pub const fn from_raw_parts(NonNull<()>, <T as Pointee>::Metadata) -> NonNull<T> {}
/// Convenience for `(ptr.cast(), metadata(ptr))`
pub const fn to_raw_parts(self) -> (NonNull<()>, <T as Pointee>::Metadata) {}
}
impl<T: ?Sized> *const T {
pub const fn to_raw_parts(self) -> (*const (), <T as Pointee>::Metadata) {}
}
impl<T: ?Sized> *mut T {
pub const fn to_raw_parts(self) -> (*mut (), <T as Pointee>::Metadata) {}
}
/// `<dyn SomeTrait as Pointee>::Metadata == DynMetadata<dyn SomeTrait>`
pub struct DynMetadata<Dyn: ?Sized> {
// Private pointer to vtable
}
impl<Dyn: ?Sized> DynMetadata<Dyn> {
pub fn size_of(self) -> usize {}
pub fn align_of(self) -> usize {}
pub fn layout(self) -> crate::alloc::Layout {}
}
unsafe impl<Dyn: ?Sized> Send for DynMetadata<Dyn> {}
unsafe impl<Dyn: ?Sized> Sync for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Debug for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Unpin for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Copy for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Clone for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Eq for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> PartialEq for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Ord for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> PartialOrd for DynMetadata<Dyn> {}
impl<Dyn: ?Sized> Hash for DynMetadata<Dyn> {}
```
API differences from the RFC, in areas noted as unresolved questions in the RFC:
* Module-level functions instead of associated `from_raw_parts` functions on `*const T` and `*mut T`, following the precedent of `null`, `slice_from_raw_parts`, etc.
* Added `to_raw_parts`
The use of module-level functions instead of associated functions
on `<*const T>` or `<*mut T>` follows the precedent of
`ptr::slice_from_raw_parts` and `ptr::slice_from_raw_parts_mut`.