Implement From for more types on Cow
This is basically https://github.com/rust-lang/rust/pull/48191, except that it should be implemented in a way that doesn't break third party crates.
This commit is spawned out of a performance regression investigation in #50496.
In tracking down this regression it turned out that the `expand_statements`
function in the compiler was taking quite a long time. Further investigation
showed two key properties:
* The function was "fast" on glibc 2.24 and slow on glibc 2.23
* The hottest function was memmove from glibc
Combined together it looked like glibc gained an optimization to the memmove
function in 2.24. Ideally we don't want to rely on this optimization, so I
wanted to dig further to see what was happening.
The hottest part of `expand_statements` was `Drop for Drain` in the call to
`splice` where we insert new statements into the original vector. This *should*
be a cheap operation because we're draining and replacing iterators of the exact
same length, but under the hood memmove was being called a lot, causing a
slowdown on glibc 2.23.
It turns out that at least one of the optimizations in glibc 2.24 was that
`memmove` where the src/dst are equal becomes much faster. [This program][prog]
executes in ~2.5s against glibc 2.23 and ~0.3s against glibc 2.24, exhibiting
how glibc 2.24 is optimizing `memmove` if the src/dst are equal.
And all that brings us to what this commit itself is doing. The change here is
purely to `Drop for Drain` to avoid the call to `ptr::copy` if the region being
copied doesn't actually need to be copied. For normal usage of just `Drain`
itself this check isn't really necessary, but because `Splice` internally
contains `Drain` this provides a nice speed boost on glibc 2.23. Overall this
should fix the regression seen in #50496 on glibc 2.23 and also fix the
regression on Windows where `memmove` looks to not have this optimization.
Note that the way `splice` was called in `expand_statements` would cause a
quadratic number of elements to be copied via `memmove` which is likely why the
tuple-stress benchmark showed such a severe regression.
Closes#50496
[prog]: https://gist.github.com/alexcrichton/c05bc51c6771bba5ae5b57561a6c1cd3
… previously in the unstable core::num::Float trait.
Per https://github.com/rust-lang/rust/issues/32110#issuecomment-379503183,
the `abs`, `signum`, and `powi` methods are *not* included for now
since they rely on LLVM intrinsics and we haven’t determined yet whether
those instrinsics lower to calls to libm functions on any platform.
Inline most of the code paths for conversions with boxed slices
This helps with the specific problem described in #49541, obviously without making any large change to how inlining works in the general case.
Everything involved in the conversions is made `#[inline]`, except for the `<Vec<T>>::into_boxed_slice` entry point which is made `#[inline(always)]` after checking that duplicating the function mentioned in the issue prevented its inlining if I only annotate it with
`#[inline]`.
For the record, that function was:
```rust
pub fn foo() -> Box<[u8]> {
vec![0].into_boxed_slice()
}
```
To help the inliner's job, we also hoist a `self.capacity() != self.len` check in `<Vec<T>>::shrink_to_fit` and mark it as `#[inline]` too.
Replace manual iterator exhaust with for_each(drop)
This originally added a dedicated method, `Iterator::exhaust`, and has since been replaced with `for_each(drop)`, which is more idiomatic.
<del>This is just shorthand for `for _ in &mut self {}` or `while let Some(_) = self.next() {}`. This states the intent a lot more clearly than the identical code: run the iterator to completion.
<del>At least personally, my eyes tend to gloss over `for _ in &mut self {}` without fully paying attention to what it does; having a `Drop` implementation akin to:
<del>`for _ in &mut self {}; unsafe { free(self.ptr); }`</del>
<del>Is not as clear as:
<del>`self.exhaust(); unsafe { free(self.ptr); }`
<del>Additionally, I've seen debate over whether `while let Some(_) = self.next() {}` or `for _ in &mut self {}` is more clear, whereas `self.exhaust()` is clearer than both.
Some modules were still using the deprecated `allocator` module, use the
`alloc` module instead.
Some modules were using `super` while it's not needed.
Some modules were more or less ordering them, and other not, so the
latter have been modified to match the others.
Deprecate offset_to; switch core&alloc to using offset_from instead
Bonus: might make code than uses `.len()` on slice iterators faster
cc https://github.com/rust-lang/rust/issues/41079
Add more vec![... ; n] optimizations
vec![0; n], via implementations of SpecFromElem, has an optimization that uses with_capacity_zeroed instead of with_capacity, which will use calloc instead of malloc, and avoid an extra memset.
This PR adds the same optimization for ptr::null, ptr::null_mut, and None, when their in-memory representation is zeroes.
Introduce Vec::resize_with method (see #41758)
In #41758, the libs team decided they preferred `Vec::resize_with` over `Vec::resize_default()`. Here is an implementation to get this moving forward.
I don't know what the removal process for `Vec::resize_default()` should be, so I've left it in place for now. Would be happy to follow up with its removal.
vec![0; n], via implementations of SpecFromElem, has an optimization
that uses with_capacity_zeroed instead of with_capacity, which will use
calloc instead of malloc, and avoid an extra memset.
This adds the same optimization for vec![ptr::null(); n] and
vec![ptr::null_mut(); n], assuming their bit value is 0 (which is true
on all currently supported platforms).
This does so by adding an intermediate trait IsZero, which looks very
much like nonzero::Zeroable, but that one is on the way out, and doesn't
apply to pointers anyways.
Adding such a trait allows to avoid repeating the logic using
with_capacity_zeroed or with_capacity, or making the macro more complex
to support generics.
This helps with the specific problem described in #49541, obviously without
making any large change to how inlining works in the general case.
Everything involved in the conversions is made `#[inline]`, except for the
`<Vec<T>>::into_boxed_slice` entry point which is made `#[inline(always)]`
after checking that duplicating the function mentioned in the issue prevented
its inlining if I only annotate it with `#[inline]`.
For the record, that function was:
```rust
pub fn foo() -> Box<[u8]> {
vec![0].into_boxed_slice()
}
```
To help the inliner's job, we also hoist a `self.capacity() != self.len` check
in `<Vec<T>>::shrink_to_fit` and mark it as `#[inline]` too.
Use f{32,64}::to_bits for is_zero test in vec::SpecFromElem
vec::SpecFromElem provides an optimization to use calloc to fill a Vec
when the element given to fill the Vec is represented by 0.
For floats, the test for that currently used is `x == 0. &&
x.is_sign_positive()`. When compiled in a standalone function, rustc
generates the following assembly:
```
xorps xmm1, xmm1
ucomisd xmm0, xmm1
setnp al
sete cl
and cl, al
movq rax, xmm0
test rax, rax
setns al
and al, cl
ret
```
A simpler test telling us whether the value is represented by 0, is
`x.to_bits() == 0`, which rustc compiles to:
```
movq rax, xmm0
test rax, rax
sete al
ret
```
Not that the test is hot in any way, but it also makes it clearer what
the intent in the rust code is.
vec::SpecFromElem provides an optimization to use calloc to fill a Vec
when the element given to fill the Vec is represented by 0.
For floats, the test for that currently used is `x == 0. &&
x.is_sign_positive()`. When compiled in a standalone function, rustc
generates the following assembly:
```
xorps xmm1, xmm1
ucomisd xmm0, xmm1
setnp al
sete cl
and cl, al
movq rax, xmm0
test rax, rax
setns al
and al, cl
ret
```
A simpler test telling us whether the value is represented by 0, is
`x.to_bits() == 0`, which rustc compiles to:
```
movq rax, xmm0
test rax, rax
sete al
ret
```
Not that the test is hot in any way, but it also makes it clearer what
the intent in the rust code is.
Stabilize FusedIterator
FusedIterator is a marker trait that promises that the implementing
iterator continues to return `None` from `.next()` once it has returned
`None` once (and/or `.next_back()`, if implemented).
The effects of FusedIterator are already widely available through
`.fuse()`, but with stable `FusedIterator`, stable Rust users can
implement this trait for their iterators when appropriate.
Closes#35602
FusedIterator is a marker trait that promises that the implementing
iterator continues to return `None` from `.next()` once it has returned
`None` once (and/or `.next_back()`, if implemented).
The effects of FusedIterator are already widely available through
`.fuse()`, but with stable `FusedIterator`, stable Rust users can
implement this trait for their iterators when appropriate.
The docs currently say, "If the closure returns false, it will try
again, and call the closure on the next element." But this happens
regardless of whether the closure returns true or false.
Make core::ops::Place an unsafe trait
Consumers of `Place` would reasonably expect that the `pointer` function returns a valid pointer to memory that can actually be written to.