Fix a minor mistake in `String::try_reserve_exact` examples
The examples of `String::try_reserve_exact` didn't actually use `try_reserve_exact`, which was probably a minor mistake, and this PR fixed it.
Drop guards in slice sorting derive src pointers from &mut T, which is invalidated by interior mutation in comparison
I tried to run https://github.com/rust-lang/miri-test-libstd on `alloc` with `-Zmiri-track-raw-pointers`, and got a failure on the test `slice::panic_safe`. The test failure has nothing to do with panic safety, it's from how the test tests for panic safety.
I minimized the test failure into this very silly program:
```rust
use std::cell::Cell;
use std::cmp::Ordering;
#[derive(Clone)]
struct Evil(Cell<usize>);
fn main() {
let mut input = vec![Evil(Cell::new(0)); 3];
// Hits the bug pattern via CopyOnDrop in core
input.sort_unstable_by(|a, _b| {
a.0.set(0);
Ordering::Less
});
// Hits the bug pattern via InsertionHole in alloc
input.sort_by(|_a, b| {
b.0.set(0);
Ordering::Less
});
}
```
To fix this, I'm just removing the mutability/uniqueness where it wasn't required.
Implement split_at_spare_mut without Deref to a slice so that the spare slice is valid
~I'm not sure I understand what's going on here correctly. And I'm pretty sure this safety comment needs to be changed. I'm just referring to the same thing that `as_mut_ptr_range` does.~ (Thanks `@RalfJung` for the guidance and clearing things up)
I tried to run https://github.com/rust-lang/miri-test-libstd on alloc with -Zmiri-track-raw-pointers, and got a failure on the test `vec::test_extend_from_within`.
I minimized the test failure into this program:
```rust
#![feature(vec_split_at_spare)]
fn main() {
Vec::<i32>::with_capacity(1).split_at_spare_mut();
}
```
The problem is that the existing implementation is actually getting a pointer range where both pointers are derived from the initialized region of the Vec's allocation, but we need the second one to be valid for the region between len and capacity. (thanks Ralf for clearing this up)
RawVec: don't recompute capacity after allocating.
Currently it sets the capacity to `ptr.len() / mem::size_of::<T>()`
after any buffer allocation/reallocation. This would be useful if
allocators ever returned a `NonNull<[u8]>` with a size larger than
requested. But this never happens, so it's not useful.
Removing this slightly reduces the size of generated LLVM IR, and
slightly speeds up the hot path of `RawVec` growth.
r? `@ghost`
Allow reverse iteration of lowercase'd/uppercase'd chars
The PR implements `DoubleEndedIterator` trait for `ToLowercase` and `ToUppercase`.
This enables reverse iteration of lowercase/uppercase variants of character sequences.
One of use cases: determining whether a char sequence is a suffix of another one.
Example:
```rust
fn endswith_ignore_case(s1: &str, s2: &str) -> bool {
for eob in s1
.chars()
.flat_map(|c| c.to_lowercase())
.rev()
.zip_longest(s2.chars().flat_map(|c| c.to_lowercase()).rev())
{
match eob {
EitherOrBoth::Both(c1, c2) => {
if c1 != c2 {
return false;
}
}
EitherOrBoth::Left(_) => return true,
EitherOrBoth::Right(_) => return false,
}
}
true
}
```
Currently it sets the capacity to `ptr.len() / mem::size_of::<T>()`
after any buffer allocation/reallocation. This would be useful if
allocators ever returned a `NonNull<[u8]>` with a size larger than
requested. But this never happens, so it's not useful.
Removing this slightly reduces the size of generated LLVM IR, and
slightly speeds up the hot path of `RawVec` growth.
The previous implementation used slice::as_mut_ptr_range to derive the
pointer for the spare capacity slice. This is invalid, because that
pointer is derived from the initialized region, so it does not have
provenance over the uninitialized region.
Update example code for Vec::splice to change the length
The current example for `Vec::splice` illustrates the replacement of a section of length 2 with a new section of length 2. This isn't a particularly interesting case for splice, and makes it look a bit like a shorthand for the kind of manipulations that could be done with a mutable slice.
In order to provide a stronger example, this updates the example to use different lengths for the source and destination regions, and uses a slice from the middle of the vector to illustrate that this does not necessarily have to be at the beginning or the end.
Resolves#92067
The src pointers in CopyOnDrop and InsertionHole used to be *mut T, and
were derived via automatic conversion from &mut T. According to Stacked
Borrows 2.1, this means that those pointers become invalidated by
interior mutation in the comparison function.
But there's no need for mutability in this code path. Thus, we can
change the drop guards to use *const and derive those from &T.
Make split_inclusive() on an empty slice yield an empty output
`[].split_inclusive()` currently yields a single, empty slice. That's
different from `"".split_inslusive()`, which yields no output at
all. I think that makes the slice version harder to use.
The case where I ran into this bug was when writing code for
generating a diff between two slices of bytes. I wanted to prefix
removed lines with "-" and a added lines with "+". Due to
`split_inclusive()`'s current behavior, that means that my code prints
just a "-" or "+" for empty files. I suspect most existing callers
have similar "bugs" (which would be fixed by this patch).
Closes#89716.
add BinaryHeap::try_reserve and BinaryHeap::try_reserve_exact
`try_reserve` of many collections were stablized in https://github.com/rust-lang/rust/pull/87993 in 1.57.0. Add `try_reserve` for the rest collections such as `BinaryHeap` should be not controversial.
Use spare_capacity_mut instead of invalid unchecked indexing when joining str
This is a fix for https://github.com/rust-lang/rust/issues/91574
I think in general I'd prefer to see this code implemented with raw pointers or `MaybeUninit::write_slice`, but there's existing code in here based on copying from slice to slice, so converting everything from `&[T]` to `&[MaybeUninit<T>]` is less disruptive.
BTree: improve public descriptions and comments
BTreeSet has always used the term "value" next to and meaning the same thing as "elements" (in the mathematical sense but also used for key-value pairs in BTreeMap), while in the BTreeMap sense these "values" are known as "keys" and definitely not "values". Today I had enough of that.
r? `@Mark-Simulacrum`
Btree: assert more API compatibility
Introducing a member such as `BTreeSet::min()` would silently break compatibility if no code calls the existing `BTreeSet::min(set)`. `BTreeSet` is the only btree class silently bringing in stable members, apart from many occurrences of `#[derive(Debug)]` on iterators.
r? `@Mark-Simulacrum`
Rollup of 11 pull requests
Successful merges:
- #91668 (Remove the match on `ErrorKind::Other`)
- #91678 (Add tests fixed by #90023)
- #91679 (Move core/stream/stream/mod.rs to core/stream/stream.rs)
- #91681 (fix typo in `intrinsics::raw_eq` docs)
- #91686 (Fix `Vec::reserve_exact` documentation)
- #91697 (Delete Utf8Lossy::from_str)
- #91706 (Add unstable book entries for parts of asm that are not being stabilized)
- #91709 (Replace iterator-based set construction by *Set::From<[T; N]>)
- #91716 (Improve x.py logging and defaults a bit more)
- #91747 (Add pierwill to .mailmap)
- #91755 (Fix since attribute for const_linked_list_new feature)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Replace iterator-based set construction by *Set::From<[T; N]>
This uses the array-based construction for `BtreeSet`s and `HashSet`s instead of first creating an iterator. I could also replace the `let mut a = Set::new(); a.insert(...);` fragments if desired.
Fix `Vec::reserve_exact` documentation
The documentation previously said the new capacity cannot overflow `usize`, but in fact it cannot exceed `isize::MAX`.
Update documentation to use `from()` to initialize `HashMap`s and `BTreeMap`s
As of Rust 1.56, `HashMap` and `BTreeMap` both have associated `from()` functions. I think using these in the documentation cleans things up a bit. It allows us to remove some of the `mut`s and avoids the Initialize-Then-Modify anti-pattern.
replace vec::Drain drop loops with drop_in_place
The `Drain::drop` implementation came up in https://github.com/rust-lang/rust/pull/82185#issuecomment-789584796 as potentially interfering with other optimization work due its widespread use somewhere in `println!`
`@rustbot` label T-libs-impl
Suggest try_reserve in try_reserve_exact
During developing #91529 , I found that `try_reserve_exact` suggests `reserve` for further insertions. I think it's a mistake by copy&paste, `try_reserve` is better here.
Implement VecDeque::retain_mut
Part of https://github.com/rust-lang/rust/issues/90829.
In https://github.com/rust-lang/rust/pull/90772, someone suggested that `retain_mut` should also be implemented on `VecDeque`. I think that it follows the same logic (coherency). So first: is it ok? Second: should I create a new feature for it or can we put it into the same one?
r? `@joshtriplett`
For users looking at documentation through IDE popups, this gives them
relevant information rather than the generic trait documentation wording
“Performs the conversion”. For users reading the documentation for a
specific type for any reason, this informs them when the conversion may
allocate or copy significant memory versus when it is always a move or
cheap copy.
Notes on specific cases:
* The new documentation for `From<T> for T` explains that it is not a
conversion at all.
* Also documented `impl<T, U> Into<U> for T where U: From<T>`, the other
central blanket implementation of conversion.
* I did not add documentation to conversions of a specific error type to
a more general error type.
* I did not add documentation to unstable code.
This change was prepared by searching for the text "From<... for" and so
may have missed some cases that for whatever reason did not match. I
also looked for `Into` impls but did not find any worth documenting by
the above criteria.
Add `into_iter().filter().collect()` as a comparison point since it was reported to be faster than `retain`.
Remove clone inside benchmark loop to reduce allocator noise.
Rollup of 10 pull requests
Successful merges:
- #88906 (Implement write() method for Box<MaybeUninit<T>>)
- #90269 (Make `Option::expect` unstably const)
- #90854 (Type can be unsized and uninhabited)
- #91170 (rustdoc: preload fonts)
- #91273 (Fix ICE #91268 by checking that the snippet ends with a `)`)
- #91381 (Android: -ldl must appear after -lgcc when linking)
- #91453 (Document Windows TLS drop behaviour)
- #91462 (Use try_normalize_erasing_regions in needs_drop)
- #91474 (suppress warning about set_errno being unused on DragonFly)
- #91483 (Sync rustfmt subtree)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Implement write() method for Box<MaybeUninit<T>>
This adds method similar to `MaybeUninit::write` main difference being
it returns owned `Box`. This can be used to elide copy from stack
safely, however it's not currently tested that the optimization actually
occurs.
Analogous methods are not provided for `Rc` and `Arc` as those need to
handle the possibility of sharing. Some version of them may be added in
the future.
This was discussed in #63291 which this change extends.
Remove unnecessary check in VecDeque::grow
All callers already check that the buffer is full before calling
`grow()`. This is where it makes the most sense, since `grow()` is
`inline(never)` and we don't want to pay for a function call just for
that check.
It could also be argued that it would be correct to call `grow()` even
if the buffer wasn't full yet.
This change breaks no code since `grow()` is not `pub`.
This adds method similar to `MaybeUninit::write` main difference being
it returns owned `Box`. This can be used to elide copy from stack
safely, however it's not currently tested that the optimization actually
occurs.
Analogous methods are not provided for `Rc` and `Arc` as those need to
handle the possibility of sharing. Some version of them may be added in
the future.
This was discussed in #63291 which this change extends.
Introduce `RawVec::reserve_for_push`.
If `Vec::push`'s capacity check fails it calls `RawVec::reserve`, which
then also does a capacity check.
This commit introduces `reserve_for_push` which skips the redundant
capacity check, for some slight compile time speed-ups.
I tried lots of minor variations on this, e.g. different inlining
attributes. This was the best one I could find.
r? `@ghost`
All callers already check that the buffer is full before calling
`grow()`. This is where it makes the most sense, since `grow()` is
`inline(never)` and we don't want to pay for a function call just for
that check.
It could also be argued that it would be correct to call `grow()` even
if the buffer wasn't full yet.
This change breaks no code since `grow()` is not `pub`.
If `Vec::push`'s capacity check fails it calls `RawVec::reserve`, which
then also does a capacity check.
This commit introduces `reserve_for_push` which skips the redundant
capacity check, for some slight compile time speed-ups.
I tried lots of minor variations on this, e.g. different inlining
attributes. This was the best one I could find.
Eliminate an unreachable codepath from String::from_utf8_lossy
`Utf8Lossy`'s `Iterator` implementation ensures that only the **final** chunk has an empty slice for `broken`:
dd549dcab4/library/core/src/str/lossy.rs (L46-L47)
Thus the only way the **first** chunk could have an empty `broken` is if it is the **final** chunk, i.e. there is only one chunk total. And the only way that there could be one chunk total with an empty `broken` is if the whole input is valid utf8 and non-empty.
That condition has already been handled by an early return, so at the point that the first `REPLACEMENT` is being pushed, it's impossible for `first_broken` to be empty.
Fix Iterator::advance_by contract inconsistency
The `advance_by(n)` docs state that in the error case `Err(k)` that k is always less than n.
It also states that `advance_by(0)` may return `Err(0)` to indicate an exhausted iterator.
These statements are inconsistent.
Since only one implementation (Skip) actually made use of that I changed it to return Ok(()) in that case too.
While adding some tests I also found a bug in `Take::advance_back_by`.
Utf8Lossy's Iterator implementation ensures that only the final chunk
has an empty slice for broken. Thus the only way the first chunk could
have an empty broken is if it is the final chunk, i.e. there is only one
chunk total. And the only way that there could be one chunk total is if
the whole input is valid utf8 and non-empty. That condition has already
been handled by an early return, so at the point that the first
REPLACEMENT is being pushed, it's impossible for first_broken to be
empty.
Mark `Arc::from_inner` / `Rc::from_inner` as unsafe
While it's an internal function, it is easy to create invalid Arc/Rcs to
a dangling pointer with it.
Fixes https://github.com/rust-lang/rust/issues/89740
The `advance_by(n)` docs state that in the error case `Err(k)` that k is always less than n.
It also states that `advance_by(0)` may return `Err(0)` to indicate an exhausted iterator.
These statements are inconsistent.
Since only one implementation (Skip) actually made use of that I changed it to return Ok(()) in that case too.
While adding some tests I also found a bug in `Take::advance_back_by`.
This commit makes the following functions from `core::str` `const fn`:
- `from_utf8[_mut]` (`feature(const_str_from_utf8)`)
- `from_utf8_unchecked_mut` (`feature(const_str_from_utf8_unchecked_mut)`)
- `Utf8Error::{valid_up_to,error_len}` (`feature(const_str_from_utf8)`)
Add Vec::retain_mut
This is to continue the discussion started in #83218.
Original comment was:
> Take 2 of #34265, since I needed this today.
The reason I think why we should add `retain_mut` is for coherency and for discoverability. For example we have `chunks` and `chunks_mut` or `get` and `get_mut` or `iter` and `iter_mut`, etc. When looking for mutable `retain`, I would expect `retain_mut` to exist. It took me a while to find out about `drain_filter`. So even if it provides an API close to `drain_filter`, just for the discoverability, I think it's worth it.
cc ``````@m-ou-se`````` ``````@jonas-schievink`````` ``````@Mark-Simulacrum``````
Optimize BinaryHeap::extend from Vec
This improves the performance of extending `BinaryHeap`s from vectors directly. Future work may involve extending this optimization to other, similar, cases where the length of the added elements is well-known, but this is not yet done in this PR.
Make RawVec private to alloc
RawVec was previously exposed for compiler-internal use (libarena specifically) in 1acbb0a935
Since it is unstable, doc-hidden and has no associated tracking issue it was never meant for public use. And since
it is no longer used outside alloc itself it can be made private again.
Also remove some functions that are dead due to lack of internal users.
Better document `Box` and `alloc::alloc::box_free` connection
The internal `alloc::alloc::box_free` function requires that its signature matches the `owned_box` struct's declaration, but previously that connection was only documented on the `box_free` function.
This PR makes the documentation two-way to help anyone making theoretical changes to `Box` to see the connection, since changes are more likely to originate from `Box`.
RawVec was previously exposed for compiler-internal use (libarena specifically) in 1acbb0a935
Since it is unstable, doc-hidden and has no associated tracking issue it was never meant for public use. And since
it is no longer used outside alloc itself it can be made private again.
Also remove some functions that are dead due to lack of internal users.
Add #[must_use] to alloc functions that would leak memory
As [requested](https://github.com/rust-lang/rust/pull/89899#issuecomment-955600779) by `@joshtriplett.`
> Please do go ahead and add the ones whose only legitimate use for ignoring the return value is leaking memory. (In a separate PR please.) I think it's sufficiently error-prone to call something like alloc and ignore the result that it's legitimate to require `let _ =` for that.
I added `realloc` myself. Clippy ignored it because of its `mut` argument.
```rust
alloc/src/alloc.rs:123:1 alloc unsafe fn realloc(ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8;
```
Parent issue: #89692
r? `@joshtriplett`
Add #[must_use] to remaining core functions
I've run out of compelling reasons to group functions together across crates so I'm just going to go module-by-module. This is everything remaining from the `core` crate.
Ignored by clippy for reasons unknown:
```rust
core::alloc::Layout unsafe fn for_value_raw<T: ?Sized>(t: *const T) -> Self;
core::any const fn type_name_of_val<T: ?Sized>(_val: &T) -> &'static str;
```
Ignored by clippy because of `mut`:
```rust
str fn split_at_mut(&mut self, mid: usize) -> (&mut str, &mut str);
```
<del>
Ignored by clippy presumably because a caller might want `f` called for side effects. That seems like a bad usage of `map` to me.
```rust
core::cell::Ref<'b, T> fn map<U: ?Sized, F>(orig: Ref<'b, T>, f: F) -> Ref<'b, T>;
core::cell::Ref<'b, T> fn map_split<U: ?Sized, V: ?Sized, F>(orig: Ref<'b, T>, f: F) -> (Ref<'b, U>, Ref<'b, V>);
```
</del>
Parent issue: #89692
r? ```@joshtriplett```
Add #[must_use] to expensive computations
The unifying theme for this commit is weak, admittedly. I put together a list of "expensive" functions when I originally proposed this whole effort, but nobody's cared about that criterion. Still, it's a decent way to bite off a not-too-big chunk of work.
Given the grab bag nature of this commit, the messages I used vary quite a bit. I'm open to wording changes.
For some reason clippy flagged four `BTreeSet` methods but didn't say boo about equivalent ones on `HashSet`. I stared at them for a while but I can't figure out the difference so I added the `HashSet` ones in.
```rust
// Flagged by clippy.
alloc::collections::btree_set::BTreeSet<T> fn difference<'a>(&'a self, other: &'a BTreeSet<T>) -> Difference<'a, T>;
alloc::collections::btree_set::BTreeSet<T> fn symmetric_difference<'a>(&'a self, other: &'a BTreeSet<T>) -> SymmetricDifference<'a, T>
alloc::collections::btree_set::BTreeSet<T> fn intersection<'a>(&'a self, other: &'a BTreeSet<T>) -> Intersection<'a, T>;
alloc::collections::btree_set::BTreeSet<T> fn union<'a>(&'a self, other: &'a BTreeSet<T>) -> Union<'a, T>;
// Ignored by clippy, but not by me.
std::collections::HashSet<T, S> fn difference<'a>(&'a self, other: &'a HashSet<T, S>) -> Difference<'a, T, S>;
std::collections::HashSet<T, S> fn symmetric_difference<'a>(&'a self, other: &'a HashSet<T, S>) -> SymmetricDifference<'a, T, S>
std::collections::HashSet<T, S> fn intersection<'a>(&'a self, other: &'a HashSet<T, S>) -> Intersection<'a, T, S>;
std::collections::HashSet<T, S> fn union<'a>(&'a self, other: &'a HashSet<T, S>) -> Union<'a, T, S>;
```
Parent issue: #89692
r? ```@joshtriplett```
Previously, it wasn't clear whether "This could include" was referring
to logic errors, or undefined behaviour. Tweak wording to clarify this
sentence does not relate to UB.
Fix MIRI UB in `Vec::swap_remove`
Fixes#90055
I find it weird that `Vec::swap_remove` read the last element to the stack just to immediately put it back in the `Vec` in place of the one at index `index`. It seems much more natural to me to just read the element at position `index` and then move the last element in its place. I guess this might also slightly improve codegen.
Avoid overflow in `VecDeque::with_capacity_in()`.
The overflow only happens if alloc is compiled with overflow checks enabled and the passed capacity is greater or equal 2^(usize::BITS-1). The overflow shadows the expected "capacity overflow" panic leading to a test failure if overflow checks are enabled for std in the CI.
Unblocks [CI: Enable overflow checks for test (non-dist) builds #89776](https://github.com/rust-lang/rust/pull/89776).
For some reason the overflow is only observable with optimization turned off, but that is a separate issue.
Avoid allocations and copying in Vec::leak
The [`Vec::leak`] method (#62195) is currently implemented by calling `Vec::into_boxed_slice` and `Box::leak`. This shrinks the vector before leaking it, which potentially causes a reallocation and copies the vector's contents.
By avoiding the conversion to `Box`, we can instead leak the vector without any expensive operations, just by returning a slice reference and forgetting the `Vec`. Users who *want* to shrink the vector first can still do so by calling `shrink_to_fit` explicitly.
**Note:** This could break code that uses `Box::from_raw` to “un-leak” the slice returned by `Vec::leak`. However, the `Vec::leak` docs explicitly forbid this, so such code is already incorrect.
[`Vec::leak`]: https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#method.leak
Optimize VecDeque::append
Optimize `VecDeque::append` to do unsafe copy rather than iterating through each element.
On my `Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz`, the benchmark shows 37% improvements:
```
Master:
custom-bench vec_deque_append 583164 ns/iter
custom-bench vec_deque_append 550040 ns/iter
Patched:
custom-bench vec_deque_append 349204 ns/iter
custom-bench vec_deque_append 368164 ns/iter
```
Additional notes on the context: this is the third attempt to implement a non-trivial version of `VecDeque::append`, the last two are reverted due to unsoundness or regression, see:
- https://github.com/rust-lang/rust/pull/52553, reverted in https://github.com/rust-lang/rust/pull/53571
- https://github.com/rust-lang/rust/pull/53564, reverted in https://github.com/rust-lang/rust/pull/54851
Both cases are covered by existing tests.
Signed-off-by: tabokie <xy.tao@outlook.com>