Introduce the partition_dedup/by/by_key methods for slices
This PR propose to add three methods to the slice type, the `partition_dedup`, `partition_dedup_by` and `partition_dedup_by_key`. The two other methods are based on `slice::partition_dedup_by`.
These methods take a mutable slice, deduplicates it and moves all duplicates to the end of it, returning two mutable slices, the first containing the deduplicated elements and the second all the duplicates unordered.
```rust
let mut slice = [1, 2, 2, 3, 3, 2];
let (dedup, duplicates) = slice.partition_dedup();
assert_eq!(dedup, [1, 2, 3, 2]);
assert_eq!(duplicates, [3, 2]);
```
The benefits of adding these methods is that it is now possible to:
- deduplicate a slice without having to allocate and possibly clone elements on the heap, really useful for embedded stuff that can't allocate for example.
- not loose duplicate elements, because, when using `Vec::dedup`, duplicates elements are dropped. These methods add more flexibillity to the user.
Note that this is near a copy/paste of the `Vec::dedup_by` function, once this method is stable the goal is to replace the algorithm in `Vec` by the following.
```rust
pub fn Vec::dedup_by<F>(&mut self, same_bucket: F)
where F: FnMut(&mut T, &mut T) -> bool
{
let (dedup, _) = self.as_mut_slice().partition_dedup_by(same_bucket);
let len = dedup.len();
self.truncate(len);
}
```
This commit fixes a buffer overflow issue in the standard library
discovered by Scott McMurray where if a large number was passed to
`str::repeat` it may cause and out of bounds write to the buffer of a `Vec`.
This bug was accidentally introduced in #48657 when optimizing the
`str::repeat` function. The bug affects stable Rust releases 1.26.0 to
1.29.0. We plan on backporting this fix to create a 1.29.1 release, and
the 1.30.0 release onwards will include this fix.
The fix in this commit is to introduce a deterministic panic in the case of
capacity overflow. When repeating a slice where the resulting length is larger
than the address space, there’s no way it can succeed anyway!
The standard library and surrounding libraries were briefly checked to see if
there were othere instances of preallocating a vector with a calculation that
may overflow. No instances of this bug (out of bounds write due to a calculation
overflow) were found at this time.
Note that this commit is the first steps towards fixing this issue,
we'll be making a formal post to the Rust security list once these
commits have been merged.
Update to a new pinning API.
~~Blocked on #53843 because of method resolution problems with new pin type.~~
@r? @cramertj
cc @RalfJung @pythonesque anyone interested in #49150
fix some uses of pointer intrinsics with invalid pointers
[Found by miri](https://github.com/solson/miri/pull/446):
* `Vec::into_iter` calls `ptr::read` (and the underlying `copy_nonoverlapping`) with an unaligned pointer to a ZST. [According to LLVM devs](https://bugs.llvm.org/show_bug.cgi?id=38583), this is UB because it contradicts the metadata we are attaching to that pointer.
* `HashMap` creation calls `ptr:.write_bytes` on a NULL pointer with a count of 0. This is likely not currently UB *currently*, but it violates the rules we are setting in https://github.com/rust-lang/rust/pull/53783, and we might want to exploit those rules later (e.g. with more `nonnull` attributes for LLVM).
Probably what `HashMap` really should do is use `NonNull::dangling()` instead of 0 for the empty case, but that would require a more careful analysis of the code.
It seems like ideally, we should do a review of usage of such intrinsics all over libstd to ensure that they use valid pointers even when the size is 0. Is it worth opening an issue for that?
Make Arc cloning mechanics clearer in module docs
Add some more wording to module documentation regarding how
`Arc::clone()` works, as some users have assumed cloning Arc's
to work via dereferencing to inner value as follows:
use std::sync::Arc;
let myarc = Arc::new(1);
let myarcref = myarc.clone();
assert!(1 == myarcref);
Instead of the actual mechanic of referencing the existing
Arc value:
use std::sync::Arg;
let myarc = Arc::new(1);
let myarcref = myarc.clone();
assert!(myarcref == &myarc); // not sure if assert could assert this in the real world
Reoptimize VecDeque::append
~Unfortunately, I don't know if these changes fix the unsoundness mentioned in #53529, so it is stil a WIP.
This is also completely untested.
The VecDeque code contains other unsound code: one example : [reading unitialized memory](https://play.rust-lang.org/?gist=6ff47551769af61fd8adc45c44010887&version=nightly&mode=release&edition=2015) (detected by MIRI), so I think this code will need a bigger refactor to make it clearer and safer.~
Note: this is based on #53571.
r? @SimonSapin
Cc: #53529#52553 @YorickPeterse @jonas-schievink @Pazzaz @shepmaster.
Add some more wording to module documentation regarding how
`Arc::clone()` works, as some users have assumed cloning Arc's
to work via dereferencing to inner value as follows:
use std::sync::Arc;
let myarc = Arc::new(1);
let myarcref = myarc.clone();
assert!(1 == myarcref);
Instead of the actual mechanic of referencing the existing
Arc value:
use std::sync::Arg;
let myarc = Arc::new(1);
let myarcref = myarc.clone();
assert!(myarcref == &myarc); // not sure if assert could assert this
in the real world
move the Pin API into its own module for centralized documentation
This implements the change proposed by @withoutboats in #49150, as suggested by @RalfJung in the review of #53104,
along with the documentation that was originally in it, that was deemed more appropriate in module-level documentation.
r? @RalfJung
docs: minor stylistic changes to str/string docs
std::string::String.repeat(): slightly rephrase to be more in-line with other descriptions.
add ticks around a few keywords in other descriptions.