Rollup merge of #119917 - Zalathar:split-off, r=cuviper

Remove special-case handling of `vec.split_off(0)`

#76682 added special handling to `Vec::split_off` for the case where `at == 0`. Instead of copying the vector's contents into a freshly-allocated vector and returning it, the special-case code steals the old vector's allocation, and replaces it with a new (empty) buffer with the same capacity.

That eliminates the need to copy the existing elements, but comes at a surprising cost, as seen in #119913. The returned vector's capacity is no longer determined by the size of its contents (as would be expected for a freshly-allocated vector), and instead uses the full capacity of the old vector.

In cases where the capacity is large but the size is small, that results in a much larger capacity than would be expected from reading the documentation of `split_off`. This is especially bad when `split_off` is called in a loop (to recycle a buffer), and the returned vectors have a wide variety of lengths.

I believe it's better to remove the special-case code, and treat `at == 0` just like any other value:
- The current documentation states that `split_off` returns a “newly allocated vector”, which is not actually true in the current implementation when `at == 0`.
- If the value of `at` could be non-zero at runtime, then the caller has already agreed to the cost of a full memcpy of the taken elements in the general case. Avoiding that copy would be nice if it were close to free, but the different handling of capacity means that it is not.
- If the caller specifically wants to avoid copying in the case where `at == 0`, they can easily implement that behaviour themselves using `mem::replace`.

Fixes #119913.
This commit is contained in:
Matthias Krüger 2024-01-26 14:43:30 +01:00 committed by GitHub
commit 772e80a650
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 18 additions and 14 deletions

View File

@ -2204,14 +2204,6 @@ fn assert_failed(at: usize, len: usize) -> ! {
assert_failed(at, self.len());
}
if at == 0 {
// the new vector can take over the original buffer and avoid the copy
return mem::replace(
self,
Vec::with_capacity_in(self.capacity(), self.allocator().clone()),
);
}
let other_len = self.len - at;
let mut other = Vec::with_capacity_in(other_len, self.allocator().clone());

View File

@ -958,23 +958,35 @@ fn test_append() {
#[test]
fn test_split_off() {
let mut vec = vec![1, 2, 3, 4, 5, 6];
let orig_ptr = vec.as_ptr();
let orig_capacity = vec.capacity();
let vec2 = vec.split_off(4);
let split_off = vec.split_off(4);
assert_eq!(vec, [1, 2, 3, 4]);
assert_eq!(vec2, [5, 6]);
assert_eq!(split_off, [5, 6]);
assert_eq!(vec.capacity(), orig_capacity);
assert_eq!(vec.as_ptr(), orig_ptr);
}
#[test]
fn test_split_off_take_all() {
let mut vec = vec![1, 2, 3, 4, 5, 6];
// Allocate enough capacity that we can tell whether the split-off vector's
// capacity is based on its size, or (incorrectly) on the original capacity.
let mut vec = Vec::with_capacity(1000);
vec.extend([1, 2, 3, 4, 5, 6]);
let orig_ptr = vec.as_ptr();
let orig_capacity = vec.capacity();
let vec2 = vec.split_off(0);
let split_off = vec.split_off(0);
assert_eq!(vec, []);
assert_eq!(vec2, [1, 2, 3, 4, 5, 6]);
assert_eq!(split_off, [1, 2, 3, 4, 5, 6]);
assert_eq!(vec.capacity(), orig_capacity);
assert_eq!(vec2.as_ptr(), orig_ptr);
assert_eq!(vec.as_ptr(), orig_ptr);
// The split-off vector should be newly-allocated, and should not have
// stolen the original vector's allocation.
assert!(split_off.capacity() < orig_capacity);
assert_ne!(split_off.as_ptr(), orig_ptr);
}
#[test]