2021-07-31 17:39:35 +02:00
|
|
|
//! Inplace iterate-and-collect specialization for `Vec`
|
|
|
|
//!
|
2022-03-23 20:57:49 +01:00
|
|
|
//! Note: This documents Vec internals, some of the following sections explain implementation
|
|
|
|
//! details and are best read together with the source of this module.
|
|
|
|
//!
|
2021-07-31 17:39:35 +02:00
|
|
|
//! The specialization in this module applies to iterators in the shape of
|
|
|
|
//! `source.adapter().adapter().adapter().collect::<Vec<U>>()`
|
2022-03-23 20:57:49 +01:00
|
|
|
//! where `source` is an owning iterator obtained from [`Vec<T>`], [`Box<[T]>`][box] (by conversion to `Vec`)
|
2021-07-31 17:39:35 +02:00
|
|
|
//! or [`BinaryHeap<T>`], the adapters each consume one or more items per step
|
|
|
|
//! (represented by [`InPlaceIterable`]), provide transitive access to `source` (via [`SourceIter`])
|
|
|
|
//! and thus the underlying allocation. And finally the layouts of `T` and `U` must
|
2022-03-23 20:57:49 +01:00
|
|
|
//! have the same size and alignment, this is currently ensured via const eval instead of trait bounds
|
|
|
|
//! in the specialized [`SpecFromIter`] implementation.
|
2021-07-31 17:39:35 +02:00
|
|
|
//!
|
|
|
|
//! [`BinaryHeap<T>`]: crate::collections::BinaryHeap
|
2022-03-23 20:57:49 +01:00
|
|
|
//! [box]: crate::boxed::Box
|
2021-07-31 17:39:35 +02:00
|
|
|
//!
|
2022-03-23 20:57:49 +01:00
|
|
|
//! By extension some other collections which use `collect::<Vec<_>>()` internally in their
|
2021-07-31 17:39:35 +02:00
|
|
|
//! `FromIterator` implementation benefit from this too.
|
|
|
|
//!
|
|
|
|
//! Access to the underlying source goes through a further layer of indirection via the private
|
2022-03-22 00:02:54 +01:00
|
|
|
//! trait [`AsVecIntoIter`] to hide the implementation detail that other collections may use
|
2021-07-31 17:39:35 +02:00
|
|
|
//! `vec::IntoIter` internally.
|
|
|
|
//!
|
|
|
|
//! In-place iteration depends on the interaction of several unsafe traits, implementation
|
|
|
|
//! details of multiple parts in the iterator pipeline and often requires holistic reasoning
|
|
|
|
//! across multiple structs since iterators are executed cooperatively rather than having
|
|
|
|
//! a central evaluator/visitor struct executing all iterator components.
|
|
|
|
//!
|
|
|
|
//! # Reading from and writing to the same allocation
|
|
|
|
//!
|
|
|
|
//! By its nature collecting in place means that the reader and writer side of the iterator
|
2022-03-23 20:57:49 +01:00
|
|
|
//! use the same allocation. Since `try_fold()` (used in [`SpecInPlaceCollect`]) takes a
|
|
|
|
//! reference to the iterator for the duration of the iteration that means we can't interleave
|
|
|
|
//! the step of reading a value and getting a reference to write to. Instead raw pointers must be
|
|
|
|
//! used on the reader and writer side.
|
2021-07-31 17:39:35 +02:00
|
|
|
//!
|
|
|
|
//! That writes never clobber a yet-to-be-read item is ensured by the [`InPlaceIterable`] requirements.
|
|
|
|
//!
|
|
|
|
//! # Layout constraints
|
|
|
|
//!
|
|
|
|
//! [`Allocator`] requires that `allocate()` and `deallocate()` have matching alignment and size.
|
|
|
|
//! Additionally this specialization doesn't make sense for ZSTs as there is no reallocation to
|
|
|
|
//! avoid and it would make pointer arithmetic more difficult.
|
|
|
|
//!
|
|
|
|
//! [`Allocator`]: core::alloc::Allocator
|
|
|
|
//!
|
|
|
|
//! # Drop- and panic-safety
|
|
|
|
//!
|
|
|
|
//! Iteration can panic, requiring dropping the already written parts but also the remainder of
|
|
|
|
//! the source. Iteration can also leave some source items unconsumed which must be dropped.
|
|
|
|
//! All those drops in turn can panic which then must either leak the allocation or abort to avoid
|
|
|
|
//! double-drops.
|
|
|
|
//!
|
2022-03-23 20:57:49 +01:00
|
|
|
//! This is handled by the [`InPlaceDrop`] guard for sink items (`U`) and by
|
|
|
|
//! [`vec::IntoIter::forget_allocation_drop_remaining()`] for remaining source items (`T`).
|
2021-07-31 17:39:35 +02:00
|
|
|
//!
|
2022-09-10 12:29:16 +02:00
|
|
|
//! If dropping any remaining source item (`T`) panics then [`InPlaceDstBufDrop`] will handle dropping
|
|
|
|
//! the already collected sink items (`U`) and freeing the allocation.
|
|
|
|
//!
|
2021-07-31 17:39:35 +02:00
|
|
|
//! [`vec::IntoIter::forget_allocation_drop_remaining()`]: super::IntoIter::forget_allocation_drop_remaining()
|
|
|
|
//!
|
|
|
|
//! # O(1) collect
|
|
|
|
//!
|
|
|
|
//! The main iteration itself is further specialized when the iterator implements
|
|
|
|
//! [`TrustedRandomAccessNoCoerce`] to let the optimizer see that it is a counted loop with a single
|
2022-03-23 20:57:49 +01:00
|
|
|
//! [induction variable]. This can turn some iterators into a noop, i.e. it reduces them from O(n) to
|
2021-07-31 17:39:35 +02:00
|
|
|
//! O(1). This particular optimization is quite fickle and doesn't always work, see [#79308]
|
|
|
|
//!
|
|
|
|
//! [#79308]: https://github.com/rust-lang/rust/issues/79308
|
2022-03-23 20:57:49 +01:00
|
|
|
//! [induction variable]: https://en.wikipedia.org/wiki/Induction_variable
|
2021-07-31 17:39:35 +02:00
|
|
|
//!
|
|
|
|
//! Since unchecked accesses through that trait do not advance the read pointer of `IntoIter`
|
|
|
|
//! this would interact unsoundly with the requirements about dropping the tail described above.
|
|
|
|
//! But since the normal `Drop` implementation of `IntoIter` would suffer from the same problem it
|
|
|
|
//! is only correct for `TrustedRandomAccessNoCoerce` to be implemented when the items don't
|
|
|
|
//! have a destructor. Thus that implicit requirement also makes the specialization safe to use for
|
|
|
|
//! in-place collection.
|
2022-03-23 20:57:49 +01:00
|
|
|
//! Note that this safety concern is about the correctness of `impl Drop for IntoIter`,
|
|
|
|
//! not the guarantees of `InPlaceIterable`.
|
2021-07-31 17:39:35 +02:00
|
|
|
//!
|
|
|
|
//! # Adapter implementations
|
|
|
|
//!
|
|
|
|
//! The invariants for adapters are documented in [`SourceIter`] and [`InPlaceIterable`], but
|
|
|
|
//! getting them right can be rather subtle for multiple, sometimes non-local reasons.
|
|
|
|
//! For example `InPlaceIterable` would be valid to implement for [`Peekable`], except
|
|
|
|
//! that it is stateful, cloneable and `IntoIter`'s clone implementation shortens the underlying
|
|
|
|
//! allocation which means if the iterator has been peeked and then gets cloned there no longer is
|
2022-03-23 20:57:49 +01:00
|
|
|
//! enough room, thus breaking an invariant ([#85322]).
|
2021-07-31 17:39:35 +02:00
|
|
|
//!
|
|
|
|
//! [#85322]: https://github.com/rust-lang/rust/issues/85322
|
|
|
|
//! [`Peekable`]: core::iter::Peekable
|
|
|
|
//!
|
|
|
|
//!
|
|
|
|
//! # Examples
|
|
|
|
//!
|
|
|
|
//! Some cases that are optimized by this specialization, more can be found in the `Vec`
|
|
|
|
//! benchmarks:
|
|
|
|
//!
|
|
|
|
//! ```rust
|
|
|
|
//! # #[allow(dead_code)]
|
|
|
|
//! /// Converts a usize vec into an isize one.
|
|
|
|
//! pub fn cast(vec: Vec<usize>) -> Vec<isize> {
|
|
|
|
//! // Does not allocate, free or panic. On optlevel>=2 it does not loop.
|
|
|
|
//! // Of course this particular case could and should be written with `into_raw_parts` and
|
|
|
|
//! // `from_raw_parts` instead.
|
|
|
|
//! vec.into_iter().map(|u| u as isize).collect()
|
|
|
|
//! }
|
|
|
|
//! ```
|
|
|
|
//!
|
|
|
|
//! ```rust
|
|
|
|
//! # #[allow(dead_code)]
|
|
|
|
//! /// Drops remaining items in `src` and if the layouts of `T` and `U` match it
|
|
|
|
//! /// returns an empty Vec backed by the original allocation. Otherwise it returns a new
|
|
|
|
//! /// empty vec.
|
|
|
|
//! pub fn recycle_allocation<T, U>(src: Vec<T>) -> Vec<U> {
|
|
|
|
//! src.into_iter().filter_map(|_| None).collect()
|
|
|
|
//! }
|
|
|
|
//! ```
|
|
|
|
//!
|
|
|
|
//! ```rust
|
|
|
|
//! let vec = vec![13usize; 1024];
|
|
|
|
//! let _ = vec.into_iter()
|
|
|
|
//! .enumerate()
|
|
|
|
//! .filter_map(|(idx, val)| if idx % 2 == 0 { Some(val+idx) } else {None})
|
|
|
|
//! .collect::<Vec<_>>();
|
|
|
|
//!
|
|
|
|
//! // is equivalent to the following, but doesn't require bounds checks
|
|
|
|
//!
|
|
|
|
//! let mut vec = vec![13usize; 1024];
|
|
|
|
//! let mut write_idx = 0;
|
|
|
|
//! for idx in 0..vec.len() {
|
|
|
|
//! if idx % 2 == 0 {
|
|
|
|
//! vec[write_idx] = vec[idx] + idx;
|
|
|
|
//! write_idx += 1;
|
|
|
|
//! }
|
|
|
|
//! }
|
|
|
|
//! vec.truncate(write_idx);
|
|
|
|
//! ```
|
2021-07-01 19:12:13 +02:00
|
|
|
use core::iter::{InPlaceIterable, SourceIter, TrustedRandomAccessNoCoerce};
|
2022-09-22 23:12:29 -07:00
|
|
|
use core::mem::{self, ManuallyDrop, SizedTypeProperties};
|
2020-12-05 01:04:06 +00:00
|
|
|
use core::ptr::{self};
|
|
|
|
|
2022-09-10 11:26:24 +02:00
|
|
|
use super::{InPlaceDrop, InPlaceDstBufDrop, SpecFromIter, SpecFromIterNested, Vec};
|
2020-12-05 01:04:06 +00:00
|
|
|
|
|
|
|
/// Specialization marker for collecting an iterator pipeline into a Vec while reusing the
|
|
|
|
/// source allocation, i.e. executing the pipeline in place.
|
|
|
|
#[rustc_unsafe_specialization_marker]
|
2021-09-30 21:42:41 +01:00
|
|
|
pub(super) trait InPlaceIterableMarker {}
|
2020-12-05 01:04:06 +00:00
|
|
|
|
2021-09-30 21:42:41 +01:00
|
|
|
impl<T> InPlaceIterableMarker for T where T: InPlaceIterable {}
|
2020-12-05 01:04:06 +00:00
|
|
|
|
|
|
|
impl<T, I> SpecFromIter<T, I> for Vec<T>
|
2020-12-05 01:44:07 +00:00
|
|
|
where
|
2022-03-22 00:02:54 +01:00
|
|
|
I: Iterator<Item = T> + SourceIter<Source: AsVecIntoIter> + InPlaceIterableMarker,
|
2020-12-05 01:04:06 +00:00
|
|
|
{
|
|
|
|
default fn from_iter(mut iterator: I) -> Self {
|
2021-07-31 17:39:35 +02:00
|
|
|
// See "Layout constraints" section in the module documentation. We rely on const
|
|
|
|
// optimization here since these conditions currently cannot be expressed as trait bounds
|
2022-09-22 23:12:29 -07:00
|
|
|
if T::IS_ZST
|
2020-12-05 01:04:06 +00:00
|
|
|
|| mem::size_of::<T>()
|
2022-03-22 00:02:54 +01:00
|
|
|
!= mem::size_of::<<<I as SourceIter>::Source as AsVecIntoIter>::Item>()
|
2020-12-05 01:04:06 +00:00
|
|
|
|| mem::align_of::<T>()
|
2022-03-22 00:02:54 +01:00
|
|
|
!= mem::align_of::<<<I as SourceIter>::Source as AsVecIntoIter>::Item>()
|
2020-12-05 01:04:06 +00:00
|
|
|
{
|
|
|
|
// fallback to more generic implementations
|
|
|
|
return SpecFromIterNested::from_iter(iterator);
|
|
|
|
}
|
|
|
|
|
|
|
|
let (src_buf, src_ptr, dst_buf, dst_end, cap) = unsafe {
|
|
|
|
let inner = iterator.as_inner().as_into_iter();
|
|
|
|
(
|
|
|
|
inner.buf.as_ptr(),
|
|
|
|
inner.ptr,
|
|
|
|
inner.buf.as_ptr() as *mut T,
|
|
|
|
inner.end as *const T,
|
|
|
|
inner.cap,
|
|
|
|
)
|
|
|
|
};
|
|
|
|
|
2020-12-08 23:21:27 +01:00
|
|
|
let len = SpecInPlaceCollect::collect_in_place(&mut iterator, dst_buf, dst_end);
|
2020-12-05 01:04:06 +00:00
|
|
|
|
|
|
|
let src = unsafe { iterator.as_inner().as_into_iter() };
|
|
|
|
// check if SourceIter contract was upheld
|
2021-07-23 19:14:28 -04:00
|
|
|
// caveat: if they weren't we might not even make it to this point
|
2020-12-05 01:04:06 +00:00
|
|
|
debug_assert_eq!(src_buf, src.buf.as_ptr());
|
|
|
|
// check InPlaceIterable contract. This is only possible if the iterator advanced the
|
|
|
|
// source pointer at all. If it uses unchecked access via TrustedRandomAccess
|
|
|
|
// then the source pointer will stay in its initial position and we can't use it as reference
|
|
|
|
if src.ptr != src_ptr {
|
|
|
|
debug_assert!(
|
2020-12-08 23:21:27 +01:00
|
|
|
unsafe { dst_buf.add(len) as *const _ } <= src.ptr,
|
2020-12-05 01:04:06 +00:00
|
|
|
"InPlaceIterable contract violation, write pointer advanced beyond read pointer"
|
|
|
|
);
|
|
|
|
}
|
|
|
|
|
2022-09-10 15:00:37 +02:00
|
|
|
// The ownership of the allocation and the new `T` values is temporarily moved into `dst_guard`.
|
2022-09-10 12:29:16 +02:00
|
|
|
// This is safe because `forget_allocation_drop_remaining` immediately forgets the allocation
|
2022-09-10 15:00:37 +02:00
|
|
|
// before any panic can occur in order to avoid any double free, and then proceeds to drop
|
|
|
|
// any remaining values at the tail of the source.
|
2021-07-01 23:01:16 +02:00
|
|
|
//
|
2021-07-31 17:39:35 +02:00
|
|
|
// Note: This access to the source wouldn't be allowed by the TrustedRandomIteratorNoCoerce
|
|
|
|
// contract (used by SpecInPlaceCollect below). But see the "O(1) collect" section in the
|
|
|
|
// module documenttation why this is ok anyway.
|
2022-09-10 11:26:24 +02:00
|
|
|
let dst_guard = InPlaceDstBufDrop { ptr: dst_buf, len, cap };
|
2021-03-29 04:22:48 +02:00
|
|
|
src.forget_allocation_drop_remaining();
|
2022-09-10 11:26:24 +02:00
|
|
|
mem::forget(dst_guard);
|
2020-12-05 01:04:06 +00:00
|
|
|
|
2020-12-08 23:21:27 +01:00
|
|
|
let vec = unsafe { Vec::from_raw_parts(dst_buf, len, cap) };
|
2020-12-05 01:04:06 +00:00
|
|
|
|
|
|
|
vec
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
fn write_in_place_with_drop<T>(
|
|
|
|
src_end: *const T,
|
|
|
|
) -> impl FnMut(InPlaceDrop<T>, T) -> Result<InPlaceDrop<T>, !> {
|
|
|
|
move |mut sink, item| {
|
|
|
|
unsafe {
|
|
|
|
// the InPlaceIterable contract cannot be verified precisely here since
|
|
|
|
// try_fold has an exclusive reference to the source pointer
|
|
|
|
// all we can do is check if it's still in range
|
|
|
|
debug_assert!(sink.dst as *const _ <= src_end, "InPlaceIterable contract violation");
|
|
|
|
ptr::write(sink.dst, item);
|
2021-06-21 21:29:43 +02:00
|
|
|
// Since this executes user code which can panic we have to bump the pointer
|
|
|
|
// after each step.
|
2020-12-05 01:04:06 +00:00
|
|
|
sink.dst = sink.dst.add(1);
|
|
|
|
}
|
|
|
|
Ok(sink)
|
|
|
|
}
|
|
|
|
}
|
2020-12-08 23:21:27 +01:00
|
|
|
|
|
|
|
/// Helper trait to hold specialized implementations of the in-place iterate-collect loop
|
|
|
|
trait SpecInPlaceCollect<T, I>: Iterator<Item = T> {
|
|
|
|
/// Collects an iterator (`self`) into the destination buffer (`dst`) and returns the number of items
|
|
|
|
/// collected. `end` is the last writable element of the allocation and used for bounds checks.
|
2021-07-01 23:01:16 +02:00
|
|
|
///
|
|
|
|
/// This method is specialized and one of its implementations makes use of
|
|
|
|
/// `Iterator::__iterator_get_unchecked` calls with a `TrustedRandomAccessNoCoerce` bound
|
|
|
|
/// on `I` which means the caller of this method must take the safety conditions
|
|
|
|
/// of that trait into consideration.
|
2020-12-08 23:21:27 +01:00
|
|
|
fn collect_in_place(&mut self, dst: *mut T, end: *const T) -> usize;
|
|
|
|
}
|
|
|
|
|
|
|
|
impl<T, I> SpecInPlaceCollect<T, I> for I
|
|
|
|
where
|
|
|
|
I: Iterator<Item = T>,
|
|
|
|
{
|
|
|
|
#[inline]
|
|
|
|
default fn collect_in_place(&mut self, dst_buf: *mut T, end: *const T) -> usize {
|
|
|
|
// use try-fold since
|
|
|
|
// - it vectorizes better for some iterator adapters
|
|
|
|
// - unlike most internal iteration methods, it only takes a &mut self
|
|
|
|
// - it lets us thread the write pointer through its innards and get it back in the end
|
|
|
|
let sink = InPlaceDrop { inner: dst_buf, dst: dst_buf };
|
|
|
|
let sink =
|
|
|
|
self.try_fold::<_, _, Result<_, !>>(sink, write_in_place_with_drop(end)).unwrap();
|
|
|
|
// iteration succeeded, don't drop head
|
2022-04-09 14:14:35 -07:00
|
|
|
unsafe { ManuallyDrop::new(sink).dst.sub_ptr(dst_buf) }
|
2020-12-08 23:21:27 +01:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
impl<T, I> SpecInPlaceCollect<T, I> for I
|
|
|
|
where
|
2021-07-01 19:12:13 +02:00
|
|
|
I: Iterator<Item = T> + TrustedRandomAccessNoCoerce,
|
2020-12-08 23:21:27 +01:00
|
|
|
{
|
|
|
|
#[inline]
|
|
|
|
fn collect_in_place(&mut self, dst_buf: *mut T, end: *const T) -> usize {
|
|
|
|
let len = self.size();
|
|
|
|
let mut drop_guard = InPlaceDrop { inner: dst_buf, dst: dst_buf };
|
|
|
|
for i in 0..len {
|
|
|
|
// Safety: InplaceIterable contract guarantees that for every element we read
|
|
|
|
// one slot in the underlying storage will have been freed up and we can immediately
|
|
|
|
// write back the result.
|
|
|
|
unsafe {
|
2022-08-19 13:33:06 +04:00
|
|
|
let dst = dst_buf.add(i);
|
2020-12-08 23:21:27 +01:00
|
|
|
debug_assert!(dst as *const _ <= end, "InPlaceIterable contract violation");
|
|
|
|
ptr::write(dst, self.__iterator_get_unchecked(i));
|
2021-06-21 21:29:43 +02:00
|
|
|
// Since this executes user code which can panic we have to bump the pointer
|
|
|
|
// after each step.
|
2020-12-08 23:21:27 +01:00
|
|
|
drop_guard.dst = dst.add(1);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
mem::forget(drop_guard);
|
|
|
|
len
|
|
|
|
}
|
|
|
|
}
|
2021-08-24 23:54:14 +02:00
|
|
|
|
2021-07-31 17:39:35 +02:00
|
|
|
/// Internal helper trait for in-place iteration specialization.
|
|
|
|
///
|
|
|
|
/// Currently this is only implemented by [`vec::IntoIter`] - returning a reference to itself - and
|
|
|
|
/// [`binary_heap::IntoIter`] which returns a reference to its inner representation.
|
|
|
|
///
|
|
|
|
/// Since this is an internal trait it hides the implementation detail `binary_heap::IntoIter`
|
|
|
|
/// uses `vec::IntoIter` internally.
|
|
|
|
///
|
|
|
|
/// [`vec::IntoIter`]: super::IntoIter
|
|
|
|
/// [`binary_heap::IntoIter`]: crate::collections::binary_heap::IntoIter
|
|
|
|
///
|
|
|
|
/// # Safety
|
|
|
|
///
|
|
|
|
/// In-place iteration relies on implementation details of `vec::IntoIter`, most importantly that
|
|
|
|
/// it does not create references to the whole allocation during iteration, only raw pointers
|
2021-08-24 23:54:14 +02:00
|
|
|
#[rustc_specialization_trait]
|
2022-03-22 00:02:54 +01:00
|
|
|
pub(crate) unsafe trait AsVecIntoIter {
|
2021-08-24 23:54:14 +02:00
|
|
|
type Item;
|
|
|
|
fn as_into_iter(&mut self) -> &mut super::IntoIter<Self::Item>;
|
|
|
|
}
|