318 lines
9.1 KiB
Markdown
318 lines
9.1 KiB
Markdown
|
% Drain
|
||
|
|
||
|
Let's move on to Drain. Drain is largely the same as IntoIter, except that
|
||
|
instead of consuming the Vec, it borrows the Vec and leaves its allocation
|
||
|
free. For now we'll only implement the "basic" full-range version.
|
||
|
|
||
|
```rust,ignore
|
||
|
use std::marker::PhantomData;
|
||
|
|
||
|
struct Drain<'a, T: 'a> {
|
||
|
vec: PhantomData<&'a mut Vec<T>>
|
||
|
start: *const T,
|
||
|
end: *const T,
|
||
|
}
|
||
|
|
||
|
impl<'a, T> Iterator for Drain<'a, T> {
|
||
|
type Item = T;
|
||
|
fn next(&mut self) -> Option<T> {
|
||
|
if self.start == self.end {
|
||
|
None
|
||
|
```
|
||
|
|
||
|
-- wait, this is seeming familiar. Let's do some more compression. Both
|
||
|
IntoIter and Drain have the exact same structure, let's just factor it out.
|
||
|
|
||
|
```rust
|
||
|
struct RawValIter<T> {
|
||
|
start: *const T,
|
||
|
end: *const T,
|
||
|
}
|
||
|
|
||
|
impl<T> RawValIter<T> {
|
||
|
// unsafe to construct because it has no associated lifetimes.
|
||
|
// This is necessary to store a RawValIter in the same struct as
|
||
|
// its actual allocation. OK since it's a private implementation
|
||
|
// detail.
|
||
|
unsafe fn new(slice: &[T]) -> Self {
|
||
|
RawValIter {
|
||
|
start: slice.as_ptr(),
|
||
|
end: if slice.len() == 0 {
|
||
|
slice.as_ptr()
|
||
|
} else {
|
||
|
slice.as_ptr().offset(slice.len() as isize)
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
|
||
|
// Iterator and DoubleEndedIterator impls identical to IntoIter.
|
||
|
```
|
||
|
|
||
|
And IntoIter becomes the following:
|
||
|
|
||
|
```
|
||
|
pub struct IntoIter<T> {
|
||
|
_buf: RawVec<T>, // we don't actually care about this. Just need it to live.
|
||
|
iter: RawValIter<T>,
|
||
|
}
|
||
|
|
||
|
impl<T> Iterator for IntoIter<T> {
|
||
|
type Item = T;
|
||
|
fn next(&mut self) -> Option<T> { self.iter.next() }
|
||
|
fn size_hint(&self) -> (usize, Option<usize>) { self.iter.size_hint() }
|
||
|
}
|
||
|
|
||
|
impl<T> DoubleEndedIterator for IntoIter<T> {
|
||
|
fn next_back(&mut self) -> Option<T> { self.iter.next_back() }
|
||
|
}
|
||
|
|
||
|
impl<T> Drop for IntoIter<T> {
|
||
|
fn drop(&mut self) {
|
||
|
for _ in &mut self.iter {}
|
||
|
}
|
||
|
}
|
||
|
|
||
|
impl<T> Vec<T> {
|
||
|
pub fn into_iter(self) -> IntoIter<T> {
|
||
|
unsafe {
|
||
|
let iter = RawValIter::new(&self);
|
||
|
let buf = ptr::read(&self.buf);
|
||
|
mem::forget(self);
|
||
|
|
||
|
IntoIter {
|
||
|
iter: iter,
|
||
|
_buf: buf,
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
Note that I've left a few quirks in this design to make upgrading Drain to work
|
||
|
with arbitrary subranges a bit easier. In particular we *could* have RawValIter
|
||
|
drain itself on drop, but that won't work right for a more complex Drain.
|
||
|
We also take a slice to simplify Drain initialization.
|
||
|
|
||
|
Alright, now Drain is really easy:
|
||
|
|
||
|
```rust
|
||
|
use std::marker::PhantomData;
|
||
|
|
||
|
pub struct Drain<'a, T: 'a> {
|
||
|
vec: PhantomData<&'a mut Vec<T>>,
|
||
|
iter: RawValIter<T>,
|
||
|
}
|
||
|
|
||
|
impl<'a, T> Iterator for Drain<'a, T> {
|
||
|
type Item = T;
|
||
|
fn next(&mut self) -> Option<T> { self.iter.next_back() }
|
||
|
fn size_hint(&self) -> (usize, Option<usize>) { self.iter.size_hint() }
|
||
|
}
|
||
|
|
||
|
impl<'a, T> DoubleEndedIterator for Drain<'a, T> {
|
||
|
fn next_back(&mut self) -> Option<T> { self.iter.next_back() }
|
||
|
}
|
||
|
|
||
|
impl<'a, T> Drop for Drain<'a, T> {
|
||
|
fn drop(&mut self) {
|
||
|
for _ in &mut self.iter {}
|
||
|
}
|
||
|
}
|
||
|
|
||
|
impl<T> Vec<T> {
|
||
|
pub fn drain(&mut self) -> Drain<T> {
|
||
|
// this is a mem::forget safety thing. If Drain is forgotten, we just
|
||
|
// leak the whole Vec's contents. Also we need to do this *eventually*
|
||
|
// anyway, so why not do it now?
|
||
|
self.len = 0;
|
||
|
|
||
|
unsafe {
|
||
|
Drain {
|
||
|
iter: RawValIter::new(&self),
|
||
|
vec: PhantomData,
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
# Handling Zero-Sized Types
|
||
|
|
||
|
It's time. We're going to fight the spectre that is zero-sized types. Safe Rust
|
||
|
*never* needs to care about this, but Vec is very intensive on raw pointers and
|
||
|
raw allocations, which are exactly the *only* two things that care about
|
||
|
zero-sized types. We need to be careful of two things:
|
||
|
|
||
|
* The raw allocator API has undefined behaviour if you pass in 0 for an
|
||
|
allocation size.
|
||
|
* raw pointer offsets are no-ops for zero-sized types, which will break our
|
||
|
C-style pointer iterator.
|
||
|
|
||
|
Thankfully we abstracted out pointer-iterators and allocating handling into
|
||
|
RawValIter and RawVec respectively. How mysteriously convenient.
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
## Allocating Zero-Sized Types
|
||
|
|
||
|
So if the allocator API doesn't support zero-sized allocations, what on earth
|
||
|
do we store as our allocation? Why, `heap::EMPTY` of course! Almost every operation
|
||
|
with a ZST is a no-op since ZSTs have exactly one value, and therefore no state needs
|
||
|
to be considered to store or load them. This actually extends to `ptr::read` and
|
||
|
`ptr::write`: they won't actually look at the pointer at all. As such we *never* need
|
||
|
to change the pointer.
|
||
|
|
||
|
Note however that our previous reliance on running out of memory before overflow is
|
||
|
no longer valid with zero-sized types. We must explicitly guard against capacity
|
||
|
overflow for zero-sized types.
|
||
|
|
||
|
Due to our current architecture, all this means is writing 3 guards, one in each
|
||
|
method of RawVec.
|
||
|
|
||
|
```rust
|
||
|
impl<T> RawVec<T> {
|
||
|
fn new() -> Self {
|
||
|
unsafe {
|
||
|
// !0 is usize::MAX. This branch should be stripped at compile time.
|
||
|
let cap = if mem::size_of::<T>() == 0 { !0 } else { 0 };
|
||
|
|
||
|
// heap::EMPTY doubles as "unallocated" and "zero-sized allocation"
|
||
|
RawVec { ptr: Unique::new(heap::EMPTY as *mut T), cap: cap }
|
||
|
}
|
||
|
}
|
||
|
|
||
|
fn grow(&mut self) {
|
||
|
unsafe {
|
||
|
let elem_size = mem::size_of::<T>();
|
||
|
|
||
|
// since we set the capacity to usize::MAX when elem_size is
|
||
|
// 0, getting to here necessarily means the Vec is overfull.
|
||
|
assert!(elem_size != 0, "capacity overflow");
|
||
|
|
||
|
let align = mem::min_align_of::<T>();
|
||
|
|
||
|
let (new_cap, ptr) = if self.cap == 0 {
|
||
|
let ptr = heap::allocate(elem_size, align);
|
||
|
(1, ptr)
|
||
|
} else {
|
||
|
let new_cap = 2 * self.cap;
|
||
|
let ptr = heap::reallocate(*self.ptr as *mut _,
|
||
|
self.cap * elem_size,
|
||
|
new_cap * elem_size,
|
||
|
align);
|
||
|
(new_cap, ptr)
|
||
|
};
|
||
|
|
||
|
// If allocate or reallocate fail, we'll get `null` back
|
||
|
if ptr.is_null() { oom() }
|
||
|
|
||
|
self.ptr = Unique::new(ptr as *mut _);
|
||
|
self.cap = new_cap;
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
|
||
|
impl<T> Drop for RawVec<T> {
|
||
|
fn drop(&mut self) {
|
||
|
let elem_size = mem::size_of::<T>();
|
||
|
|
||
|
// don't free zero-sized allocations, as they were never allocated.
|
||
|
if self.cap != 0 && elem_size != 0 {
|
||
|
let align = mem::min_align_of::<T>();
|
||
|
|
||
|
let num_bytes = elem_size * self.cap;
|
||
|
unsafe {
|
||
|
heap::deallocate(*self.ptr as *mut _, num_bytes, align);
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
That's it. We support pushing and popping zero-sized types now. Our iterators
|
||
|
(that aren't provided by slice Deref) are still busted, though.
|
||
|
|
||
|
|
||
|
|
||
|
|
||
|
## Iterating Zero-Sized Types
|
||
|
|
||
|
Zero-sized offsets are no-ops. This means that our current design will always
|
||
|
initialize `start` and `end` as the same value, and our iterators will yield
|
||
|
nothing. The current solution to this is to cast the pointers to integers,
|
||
|
increment, and then cast them back:
|
||
|
|
||
|
```
|
||
|
impl<T> RawValIter<T> {
|
||
|
unsafe fn new(slice: &[T]) -> Self {
|
||
|
RawValIter {
|
||
|
start: slice.as_ptr(),
|
||
|
end: if mem::size_of::<T>() == 0 {
|
||
|
((slice.as_ptr() as usize) + slice.len()) as *const _
|
||
|
} else if slice.len() == 0 {
|
||
|
slice.as_ptr()
|
||
|
} else {
|
||
|
slice.as_ptr().offset(slice.len() as isize)
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
Now we have a different bug. Instead of our iterators not running at all, our
|
||
|
iterators now run *forever*. We need to do the same trick in our iterator impls.
|
||
|
Also, our size_hint computation code will divide by 0 for ZSTs. Since we'll
|
||
|
basically be treating the two pointers as if they point to bytes, we'll just
|
||
|
map size 0 to divide by 1.
|
||
|
|
||
|
```
|
||
|
impl<T> Iterator for RawValIter<T> {
|
||
|
type Item = T;
|
||
|
fn next(&mut self) -> Option<T> {
|
||
|
if self.start == self.end {
|
||
|
None
|
||
|
} else {
|
||
|
unsafe {
|
||
|
let result = ptr::read(self.start);
|
||
|
self.start = if mem::size_of::<T>() == 0 {
|
||
|
(self.start as usize + 1) as *const _
|
||
|
} else {
|
||
|
self.start.offset(1);
|
||
|
}
|
||
|
Some(result)
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
|
||
|
fn size_hint(&self) -> (usize, Option<usize>) {
|
||
|
let elem_size = mem::size_of::<T>();
|
||
|
let len = (self.end as usize - self.start as usize)
|
||
|
/ if elem_size == 0 { 1 } else { elem_size };
|
||
|
(len, Some(len))
|
||
|
}
|
||
|
}
|
||
|
|
||
|
impl<T> DoubleEndedIterator for RawValIter<T> {
|
||
|
fn next_back(&mut self) -> Option<T> {
|
||
|
if self.start == self.end {
|
||
|
None
|
||
|
} else {
|
||
|
unsafe {
|
||
|
self.end = if mem::size_of::<T>() == 0 {
|
||
|
(self.end as usize - 1) as *const _
|
||
|
} else {
|
||
|
self.end.offset(-1);
|
||
|
}
|
||
|
Some(ptr::read(self.end))
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
}
|
||
|
```
|
||
|
|
||
|
And that's it. Iteration works!
|