clean up vec chapter of tarpl

2015-07-20 11:31:29 -07:00 · 2015-07-20 11:31:29 -07:00 · 5f6e0abe27
commit 5f6e0abe27
parent 42c2f107c1
9 changed files with 192 additions and 177 deletions
--- a/src/doc/tarpl/SUMMARY.md
+++ b/src/doc/tarpl/SUMMARY.md
@ -46,6 +46,7 @@
 	* [Deref](vec-deref.md)
 	* [Insert and Remove](vec-insert-remove.md)
 	* [IntoIter](vec-into-iter.md)
+	* [RawVec](vec-raw.md)
 	* [Drain](vec-drain.md)
 	* [Handling Zero-Sized Types](vec-zsts.md)
 	* [Final Code](vec-final.md)
--- a/src/doc/tarpl/vec-alloc.md
+++ b/src/doc/tarpl/vec-alloc.md
@ -46,7 +46,7 @@ Okay, now we can write growing. Roughly, we want to have this logic:
 if cap == 0:
    allocate()
    cap = 1
-else
+else:
    reallocate
    cap *= 2
 ```
--- a/src/doc/tarpl/vec-dealloc.md
+++ b/src/doc/tarpl/vec-dealloc.md
@ -2,13 +2,13 @@

 Next we should implement Drop so that we don't massively leak tons of resources.
 The easiest way is to just call `pop` until it yields None, and then deallocate
-our buffer. Note that calling `pop` is uneeded if `T: !Drop`. In theory we can
-ask Rust if T needs_drop and omit the calls to `pop`. However in practice LLVM
-is *really* good at removing simple side-effect free code like this, so I wouldn't
-bother unless you notice it's not being stripped (in this case it is).
+our buffer. Note that calling `pop` is unneeded if `T: !Drop`. In theory we can
+ask Rust if `T` `needs_drop` and omit the calls to `pop`. However in practice
+LLVM is *really* good at removing simple side-effect free code like this, so I
+wouldn't bother unless you notice it's not being stripped (in this case it is).

-We must not call `heap::deallocate` when `self.cap == 0`, as in this case we haven't
-actually allocated any memory.
+We must not call `heap::deallocate` when `self.cap == 0`, as in this case we
+haven't actually allocated any memory.


 ```rust,ignore
--- a/src/doc/tarpl/vec-deref.md
+++ b/src/doc/tarpl/vec-deref.md
@ -1,13 +1,15 @@
 % Deref

-Alright! We've got a decent minimal ArrayStack implemented. We can push, we can
-pop, and we can clean up after ourselves. However there's a whole mess of functionality
-we'd reasonably want. In particular, we have a proper array, but none of the slice
-functionality. That's actually pretty easy to solve: we can implement `Deref<Target=[T]>`.
-This will magically make our Vec coerce to and behave like a slice in all sorts of
-conditions.
+Alright! We've got a decent minimal stack implemented. We can push, we can
+pop, and we can clean up after ourselves. However there's a whole mess of
+functionality we'd reasonably want. In particular, we have a proper array, but
+none of the slice functionality. That's actually pretty easy to solve: we can
+implement `Deref<Target=[T]>`. This will magically make our Vec coerce to, and
+behave like, a slice in all sorts of conditions.

-All we need is `slice::from_raw_parts`.
+All we need is `slice::from_raw_parts`. It will correctly handle empty slices
+for us. Later once we set up zero-sized type support it will also Just Work
+for those too.

 ```rust,ignore
 use std::ops::Deref;
@ -36,5 +38,5 @@ impl<T> DerefMut for Vec<T> {
 }
 ```

-Now we have `len`, `first`, `last`, indexing, slicing, sorting, `iter`, `iter_mut`,
-and all other sorts of bells and whistles provided by slice. Sweet!
+Now we have `len`, `first`, `last`, indexing, slicing, sorting, `iter`,
+`iter_mut`, and all other sorts of bells and whistles provided by slice. Sweet!
--- a/src/doc/tarpl/vec-drain.md
+++ b/src/doc/tarpl/vec-drain.md
@ -2,7 +2,7 @@

 Let's move on to Drain. Drain is largely the same as IntoIter, except that
 instead of consuming the Vec, it borrows the Vec and leaves its allocation
-free. For now we'll only implement the "basic" full-range version.
+untouched. For now we'll only implement the "basic" full-range version.

 ```rust,ignore
 use std::marker::PhantomData;
@ -38,6 +38,9 @@ impl<T> RawValIter<T> {
        RawValIter {
            start: slice.as_ptr(),
            end: if slice.len() == 0 {
+                // if `len = 0`, then this is not actually allocated memory.
+                // Need to avoid offsetting because that will give wrong
+                // information to LLVM via GEP.
                slice.as_ptr()
            } else {
                slice.as_ptr().offset(slice.len() as isize)
@ -137,5 +140,7 @@ impl<T> Vec<T> {
 }
 ```

+For more details on the `mem::forget` problem, see the
+[section on leaks][leaks].

-
+[leaks]: leaking.html
--- a/src/doc/tarpl/vec-insert-remove.md
+++ b/src/doc/tarpl/vec-insert-remove.md
@ -1,12 +1,13 @@
 % Insert and Remove

-Something *not* provided but slice is `insert` and `remove`, so let's do those next.
+Something *not* provided by slice is `insert` and `remove`, so let's do those
+next.

 Insert needs to shift all the elements at the target index to the right by one.
 To do this we need to use `ptr::copy`, which is our version of C's `memmove`.
-This copies some chunk of memory from one location to another, correctly handling
-the case where the source and destination overlap (which will definitely happen
-here).
+This copies some chunk of memory from one location to another, correctly
+handling the case where the source and destination overlap (which will
+definitely happen here).

 If we insert at index `i`, we want to shift the `[i .. len]` to `[i+1 .. len+1]`
 using the *old* len.
--- a/src/doc/tarpl/vec-into-iter.md
+++ b/src/doc/tarpl/vec-into-iter.md
@ -11,19 +11,20 @@ allocation.
 IntoIter needs to be DoubleEnded as well, to enable reading from both ends.
 Reading from the back could just be implemented as calling `pop`, but reading
 from the front is harder. We could call `remove(0)` but that would be insanely
-expensive. Instead we're going to just use ptr::read to copy values out of either
-end of the Vec without mutating the buffer at all.
+expensive. Instead we're going to just use ptr::read to copy values out of
+either end of the Vec without mutating the buffer at all.

 To do this we're going to use a very common C idiom for array iteration. We'll
-make two pointers; one that points to the start of the array, and one that points
-to one-element past the end. When we want an element from one end, we'll read out
-the value pointed to at that end and move the pointer over by one. When the two
-pointers are equal, we know we're done.
+make two pointers; one that points to the start of the array, and one that
+points to one-element past the end. When we want an element from one end, we'll
+read out the value pointed to at that end and move the pointer over by one. When
+the two pointers are equal, we know we're done.

 Note that the order of read and offset are reversed for `next` and `next_back`
 For `next_back` the pointer is always *after* the element it wants to read next,
 while for `next` the pointer is always *at* the element it wants to read next.
-To see why this is, consider the case where every element but one has been yielded.
+To see why this is, consider the case where every element but one has been
+yielded.

 The array looks like this:

@ -35,6 +36,10 @@ The array looks like this:
 If E pointed directly at the element it wanted to yield next, it would be
 indistinguishable from the case where there are no more elements to yield.

+Although we don't actually care about it during iteration, we also need to hold
+onto the Vec's allocation information in order to free it once IntoIter is
+dropped.
+
 So we're going to use the following struct:

 ```rust,ignore
@ -46,8 +51,8 @@ struct IntoIter<T> {
 }
 ```

-One last subtle detail: if our Vec is empty, we want to produce an empty iterator.
-This will actually technically fall out doing the naive thing of:
+One last subtle detail: if our Vec is empty, we want to produce an empty
+iterator. This will actually technically fall out doing the naive thing of:

 ```text
 start = ptr
@ -155,139 +160,3 @@ impl<T> Drop for IntoIter<T> {
    }
 }
 ```
-
-We've actually reached an interesting situation here: we've duplicated the logic
-for specifying a buffer and freeing its memory. Now that we've implemented it and
-identified *actual* logic duplication, this is a good time to perform some logic
-compression.
-
-We're going to abstract out the `(ptr, cap)` pair and give them the logic for
-allocating, growing, and freeing:
-
-```rust,ignore
-
-struct RawVec<T> {
-    ptr: Unique<T>,
-    cap: usize,
-}
-
-impl<T> RawVec<T> {
-    fn new() -> Self {
-        assert!(mem::size_of::<T>() != 0, "TODO: implement ZST support");
-        unsafe {
-            RawVec { ptr: Unique::new(heap::EMPTY as *mut T), cap: 0 }
-        }
-    }
-
-    // unchanged from Vec
-    fn grow(&mut self) {
-        unsafe {
-            let align = mem::align_of::<T>();
-            let elem_size = mem::size_of::<T>();
-
-            let (new_cap, ptr) = if self.cap == 0 {
-                let ptr = heap::allocate(elem_size, align);
-                (1, ptr)
-            } else {
-                let new_cap = 2 * self.cap;
-                let ptr = heap::reallocate(*self.ptr as *mut _,
-                                            self.cap * elem_size,
-                                            new_cap * elem_size,
-                                            align);
-                (new_cap, ptr)
-            };
-
-            // If allocate or reallocate fail, we'll get `null` back
-            if ptr.is_null() { oom() }
-
-            self.ptr = Unique::new(ptr as *mut _);
-            self.cap = new_cap;
-        }
-    }
-}
-
-
-impl<T> Drop for RawVec<T> {
-    fn drop(&mut self) {
-        if self.cap != 0 {
-            let align = mem::align_of::<T>();
-            let elem_size = mem::size_of::<T>();
-            let num_bytes = elem_size * self.cap;
-            unsafe {
-                heap::deallocate(*self.ptr as *mut _, num_bytes, align);
-            }
-        }
-    }
-}
-```
-
-And change vec as follows:
-
-```rust,ignore
-pub struct Vec<T> {
-    buf: RawVec<T>,
-    len: usize,
-}
-
-impl<T> Vec<T> {
-    fn ptr(&self) -> *mut T { *self.buf.ptr }
-
-    fn cap(&self) -> usize { self.buf.cap }
-
-    pub fn new() -> Self {
-        Vec { buf: RawVec::new(), len: 0 }
-    }
-
-    // push/pop/insert/remove largely unchanged:
-    // * `self.ptr -> self.ptr()`
-    // * `self.cap -> self.cap()`
-    // * `self.grow -> self.buf.grow()`
-}
-
-impl<T> Drop for Vec<T> {
-    fn drop(&mut self) {
-        while let Some(_) = self.pop() {}
-        // deallocation is handled by RawVec
-    }
-}
-```
-
-And finally we can really simplify IntoIter:
-
-```rust,ignore
-struct IntoIter<T> {
-    _buf: RawVec<T>, // we don't actually care about this. Just need it to live.
-    start: *const T,
-    end: *const T,
-}
-
-// next and next_back literally unchanged since they never referred to the buf
-
-impl<T> Drop for IntoIter<T> {
-    fn drop(&mut self) {
-        // only need to ensure all our elements are read;
-        // buffer will clean itself up afterwards.
-        for _ in &mut *self {}
-    }
-}
-
-impl<T> Vec<T> {
-    pub fn into_iter(self) -> IntoIter<T> {
-        unsafe {
-            // need to use ptr::read to unsafely move the buf out since it's
-            // not Copy.
-            let buf = ptr::read(&self.buf);
-            let len = self.len;
-            mem::forget(self);
-
-            IntoIter {
-                start: *buf.ptr,
-                end: buf.ptr.offset(len as isize),
-                _buf: buf,
-            }
-        }
-    }
-}
-```
-
-Much better.
--- a/src/doc/tarpl/vec-layout.md
+++ b/src/doc/tarpl/vec-layout.md
@ -13,15 +13,15 @@ pub struct Vec<T> {
 # fn main() {}
 ```

-And indeed this would compile. Unfortunately, it would be incorrect. The compiler
-will give us too strict variance, so e.g. an `&Vec<&'static str>` couldn't be used
-where an `&Vec<&'a str>` was expected. More importantly, it will give incorrect
-ownership information to dropck, as it will conservatively assume we don't own
-any values of type `T`. See [the chapter on ownership and lifetimes]
-(lifetimes.html) for details.
+And indeed this would compile. Unfortunately, it would be incorrect. The
+compiler will give us too strict variance, so e.g. an `&Vec<&'static str>`
+couldn't be used where an `&Vec<&'a str>` was expected. More importantly, it
+will give incorrect ownership information to dropck, as it will conservatively
+assume we don't own any values of type `T`. See [the chapter on ownership and
+lifetimes] (lifetimes.html) for details.

-As we saw in the lifetimes chapter, we should use `Unique<T>` in place of `*mut T`
-when we have a raw pointer to an allocation we own:
+As we saw in the lifetimes chapter, we should use `Unique<T>` in place of
+`*mut T` when we have a raw pointer to an allocation we own:


 ```rust
@ -40,9 +40,10 @@ pub struct Vec<T> {

 As a recap, Unique is a wrapper around a raw pointer that declares that:

-* We own at least one value of type `T`
+* We may own a value of type `T`
 * We are Send/Sync iff `T` is Send/Sync
-* Our pointer is never null (and therefore `Option<Vec>` is null-pointer-optimized)
+* Our pointer is never null (and therefore `Option<Vec>` is
+  null-pointer-optimized)

 That last point is subtle. First, it makes `Unique::new` unsafe to call, because
 putting `null` inside of it is Undefined Behaviour. It also throws a
--- a/src/doc/tarpl/vec-raw.md
+++ b/src/doc/tarpl/vec-raw.md
@ -0,0 +1,136 @@
+% RawVec
+
+We've actually reached an interesting situation here: we've duplicated the logic
+for specifying a buffer and freeing its memory. Now that we've implemented it
+and identified *actual* logic duplication, this is a good time to perform some
+logic compression.
+
+We're going to abstract out the `(ptr, cap)` pair and give them the logic for
+allocating, growing, and freeing:
+
+```rust,ignore
+struct RawVec<T> {
+    ptr: Unique<T>,
+    cap: usize,
+}
+
+impl<T> RawVec<T> {
+    fn new() -> Self {
+        assert!(mem::size_of::<T>() != 0, "TODO: implement ZST support");
+        unsafe {
+            RawVec { ptr: Unique::new(heap::EMPTY as *mut T), cap: 0 }
+        }
+    }
+
+    // unchanged from Vec
+    fn grow(&mut self) {
+        unsafe {
+            let align = mem::align_of::<T>();
+            let elem_size = mem::size_of::<T>();
+
+            let (new_cap, ptr) = if self.cap == 0 {
+                let ptr = heap::allocate(elem_size, align);
+                (1, ptr)
+            } else {
+                let new_cap = 2 * self.cap;
+                let ptr = heap::reallocate(*self.ptr as *mut _,
+                                            self.cap * elem_size,
+                                            new_cap * elem_size,
+                                            align);
+                (new_cap, ptr)
+            };
+
+            // If allocate or reallocate fail, we'll get `null` back
+            if ptr.is_null() { oom() }
+
+            self.ptr = Unique::new(ptr as *mut _);
+            self.cap = new_cap;
+        }
+    }
+}
+
+
+impl<T> Drop for RawVec<T> {
+    fn drop(&mut self) {
+        if self.cap != 0 {
+            let align = mem::align_of::<T>();
+            let elem_size = mem::size_of::<T>();
+            let num_bytes = elem_size * self.cap;
+            unsafe {
+                heap::deallocate(*self.ptr as *mut _, num_bytes, align);
+            }
+        }
+    }
+}
+```
+
+And change vec as follows:
+
+```rust,ignore
+pub struct Vec<T> {
+    buf: RawVec<T>,
+    len: usize,
+}
+
+impl<T> Vec<T> {
+    fn ptr(&self) -> *mut T { *self.buf.ptr }
+
+    fn cap(&self) -> usize { self.buf.cap }
+
+    pub fn new() -> Self {
+        Vec { buf: RawVec::new(), len: 0 }
+    }
+
+    // push/pop/insert/remove largely unchanged:
+    // * `self.ptr -> self.ptr()`
+    // * `self.cap -> self.cap()`
+    // * `self.grow -> self.buf.grow()`
+}
+
+impl<T> Drop for Vec<T> {
+    fn drop(&mut self) {
+        while let Some(_) = self.pop() {}
+        // deallocation is handled by RawVec
+    }
+}
+```
+
+And finally we can really simplify IntoIter:
+
+```rust,ignore
+struct IntoIter<T> {
+    _buf: RawVec<T>, // we don't actually care about this. Just need it to live.
+    start: *const T,
+    end: *const T,
+}
+
+// next and next_back literally unchanged since they never referred to the buf
+
+impl<T> Drop for IntoIter<T> {
+    fn drop(&mut self) {
+        // only need to ensure all our elements are read;
+        // buffer will clean itself up afterwards.
+        for _ in &mut *self {}
+    }
+}
+
+impl<T> Vec<T> {
+    pub fn into_iter(self) -> IntoIter<T> {
+        unsafe {
+            // need to use ptr::read to unsafely move the buf out since it's
+            // not Copy, and Vec implements Drop (so we can't destructure it).
+            let buf = ptr::read(&self.buf);
+            let len = self.len;
+            mem::forget(self);
+
+            IntoIter {
+                start: *buf.ptr,
+                end: buf.ptr.offset(len as isize),
+                _buf: buf,
+            }
+        }
+    }
+}
+```
+
+Much better.