rollup merge of #18221 : jkleint/guide-boxes

2014-10-27 09:06:33 -07:00 · 2014-10-27 09:06:33 -07:00 · 83e91fb489
commit 83e91fb489
parent e05c3b7799 d257b37608
1 changed files with 79 additions and 45 deletions
--- a/src/doc/guide.md
+++ b/src/doc/guide.md
@ -3637,40 +3637,72 @@ pub fn as_maybe_owned(&self) -> MaybeOwned<'a> { ... }

 ## Boxes

-All of our references so far have been to variables we've created on the stack.
-In Rust, the simplest way to allocate heap variables is using a *box*.  To
-create a box, use the `box` keyword:
+Most of the types we've seen so far have a fixed size or number of components.
+The compiler needs this fact to lay out values in memory. However, some data
+structures, such as a linked list, do not have a fixed size. You might think to
+implement a linked list with an enum that's either a `Node` or the end of the
+list (`Nil`), like this:

-```{rust}
-let x = box 5i;
+```{rust,ignore}
+enum List {             // error: illegal recursive enum type
+    Node(u32, List),
+    Nil
+}
 ```

-This allocates an integer `5` on the heap, and creates a binding `x` that
-refers to it. The great thing about boxed pointers is that we don't have to
-manually free this allocation! If we write
+But the compiler complains that the type is recursive, that is, it could be
+arbitrarily large. To remedy this, Rust provides a fixed-size container called
+a **box** that can hold any type. You can box up any value with the `box`
+keyword. Our boxed List gets the type `Box<List>` (more on the notation when we
+get to generics):
+
+```{rust}
+enum List {
+    Node(u32, Box<List>),
+    Nil
+}
+
+fn main() {
+    let list = Node(0, box Node(1, box Nil));
+}
+```
+
+A box dynamically allocates memory to hold its contents. The great thing about
+Rust is that that memory is *automatically*, *efficiently*, and *predictably*
+deallocated when you're done with the box.
+
+A box is a pointer type, and you access what's inside using the `*` operator,
+just like regular references. This (rather silly) example dynamically allocates
+an integer `5` and makes `x` a pointer to it:

 ```{rust}
 {
    let x = box 5i;
-    // do stuff
+    println!("{}", *x);     // Prints 5
 }
 ```

-then Rust will automatically free `x` at the end of the block. This isn't
-because Rust has a garbage collector -- it doesn't. Instead, when `x` goes out
-of scope, Rust `free`s `x`. This Rust code will do the same thing as the
-following C code:
+The great thing about boxes is that we don't have to manually free this
+allocation! Instead, when `x` reaches the end of its lifetime -- in this case,
+when it goes out of scope at the end of the block -- Rust `free`s `x`. This
+isn't because Rust has a garbage collector (it doesn't). Instead, by tracking
+the ownership and lifetime of a variable (with a little help from you, the
+programmer), the compiler knows precisely when it is no longer used.
+
+The Rust code above will do the same thing as the following C code:

 ```{c,ignore}
 {
    int *x = (int *)malloc(sizeof(int));
-    // do stuff
+    if (!x) abort();
+    *x = 5;
+    printf("%d\n", *x);
    free(x);
 }
 ```

-This means we get the benefits of manual memory management, but the compiler
-ensures that we don't do something wrong. We can't forget to `free` our memory.
+We get the benefits of manual memory management, while ensuring we don't
+introduce any bugs. We can't forget to `free` our memory.

 Boxes are the sole owner of their contents, so you cannot take a mutable
 reference to them and then use the original box:
@ -3706,48 +3738,50 @@ let mut x = box 5i;
 *x;
 ```

+Boxes are simple and efficient pointers to dynamically allocated values with a
+single owner. They are useful for tree-like structures where the lifetime of a
+child depends solely on the lifetime of its (single) parent. If you need a
+value that must persist as long as any of several referrers, read on.
+
 ## Rc and Arc

-Sometimes, you need to allocate something on the heap, but give out multiple
-references to the memory. Rust's `Rc<T>` (pronounced 'arr cee tee') and
-`Arc<T>` types (again, the `T` is for generics, we'll learn more later) provide
-you with this ability.  **Rc** stands for 'reference counted,' and **Arc** for
-'atomically reference counted.' This is how Rust keeps track of the multiple
-owners: every time we make a new reference to the `Rc<T>`, we add one to its
-internal 'reference count.' Every time a reference goes out of scope, we
-subtract one from the count. When the count is zero, the `Rc<T>` can be safely
-deallocated. `Arc<T>` is almost identical to `Rc<T>`, except for one thing: The
-'atomically' in 'Arc' means that increasing and decreasing the count uses a
-thread-safe mechanism to do so. Why two types? `Rc<T>` is faster, so if you're
-not in a multi-threaded scenario, you can have that advantage. Since we haven't
-talked about threading yet in Rust, we'll show you `Rc<T>` for the rest of this
-section.
+Sometimes, you need a variable that is referenced from multiple places
+(immutably!), lasting as long as any of those places, and disappearing when it
+is no longer referenced. For instance, in a graph-like data structure, a node
+might be referenced from all of its neighbors. In this case, it is not possible
+for the compiler to determine ahead of time when the value can be freed -- it
+needs a little run-time support.

-To create an `Rc<T>`, use `Rc::new()`:
+Rust's **Rc** type provides shared ownership of a dynamically allocated value
+that is automatically freed at the end of its last owner's lifetime. (`Rc`
+stands for 'reference counted,' referring to the way these library types are
+implemented.) This provides more flexibility than single-owner boxes, but has
+some runtime overhead.

-```{rust}
-use std::rc::Rc;
-
-let x = Rc::new(5i);
-```
-
-To create a second reference, use the `.clone()` method:
+To create an `Rc` value, use `Rc::new()`. To create a second owner, use the
+`.clone()` method:

 ```{rust}
 use std::rc::Rc;

 let x = Rc::new(5i);
 let y = x.clone();
+
+println!("{} {}", *x, *y);      // Prints 5 5
 ```

-The `Rc<T>` will live as long as any of its references are alive. After they
-all go out of scope, the memory will be `free`d.
+The `Rc` will live as long as any of its owners are alive. After that, the
+memory will be `free`d.

-If you use `Rc<T>` or `Arc<T>`, you have to be careful about introducing
-cycles. If you have two `Rc<T>`s that point to each other, the reference counts
-will never drop to zero, and you'll have a memory leak. To learn more, check
-out [the section on `Rc<T>` and `Arc<T>` in the pointers
-guide](guide-pointers.html#rc-and-arc).
+**Arc** is an 'atomically reference counted' value, identical to `Rc` except
+that ownership can be safely shared among multiple threads. Why two types?
+`Arc` has more overhead, so if you're not in a multi-threaded scenario, you
+don't have to pay the price.
+
+If you use `Rc` or `Arc`, you have to be careful about introducing cycles. If
+you have two `Rc`s that point to each other, they will happily keep each other
+alive forever, creating a memory leak. To learn more, check out [the section on
+`Rc` and `Arc` in the pointers guide](guide-pointers.html#rc-and-arc).

 # Patterns