Rollup merge of #24782 - steveklabnik:doc_ownership, r=nikomatsakis

Also, as @huonw guessed, move semantics really _does_ make more sense as
a sub-chapter of ownership.
This commit is contained in:
Steve Klabnik 2015-05-05 16:56:01 -04:00
commit 1eae884f39
5 changed files with 756 additions and 583 deletions

View File

@ -26,7 +26,6 @@
* [References and Borrowing](references-and-borrowing.md)
* [Lifetimes](lifetimes.md)
* [Mutability](mutability.md)
* [Move semantics](move-semantics.md)
* [Enums](enums.md)
* [Match](match.md)
* [Structs](structs.md)

View File

@ -1,3 +1,297 @@
% Lifetimes
Coming Soon! Until then, check out the [ownership](ownership.html) chapter.
This guide is one of three presenting Rusts ownership system. This is one of
Rusts most unique and compelling features, with which Rust developers should
become quite acquainted. Ownership is how Rust achieves its largest goal,
memory safety. There are a few distinct concepts, each with its own chapter:
* [ownership][ownership], ownership, the key concept
* [borrowing][borrowing], and their associated feature references
* lifetimes, which youre reading now
These three chapters are related, and in order. Youll need all three to fully
understand the ownership system.
[ownership]: ownership.html
[borrowing]: references-and-borrowing.html
# Meta
Before we get to the details, two important notes about the ownership system.
Rust has a focus on safety and speed. It accomplishes these goals through many
zero-cost abstractions, which means that in Rust, abstractions cost as little
as possible in order to make them work. The ownership system is a prime example
of a zero-cost abstraction. All of the analysis well talk about in this guide
is _done at compile time_. You do not pay any run-time cost for any of these
features.
However, this system does have a certain cost: learning curve. Many new users
to Rust experience something we like to call fighting with the borrow
checker, where the Rust compiler refuses to compile a program that the author
thinks is valid. This often happens because the programmers mental model of
how ownership should work doesnt match the actual rules that Rust implements.
You probably will experience similar things at first. There is good news,
however: more experienced Rust developers report that once they work with the
rules of the ownership system for a period of time, they fight the borrow
checker less and less.
With that in mind, lets learn about lifetimes.
# Lifetimes
Lending out a reference to a resource that someone else owns can be
complicated, however. For example, imagine this set of operations:
- I acquire a handle to some kind of resource.
- I lend you a reference to the resource.
- I decide Im done with the resource, and deallocate it, while you still have
your reference.
- You decide to use the resource.
Uh oh! Your reference is pointing to an invalid resource. This is called a
dangling pointer or use after free, when the resource is memory.
To fix this, we have to make sure that step four never happens after step
three. The ownership system in Rust does this through a concept called
lifetimes, which describe the scope that a reference is valid for.
When we have a function that takes a reference by argument, we can be implicit
or explicit about the lifetime of the reference:
```rust
// implicit
fn foo(x: &i32) {
}
// explicit
fn bar<'a>(x: &'a i32) {
}
```
The `'a` reads the lifetime a. Technically, every reference has some lifetime
associated with it, but the compiler lets you elide them in common cases.
Before we get to that, though, lets break the explicit example down:
```rust,ignore
fn bar<'a>(...)
```
This part declares our lifetimes. This says that `bar` has one lifetime, `'a`.
If we had two reference parameters, it would look like this:
```rust,ignore
fn bar<'a, 'b>(...)
```
Then in our parameter list, we use the lifetimes weve named:
```rust,ignore
...(x: &'a i32)
```
If we wanted an `&mut` reference, wed do this:
```rust,ignore
...(x: &'a mut i32)
```
If you compare `&mut i32` to `&'a mut i32`, theyre the same, its just that
the lifetime `'a` has snuck in between the `&` and the `mut i32`. We read `&mut
i32` as a mutable reference to an i32 and `&'a mut i32` as a mutable
reference to an `i32` with the lifetime `'a`.
Youll also need explicit lifetimes when working with [`struct`][structs]s:
```rust
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let y = &5; // this is the same as `let _y = 5; let y = &_y;`
let f = Foo { x: y };
println!("{}", f.x);
}
```
[struct]: structs.html
As you can see, `struct`s can also have lifetimes. In a similar way to functions,
```rust
struct Foo<'a> {
# x: &'a i32,
# }
```
declares a lifetime, and
```rust
# struct Foo<'a> {
x: &'a i32,
# }
```
uses it. So why do we need a lifetime here? We need to ensure that any reference
to a `Foo` cannot outlive the reference to an `i32` it contains.
## Thinking in scopes
A way to think about lifetimes is to visualize the scope that a reference is
valid for. For example:
```rust
fn main() {
let y = &5; // -+ y goes into scope
// |
// stuff // |
// |
} // -+ y goes out of scope
```
Adding in our `Foo`:
```rust
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let y = &5; // -+ y goes into scope
let f = Foo { x: y }; // -+ f goes into scope
// stuff // |
// |
} // -+ f and y go out of scope
```
Our `f` lives within the scope of `y`, so everything works. What if it didnt?
This code wont work:
```rust,ignore
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let x; // -+ x goes into scope
// |
{ // |
let y = &5; // ---+ y goes into scope
let f = Foo { x: y }; // ---+ f goes into scope
x = &f.x; // | | error here
} // ---+ f and y go out of scope
// |
println!("{}", x); // |
} // -+ x goes out of scope
```
Whew! As you can see here, the scopes of `f` and `y` are smaller than the scope
of `x`. But when we do `x = &f.x`, we make `x` a reference to something thats
about to go out of scope.
Named lifetimes are a way of giving these scopes a name. Giving something a
name is the first step towards being able to talk about it.
## 'static
The lifetime named static is a special lifetime. It signals that something
has the lifetime of the entire program. Most Rust programmers first come across
`'static` when dealing with strings:
```rust
let x: &'static str = "Hello, world.";
```
String literals have the type `&'static str` because the reference is always
alive: they are baked into the data segment of the final binary. Another
example are globals:
```rust
static FOO: i32 = 5;
let x: &'static i32 = &FOO;
```
This adds an `i32` to the data segment of the binary, and `x` is a reference
to it.
## Lifetime Elision
Rust supports powerful local type inference in function bodies, but its
forbidden in item signatures to allow reasoning about the types just based in
the item signature alone. However, for ergonomic reasons a very restricted
secondary inference algorithm called “lifetime elision” applies in function
signatures. It infers only based on the signature components themselves and not
based on the body of the function, only infers lifetime parameters, and does
this with only three easily memorizable and unambiguous rules. This makes
lifetime elision a shorthand for writing an item signature, while not hiding
away the actual types involved as full local inference would if applied to it.
When talking about lifetime elision, we use the term *input lifetime* and
*output lifetime*. An *input lifetime* is a lifetime associated with a parameter
of a function, and an *output lifetime* is a lifetime associated with the return
value of a function. For example, this function has an input lifetime:
```rust,ignore
fn foo<'a>(bar: &'a str)
```
This one has an output lifetime:
```rust,ignore
fn foo<'a>() -> &'a str
```
This one has a lifetime in both positions:
```rust,ignore
fn foo<'a>(bar: &'a str) -> &'a str
```
Here are the three rules:
* Each elided lifetime in a functions arguments becomes a distinct lifetime
parameter.
* If there is exactly one input lifetime, elided or not, that lifetime is
assigned to all elided lifetimes in the return values of that function.
* If there are multiple input lifetimes, but one of them is `&self` or `&mut
self`, the lifetime of `self` is assigned to all elided output lifetimes.
Otherwise, it is an error to elide an output lifetime.
### Examples
Here are some examples of functions with elided lifetimes. Weve paired each
example of an elided lifetime with its expanded form.
```rust,ignore
fn print(s: &str); // elided
fn print<'a>(s: &'a str); // expanded
fn debug(lvl: u32, s: &str); // elided
fn debug<'a>(lvl: u32, s: &'a str); // expanded
// In the preceding example, `lvl` doesnt need a lifetime because its not a
// reference (`&`). Only things relating to references (such as a `struct`
// which contains a reference) need lifetimes.
fn substr(s: &str, until: u32) -> &str; // elided
fn substr<'a>(s: &'a str, until: u32) -> &'a str; // expanded
fn get_str() -> &str; // ILLEGAL, no inputs
fn frob(s: &str, t: &str) -> &str; // ILLEGAL, two inputs
fn frob<'a, 'b>(s: &'a str, t: &'b str) -> &str; // Expanded: Output lifetime is unclear
fn get_mut(&mut self) -> &mut T; // elided
fn get_mut<'a>(&'a mut self) -> &'a mut T; // expanded
fn args<T:ToCStr>(&mut self, args: &[T]) -> &mut Command // elided
fn args<'a, 'b, T:ToCStr>(&'a mut self, args: &'b [T]) -> &'a mut Command // expanded
fn new(buf: &mut [u8]) -> BufWriter; // elided
fn new<'a>(buf: &'a mut [u8]) -> BufWriter<'a> // expanded
```

View File

@ -1,105 +0,0 @@
% Move Semantics
An important aspect of [ownership][ownership] is move semantics. Move
semantics control how and when ownership is transferred between bindings.
[ownership]: ownership.html
For example, consider a type like `Vec<T>`, which owns its contents:
```rust
let v = vec![1, 2, 3];
```
I can assign this vector to another binding:
```rust
let v = vec![1, 2, 3];
let v2 = v;
```
But, if we try to use `v` afterwards, we get an error:
```rust,ignore
let v = vec![1, 2, 3];
let v2 = v;
println!("v[0] is: {}", v[0]);
```
It looks like this:
```text
error: use of moved value: `v`
println!("v[0] is: {}", v[0]);
^
```
A similar thing happens if we define a function which takes ownership, and
try to use something after weve passed it as an argument:
```rust,ignore
fn take(v: Vec<i32>) {
// what happens here isnt important.
}
let v = vec![1, 2, 3];
take(v);
println!("v[0] is: {}", v[0]);
```
Same error: “use of moved value.” When we transfer ownership to something else,
we say that weve moved the thing we refer to. You dont need some sort of
special annotation here, its the default thing that Rust does.
# The details
The reason that we cannot use a binding after weve moved it is subtle, but
important. When we write code like this:
```rust
let v = vec![1, 2, 3];
let v2 = v;
```
The first line creates some data for the vector on the stack, `v`. The vectors
data, however, is stored on the heap, and so it contains a pointer to that
data. When we move `v` to `v2`, it creates a copy of that data, for `v2`. Which
would mean two pointers to the contents of the vector on the heap. That would
be a problem: it would violate Rusts safety guarantees by introducing a data
race. Therefore, Rust forbids using `v` after weve done the move.
Its also important to note that optimizations may remove the actual copy of
the bytes, depending on circumstances. So it may not be as inefficient as it
initially seems.
# `Copy` types
Weve established that when ownership is transferred to another binding, you
cannot use the original binding. However, theres a [trait][traits] that changes this
behavior, and its called `Copy`. We havent discussed traits yet, but for now,
you can think of them as an annotation to a particular type that adds extra
behavior. For example:
```rust
let v = 1;
let v2 = v;
println!("v is: {}", v);
```
In this case, `v` is an `i32`, which implements the `Copy` trait. This means
that, just like a move, when we assign `v` to `v2`, a copy of the data is made.
But, unlike a move, we can still use `v` afterward. This is because an `i32`
has no pointers to data somewhere else, copying it is a full copy.
We will discuss how to make your own types `Copy` in the [traits][traits]
section.
[traits]: traits.html

View File

@ -1,555 +1,207 @@
% Ownership
This guide presents Rust's ownership system. This is one of Rust's most unique
and compelling features, with which Rust developers should become quite
acquainted. Ownership is how Rust achieves its largest goal, memory safety.
The ownership system has a few distinct concepts: *ownership*, *borrowing*,
and *lifetimes*. We'll talk about each one in turn.
This guide is one of three presenting Rusts ownership system. This is one of
Rusts most unique and compelling features, with which Rust developers should
become quite acquainted. Ownership is how Rust achieves its largest goal,
memory safety. The there are a few distinct concepts, each with its own
chapter:
* ownership, which youre reading now.
* [borrowing][borrowing], and their associated feature references
* [lifetimes][lifetimes], an advanced concept of borrowing
These three chapters are related, and in order. Youll need all three to fully
understand the ownership system.
[borrowing]: references-and-borrowing.html
[lifetimes]: lifetimes.html
# Meta
Before we get to the details, two important notes about the ownership system.
Rust has a focus on safety and speed. It accomplishes these goals through many
*zero-cost abstractions*, which means that in Rust, abstractions cost as little
zero-cost abstractions, which means that in Rust, abstractions cost as little
as possible in order to make them work. The ownership system is a prime example
of a zero cost abstraction. All of the analysis we'll talk about in this guide
of a zero cost abstraction. All of the analysis well talk about in this guide
is _done at compile time_. You do not pay any run-time cost for any of these
features.
However, this system does have a certain cost: learning curve. Many new users
to Rust experience something we like to call "fighting with the borrow
checker," where the Rust compiler refuses to compile a program that the author
thinks is valid. This often happens because the programmer's mental model of
how ownership should work doesn't match the actual rules that Rust implements.
to Rust experience something we like to call fighting with the borrow
checker, where the Rust compiler refuses to compile a program that the author
thinks is valid. This often happens because the programmers mental model of
how ownership should work doesnt match the actual rules that Rust implements.
You probably will experience similar things at first. There is good news,
however: more experienced Rust developers report that once they work with the
rules of the ownership system for a period of time, they fight the borrow
checker less and less.
With that in mind, let's learn about ownership.
With that in mind, lets learn about ownership.
# Ownership
At its core, ownership is about *resources*. For the purposes of the vast
majority of this guide, we will talk about a specific resource: memory. The
concept generalizes to any kind of resource, like a file handle, but to make it
more concrete, we'll focus on memory.
When your program allocates some memory, it needs some way to deallocate that
memory. Imagine a function `foo` that allocates four bytes of memory, and then
never deallocates that memory. We call this problem *leaking* memory, because
each time we call `foo`, we're allocating another four bytes. Eventually, with
enough calls to `foo`, we will run our system out of memory. That's no good. So
we need some way for `foo` to deallocate those four bytes. It's also important
that we don't deallocate too many times, either. Without getting into the
details, attempting to deallocate memory multiple times can lead to problems.
In other words, any time some memory is allocated, we need to make sure that we
deallocate that memory once and only once. Too many times is bad, not enough
times is bad. The counts must match.
There's one other important detail with regards to allocating memory. Whenever
we request some amount of memory, what we are given is a handle to that memory.
This handle (often called a *pointer*, when we're referring to memory) is how
we interact with the allocated memory. As long as we have that handle, we can
do something with the memory. Once we're done with the handle, we're also done
with the memory, as we can't do anything useful without a handle to it.
Historically, systems programming languages require you to track these
allocations, deallocations, and handles yourself. For example, if we want some
memory from the heap in a language like C, we do this:
```c
{
int *x = malloc(sizeof(int));
// we can now do stuff with our handle x
*x = 5;
free(x);
}
```
The call to `malloc` allocates some memory. The call to `free` deallocates the
memory. There's also bookkeeping about allocating the correct amount of memory.
Rust combines these two aspects of allocating memory (and other resources) into
a concept called *ownership*. Whenever we request some memory, that handle we
receive is called the *owning handle*. Whenever that handle goes out of scope,
Rust knows that you cannot do anything with the memory anymore, and so
therefore deallocates the memory for you. Here's the equivalent example in
Rust:
[`Variable bindings`][bindings] have a property in Rust: they have ownership
of what theyre bound to. This means that when a binding goes out of scope, the
resource that theyre bound to are freed. For example:
```rust
{
let x = Box::new(5);
fn foo() {
let v = vec![1, 2, 3];
}
```
The `Box::new` function creates a `Box<T>` (specifically `Box<i32>` in this
case) by allocating a small segment of memory on the heap with enough space to
fit an `i32`. But where in the code is the box deallocated? We said before that
we must have a deallocation for each allocation. Rust handles this for you. It
knows that our handle, `x`, is the owning reference to our box. Rust knows that
`x` will go out of scope at the end of the block, and so it inserts a call to
deallocate the memory at the end of the scope. Because the compiler does this
for us, it's impossible to forget. We always have exactly one deallocation
paired with each of our allocations.
When `v` comes into scope, a new [`Vec<T>`][vect] is created. In this case, the
vector also allocates space on [the heap][heap], for the three elements. When
`v` goes out of scope at the end of `foo()`, Rust will clean up everything
related to the vector, even the heap-allocated memory. This happens
deterministically, at the end of the scope.
This is pretty straightforward, but what happens when we want to pass our box
to a function? Let's look at some code:
[vect]: ../std/vec/struct.Vec.html
[heap]: the-stack-and-the-heap.html
# Move semantics
Theres some more subtlety here, though: Rust ensures that there is _exactly
one_ binding to any given resource. For example, if we have a vector, we can
assign it to another binding:
```rust
fn main() {
let x = Box::new(5);
let v = vec![1, 2, 3];
add_one(x);
}
fn add_one(mut num: Box<i32>) {
*num += 1;
}
let v2 = v;
```
This code works, but it's not ideal. For example, let's add one more line of
code, where we print out the value of `x`:
But, if we try to use `v` afterwards, we get an error:
```{rust,ignore}
fn main() {
let x = Box::new(5);
```rust,ignore
let v = vec![1, 2, 3];
add_one(x);
let v2 = v;
println!("{}", x);
}
fn add_one(mut num: Box<i32>) {
*num += 1;
}
println!("v[0] is: {}", v[0]);
```
This does not compile, and gives us an error:
It looks like this:
```text
error: use of moved value: `x`
println!("{}", x);
^
error: use of moved value: `v`
println!("v[0] is: {}", v[0]);
^
```
Remember, we need one deallocation for every allocation. When we try to pass
our box to `add_one`, we would have two handles to the memory: `x` in `main`,
and `num` in `add_one`. If we deallocated the memory when each handle went out
of scope, we would have two deallocations and one allocation, and that's wrong.
So when we call `add_one`, Rust defines `num` as the owner of the handle. And
so, now that we've given ownership to `num`, `x` is invalid. `x`'s value has
"moved" from `x` to `num`. Hence the error: use of moved value `x`.
A similar thing happens if we define a function which takes ownership, and
try to use something after weve passed it as an argument:
To fix this, we can have `add_one` give ownership back when it's done with the
box:
```rust,ignore
fn take(v: Vec<i32>) {
// what happens here isnt important.
}
let v = vec![1, 2, 3];
take(v);
println!("v[0] is: {}", v[0]);
```
Same error: “use of moved value.” When we transfer ownership to something else,
we say that weve moved the thing we refer to. You dont need some sort of
special annotation here, its the default thing that Rust does.
## The details
The reason that we cannot use a binding after weve moved it is subtle, but
important. When we write code like this:
```rust
fn main() {
let x = Box::new(5);
let v = vec![1, 2, 3];
let y = add_one(x);
println!("{}", y);
}
fn add_one(mut num: Box<i32>) -> Box<i32> {
*num += 1;
num
}
let v2 = v;
```
This code will compile and run just fine. Now, we return a `box`, and so the
ownership is transferred back to `y` in `main`. We only have ownership for the
duration of our function before giving it back. This pattern is very common,
and so Rust introduces a concept to describe a handle which temporarily refers
to something another handle owns. It's called *borrowing*, and it's done with
*references*, designated by the `&` symbol.
The first line creates some data for the vector on the [stack][sh], `v`. The
vectors data, however, is stored on the [heap][sh], and so it contains a
pointer to that data. When we move `v` to `v2`, it creates a copy of that data,
for `v2`. Which would mean two pointers to the contents of the vector on the
heap. That would be a problem: it would violate Rusts safety guarantees by
introducing a data race. Therefore, Rust forbids using `v` after weve done the
move.
# Borrowing
[sh]: the-stack-and-the-heap.html
Here's the current state of our `add_one` function:
Its also important to note that optimizations may remove the actual copy of
the bytes, depending on circumstances. So it may not be as inefficient as it
initially seems.
## `Copy` types
Weve established that when ownership is transferred to another binding, you
cannot use the original binding. However, theres a [trait][traits] that changes this
behavior, and its called `Copy`. We havent discussed traits yet, but for now,
you can think of them as an annotation to a particular type that adds extra
behavior. For example:
```rust
fn add_one(mut num: Box<i32>) -> Box<i32> {
*num += 1;
let v = 1;
num
}
let v2 = v;
println!("v is: {}", v);
```
This function takes ownership, because it takes a `Box`, which owns its
contents. But then we give ownership right back.
In this case, `v` is an `i32`, which implements the `Copy` trait. This means
that, just like a move, when we assign `v` to `v2`, a copy of the data is made.
But, unlike a move, we can still use `v` afterward. This is because an `i32`
has no pointers to data somewhere else, copying it is a full copy.
In the physical world, you can give one of your possessions to someone for a
short period of time. You still own your possession, you're just letting someone
else use it for a while. We call that *lending* something to someone, and that
person is said to be *borrowing* that something from you.
We will discuss how to make your own types `Copy` in the [traits][traits]
section.
Rust's ownership system also allows an owner to lend out a handle for a limited
period. This is also called *borrowing*. Here's a version of `add_one` which
borrows its argument rather than taking ownership:
[traits]: traits.html
# More than ownership
Of course, if we had to hand ownership back with every function we wrote:
```rust
fn add_one(num: &mut i32) {
*num += 1;
fn foo(v: Vec<i32>) -> Vec<i32> {
// do stuff with v
// hand back ownership
v
}
```
This function borrows an `i32` from its caller, and then increments it. When
the function is over, and `num` goes out of scope, the borrow is over.
We have to change our `main` a bit too:
This would get very tedius. It gets worse the more things we want to take ownership of:
```rust
fn main() {
let mut x = 5;
fn foo(v1: Vec<i32>, v2: Vec<i32>) -> (Vec<i32>, Vec<i32>, i32) {
// do stuff with v1 and v2
add_one(&mut x);
println!("{}", x);
// hand back ownership, and the result of our function
(v1, v2, 42)
}
fn add_one(num: &mut i32) {
*num += 1;
}
let v1 = vec![1, 2, 3];
let v2 = vec![1, 2, 3];
let (v1, v2, answer) = foo(v1, v2);
```
We don't need to assign the result of `add_one()` anymore, because it doesn't
return anything anymore. This is because we're not passing ownership back,
since we just borrow, not take ownership.
Ugh! The return type, return line, and calling the function gets way more
complicated.
# Lifetimes
Luckily, Rust offers a feature, borrowing, which helps us solve this problem.
Its the topic of the next section!
Lending out a reference to a resource that someone else owns can be
complicated, however. For example, imagine this set of operations:
1. I acquire a handle to some kind of resource.
2. I lend you a reference to the resource.
3. I decide I'm done with the resource, and deallocate it, while you still have
your reference.
4. You decide to use the resource.
Uh oh! Your reference is pointing to an invalid resource. This is called a
*dangling pointer* or "use after free," when the resource is memory.
To fix this, we have to make sure that step four never happens after step
three. The ownership system in Rust does this through a concept called
*lifetimes*, which describe the scope that a reference is valid for.
Remember the function that borrowed an `i32`? Let's look at it again.
```rust
fn add_one(num: &mut i32) {
*num += 1;
}
```
Rust has a feature called *lifetime elision*, which allows you to not write
lifetime annotations in certain circumstances. This is one of them. We will
cover the others later. Without eliding the lifetimes, `add_one` looks like
this:
```rust
fn add_one<'a>(num: &'a mut i32) {
*num += 1;
}
```
The `'a` is called a *lifetime*. Most lifetimes are used in places where
short names like `'a`, `'b` and `'c` are clearest, but it's often useful to
have more descriptive names. Let's dig into the syntax in a bit more detail:
```{rust,ignore}
fn add_one<'a>(...)
```
This part _declares_ our lifetimes. This says that `add_one` has one lifetime,
`'a`. If we had two, it would look like this:
```{rust,ignore}
fn add_two<'a, 'b>(...)
```
Then in our parameter list, we use the lifetimes we've named:
```{rust,ignore}
...(num: &'a mut i32)
```
If you compare `&mut i32` to `&'a mut i32`, they're the same, it's just that the
lifetime `'a` has snuck in between the `&` and the `mut i32`. We read `&mut i32` as "a
mutable reference to an i32" and `&'a mut i32` as "a mutable reference to an i32 with the lifetime 'a.'"
Why do lifetimes matter? Well, for example, here's some code:
```rust
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let y = &5; // this is the same as `let _y = 5; let y = &_y;`
let f = Foo { x: y };
println!("{}", f.x);
}
```
As you can see, `struct`s can also have lifetimes. In a similar way to functions,
```{rust}
struct Foo<'a> {
# x: &'a i32,
# }
```
declares a lifetime, and
```rust
# struct Foo<'a> {
x: &'a i32,
# }
```
uses it. So why do we need a lifetime here? We need to ensure that any reference
to a `Foo` cannot outlive the reference to an `i32` it contains.
## Thinking in scopes
A way to think about lifetimes is to visualize the scope that a reference is
valid for. For example:
```rust
fn main() {
let y = &5; // -+ y goes into scope
// |
// stuff // |
// |
} // -+ y goes out of scope
```
Adding in our `Foo`:
```rust
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let y = &5; // -+ y goes into scope
let f = Foo { x: y }; // -+ f goes into scope
// stuff // |
// |
} // -+ f and y go out of scope
```
Our `f` lives within the scope of `y`, so everything works. What if it didn't?
This code won't work:
```{rust,ignore}
struct Foo<'a> {
x: &'a i32,
}
fn main() {
let x; // -+ x goes into scope
// |
{ // |
let y = &5; // ---+ y goes into scope
let f = Foo { x: y }; // ---+ f goes into scope
x = &f.x; // | | error here
} // ---+ f and y go out of scope
// |
println!("{}", x); // |
} // -+ x goes out of scope
```
Whew! As you can see here, the scopes of `f` and `y` are smaller than the scope
of `x`. But when we do `x = &f.x`, we make `x` a reference to something that's
about to go out of scope.
Named lifetimes are a way of giving these scopes a name. Giving something a
name is the first step towards being able to talk about it.
## 'static
The lifetime named *static* is a special lifetime. It signals that something
has the lifetime of the entire program. Most Rust programmers first come across
`'static` when dealing with strings:
```rust
let x: &'static str = "Hello, world.";
```
String literals have the type `&'static str` because the reference is always
alive: they are baked into the data segment of the final binary. Another
example are globals:
```rust
static FOO: i32 = 5;
let x: &'static i32 = &FOO;
```
This adds an `i32` to the data segment of the binary, and `x` is a reference
to it.
# Shared Ownership
In all the examples we've considered so far, we've assumed that each handle has
a singular owner. But sometimes, this doesn't work. Consider a car. Cars have
four wheels. We would want a wheel to know which car it was attached to. But
this won't work:
```{rust,ignore}
struct Car {
name: String,
}
struct Wheel {
size: i32,
owner: Car,
}
fn main() {
let car = Car { name: "DeLorean".to_string() };
for _ in 0..4 {
Wheel { size: 360, owner: car };
}
}
```
We try to make four `Wheel`s, each with a `Car` that it's attached to. But the
compiler knows that on the second iteration of the loop, there's a problem:
```text
error: use of moved value: `car`
Wheel { size: 360, owner: car };
^~~
note: `car` moved here because it has type `Car`, which is non-copyable
Wheel { size: 360, owner: car };
^~~
```
We need our `Car` to be pointed to by multiple `Wheel`s. We can't do that with
`Box<T>`, because it has a single owner. We can do it with `Rc<T>` instead:
```rust
use std::rc::Rc;
struct Car {
name: String,
}
struct Wheel {
size: i32,
owner: Rc<Car>,
}
fn main() {
let car = Car { name: "DeLorean".to_string() };
let car_owner = Rc::new(car);
for _ in 0..4 {
Wheel { size: 360, owner: car_owner.clone() };
}
}
```
We wrap our `Car` in an `Rc<T>`, getting an `Rc<Car>`, and then use the
`clone()` method to make new references. We've also changed our `Wheel` to have
an `Rc<Car>` rather than just a `Car`.
This is the simplest kind of multiple ownership possible. For example, there's
also `Arc<T>`, which uses more expensive atomic instructions to be the
thread-safe counterpart of `Rc<T>`.
## Lifetime Elision
Rust supports powerful local type inference in function bodies, but its
forbidden in item signatures to allow reasoning about the types just based in
the item signature alone. However, for ergonomic reasons a very restricted
secondary inference algorithm called “lifetime elision” applies in function
signatures. It infers only based on the signature components themselves and not
based on the body of the function, only infers lifetime parameters, and does
this with only three easily memorizable and unambiguous rules. This makes
lifetime elision a shorthand for writing an item signature, while not hiding
away the actual types involved as full local inference would if applied to it.
When talking about lifetime elision, we use the term *input lifetime* and
*output lifetime*. An *input lifetime* is a lifetime associated with a parameter
of a function, and an *output lifetime* is a lifetime associated with the return
value of a function. For example, this function has an input lifetime:
```{rust,ignore}
fn foo<'a>(bar: &'a str)
```
This one has an output lifetime:
```{rust,ignore}
fn foo<'a>() -> &'a str
```
This one has a lifetime in both positions:
```{rust,ignore}
fn foo<'a>(bar: &'a str) -> &'a str
```
Here are the three rules:
* Each elided lifetime in a function's arguments becomes a distinct lifetime
parameter.
* If there is exactly one input lifetime, elided or not, that lifetime is
assigned to all elided lifetimes in the return values of that function.
* If there are multiple input lifetimes, but one of them is `&self` or `&mut
self`, the lifetime of `self` is assigned to all elided output lifetimes.
Otherwise, it is an error to elide an output lifetime.
### Examples
Here are some examples of functions with elided lifetimes. We've paired each
example of an elided lifetime with its expanded form.
```{rust,ignore}
fn print(s: &str); // elided
fn print<'a>(s: &'a str); // expanded
fn debug(lvl: u32, s: &str); // elided
fn debug<'a>(lvl: u32, s: &'a str); // expanded
// In the preceding example, `lvl` doesn't need a lifetime because it's not a
// reference (`&`). Only things relating to references (such as a `struct`
// which contains a reference) need lifetimes.
fn substr(s: &str, until: u32) -> &str; // elided
fn substr<'a>(s: &'a str, until: u32) -> &'a str; // expanded
fn get_str() -> &str; // ILLEGAL, no inputs
fn frob(s: &str, t: &str) -> &str; // ILLEGAL, two inputs
fn frob<'a, 'b>(s: &'a str, t: &'b str) -> &str; // Expanded: Output lifetime is unclear
fn get_mut(&mut self) -> &mut T; // elided
fn get_mut<'a>(&'a mut self) -> &'a mut T; // expanded
fn args<T:ToCStr>(&mut self, args: &[T]) -> &mut Command // elided
fn args<'a, 'b, T:ToCStr>(&'a mut self, args: &'b [T]) -> &'a mut Command // expanded
fn new(buf: &mut [u8]) -> BufWriter; // elided
fn new<'a>(buf: &'a mut [u8]) -> BufWriter<'a> // expanded
```
# Related Resources
Coming Soon.

View File

@ -1,3 +1,336 @@
% References and Borrowing
Coming Soon! Until then, check out the [ownership](ownership.html) chapter.
This guide is one of three presenting Rusts ownership system. This is one of
Rusts most unique and compelling features, with which Rust developers should
become quite acquainted. Ownership is how Rust achieves its largest goal,
memory safety. The there are a few distinct concepts, each with its own
chapter:
* [ownership][ownership], ownership, the key concept
* borrowing, which youre reading now
* [lifetimes][lifetimes], an advanced concept of borrowing
These three chapters are related, and in order. Youll need all three to fully
understand the ownership system.
[ownership]: ownership.html
[lifetimes]: lifetimes.html
# Meta
Before we get to the details, two important notes about the ownership system.
Rust has a focus on safety and speed. It accomplishes these goals through many
zero-cost abstractions, which means that in Rust, abstractions cost as little
as possible in order to make them work. The ownership system is a prime example
of a zero cost abstraction. All of the analysis well talk about in this guide
is _done at compile time_. You do not pay any run-time cost for any of these
features.
However, this system does have a certain cost: learning curve. Many new users
to Rust experience something we like to call fighting with the borrow
checker, where the Rust compiler refuses to compile a program that the author
thinks is valid. This often happens because the programmers mental model of
how ownership should work doesnt match the actual rules that Rust implements.
You probably will experience similar things at first. There is good news,
however: more experienced Rust developers report that once they work with the
rules of the ownership system for a period of time, they fight the borrow
checker less and less.
With that in mind, lets learn about borrowing.
# Borrowing
At the end of the [ownership][ownership] section, we had a nasty function that looked
like this:
```rust
fn foo(v1: Vec<i32>, v2: Vec<i32>) -> (Vec<i32>, Vec<i32>, i32) {
// do stuff with v1 and v2
// hand back ownership, and the result of our function
(v1, v2, 42)
}
let v1 = vec![1, 2, 3];
let v2 = vec![1, 2, 3];
let (v1, v2, answer) = foo(v1, v2);
```
This is not idiomatic Rust, however, as it doesnt take advantage of borrowing. Heres
the first step:
```rust
fn foo(v1: &Vec<i32>, v2: &Vec<i32>) -> i32 {
// do stuff with v1 and v2
// return the answer
42
}
let v1 = vec![1, 2, 3];
let v2 = vec![1, 2, 3];
let answer = foo(&v1, &v2);
// we can use v1 and v2 here!
```
Instead of taking `Vec<i32>`s as our arguments, we take a reference:
`&Vec<i32>`. And instead of passing `v1` and `v2` directly, we pass `&v1` and
`&v2`. We call the `&T` type a reference, and rather than owning the resource,
it borrows ownership. A binding that borrows something does not deallocate the
resource when it goes out of scope. This means that after the call to `foo()`,
we can use our original bindings again.
References are immutable, just like bindings. This means that inside of `foo()`,
the vectors cant be changed at all:
```rust,ignore
fn foo(v: &Vec<i32>) {
v.push(5);
}
let v = vec![];
foo(&v);
```
errors with:
```text
error: cannot borrow immutable borrowed content `*v` as mutable
v.push(5);
^
```
Pushing a value mutates the vector, and so we arent allowed to do it.
# &mut references
Theres a second kind of reference: `&mut T`. A mutable reference allows you
to mutate the resource youre borrowing. For example:
```rust
let mut x = 5;
{
let y = &mut x;
*y += 1;
}
println!("{}", x);
```
This will print `6`. We make `y` a mutable reference to `x`, then add one to
the thing `y` points at. Youll notice that `x` had to be marked `mut` as well,
if it wasnt, we couldnt take a mutable borrow to an immutable value.
Otherwise, `&mut` references are just like references. There _is_ a large
difference between the two, and how they interact, though. You can tell
something is fishy in the above example, because we need that extra scope, with
the `{` and `}`. If we remove them, we get an error:
```text
error: cannot borrow `x` as immutable because it is also borrowed as mutable
println!("{}", x);
^
note: previous borrow of `x` occurs here; the mutable borrow prevents
subsequent moves, borrows, or modification of `x` until the borrow ends
let y = &mut x;
^
note: previous borrow ends here
fn main() {
}
^
```
As it turns out, there are rules.
# The Rules
Heres the rules about borrowing in Rust:
First, any borrow must last for a smaller scope than the owner. Second, you may
have one or the other of these two kinds of borrows, but not both at the same
time:
* 0 to N references (`&T`) to a resource.
* exactly one mutable reference (`&mut T`)
You may notice that this is very similar, though not exactly the same as,
to the definition of a data race:
> There is a data race when two or more pointers access the same memory
> location at the same time, where at least one of them is writing, and the
> operations are not synchronized.
With references, you may have as many as youd like, since none of them are
writing. If you are writing, you need two or more pointers to the same memory,
and you can only have one `&mut` at a time. This is how Rust prevents data
races at compile time: well get errors if we break the rules.
With this in mind, lets consider our example again.
## Thinking in scopes
Heres the code:
```rust,ignore
let mut x = 5;
let y = &mut x;
*y += 1;
println!("{}", x);
```
This code gives us this error:
```text
error: cannot borrow `x` as immutable because it is also borrowed as mutable
println!("{}", x);
^
```
This is because weve violated the rules: we have a `&mut T` pointing to `x`,
and so we arent allowed to create any `&T`s. One or the other. The note
hints at how to think about this problem:
```text
note: previous borrow ends here
fn main() {
}
^
```
In other words, the mutable borow is held through the rest of our example. What
we want is for the mutable borrow to end _before_ we try to call `println!` and
make an immutable borrow. In Rust, borrowing is tied to the scope that the
borrow is valid for. And our scopes look like this:
```rust,ignore
let mut x = 5;
let y = &mut x; // -+ &mut borrow of x starts here
// |
*y += 1; // |
// |
println!("{}", x); // -+ - try to borrow x here
// -+ &mut borrow of x ends here
```
The scopes conflict: we cant make an `&x` while `y` is in scope.
So when we add the curly braces:
```rust
let mut x = 5;
{
let y = &mut x; // -+ &mut borrow starts here
*y += 1; // |
} // -+ ... and ends here
println!("{}", x); // <- try to borrow x here
```
Theres no problem. Our mutable borrow goes out of scope before we create an
immutable one. But scope is the key to seeing how long a borrow lasts for.
## Issues borrowing prevents
Why have these restrictive rules? Well, as we noted, these rules prevent data
races. What kinds of issues do data races cause? Heres a few.
### Iterator invalidation
One example is iterator invalidation, which happens when you try to mutate a
collection that youre iterating over. Rusts borrow checker prevents this from
happening:
```rust
let mut v = vec![1, 2, 3];
for i in &v {
println!("{}", i);
}
```
This prints out one through three. As we iterate through the vectors, were
only given references to the elements. And `v` is itself borrowed as immutable,
which means we cant change it while were iterating:
```rust,ignore
let mut v = vec![1, 2, 3];
for i in &v {
println!("{}", i);
v.push(34);
}
```
Heres the error:
```text
error: cannot borrow `v` as mutable because it is also borrowed as immutable
v.push(34);
^
note: previous borrow of `v` occurs here; the immutable borrow prevents
subsequent moves or mutable borrows of `v` until the borrow ends
for i in &v {
^
note: previous borrow ends here
for i in &v {
println!(“{}”, i);
v.push(34);
}
^
```
We cant modify `v` because its borrowed by the loop.
### use after free
References must live as long as the resource they refer to. Rust will check the
scopes of your references to ensure that this is true.
If Rust didnt check that this property, we could accidentally use a reference
which was invalid. For example:
```rust,ignore
let y: &i32;
{
let x = 5;
y = &x;
}
println!("{}", y);
```
We get this error:
error: `x` does not live long enough
y = &x;
^
note: reference must be valid for the block suffix following statement 0 at
2:16...
let y: &i32;
{
let x = 5;
y = &x;
}
note: ...but borrowed value is only valid for the block suffix following
statement 0 at 4:18
let x = 5;
y = &x;
}
```
In other words, `y` is only valid for the scope where `x` exists. As soon as
`x` goes away, it becomes invalid to refer to it. As such, the error says that
the borrow doesnt live long enough because its not valid for the right
amount of time.