2014-01-07 20:49:13 -06:00
|
|
|
|
% The Rust References and Lifetimes Guide
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
# Introduction
|
|
|
|
|
|
2014-01-07 20:49:13 -06:00
|
|
|
|
References are one of the more flexible and powerful tools available in
|
2014-05-12 12:31:13 -05:00
|
|
|
|
Rust. They can point anywhere: into the heap, stack, and even into the
|
|
|
|
|
interior of another data structure. A reference is as flexible as a C pointer
|
|
|
|
|
or C++ reference.
|
|
|
|
|
|
|
|
|
|
Unlike C and C++ compilers, the Rust compiler includes special static
|
|
|
|
|
checks that ensure that programs use references safely.
|
2012-10-09 19:12:34 -05:00
|
|
|
|
|
2014-01-07 20:49:13 -06:00
|
|
|
|
Despite their complete safety, a reference's representation at runtime
|
2012-10-09 19:12:34 -05:00
|
|
|
|
is the same as that of an ordinary pointer in a C program. They introduce zero
|
|
|
|
|
overhead. The compiler does all safety checks at compile time.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-06-05 00:27:21 -05:00
|
|
|
|
Although references have rather elaborate theoretical underpinnings
|
|
|
|
|
(e.g. region pointers), the core concepts will be familiar to anyone
|
|
|
|
|
who has worked with C or C++. The best way to explain how they are
|
2014-05-24 19:08:00 -05:00
|
|
|
|
used—and their limitations—is probably just to work through several examples.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
# By example
|
|
|
|
|
|
2014-01-07 20:49:13 -06:00
|
|
|
|
References, sometimes known as *borrowed pointers*, are only valid for
|
|
|
|
|
a limited duration. References never claim any kind of ownership
|
2014-05-24 19:08:00 -05:00
|
|
|
|
over the data that they point to. Instead, they are used for cases
|
2012-10-09 19:12:34 -05:00
|
|
|
|
where you would like to use data for a short time.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-05-24 19:08:00 -05:00
|
|
|
|
Consider a simple struct type `Point`:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
struct Point {x: f64, y: f64}
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-09 19:12:34 -05:00
|
|
|
|
We can use this simple definition to allocate points in many different ways. For
|
2014-05-12 12:31:13 -05:00
|
|
|
|
example, in this code, each of these local variables contains a point,
|
|
|
|
|
but allocated in a different place:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}
|
2014-05-12 12:31:13 -05:00
|
|
|
|
let on_the_stack : Point = Point {x: 3.0, y: 4.0};
|
|
|
|
|
let on_the_heap : Box<Point> = box Point {x: 7.0, y: 9.0};
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-09 19:12:34 -05:00
|
|
|
|
Suppose we wanted to write a procedure that computed the distance between any
|
2014-05-12 12:31:13 -05:00
|
|
|
|
two points, no matter where they were stored. One option is to define a function
|
2014-05-24 15:15:48 -05:00
|
|
|
|
that takes two arguments of type `Point`—that is, it takes the points by value.
|
|
|
|
|
But if we define it this way, calling the function will cause the points to be
|
|
|
|
|
copied. For points, this is probably not so bad, but often copies are
|
|
|
|
|
expensive. So we'd like to define a function that takes the points just as
|
|
|
|
|
a reference.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}
|
|
|
|
|
# fn sqrt(f: f64) -> f64 { 0.0 }
|
|
|
|
|
fn compute_distance(p1: &Point, p2: &Point) -> f64 {
|
2012-09-15 19:09:21 -05:00
|
|
|
|
let x_d = p1.x - p2.x;
|
|
|
|
|
let y_d = p1.y - p2.y;
|
|
|
|
|
sqrt(x_d * x_d + y_d * y_d)
|
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
2014-05-24 15:15:48 -05:00
|
|
|
|
Now we can call `compute_distance()`:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-09-15 20:24:04 -05:00
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}
|
2014-05-05 20:56:44 -05:00
|
|
|
|
# let on_the_stack : Point = Point{x: 3.0, y: 4.0};
|
2014-05-12 12:31:13 -05:00
|
|
|
|
# let on_the_heap : Box<Point> = box Point{x: 7.0, y: 9.0};
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# fn compute_distance(p1: &Point, p2: &Point) -> f64 { 0.0 }
|
2014-07-07 18:35:15 -05:00
|
|
|
|
compute_distance(&on_the_stack, &*on_the_heap);
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-09 19:12:34 -05:00
|
|
|
|
Here, the `&` operator takes the address of the variable
|
2012-10-02 13:57:42 -05:00
|
|
|
|
`on_the_stack`; this is because `on_the_stack` has the type `Point`
|
2012-09-26 18:41:14 -05:00
|
|
|
|
(that is, a struct value) and we have to take its address to get a
|
2012-09-15 19:09:21 -05:00
|
|
|
|
value. We also call this _borrowing_ the local variable
|
2014-05-24 15:15:48 -05:00
|
|
|
|
`on_the_stack`, because we have created an alias: that is, another
|
2012-10-09 19:12:34 -05:00
|
|
|
|
name for the same data.
|
|
|
|
|
|
2014-07-07 18:35:15 -05:00
|
|
|
|
Likewise, in the case of `owned_box`,
|
|
|
|
|
the `&` operator is used in conjunction with the `*` operator
|
|
|
|
|
to take a reference to the contents of the box.
|
2012-10-09 19:12:34 -05:00
|
|
|
|
|
|
|
|
|
Whenever a caller lends data to a callee, there are some limitations on what
|
|
|
|
|
the caller can do with the original. For example, if the contents of a
|
|
|
|
|
variable have been lent out, you cannot send that variable to another task. In
|
|
|
|
|
addition, the compiler will reject any code that might cause the borrowed
|
|
|
|
|
value to be freed or overwrite its component fields with values of different
|
|
|
|
|
types (I'll get into what kinds of actions those are shortly). This rule
|
|
|
|
|
should make intuitive sense: you must wait for a borrower to return the value
|
2014-01-07 20:49:13 -06:00
|
|
|
|
that you lent it (that is, wait for the reference to go out of scope)
|
2012-10-09 19:12:34 -05:00
|
|
|
|
before you can make full use of it again.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
# Other uses for the & operator
|
|
|
|
|
|
|
|
|
|
In the previous example, the value `on_the_stack` was defined like so:
|
|
|
|
|
|
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}
|
2012-10-02 13:57:42 -05:00
|
|
|
|
let on_the_stack: Point = Point {x: 3.0, y: 4.0};
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-10 15:56:08 -05:00
|
|
|
|
This declaration means that code can only pass `Point` by value to other
|
|
|
|
|
functions. As a consequence, we had to explicitly take the address of
|
2014-01-07 20:49:13 -06:00
|
|
|
|
`on_the_stack` to get a reference. Sometimes however it is more
|
2012-10-10 15:56:08 -05:00
|
|
|
|
convenient to move the & operator into the definition of `on_the_stack`:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}
|
2012-10-02 13:57:42 -05:00
|
|
|
|
let on_the_stack2: &Point = &Point {x: 3.0, y: 4.0};
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
|
|
|
|
Applying `&` to an rvalue (non-assignable location) is just a convenient
|
2012-10-10 15:56:08 -05:00
|
|
|
|
shorthand for creating a temporary and taking its address. A more verbose
|
|
|
|
|
way to write the same code is:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}
|
2012-10-02 13:57:42 -05:00
|
|
|
|
let tmp = Point {x: 3.0, y: 4.0};
|
|
|
|
|
let on_the_stack2 : &Point = &tmp;
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# Taking the address of fields
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-05-12 12:31:13 -05:00
|
|
|
|
The `&` operator is not limited to taking the address of
|
2012-10-10 15:56:08 -05:00
|
|
|
|
local variables. It can also take the address of fields or
|
2012-09-15 19:09:21 -05:00
|
|
|
|
individual array elements. For example, consider this type definition
|
2014-05-12 12:31:13 -05:00
|
|
|
|
for `Rectangle`:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
struct Point {x: f64, y: f64} // as before
|
|
|
|
|
struct Size {w: f64, h: f64} // as before
|
2012-10-02 13:57:42 -05:00
|
|
|
|
struct Rectangle {origin: Point, size: Size}
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-10 15:56:08 -05:00
|
|
|
|
Now, as before, we can define rectangles in a few different ways:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}
|
|
|
|
|
# struct Size {w: f64, h: f64} // as before
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# struct Rectangle {origin: Point, size: Size}
|
2014-05-05 20:56:44 -05:00
|
|
|
|
let rect_stack = &Rectangle {origin: Point {x: 1.0, y: 2.0},
|
|
|
|
|
size: Size {w: 3.0, h: 4.0}};
|
2014-05-12 12:31:13 -05:00
|
|
|
|
let rect_heap = box Rectangle {origin: Point {x: 5.0, y: 6.0},
|
2014-05-05 20:56:44 -05:00
|
|
|
|
size: Size {w: 3.0, h: 4.0}};
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-10 15:56:08 -05:00
|
|
|
|
In each case, we can extract out individual subcomponents with the `&`
|
|
|
|
|
operator. For example, I could write:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-09-15 20:24:04 -05:00
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64} // as before
|
|
|
|
|
# struct Size {w: f64, h: f64} // as before
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# struct Rectangle {origin: Point, size: Size}
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# let rect_stack = &Rectangle {origin: Point {x: 1.0, y: 2.0}, size: Size {w: 3.0, h: 4.0}};
|
2014-05-12 12:31:13 -05:00
|
|
|
|
# let rect_heap = box Rectangle {origin: Point {x: 5.0, y: 6.0}, size: Size {w: 3.0, h: 4.0}};
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# fn compute_distance(p1: &Point, p2: &Point) -> f64 { 0.0 }
|
2014-05-12 12:31:13 -05:00
|
|
|
|
compute_distance(&rect_stack.origin, &rect_heap.origin);
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
|
|
|
|
which would borrow the field `origin` from the rectangle on the stack
|
2014-05-12 12:31:13 -05:00
|
|
|
|
as well as from the owned box, and then compute the distance between them.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-05-12 12:31:13 -05:00
|
|
|
|
# Lifetimes
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-05-24 15:15:48 -05:00
|
|
|
|
We’ve seen a few examples of borrowing data. To this point, we’ve glossed
|
2014-05-12 12:31:13 -05:00
|
|
|
|
over issues of safety. As stated in the introduction, at runtime a reference
|
|
|
|
|
is simply a pointer, nothing more. Therefore, avoiding C's problems with
|
|
|
|
|
dangling pointers requires a compile-time safety check.
|
2012-10-10 16:28:43 -05:00
|
|
|
|
|
2014-05-24 15:15:48 -05:00
|
|
|
|
The basis for the check is the notion of _lifetimes_. A lifetime is a
|
2012-10-10 16:28:43 -05:00
|
|
|
|
static approximation of the span of execution during which the pointer
|
|
|
|
|
is valid: it always corresponds to some expression or block within the
|
2014-05-12 12:31:13 -05:00
|
|
|
|
program.
|
|
|
|
|
|
|
|
|
|
The compiler will only allow a borrow *if it can guarantee that the data will
|
|
|
|
|
not be reassigned or moved for the lifetime of the pointer*. This does not
|
|
|
|
|
necessarily mean that the data is stored in immutable memory. For example,
|
2012-09-15 19:09:21 -05:00
|
|
|
|
the following function is legal:
|
|
|
|
|
|
|
|
|
|
~~~
|
2012-09-15 20:06:20 -05:00
|
|
|
|
# fn some_condition() -> bool { true }
|
2013-03-01 21:57:05 -06:00
|
|
|
|
# struct Foo { f: int }
|
2012-09-15 19:09:21 -05:00
|
|
|
|
fn example3() -> int {
|
2014-05-05 20:56:44 -05:00
|
|
|
|
let mut x = box Foo {f: 3};
|
2012-09-15 20:06:20 -05:00
|
|
|
|
if some_condition() {
|
2012-09-15 19:09:21 -05:00
|
|
|
|
let y = &x.f; // -+ L
|
2012-09-15 20:06:20 -05:00
|
|
|
|
return *y; // |
|
2012-09-15 19:09:21 -05:00
|
|
|
|
} // -+
|
2014-05-05 20:56:44 -05:00
|
|
|
|
x = box Foo {f: 4};
|
2014-03-08 05:05:20 -06:00
|
|
|
|
// ...
|
2012-09-15 20:06:20 -05:00
|
|
|
|
# return 0;
|
2012-09-15 19:09:21 -05:00
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
2014-05-24 19:08:00 -05:00
|
|
|
|
Here, the interior of the variable `x` is being borrowed
|
2012-10-10 16:28:43 -05:00
|
|
|
|
and `x` is declared as mutable. However, the compiler can prove that
|
|
|
|
|
`x` is not assigned anywhere in the lifetime L of the variable
|
2012-09-15 19:09:21 -05:00
|
|
|
|
`y`. Therefore, it accepts the function, even though `x` is mutable
|
|
|
|
|
and in fact is mutated later in the function.
|
|
|
|
|
|
2012-10-10 16:28:43 -05:00
|
|
|
|
It may not be clear why we are so concerned about mutating a borrowed
|
2014-05-12 12:31:13 -05:00
|
|
|
|
variable. The reason is that the runtime system frees any box
|
2012-10-10 16:28:43 -05:00
|
|
|
|
_as soon as its owning reference changes or goes out of
|
2012-09-15 19:09:21 -05:00
|
|
|
|
scope_. Therefore, a program like this is illegal (and would be
|
|
|
|
|
rejected by the compiler):
|
|
|
|
|
|
2014-01-11 20:25:19 -06:00
|
|
|
|
~~~ {.ignore}
|
2012-09-15 19:09:21 -05:00
|
|
|
|
fn example3() -> int {
|
2014-05-05 20:56:44 -05:00
|
|
|
|
let mut x = box X {f: 3};
|
2012-09-15 19:09:21 -05:00
|
|
|
|
let y = &x.f;
|
2014-05-05 20:56:44 -05:00
|
|
|
|
x = box X {f: 4}; // Error reported here.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
*y
|
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
|
|
|
|
To make this clearer, consider this diagram showing the state of
|
|
|
|
|
memory immediately before the re-assignment of `x`:
|
|
|
|
|
|
2014-06-02 05:37:54 -05:00
|
|
|
|
~~~ {.text}
|
2012-09-15 19:09:21 -05:00
|
|
|
|
Stack Exchange Heap
|
|
|
|
|
|
2014-05-05 20:56:44 -05:00
|
|
|
|
x +-------------+
|
|
|
|
|
| box {f:int} | ----+
|
|
|
|
|
y +-------------+ |
|
|
|
|
|
| &int | ----+
|
|
|
|
|
+-------------+ | +---------+
|
|
|
|
|
+--> | f: 3 |
|
|
|
|
|
+---------+
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
|
|
|
|
Once the reassignment occurs, the memory will look like this:
|
|
|
|
|
|
2014-06-02 05:37:54 -05:00
|
|
|
|
~~~ {.text}
|
2012-09-15 19:09:21 -05:00
|
|
|
|
Stack Exchange Heap
|
|
|
|
|
|
2014-05-05 20:56:44 -05:00
|
|
|
|
x +-------------+ +---------+
|
|
|
|
|
| box {f:int} | -------> | f: 4 |
|
|
|
|
|
y +-------------+ +---------+
|
|
|
|
|
| &int | ----+
|
|
|
|
|
+-------------+ | +---------+
|
|
|
|
|
+--> | (freed) |
|
|
|
|
|
+---------+
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
2014-05-12 12:31:13 -05:00
|
|
|
|
Here you can see that the variable `y` still points at the old `f`
|
|
|
|
|
property of Foo, which has been freed.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-10-10 16:28:43 -05:00
|
|
|
|
In fact, the compiler can apply the same kind of reasoning to any
|
2014-05-12 12:31:13 -05:00
|
|
|
|
memory that is (uniquely) owned by the stack frame. So we could
|
2013-05-14 08:25:55 -05:00
|
|
|
|
modify the previous example to introduce additional owned pointers
|
2012-10-10 16:28:43 -05:00
|
|
|
|
and structs, and the compiler will still be able to detect possible
|
2014-05-12 12:31:13 -05:00
|
|
|
|
mutations. This time, we'll use an analogy to illustrate the concept.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-01-11 20:25:19 -06:00
|
|
|
|
~~~ {.ignore}
|
2012-09-15 19:09:21 -05:00
|
|
|
|
fn example3() -> int {
|
2014-05-12 12:31:13 -05:00
|
|
|
|
struct House { owner: Box<Person> }
|
|
|
|
|
struct Person { age: int }
|
2012-09-26 18:41:14 -05:00
|
|
|
|
|
2014-05-12 12:31:13 -05:00
|
|
|
|
let mut house = box House {
|
|
|
|
|
owner: box Person {age: 30}
|
|
|
|
|
};
|
|
|
|
|
|
|
|
|
|
let owner_age = &house.owner.age;
|
|
|
|
|
house = box House {owner: box Person {age: 40}}; // Error reported here.
|
|
|
|
|
house.owner = box Person {age: 50}; // Error reported here.
|
|
|
|
|
*owner_age
|
2012-09-15 19:09:21 -05:00
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
2014-05-12 12:31:13 -05:00
|
|
|
|
In this case, two errors are reported, one when the variable `house` is
|
|
|
|
|
modified and another when `house.owner` is modified. Either modification would
|
|
|
|
|
invalidate the pointer `owner_age`.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
# Borrowing and enums
|
|
|
|
|
|
2014-06-10 19:36:02 -05:00
|
|
|
|
The previous example showed that the type system forbids any mutations
|
|
|
|
|
of owned boxed values while they are being borrowed. In general, the type
|
|
|
|
|
system also forbids borrowing a value as mutable if it is already being
|
|
|
|
|
borrowed - either as a mutable reference or an immutable one. This restriction
|
2012-10-10 16:49:07 -05:00
|
|
|
|
prevents pointers from pointing into freed memory. There is one other
|
|
|
|
|
case where the compiler must be very careful to ensure that pointers
|
|
|
|
|
remain valid: pointers into the interior of an `enum`.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-05-24 19:08:00 -05:00
|
|
|
|
Let’s look at the following `shape` type that can represent both rectangles
|
|
|
|
|
and circles:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
struct Point {x: f64, y: f64}; // as before
|
|
|
|
|
struct Size {w: f64, h: f64}; // as before
|
2012-10-02 13:57:42 -05:00
|
|
|
|
enum Shape {
|
2013-09-26 01:26:09 -05:00
|
|
|
|
Circle(Point, f64), // origin, radius
|
2012-10-02 13:57:42 -05:00
|
|
|
|
Rectangle(Point, Size) // upper-left, dimensions
|
2012-09-15 19:09:21 -05:00
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-10 16:49:07 -05:00
|
|
|
|
Now we might write a function to compute the area of a shape. This
|
2014-01-07 20:49:13 -06:00
|
|
|
|
function takes a reference to a shape, to avoid the need for
|
2012-10-10 16:49:07 -05:00
|
|
|
|
copying.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}; // as before
|
|
|
|
|
# struct Size {w: f64, h: f64}; // as before
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# enum Shape {
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# Circle(Point, f64), // origin, radius
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# Rectangle(Point, Size) // upper-left, dimensions
|
2012-09-15 20:06:20 -05:00
|
|
|
|
# }
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# static tau: f64 = 6.28;
|
|
|
|
|
fn compute_area(shape: &Shape) -> f64 {
|
2012-09-15 20:06:20 -05:00
|
|
|
|
match *shape {
|
2012-10-02 13:57:42 -05:00
|
|
|
|
Circle(_, radius) => 0.5 * tau * radius * radius,
|
|
|
|
|
Rectangle(_, ref size) => size.w * size.h
|
2012-09-15 19:09:21 -05:00
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-10 16:49:07 -05:00
|
|
|
|
The first case matches against circles. Here, the pattern extracts the
|
|
|
|
|
radius from the shape variant and the action uses it to compute the
|
|
|
|
|
area of the circle. (Like any up-to-date engineer, we use the [tau
|
|
|
|
|
circle constant][tau] and not that dreadfully outdated notion of pi).
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
[tau]: http://www.math.utah.edu/~palais/pi.html
|
|
|
|
|
|
|
|
|
|
The second match is more interesting. Here we match against a
|
2012-10-10 16:49:07 -05:00
|
|
|
|
rectangle and extract its size: but rather than copy the `size`
|
2014-05-24 15:15:48 -05:00
|
|
|
|
struct, we use a by-reference binding to create a pointer to it. In
|
2012-10-10 16:49:07 -05:00
|
|
|
|
other words, a pattern binding like `ref size` binds the name `size`
|
|
|
|
|
to a pointer of type `&size` into the _interior of the enum_.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-10-10 16:49:07 -05:00
|
|
|
|
To make this more clear, let's look at a diagram of memory layout in
|
|
|
|
|
the case where `shape` points at a rectangle:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-06-02 05:37:54 -05:00
|
|
|
|
~~~ {.text}
|
2012-09-15 19:09:21 -05:00
|
|
|
|
Stack Memory
|
|
|
|
|
|
|
|
|
|
+-------+ +---------------+
|
|
|
|
|
| shape | ------> | rectangle( |
|
2013-09-26 01:26:09 -05:00
|
|
|
|
+-------+ | {x: f64, |
|
|
|
|
|
| size | -+ | y: f64}, |
|
|
|
|
|
+-------+ +----> | {w: f64, |
|
|
|
|
|
| h: f64}) |
|
2012-09-15 19:09:21 -05:00
|
|
|
|
+---------------+
|
|
|
|
|
~~~
|
|
|
|
|
|
|
|
|
|
Here you can see that rectangular shapes are composed of five words of
|
|
|
|
|
memory. The first is a tag indicating which variant this enum is
|
|
|
|
|
(`rectangle`, in this case). The next two words are the `x` and `y`
|
|
|
|
|
fields for the point and the remaining two are the `w` and `h` fields
|
|
|
|
|
for the size. The binding `size` is then a pointer into the inside of
|
|
|
|
|
the shape.
|
|
|
|
|
|
|
|
|
|
Perhaps you can see where the danger lies: if the shape were somehow
|
|
|
|
|
to be reassigned, perhaps to a circle, then although the memory used
|
|
|
|
|
to store that shape value would still be valid, _it would have a
|
2012-10-10 16:49:07 -05:00
|
|
|
|
different type_! The following diagram shows what memory would look
|
|
|
|
|
like if code overwrote `shape` with a circle:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-06-02 05:37:54 -05:00
|
|
|
|
~~~ {.text}
|
2012-09-15 19:09:21 -05:00
|
|
|
|
Stack Memory
|
|
|
|
|
|
|
|
|
|
+-------+ +---------------+
|
|
|
|
|
| shape | ------> | circle( |
|
2013-09-26 01:26:09 -05:00
|
|
|
|
+-------+ | {x: f64, |
|
|
|
|
|
| size | -+ | y: f64}, |
|
|
|
|
|
+-------+ +----> | f64) |
|
2012-09-15 19:09:21 -05:00
|
|
|
|
| |
|
|
|
|
|
+---------------+
|
|
|
|
|
~~~
|
|
|
|
|
|
2013-09-26 01:26:09 -05:00
|
|
|
|
As you can see, the `size` pointer would be pointing at a `f64`
|
2012-10-10 16:49:07 -05:00
|
|
|
|
instead of a struct. This is not good: dereferencing the second field
|
2013-09-26 01:26:09 -05:00
|
|
|
|
of a `f64` as if it were a struct with two fields would be a memory
|
2012-10-10 16:49:07 -05:00
|
|
|
|
safety violation.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
So, in fact, for every `ref` binding, the compiler will impose the
|
2014-05-01 20:02:11 -05:00
|
|
|
|
same rules as the ones we saw for borrowing the interior of an owned
|
2012-10-10 16:49:07 -05:00
|
|
|
|
box: it must be able to guarantee that the `enum` will not be
|
|
|
|
|
overwritten for the duration of the borrow. In fact, the compiler
|
|
|
|
|
would accept the example we gave earlier. The example is safe because
|
2014-01-07 20:49:13 -06:00
|
|
|
|
the shape pointer has type `&Shape`, which means "reference to
|
2012-10-10 16:49:07 -05:00
|
|
|
|
immutable memory containing a `shape`". If, however, the type of that
|
2013-03-22 18:45:54 -05:00
|
|
|
|
pointer were `&mut Shape`, then the ref binding would be ill-typed.
|
2013-05-14 08:25:55 -05:00
|
|
|
|
Just as with owned boxes, the compiler will permit `ref` bindings
|
2013-03-22 18:45:54 -05:00
|
|
|
|
into data owned by the stack frame even if the data are mutable,
|
|
|
|
|
but otherwise it requires that the data reside in immutable memory.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-01-07 20:49:13 -06:00
|
|
|
|
# Returning references
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-01-07 20:49:13 -06:00
|
|
|
|
So far, all of the examples we have looked at, use references in a
|
2012-10-10 16:49:07 -05:00
|
|
|
|
“downward” direction. That is, a method or code block creates a
|
2014-01-07 20:49:13 -06:00
|
|
|
|
reference, then uses it within the same scope. It is also
|
|
|
|
|
possible to return references as the result of a function, but
|
2012-10-10 16:49:07 -05:00
|
|
|
|
as we'll see, doing so requires some explicit annotation.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-05-24 19:08:00 -05:00
|
|
|
|
We could write a subroutine like this:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-10-02 13:57:42 -05:00
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
struct Point {x: f64, y: f64}
|
|
|
|
|
fn get_x<'r>(p: &'r Point) -> &'r f64 { &p.x }
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-10 16:49:07 -05:00
|
|
|
|
Here, the function `get_x()` returns a pointer into the structure it
|
2013-03-16 13:11:31 -05:00
|
|
|
|
was given. The type of the parameter (`&'r Point`) and return type
|
2013-09-26 01:26:09 -05:00
|
|
|
|
(`&'r f64`) both use a new syntactic form that we have not seen so
|
2012-10-10 16:49:07 -05:00
|
|
|
|
far. Here the identifier `r` names the lifetime of the pointer
|
|
|
|
|
explicitly. So in effect, this function declares that it takes a
|
|
|
|
|
pointer with lifetime `r` and returns a pointer with that same
|
|
|
|
|
lifetime.
|
2012-10-02 13:57:42 -05:00
|
|
|
|
|
2014-01-07 20:49:13 -06:00
|
|
|
|
In general, it is only possible to return references if they
|
2012-10-10 16:49:07 -05:00
|
|
|
|
are derived from a parameter to the procedure. In that case, the
|
|
|
|
|
pointer result will always have the same lifetime as one of the
|
|
|
|
|
parameters; named lifetimes indicate which parameter that
|
|
|
|
|
is.
|
2012-10-02 13:57:42 -05:00
|
|
|
|
|
2014-05-24 19:08:00 -05:00
|
|
|
|
In the previous code samples, function parameter types did not include a
|
|
|
|
|
lifetime name. The compiler simply creates a fresh name for the lifetime
|
|
|
|
|
automatically: that is, the lifetime name is guaranteed to refer to a distinct
|
|
|
|
|
lifetime from the lifetimes of all other parameters.
|
2012-10-02 13:57:42 -05:00
|
|
|
|
|
|
|
|
|
Named lifetimes that appear in function signatures are conceptually
|
2013-06-19 14:58:08 -05:00
|
|
|
|
the same as the other lifetimes we have seen before, but they are a bit
|
2012-10-02 13:57:42 -05:00
|
|
|
|
abstract: they don’t refer to a specific expression within `get_x()`,
|
|
|
|
|
but rather to some expression within the *caller of `get_x()`*. The
|
|
|
|
|
lifetime `r` is actually a kind of *lifetime parameter*: it is defined
|
|
|
|
|
by the caller to `get_x()`, just as the value for the parameter `p` is
|
|
|
|
|
defined by that caller.
|
|
|
|
|
|
2012-10-10 16:49:07 -05:00
|
|
|
|
In any case, whatever the lifetime of `r` is, the pointer produced by
|
|
|
|
|
`&p.x` always has the same lifetime as `p` itself: a pointer to a
|
2012-09-26 18:41:14 -05:00
|
|
|
|
field of a struct is valid as long as the struct is valid. Therefore,
|
2012-10-10 16:49:07 -05:00
|
|
|
|
the compiler accepts the function `get_x()`.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-10-10 16:49:07 -05:00
|
|
|
|
To emphasize this point, let’s look at a variation on the example, this
|
|
|
|
|
time one that does not compile:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-01-11 20:25:19 -06:00
|
|
|
|
~~~ {.ignore}
|
2013-09-26 01:26:09 -05:00
|
|
|
|
struct Point {x: f64, y: f64}
|
2014-05-12 12:31:13 -05:00
|
|
|
|
fn get_x_sh(p: &Point) -> &f64 {
|
2012-09-15 19:09:21 -05:00
|
|
|
|
&p.x // Error reported here
|
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
2014-05-12 12:31:13 -05:00
|
|
|
|
Here, the function `get_x_sh()` takes a reference as input and
|
2014-01-07 20:49:13 -06:00
|
|
|
|
returns a reference. As before, the lifetime of the reference
|
|
|
|
|
that will be returned is a parameter (specified by the
|
|
|
|
|
caller). That means that `get_x_sh()` promises to return a reference
|
|
|
|
|
that is valid for as long as the caller would like: this is
|
2012-10-10 16:49:07 -05:00
|
|
|
|
subtly different from the first example, which promised to return a
|
|
|
|
|
pointer that was valid for as long as its pointer argument was valid.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
Within `get_x_sh()`, we see the expression `&p.x` which takes the
|
2014-05-12 12:31:13 -05:00
|
|
|
|
address of a field of a Point. The presence of this expression
|
|
|
|
|
implies that the compiler must guarantee that , so long as the
|
|
|
|
|
resulting pointer is valid, the original Point won't be moved or changed.
|
|
|
|
|
|
|
|
|
|
But recall that `get_x_sh()` also promised to
|
2012-10-10 16:49:07 -05:00
|
|
|
|
return a pointer that was valid for as long as the caller wanted it to
|
|
|
|
|
be. Clearly, `get_x_sh()` is not in a position to make both of these
|
|
|
|
|
guarantees; in fact, it cannot guarantee that the pointer will remain
|
|
|
|
|
valid at all once it returns, as the parameter `p` may or may not be
|
|
|
|
|
live in the caller. Therefore, the compiler will report an error here.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2014-05-24 19:08:00 -05:00
|
|
|
|
In general, if you borrow a struct or box to create a
|
2014-01-07 20:49:13 -06:00
|
|
|
|
reference, it will only be valid within the function
|
|
|
|
|
and cannot be returned. This is why the typical way to return references
|
|
|
|
|
is to take references as input (the only other case in
|
|
|
|
|
which it can be legal to return a reference is if it
|
2012-10-02 13:57:42 -05:00
|
|
|
|
points at a static constant).
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
|
|
|
|
# Named lifetimes
|
|
|
|
|
|
2014-03-23 20:24:17 -05:00
|
|
|
|
Lifetimes can be named and referenced. For example, the special lifetime
|
2014-03-23 15:05:01 -05:00
|
|
|
|
`'static`, which does not go out of scope, can be used to create global
|
2014-04-20 00:35:14 -05:00
|
|
|
|
variables and communicate between tasks (see the manual for use cases).
|
2014-03-23 15:05:01 -05:00
|
|
|
|
|
|
|
|
|
## Parameter Lifetimes
|
|
|
|
|
|
2014-03-23 20:24:17 -05:00
|
|
|
|
Named lifetimes allow for grouping of parameters by lifetime.
|
2014-03-23 15:05:01 -05:00
|
|
|
|
For example, consider this function:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-10-02 13:57:42 -05:00
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}; // as before
|
|
|
|
|
# struct Size {w: f64, h: f64}; // as before
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# enum Shape {
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# Circle(Point, f64), // origin, radius
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# Rectangle(Point, Size) // upper-left, dimensions
|
2012-09-15 20:06:20 -05:00
|
|
|
|
# }
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# fn compute_area(shape: &Shape) -> f64 { 0.0 }
|
|
|
|
|
fn select<'r, T>(shape: &'r Shape, threshold: f64,
|
2013-03-25 15:21:04 -05:00
|
|
|
|
a: &'r T, b: &'r T) -> &'r T {
|
2012-09-15 19:09:21 -05:00
|
|
|
|
if compute_area(shape) > threshold {a} else {b}
|
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
2014-01-07 20:49:13 -06:00
|
|
|
|
This function takes three references and assigns each the same
|
2012-10-02 13:57:42 -05:00
|
|
|
|
lifetime `r`. In practice, this means that, in the caller, the
|
|
|
|
|
lifetime `r` will be the *intersection of the lifetime of the three
|
|
|
|
|
region parameters*. This may be overly conservative, as in this
|
|
|
|
|
example:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-10-02 13:57:42 -05:00
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}; // as before
|
|
|
|
|
# struct Size {w: f64, h: f64}; // as before
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# enum Shape {
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# Circle(Point, f64), // origin, radius
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# Rectangle(Point, Size) // upper-left, dimensions
|
2012-09-15 20:06:20 -05:00
|
|
|
|
# }
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# fn compute_area(shape: &Shape) -> f64 { 0.0 }
|
|
|
|
|
# fn select<'r, T>(shape: &Shape, threshold: f64,
|
2013-03-25 15:21:04 -05:00
|
|
|
|
# a: &'r T, b: &'r T) -> &'r T {
|
2012-09-15 20:06:20 -05:00
|
|
|
|
# if compute_area(shape) > threshold {a} else {b}
|
|
|
|
|
# }
|
2013-03-16 13:11:31 -05:00
|
|
|
|
// -+ r
|
2013-03-25 15:21:04 -05:00
|
|
|
|
fn select_based_on_unit_circle<'r, T>( // |-+ B
|
2014-01-25 10:16:55 -06:00
|
|
|
|
threshold: f64, a: &'r T, b: &'r T) -> &'r T { // | |
|
2013-03-16 13:11:31 -05:00
|
|
|
|
// | |
|
|
|
|
|
let shape = Circle(Point {x: 0., y: 0.}, 1.); // | |
|
|
|
|
|
select(&shape, threshold, a, b) // | |
|
|
|
|
|
} // |-+
|
|
|
|
|
// -+
|
2012-09-15 19:09:21 -05:00
|
|
|
|
~~~
|
|
|
|
|
|
|
|
|
|
In this call to `select()`, the lifetime of the first parameter shape
|
|
|
|
|
is B, the function body. Both of the second two parameters `a` and `b`
|
2012-10-02 13:57:42 -05:00
|
|
|
|
share the same lifetime, `r`, which is a lifetime parameter of
|
2012-09-15 19:09:21 -05:00
|
|
|
|
`select_based_on_unit_circle()`. The caller will infer the
|
2012-10-02 13:57:42 -05:00
|
|
|
|
intersection of these two lifetimes as the lifetime of the returned
|
2012-10-29 15:52:05 -05:00
|
|
|
|
value, and hence the return value of `select()` will be assigned a
|
2012-10-02 13:57:42 -05:00
|
|
|
|
lifetime of B. This will in turn lead to a compilation error, because
|
|
|
|
|
`select_based_on_unit_circle()` is supposed to return a value with the
|
|
|
|
|
lifetime `r`.
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-10-02 13:57:42 -05:00
|
|
|
|
To address this, we can modify the definition of `select()` to
|
2012-09-15 19:09:21 -05:00
|
|
|
|
distinguish the lifetime of the first parameter from the lifetime of
|
|
|
|
|
the latter two. After all, the first parameter is not being
|
2012-10-02 13:57:42 -05:00
|
|
|
|
returned. Here is how the new `select()` might look:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-10-02 13:57:42 -05:00
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}; // as before
|
|
|
|
|
# struct Size {w: f64, h: f64}; // as before
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# enum Shape {
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# Circle(Point, f64), // origin, radius
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# Rectangle(Point, Size) // upper-left, dimensions
|
2012-09-15 20:06:20 -05:00
|
|
|
|
# }
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# fn compute_area(shape: &Shape) -> f64 { 0.0 }
|
|
|
|
|
fn select<'r, 'tmp, T>(shape: &'tmp Shape, threshold: f64,
|
2013-03-25 15:21:04 -05:00
|
|
|
|
a: &'r T, b: &'r T) -> &'r T {
|
2012-09-15 19:09:21 -05:00
|
|
|
|
if compute_area(shape) > threshold {a} else {b}
|
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
2012-10-10 17:04:42 -05:00
|
|
|
|
Here you can see that `shape`'s lifetime is now named `tmp`. The
|
|
|
|
|
parameters `a`, `b`, and the return value all have the lifetime `r`.
|
|
|
|
|
However, since the lifetime `tmp` is not returned, it would be more
|
|
|
|
|
concise to just omit the named lifetime for `shape` altogether:
|
2012-09-15 19:09:21 -05:00
|
|
|
|
|
2012-10-02 13:57:42 -05:00
|
|
|
|
~~~
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# struct Point {x: f64, y: f64}; // as before
|
|
|
|
|
# struct Size {w: f64, h: f64}; // as before
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# enum Shape {
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# Circle(Point, f64), // origin, radius
|
2012-10-02 13:57:42 -05:00
|
|
|
|
# Rectangle(Point, Size) // upper-left, dimensions
|
2012-09-15 20:06:20 -05:00
|
|
|
|
# }
|
2013-09-26 01:26:09 -05:00
|
|
|
|
# fn compute_area(shape: &Shape) -> f64 { 0.0 }
|
|
|
|
|
fn select<'r, T>(shape: &Shape, threshold: f64,
|
2013-03-25 15:21:04 -05:00
|
|
|
|
a: &'r T, b: &'r T) -> &'r T {
|
2012-09-15 19:09:21 -05:00
|
|
|
|
if compute_area(shape) > threshold {a} else {b}
|
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
|
|
|
|
This is equivalent to the previous definition.
|
|
|
|
|
|
2014-03-23 15:05:01 -05:00
|
|
|
|
## Labeled Control Structures
|
|
|
|
|
|
|
|
|
|
Named lifetime notation can also be used to control the flow of execution:
|
|
|
|
|
|
|
|
|
|
~~~
|
2014-04-21 16:58:52 -05:00
|
|
|
|
'h: for i in range(0u, 10) {
|
2014-03-23 15:05:01 -05:00
|
|
|
|
'g: loop {
|
|
|
|
|
if i % 2 == 0 { continue 'h; }
|
|
|
|
|
if i == 9 { break 'h; }
|
|
|
|
|
break 'g;
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
~~~
|
|
|
|
|
|
2014-04-12 14:06:02 -05:00
|
|
|
|
> *Note:* Labelled breaks are not currently supported within `while` loops.
|
2014-03-23 20:24:17 -05:00
|
|
|
|
|
|
|
|
|
Named labels are hygienic and can be used safely within macros.
|
|
|
|
|
See the macros guide section on hygiene for more details.
|
|
|
|
|
|
2012-09-15 19:09:21 -05:00
|
|
|
|
# Conclusion
|
|
|
|
|
|
2014-01-07 20:49:13 -06:00
|
|
|
|
So there you have it: a (relatively) brief tour of the lifetime
|
2012-10-10 17:04:42 -05:00
|
|
|
|
system. For more details, we refer to the (yet to be written) reference
|
2014-01-07 20:49:13 -06:00
|
|
|
|
document on references, which will explain the full notation
|
2012-10-02 13:57:42 -05:00
|
|
|
|
and give more examples.
|