2013-09-27 19:03:34 -05:00
|
|
|
% The Rust Pointer Guide
|
|
|
|
|
|
|
|
Rust's pointers are one of its more unique and compelling features. Pointers
|
|
|
|
are also one of the more confusing topics for newcomers to Rust. They can also
|
|
|
|
be confusing for people coming from other languages that support pointers, such
|
2014-01-07 23:17:38 -06:00
|
|
|
as C++. This guide will help you understand this important topic.
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
Be sceptical of non-reference pointers in Rust: use them for a deliberate
|
|
|
|
purpose, not just to make the compiler happy. Each pointer type comes with an
|
|
|
|
explanation about when they are appropriate to use. Default to references
|
|
|
|
unless you're in one of those specific situations.
|
|
|
|
|
|
|
|
You may be interested in the [cheat sheet](#cheat-sheet), which gives a quick
|
|
|
|
overview of the types, names, and purpose of the various pointers.
|
|
|
|
|
|
|
|
# An introduction
|
|
|
|
|
|
|
|
If you aren't familiar with the concept of pointers, here's a short
|
|
|
|
introduction. Pointers are a very fundamental concept in systems programming
|
|
|
|
languages, so it's important to understand them.
|
|
|
|
|
|
|
|
## Pointer Basics
|
|
|
|
|
|
|
|
When you create a new variable binding, you're giving a name to a value that's
|
|
|
|
stored at a particular location on the stack. (If you're not familiar with the
|
|
|
|
"heap" vs. "stack", please check out [this Stack Overflow
|
|
|
|
question](http://stackoverflow.com/questions/79923/what-and-where-are-the-stack-and-heap),
|
|
|
|
as the rest of this guide assumes you know the difference.) Like this:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
let x = 5i;
|
|
|
|
let y = 8i;
|
|
|
|
```
|
|
|
|
| location | value |
|
|
|
|
|----------|-------|
|
|
|
|
| 0xd3e030 | 5 |
|
|
|
|
| 0xd3e028 | 8 |
|
|
|
|
|
|
|
|
We're making up memory locations here, they're just sample values. Anyway, the
|
|
|
|
point is that `x`, the name we're using for our variable, corresponds to the
|
|
|
|
memory location `0xd3e030`, and the value at that location is `5`. When we
|
|
|
|
refer to `x`, we get the corresponding value. Hence, `x` is `5`.
|
|
|
|
|
|
|
|
Let's introduce a pointer. In some languages, there is just one type of
|
|
|
|
'pointer,' but in Rust, we have many types. In this case, we'll use a Rust
|
|
|
|
**reference**, which is the simplest kind of pointer.
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
let x = 5i;
|
|
|
|
let y = 8i;
|
|
|
|
let z = &y;
|
|
|
|
```
|
|
|
|
|location | value |
|
|
|
|
|-------- |----------|
|
|
|
|
|0xd3e030 | 5 |
|
|
|
|
|0xd3e028 | 8 |
|
|
|
|
|0xd3e020 | 0xd3e028 |
|
|
|
|
|
|
|
|
See the difference? Rather than contain a value, the value of a pointer is a
|
|
|
|
location in memory. In this case, the location of `y`. `x` and `y` have the
|
|
|
|
type `int`, but `z` has the type `&int`. We can print this location using the
|
|
|
|
`{:p}` format string:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
let x = 5i;
|
|
|
|
let y = 8i;
|
|
|
|
let z = &y;
|
|
|
|
|
|
|
|
println!("{:p}", z);
|
|
|
|
```
|
|
|
|
|
|
|
|
This would print `0xd3e028`, with our fictional memory addresses.
|
|
|
|
|
|
|
|
Because `int` and `&int` are different types, we can't, for example, add them
|
|
|
|
together:
|
|
|
|
|
|
|
|
```{rust,ignore}
|
|
|
|
let x = 5i;
|
|
|
|
let y = 8i;
|
|
|
|
let z = &y;
|
|
|
|
|
|
|
|
println!("{}", x + z);
|
|
|
|
```
|
|
|
|
|
|
|
|
This gives us an error:
|
|
|
|
|
|
|
|
```{notrust,ignore}
|
|
|
|
hello.rs:6:24: 6:25 error: mismatched types: expected `int` but found `&int` (expected int but found &-ptr)
|
|
|
|
hello.rs:6 println!("{}", x + z);
|
|
|
|
^
|
|
|
|
```
|
|
|
|
|
|
|
|
We can **dereference** the pointer by using the `*` operator. Dereferencing a
|
|
|
|
pointer means accessing the value at the location stored in the pointer. This
|
|
|
|
will work:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
let x = 5i;
|
|
|
|
let y = 8i;
|
|
|
|
let z = &y;
|
|
|
|
|
|
|
|
println!("{}", x + *z);
|
|
|
|
```
|
|
|
|
|
|
|
|
It prints `13`.
|
|
|
|
|
|
|
|
That's it! That's all pointers are: they point to some memory location. Not
|
|
|
|
much else to them. Now that we've discussed the 'what' of pointers, let's
|
|
|
|
talk about the 'why.'
|
|
|
|
|
|
|
|
## Pointer uses
|
|
|
|
|
|
|
|
Rust's pointers are quite useful, but in different ways than in other systems
|
|
|
|
languages. We'll talk about best practices for Rust pointers later in
|
|
|
|
the guide, but here are some ways that pointers are useful in other languages:
|
|
|
|
|
|
|
|
In C, strings are a pointer to a list of `char`s, ending with a null byte.
|
|
|
|
The only way to use strings is to get quite familiar with pointers.
|
|
|
|
|
|
|
|
Pointers are useful to point to memory locations that are not on the stack. For
|
|
|
|
example, our example used two stack variables, so we were able to give them
|
|
|
|
names. But if we allocated some heap memory, we wouldn't have that name
|
|
|
|
available. In C, `malloc` is used to allocate heap memory, and it returns a
|
|
|
|
pointer.
|
|
|
|
|
|
|
|
As a more general variant of the previous two points, any time you have a
|
|
|
|
structure that can change in size, you need a pointer. You can't tell at
|
|
|
|
compile time how much memory to allocate, so you've gotta use a pointer to
|
|
|
|
point at the memory where it will be allocated, and deal with it at run time.
|
|
|
|
|
|
|
|
Pointers are useful in languages that are pass-by-value, rather than
|
|
|
|
pass-by-reference. Basically, languages can make two choices (this is made
|
|
|
|
up syntax, it's not Rust):
|
|
|
|
|
|
|
|
```{notrust,ignore}
|
2014-11-13 13:47:59 -06:00
|
|
|
func foo(x) {
|
2014-07-18 13:16:28 -05:00
|
|
|
x = 5
|
2013-09-27 19:03:34 -05:00
|
|
|
}
|
|
|
|
|
2014-11-13 13:47:59 -06:00
|
|
|
func main() {
|
2014-07-18 13:16:28 -05:00
|
|
|
i = 1
|
|
|
|
foo(i)
|
|
|
|
// what is the value of i here?
|
2013-09-27 19:03:34 -05:00
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
In languages that are pass-by-value, `foo` will get a copy of `i`, and so
|
|
|
|
the original version of `i` is not modified. At the comment, `i` will still be
|
|
|
|
`1`. In a language that is pass-by-reference, `foo` will get a reference to `i`,
|
|
|
|
and therefore, can change its value. At the comment, `i` will be `5`.
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
So what do pointers have to do with this? Well, since pointers point to a
|
|
|
|
location in memory...
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{notrust,ignore}
|
2014-11-13 13:47:59 -06:00
|
|
|
func foo(&int x) {
|
2014-07-18 13:16:28 -05:00
|
|
|
*x = 5
|
|
|
|
}
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-11-13 13:47:59 -06:00
|
|
|
func main() {
|
2014-07-18 13:16:28 -05:00
|
|
|
i = 1
|
|
|
|
foo(&i)
|
|
|
|
// what is the value of i here?
|
2013-09-27 19:03:34 -05:00
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
Even in a language which is pass by value, `i` will be `5` at the comment. You
|
|
|
|
see, because the argument `x` is a pointer, we do send a copy over to `foo`,
|
|
|
|
but because it points at a memory location, which we then assign to, the
|
|
|
|
original value is still changed. This pattern is called
|
|
|
|
'pass-reference-by-value.' Tricky!
|
2014-05-12 12:31:13 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
## Common pointer problems
|
2014-05-12 12:31:13 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
We've talked about pointers, and we've sung their praises. So what's the
|
|
|
|
downside? Well, Rust attempts to mitigate each of these kinds of problems,
|
|
|
|
but here are problems with pointers in other languages:
|
2014-05-12 12:31:13 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
Uninitialized pointers can cause a problem. For example, what does this program
|
|
|
|
do?
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{notrust,ignore}
|
|
|
|
&int x;
|
|
|
|
*x = 5; // whoops!
|
|
|
|
```
|
2014-05-24 15:15:48 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
Who knows? We just declare a pointer, but don't point it at anything, and then
|
|
|
|
set the memory location that it points at to be `5`. But which location? Nobody
|
|
|
|
knows. This might be harmless, and it might be catastrophic.
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
When you combine pointers and functions, it's easy to accidentally invalidate
|
|
|
|
the memory the pointer is pointing to. For example:
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{notrust,ignore}
|
2014-11-13 13:47:59 -06:00
|
|
|
func make_pointer(): &int {
|
2014-07-18 13:16:28 -05:00
|
|
|
x = 5;
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
return &x;
|
2013-09-27 19:03:34 -05:00
|
|
|
}
|
|
|
|
|
2014-11-13 13:47:59 -06:00
|
|
|
func main() {
|
2014-07-18 13:16:28 -05:00
|
|
|
&int i = make_pointer();
|
|
|
|
*i = 5; // uh oh!
|
2013-09-27 19:03:34 -05:00
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
`x` is local to the `make_pointer` function, and therefore, is invalid as soon
|
|
|
|
as `make_pointer` returns. But we return a pointer to its memory location, and
|
|
|
|
so back in `main`, we try to use that pointer, and it's a very similar
|
|
|
|
situation to our first one. Setting invalid memory locations is bad.
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
As one last example of a big problem with pointers, **aliasing** can be an
|
|
|
|
issue. Two pointers are said to alias when they point at the same location
|
|
|
|
in memory. Like this:
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{notrust,ignore}
|
2014-11-13 13:47:59 -06:00
|
|
|
func mutate(&int i, int j) {
|
2014-07-18 13:16:28 -05:00
|
|
|
*i = j;
|
2013-09-27 19:03:34 -05:00
|
|
|
}
|
|
|
|
|
2014-11-13 13:47:59 -06:00
|
|
|
func main() {
|
2014-07-18 13:16:28 -05:00
|
|
|
x = 5;
|
|
|
|
y = &x;
|
|
|
|
z = &x; //y and z are aliased
|
2013-09-27 19:03:34 -05:00
|
|
|
|
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
run_in_new_thread(mutate, y, 1);
|
|
|
|
run_in_new_thread(mutate, z, 100);
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
// what is the value of x here?
|
2013-09-27 19:03:34 -05:00
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
In this made-up example, `run_in_new_thread` spins up a new thread, and calls
|
|
|
|
the given function name with its arguments. Since we have two threads, and
|
|
|
|
they're both operating on aliases to `x`, we can't tell which one finishes
|
|
|
|
first, and therefore, the value of `x` is actually non-deterministic. Worse,
|
|
|
|
what if one of them had invalidated the memory location they pointed to? We'd
|
|
|
|
have the same problem as before, where we'd be setting an invalid location.
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
## Conclusion
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
That's a basic overview of pointers as a general concept. As we alluded to
|
|
|
|
before, Rust has different kinds of pointers, rather than just one, and
|
|
|
|
mitigates all of the problems that we talked about, too. This does mean that
|
|
|
|
Rust pointers are slightly more complicated than in other languages, but
|
|
|
|
it's worth it to not have the problems that simple pointers have.
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
# References
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
The most basic type of pointer that Rust has is called a 'reference.' Rust
|
|
|
|
references look like this:
|
2014-05-12 12:31:13 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust}
|
|
|
|
let x = 5i;
|
|
|
|
let y = &x;
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
println!("{}", *y);
|
|
|
|
println!("{:p}", y);
|
|
|
|
println!("{}", y);
|
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
We'd say "`y` is a reference to `x`." The first `println!` prints out the
|
|
|
|
value of `y`'s referent by using the dereference operator, `*`. The second
|
|
|
|
one prints out the memory location that `y` points to, by using the pointer
|
|
|
|
format string. The third `println!` *also* prints out the value of `y`'s
|
|
|
|
referent, because `println!` will automatically dereference it for us.
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
Here's a function that takes a reference:
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust}
|
|
|
|
fn succ(x: &int) -> int { *x + 1 }
|
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
You can also use `&` as an operator to create a reference, so we can
|
|
|
|
call this function in two different ways:
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust}
|
|
|
|
fn succ(x: &int) -> int { *x + 1 }
|
2014-05-03 15:24:06 -05:00
|
|
|
|
2013-09-27 19:03:34 -05:00
|
|
|
fn main() {
|
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
let x = 5i;
|
|
|
|
let y = &x;
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
println!("{}", succ(y));
|
|
|
|
println!("{}", succ(&x));
|
2013-09-27 19:03:34 -05:00
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
Both of these `println!`s will print out `6`.
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
Of course, if this were real code, we wouldn't bother with the reference, and
|
|
|
|
just write:
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust}
|
|
|
|
fn succ(x: int) -> int { x + 1 }
|
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
References are immutable by default:
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust,ignore}
|
|
|
|
let x = 5i;
|
|
|
|
let y = &x;
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
*y = 5; // error: cannot assign to immutable dereference of `&`-pointer `*y`
|
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
They can be made mutable with `mut`, but only if its referent is also mutable.
|
|
|
|
This works:
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust}
|
|
|
|
let mut x = 5i;
|
|
|
|
let y = &mut x;
|
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
This does not:
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust,ignore}
|
|
|
|
let x = 5i;
|
|
|
|
let y = &mut x; // error: cannot borrow immutable local variable `x` as mutable
|
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
Immutable pointers are allowed to alias:
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust}
|
|
|
|
let x = 5i;
|
|
|
|
let y = &x;
|
|
|
|
let z = &x;
|
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
Mutable ones, however, are not:
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust,ignore}
|
2014-08-19 15:41:12 -05:00
|
|
|
let mut x = 5i;
|
2014-07-18 13:16:28 -05:00
|
|
|
let y = &mut x;
|
|
|
|
let z = &mut x; // error: cannot borrow `x` as mutable more than once at a time
|
|
|
|
```
|
|
|
|
|
|
|
|
Despite their complete safety, a reference's representation at runtime is the
|
|
|
|
same as that of an ordinary pointer in a C program. They introduce zero
|
|
|
|
overhead. The compiler does all safety checks at compile time. The theory that
|
|
|
|
allows for this was originally called **region pointers**. Region pointers
|
|
|
|
evolved into what we know today as **lifetimes**.
|
2014-05-12 12:31:13 -05:00
|
|
|
|
|
|
|
Here's the simple explanation: would you expect this code to compile?
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust,ignore}
|
2013-09-27 19:03:34 -05:00
|
|
|
fn main() {
|
2014-01-09 04:06:55 -06:00
|
|
|
println!("{}", x);
|
2013-09-27 19:03:34 -05:00
|
|
|
let x = 5;
|
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-01-07 20:15:14 -06:00
|
|
|
Probably not. That's because you know that the name `x` is valid from where
|
2013-09-27 19:03:34 -05:00
|
|
|
it's declared to when it goes out of scope. In this case, that's the end of
|
|
|
|
the `main` function. So you know this code will cause an error. We call this
|
|
|
|
duration a 'lifetime'. Let's try a more complex example:
|
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust}
|
2013-09-27 19:03:34 -05:00
|
|
|
fn main() {
|
2014-07-18 13:16:28 -05:00
|
|
|
let x = &mut 5i;
|
|
|
|
|
2014-01-07 23:17:38 -06:00
|
|
|
if *x < 10 {
|
2013-09-27 19:03:34 -05:00
|
|
|
let y = &x;
|
2014-07-18 13:16:28 -05:00
|
|
|
|
2014-05-22 13:28:01 -05:00
|
|
|
println!("Oh no: {}", y);
|
2013-09-27 19:03:34 -05:00
|
|
|
return;
|
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
|
2014-01-07 23:17:38 -06:00
|
|
|
*x -= 1;
|
2014-07-18 13:16:28 -05:00
|
|
|
|
2014-05-22 13:28:01 -05:00
|
|
|
println!("Oh no: {}", x);
|
2013-09-27 19:03:34 -05:00
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
|
|
|
Here, we're borrowing a pointer to `x` inside of the `if`. The compiler, however,
|
|
|
|
is able to determine that that pointer will go out of scope without `x` being
|
|
|
|
mutated, and therefore, lets us pass. This wouldn't work:
|
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust,ignore}
|
2013-09-27 19:03:34 -05:00
|
|
|
fn main() {
|
2014-07-18 13:16:28 -05:00
|
|
|
let x = &mut 5i;
|
|
|
|
|
2014-01-07 23:17:38 -06:00
|
|
|
if *x < 10 {
|
2013-09-27 19:03:34 -05:00
|
|
|
let y = &x;
|
2014-01-07 23:17:38 -06:00
|
|
|
*x -= 1;
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-05-22 13:28:01 -05:00
|
|
|
println!("Oh no: {}", y);
|
2013-09-27 19:03:34 -05:00
|
|
|
return;
|
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
|
2014-01-07 23:17:38 -06:00
|
|
|
*x -= 1;
|
2014-07-18 13:16:28 -05:00
|
|
|
|
2014-05-22 13:28:01 -05:00
|
|
|
println!("Oh no: {}", x);
|
2013-09-27 19:03:34 -05:00
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
|
|
|
It gives this error:
|
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{notrust,ignore}
|
2013-09-27 19:03:34 -05:00
|
|
|
test.rs:5:8: 5:10 error: cannot assign to `*x` because it is borrowed
|
2014-01-07 23:17:38 -06:00
|
|
|
test.rs:5 *x -= 1;
|
2013-09-27 19:03:34 -05:00
|
|
|
^~
|
|
|
|
test.rs:4:16: 4:18 note: borrow of `*x` occurs here
|
|
|
|
test.rs:4 let y = &x;
|
|
|
|
^~
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
2013-09-27 19:03:34 -05:00
|
|
|
|
|
|
|
As you might guess, this kind of analysis is complex for a human, and therefore
|
2014-01-07 20:49:13 -06:00
|
|
|
hard for a computer, too! There is an entire [guide devoted to references
|
|
|
|
and lifetimes](guide-lifetimes.html) that goes into lifetimes in
|
2013-09-27 19:03:34 -05:00
|
|
|
great detail, so if you want the full details, check that out.
|
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
## Best practices
|
|
|
|
|
|
|
|
In general, prefer stack allocation over heap allocation. Using references to
|
|
|
|
stack allocated information is preferred whenever possible. Therefore,
|
2014-10-27 09:41:24 -05:00
|
|
|
references are the default pointer type you should use, unless you have a
|
2014-07-18 13:16:28 -05:00
|
|
|
specific reason to use a different type. The other types of pointers cover when
|
|
|
|
they're appropriate to use in their own best practices sections.
|
|
|
|
|
|
|
|
Use references when you want to use a pointer, but do not want to take ownership.
|
|
|
|
References just borrow ownership, which is more polite if you don't need the
|
|
|
|
ownership. In other words, prefer:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
fn succ(x: &int) -> int { *x + 1 }
|
|
|
|
```
|
|
|
|
|
|
|
|
to
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
fn succ(x: Box<int>) -> int { *x + 1 }
|
|
|
|
```
|
|
|
|
|
|
|
|
As a corollary to that rule, references allow you to accept a wide variety of
|
|
|
|
other pointers, and so are useful so that you don't have to write a number
|
|
|
|
of variants per pointer. In other words, prefer:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
fn succ(x: &int) -> int { *x + 1 }
|
|
|
|
```
|
|
|
|
|
|
|
|
to
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
fn box_succ(x: Box<int>) -> int { *x + 1 }
|
|
|
|
|
|
|
|
fn rc_succ(x: std::rc::Rc<int>) -> int { *x + 1 }
|
|
|
|
```
|
|
|
|
|
|
|
|
# Boxes
|
|
|
|
|
|
|
|
`Box<T>` is Rust's 'boxed pointer' type. Boxes provide the simplest form of
|
|
|
|
heap allocation in Rust. Creating a box looks like this:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
let x = box(std::boxed::HEAP) 5i;
|
|
|
|
```
|
|
|
|
|
|
|
|
`box` is a keyword that does 'placement new,' which we'll talk about in a bit.
|
|
|
|
`box` will be useful for creating a number of heap-allocated types, but is not
|
|
|
|
quite finished yet. In the meantime, `box`'s type defaults to
|
|
|
|
`std::boxed::HEAP`, and so you can leave it off:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
let x = box 5i;
|
|
|
|
```
|
|
|
|
|
|
|
|
As you might assume from the `HEAP`, boxes are heap allocated. They are
|
|
|
|
deallocated automatically by Rust when they go out of scope:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
{
|
|
|
|
let x = box 5i;
|
|
|
|
|
|
|
|
// stuff happens
|
|
|
|
|
|
|
|
} // x is destructed and its memory is free'd here
|
|
|
|
```
|
|
|
|
|
|
|
|
However, boxes do _not_ use reference counting or garbage collection. Boxes are
|
|
|
|
what's called an **affine type**. This means that the Rust compiler, at compile
|
|
|
|
time, determines when the box comes into and goes out of scope, and inserts the
|
|
|
|
appropriate calls there. Furthermore, boxes are a specific kind of affine type,
|
|
|
|
known as a **region**. You can read more about regions [in this paper on the
|
|
|
|
Cyclone programming
|
|
|
|
language](http://www.cs.umd.edu/projects/cyclone/papers/cyclone-regions.pdf).
|
|
|
|
|
|
|
|
You don't need to fully grok the theory of affine types or regions to grok
|
|
|
|
boxes, though. As a rough approximation, you can treat this Rust code:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
{
|
|
|
|
let x = box 5i;
|
|
|
|
|
|
|
|
// stuff happens
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
As being similar to this C code:
|
|
|
|
|
|
|
|
```{notrust,ignore}
|
|
|
|
{
|
|
|
|
int *x;
|
|
|
|
x = (int *)malloc(sizeof(int));
|
2014-09-24 08:47:11 -05:00
|
|
|
*x = 5;
|
2014-07-18 13:16:28 -05:00
|
|
|
|
|
|
|
// stuff happens
|
|
|
|
|
|
|
|
free(x);
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
Of course, this is a 10,000 foot view. It leaves out destructors, for example.
|
|
|
|
But the general idea is correct: you get the semantics of `malloc`/`free`, but
|
|
|
|
with some improvements:
|
|
|
|
|
|
|
|
1. It's impossible to allocate the incorrect amount of memory, because Rust
|
|
|
|
figures it out from the types.
|
|
|
|
2. You cannot forget to `free` memory you've allocated, because Rust does it
|
|
|
|
for you.
|
|
|
|
3. Rust ensures that this `free` happens at the right time, when it is truly
|
|
|
|
not used. Use-after-free is not possible.
|
|
|
|
4. Rust enforces that no other writeable pointers alias to this heap memory,
|
|
|
|
which means writing to an invalid pointer is not possible.
|
|
|
|
|
|
|
|
See the section on references or the [lifetimes guide](guide-lifetimes.html)
|
|
|
|
for more detail on how lifetimes work.
|
|
|
|
|
|
|
|
Using boxes and references together is very common. For example:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
fn add_one(x: &int) -> int {
|
|
|
|
*x + 1
|
|
|
|
}
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
let x = box 5i;
|
|
|
|
|
|
|
|
println!("{}", add_one(&*x));
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
In this case, Rust knows that `x` is being 'borrowed' by the `add_one()`
|
|
|
|
function, and since it's only reading the value, allows it.
|
|
|
|
|
|
|
|
We can borrow `x` multiple times, as long as it's not simultaneous:
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
fn add_one(x: &int) -> int {
|
|
|
|
*x + 1
|
|
|
|
}
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
let x = box 5i;
|
|
|
|
|
|
|
|
println!("{}", add_one(&*x));
|
|
|
|
println!("{}", add_one(&*x));
|
|
|
|
println!("{}", add_one(&*x));
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
Or as long as it's not a mutable borrow. This will error:
|
|
|
|
|
|
|
|
```{rust,ignore}
|
|
|
|
fn add_one(x: &mut int) -> int {
|
|
|
|
*x + 1
|
|
|
|
}
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
let x = box 5i;
|
|
|
|
|
|
|
|
println!("{}", add_one(&*x)); // error: cannot borrow immutable dereference
|
|
|
|
// of `&`-pointer as mutable
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
Notice we changed the signature of `add_one()` to request a mutable reference.
|
|
|
|
|
2014-07-30 09:09:03 -05:00
|
|
|
## Best practices
|
2014-07-18 13:16:28 -05:00
|
|
|
|
|
|
|
Boxes are appropriate to use in two situations: Recursive data structures,
|
|
|
|
and occasionally, when returning data.
|
|
|
|
|
2014-07-30 09:09:03 -05:00
|
|
|
### Recursive data structures
|
2014-07-18 13:16:28 -05:00
|
|
|
|
|
|
|
Sometimes, you need a recursive data structure. The simplest is known as a
|
|
|
|
'cons list':
|
|
|
|
|
|
|
|
|
|
|
|
```{rust}
|
|
|
|
#[deriving(Show)]
|
|
|
|
enum List<T> {
|
|
|
|
Cons(T, Box<List<T>>),
|
|
|
|
Nil,
|
|
|
|
}
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
let list: List<int> = Cons(1, box Cons(2, box Cons(3, box Nil)));
|
|
|
|
println!("{}", list);
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
This prints:
|
|
|
|
|
|
|
|
```{notrust,ignore}
|
|
|
|
Cons(1, box Cons(2, box Cons(3, box Nil)))
|
|
|
|
```
|
|
|
|
|
|
|
|
The reference to another `List` inside of the `Cons` enum variant must be a box,
|
|
|
|
because we don't know the length of the list. Because we don't know the length,
|
|
|
|
we don't know the size, and therefore, we need to heap allocate our list.
|
|
|
|
|
|
|
|
Working with recursive or other unknown-sized data structures is the primary
|
|
|
|
use-case for boxes.
|
|
|
|
|
2014-07-30 09:09:03 -05:00
|
|
|
### Returning data
|
2014-07-18 13:16:28 -05:00
|
|
|
|
|
|
|
This is important enough to have its own section entirely. The TL;DR is this:
|
|
|
|
you don't generally want to return pointers, even when you might in a language
|
|
|
|
like C or C++.
|
|
|
|
|
|
|
|
See [Returning Pointers](#returning-pointers) below for more.
|
|
|
|
|
|
|
|
# Rc and Arc
|
|
|
|
|
|
|
|
This part is coming soon.
|
|
|
|
|
|
|
|
## Best practices
|
|
|
|
|
|
|
|
This part is coming soon.
|
|
|
|
|
|
|
|
# Raw Pointers
|
|
|
|
|
|
|
|
This part is coming soon.
|
|
|
|
|
|
|
|
## Best practices
|
|
|
|
|
|
|
|
This part is coming soon.
|
|
|
|
|
2014-05-24 15:15:48 -05:00
|
|
|
# Returning Pointers
|
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
In many languages with pointers, you'd return a pointer from a function
|
2014-08-09 21:12:15 -05:00
|
|
|
so as to avoid copying a large data structure. For example:
|
2014-05-24 15:15:48 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust}
|
|
|
|
struct BigStruct {
|
|
|
|
one: int,
|
|
|
|
two: int,
|
|
|
|
// etc
|
|
|
|
one_hundred: int,
|
|
|
|
}
|
2014-05-24 15:15:48 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
fn foo(x: Box<BigStruct>) -> Box<BigStruct> {
|
2014-05-24 15:15:48 -05:00
|
|
|
return box *x;
|
|
|
|
}
|
|
|
|
|
|
|
|
fn main() {
|
2014-07-18 13:16:28 -05:00
|
|
|
let x = box BigStruct {
|
|
|
|
one: 1,
|
|
|
|
two: 2,
|
|
|
|
one_hundred: 100,
|
|
|
|
};
|
|
|
|
|
2014-05-24 15:15:48 -05:00
|
|
|
let y = foo(x);
|
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
|
|
|
|
|
|
|
The idea is that by passing around a box, you're only copying a pointer, rather
|
|
|
|
than the hundred `int`s that make up the `BigStruct`.
|
2014-05-24 15:15:48 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
This is an antipattern in Rust. Instead, write this:
|
2014-05-24 15:15:48 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
```{rust}
|
|
|
|
struct BigStruct {
|
|
|
|
one: int,
|
|
|
|
two: int,
|
|
|
|
// etc
|
|
|
|
one_hundred: int,
|
|
|
|
}
|
|
|
|
|
|
|
|
fn foo(x: Box<BigStruct>) -> BigStruct {
|
2014-05-24 15:15:48 -05:00
|
|
|
return *x;
|
|
|
|
}
|
|
|
|
|
|
|
|
fn main() {
|
2014-07-18 13:16:28 -05:00
|
|
|
let x = box BigStruct {
|
|
|
|
one: 1,
|
|
|
|
two: 2,
|
|
|
|
one_hundred: 100,
|
|
|
|
};
|
|
|
|
|
2014-05-24 15:15:48 -05:00
|
|
|
let y = box foo(x);
|
|
|
|
}
|
2014-07-18 13:16:28 -05:00
|
|
|
```
|
2014-05-24 15:15:48 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
This gives you flexibility without sacrificing performance.
|
2014-05-24 15:15:48 -05:00
|
|
|
|
|
|
|
You may think that this gives us terrible performance: return a value and then
|
|
|
|
immediately box it up ?! Isn't that the worst of both worlds? Rust is smarter
|
2014-07-18 13:16:28 -05:00
|
|
|
than that. There is no copy in this code. main allocates enough room for the
|
|
|
|
`box , passes a pointer to that memory into foo as x, and then foo writes the
|
|
|
|
value straight into that pointer. This writes the return value directly into
|
2014-05-24 15:15:48 -05:00
|
|
|
the allocated box.
|
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
This is important enough that it bears repeating: pointers are not for
|
|
|
|
optimizing returning values from your code. Allow the caller to choose how they
|
|
|
|
want to use your output.
|
|
|
|
|
|
|
|
# Creating your own Pointers
|
|
|
|
|
|
|
|
This part is coming soon.
|
|
|
|
|
|
|
|
## Best practices
|
|
|
|
|
|
|
|
This part is coming soon.
|
|
|
|
|
2014-08-28 14:12:18 -05:00
|
|
|
# Patterns and `ref`
|
|
|
|
|
|
|
|
When you're trying to match something that's stored in a pointer, there may be
|
|
|
|
a situation where matching directly isn't the best option available. Let's see
|
|
|
|
how to properly handle this:
|
|
|
|
|
|
|
|
```{rust,ignore}
|
|
|
|
fn possibly_print(x: &Option<String>) {
|
|
|
|
match *x {
|
|
|
|
// BAD: cannot move out of a `&`
|
|
|
|
Some(s) => println!("{}", s)
|
|
|
|
|
|
|
|
// GOOD: instead take a reference into the memory of the `Option`
|
|
|
|
Some(ref s) => println!("{}", *s),
|
|
|
|
None => {}
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
The `ref s` here means that `s` will be of type `&String`, rather than type
|
|
|
|
`String`.
|
|
|
|
|
|
|
|
This is important when the type you're trying to get access to has a destructor
|
|
|
|
and you don't want to move it, you just want a reference to it.
|
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
# Cheat Sheet
|
|
|
|
|
|
|
|
Here's a quick rundown of Rust's pointer types:
|
|
|
|
|
2014-07-30 08:41:05 -05:00
|
|
|
| Type | Name | Summary |
|
|
|
|
|--------------|---------------------|---------------------------------------------------------------------|
|
|
|
|
| `&T` | Reference | Allows one or more references to read `T` |
|
|
|
|
| `&mut T` | Mutable Reference | Allows a single reference to read and write `T` |
|
|
|
|
| `Box<T>` | Box | Heap allocated `T` with a single owner that may read and write `T`. |
|
|
|
|
| `Rc<T>` | "arr cee" pointer | Heap allocated `T` with many readers |
|
|
|
|
| `Arc<T>` | Arc pointer | Same as above, but safe sharing across threads |
|
|
|
|
| `*const T` | Raw pointer | Unsafe read access to `T` |
|
|
|
|
| `*mut T` | Mutable raw pointer | Unsafe read and write access to `T` |
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
# Related resources
|
2013-09-27 19:03:34 -05:00
|
|
|
|
2014-07-18 13:16:28 -05:00
|
|
|
* [API documentation for Box](std/boxed/index.html)
|
2014-01-07 23:17:38 -06:00
|
|
|
* [Lifetimes guide](guide-lifetimes.html)
|
2014-07-18 13:16:28 -05:00
|
|
|
* [Cyclone paper on regions](http://www.cs.umd.edu/projects/cyclone/papers/cyclone-regions.pdf), which inspired Rust's lifetime system
|