2015-01-13 21:06:52 -06:00
|
|
|
% Concurrency
|
|
|
|
|
|
|
|
Concurrency and parallelism are incredibly important topics in computer
|
|
|
|
science, and are also a hot topic in industry today. Computers are gaining more
|
|
|
|
and more cores, yet many programmers aren't prepared to fully utilize them.
|
|
|
|
|
|
|
|
Rust's memory safety features also apply to its concurrency story too. Even
|
|
|
|
concurrent Rust programs must be memory safe, having no data races. Rust's type
|
2015-05-14 10:15:50 -05:00
|
|
|
system is up to the task, and gives you powerful ways to reason about
|
2015-01-13 21:06:52 -06:00
|
|
|
concurrent code at compile time.
|
|
|
|
|
|
|
|
Before we talk about the concurrency features that come with Rust, it's important
|
2015-07-07 08:29:04 -05:00
|
|
|
to understand something: Rust is low-level enough that the vast majority of
|
|
|
|
this is provided by the standard library, not by the language. This means that
|
|
|
|
if you don't like some aspect of the way Rust handles concurrency, you can
|
|
|
|
implement an alternative way of doing things.
|
|
|
|
[mio](https://github.com/carllerche/mio) is a real-world example of this
|
|
|
|
principle in action.
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
## Background: `Send` and `Sync`
|
|
|
|
|
|
|
|
Concurrency is difficult to reason about. In Rust, we have a strong, static
|
|
|
|
type system to help us reason about our code. As such, Rust gives us two traits
|
|
|
|
to help us make sense of code that can possibly be concurrent.
|
|
|
|
|
|
|
|
### `Send`
|
|
|
|
|
|
|
|
The first trait we're going to talk about is
|
|
|
|
[`Send`](../std/marker/trait.Send.html). When a type `T` implements `Send`, it indicates
|
|
|
|
to the compiler that something of this type is able to have ownership transferred
|
|
|
|
safely between threads.
|
|
|
|
|
|
|
|
This is important to enforce certain restrictions. For example, if we have a
|
|
|
|
channel connecting two threads, we would want to be able to send some data
|
|
|
|
down the channel and to the other thread. Therefore, we'd ensure that `Send` was
|
|
|
|
implemented for that type.
|
|
|
|
|
|
|
|
In the opposite way, if we were wrapping a library with FFI that isn't
|
|
|
|
threadsafe, we wouldn't want to implement `Send`, and so the compiler will help
|
|
|
|
us enforce that it can't leave the current thread.
|
|
|
|
|
|
|
|
### `Sync`
|
|
|
|
|
2015-03-01 22:45:53 -06:00
|
|
|
The second of these traits is called [`Sync`](../std/marker/trait.Sync.html).
|
2015-01-13 21:06:52 -06:00
|
|
|
When a type `T` implements `Sync`, it indicates to the compiler that something
|
|
|
|
of this type has no possibility of introducing memory unsafety when used from
|
|
|
|
multiple threads concurrently.
|
|
|
|
|
|
|
|
For example, sharing immutable data with an atomic reference count is
|
|
|
|
threadsafe. Rust provides a type like this, `Arc<T>`, and it implements `Sync`,
|
2015-03-01 22:45:53 -06:00
|
|
|
so it is safe to share between threads.
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
These two traits allow you to use the type system to make strong guarantees
|
|
|
|
about the properties of your code under concurrency. Before we demonstrate
|
|
|
|
why, we need to learn how to create a concurrent Rust program in the first
|
|
|
|
place!
|
|
|
|
|
|
|
|
## Threads
|
|
|
|
|
2015-04-13 17:15:32 -05:00
|
|
|
Rust's standard library provides a library for threads, which allow you to
|
2015-02-17 17:24:34 -06:00
|
|
|
run Rust code in parallel. Here's a basic example of using `std::thread`:
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-05-18 13:56:00 -05:00
|
|
|
```rust
|
2015-02-17 17:24:34 -06:00
|
|
|
use std::thread;
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
fn main() {
|
2015-04-13 17:15:32 -05:00
|
|
|
thread::spawn(|| {
|
2015-01-13 21:06:52 -06:00
|
|
|
println!("Hello from a thread!");
|
|
|
|
});
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2015-04-13 17:15:32 -05:00
|
|
|
The `thread::spawn()` method accepts a closure, which is executed in a
|
|
|
|
new thread. It returns a handle to the thread, that can be used to
|
|
|
|
wait for the child thread to finish and extract its result:
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-05-18 13:56:00 -05:00
|
|
|
```rust
|
2015-02-17 17:24:34 -06:00
|
|
|
use std::thread;
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
fn main() {
|
2015-04-13 17:15:32 -05:00
|
|
|
let handle = thread::spawn(|| {
|
|
|
|
"Hello from a thread!"
|
2015-01-13 21:06:52 -06:00
|
|
|
});
|
|
|
|
|
2015-04-13 17:15:32 -05:00
|
|
|
println!("{}", handle.join().unwrap());
|
2015-01-13 21:06:52 -06:00
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
Many languages have the ability to execute threads, but it's wildly unsafe.
|
|
|
|
There are entire books about how to prevent errors that occur from shared
|
|
|
|
mutable state. Rust helps out with its type system here as well, by preventing
|
|
|
|
data races at compile time. Let's talk about how you actually share things
|
|
|
|
between threads.
|
|
|
|
|
|
|
|
## Safe Shared Mutable State
|
|
|
|
|
|
|
|
Due to Rust's type system, we have a concept that sounds like a lie: "safe
|
|
|
|
shared mutable state." Many programmers agree that shared mutable state is
|
|
|
|
very, very bad.
|
|
|
|
|
|
|
|
Someone once said this:
|
|
|
|
|
|
|
|
> Shared mutable state is the root of all evil. Most languages attempt to deal
|
|
|
|
> with this problem through the 'mutable' part, but Rust deals with it by
|
|
|
|
> solving the 'shared' part.
|
|
|
|
|
|
|
|
The same [ownership system](ownership.html) that helps prevent using pointers
|
|
|
|
incorrectly also helps rule out data races, one of the worst kinds of
|
|
|
|
concurrency bugs.
|
|
|
|
|
|
|
|
As an example, here is a Rust program that would have a data race in many
|
|
|
|
languages. It will not compile:
|
|
|
|
|
|
|
|
```ignore
|
2015-02-17 17:24:34 -06:00
|
|
|
use std::thread;
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
fn main() {
|
2015-08-03 19:48:14 -05:00
|
|
|
let mut data = vec![1, 2, 3];
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-05-02 07:29:53 -05:00
|
|
|
for i in 0..3 {
|
2015-02-17 17:24:34 -06:00
|
|
|
thread::spawn(move || {
|
2015-01-13 21:06:52 -06:00
|
|
|
data[i] += 1;
|
|
|
|
});
|
|
|
|
}
|
|
|
|
|
2015-04-03 14:29:33 -05:00
|
|
|
thread::sleep_ms(50);
|
2015-01-13 21:06:52 -06:00
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
This gives us an error:
|
|
|
|
|
|
|
|
```text
|
2015-04-03 14:29:33 -05:00
|
|
|
8:17 error: capture of moved value: `data`
|
2015-01-13 21:06:52 -06:00
|
|
|
data[i] += 1;
|
|
|
|
^~~~
|
|
|
|
```
|
|
|
|
|
2015-08-03 12:10:49 -05:00
|
|
|
Rust knows this wouldn't be safe! If we had a reference to `data` in each
|
|
|
|
thread, and the thread takes ownership of the reference, we'd have three
|
|
|
|
owners!
|
|
|
|
|
|
|
|
So, we need some type that lets us have more than one reference to a value and
|
|
|
|
that we can share between threads, that is it must implement `Sync`.
|
|
|
|
|
|
|
|
We'll use `Arc<T>`, rust's standard atomic reference count type, which
|
|
|
|
wraps a value up with some extra runtime bookkeeping which allows us to
|
|
|
|
share the ownership of the value between multiple references at the same time.
|
|
|
|
|
|
|
|
The bookkeeping consists of a count of how many of these references exist to
|
|
|
|
the value, hence the reference count part of the name.
|
|
|
|
|
|
|
|
The Atomic part means `Arc<T>` can safely be accessed from multiple threads.
|
|
|
|
To do this the compiler guarantees that mutations of the internal count use
|
|
|
|
indivisible operations which can't have data races.
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
|
|
|
|
```ignore
|
2015-02-17 17:24:34 -06:00
|
|
|
use std::thread;
|
2015-08-03 12:10:49 -05:00
|
|
|
use std::sync::Arc;
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
fn main() {
|
2015-08-03 12:10:49 -05:00
|
|
|
let mut data = Arc::new(vec![1, 2, 3]);
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-05-02 07:29:53 -05:00
|
|
|
for i in 0..3 {
|
2015-08-03 12:10:49 -05:00
|
|
|
let data = data.clone();
|
2015-02-17 17:24:34 -06:00
|
|
|
thread::spawn(move || {
|
2015-01-13 21:06:52 -06:00
|
|
|
data[i] += 1;
|
|
|
|
});
|
|
|
|
}
|
|
|
|
|
2015-04-03 14:29:33 -05:00
|
|
|
thread::sleep_ms(50);
|
2015-01-13 21:06:52 -06:00
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2015-08-03 12:10:49 -05:00
|
|
|
We now call `clone()` on our `Arc<T>`, which increases the internal count.
|
|
|
|
This handle is then moved into the new thread.
|
|
|
|
|
|
|
|
And... still gives us an error.
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
```text
|
2015-08-03 12:10:49 -05:00
|
|
|
<anon>:11:24 error: cannot borrow immutable borrowed content as mutable
|
|
|
|
<anon>:11 data[i] += 1;
|
|
|
|
^~~~
|
2015-01-13 21:06:52 -06:00
|
|
|
```
|
|
|
|
|
2015-08-03 12:10:49 -05:00
|
|
|
`Arc<T>` assumes one more property about its contents to ensure that it is safe
|
|
|
|
to share across threads: it assumes its contents are `Sync`. This is true for
|
|
|
|
our value if it's immutable, but we want to be able to mutate it, so we need
|
|
|
|
something else to persuade the borrow checker we know what we're doing.
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-08-03 12:10:49 -05:00
|
|
|
It looks like we need some type that allows us to safely mutate a shared value,
|
|
|
|
for example a type that that can ensure only one thread at a time is able to
|
|
|
|
mutate the value inside it at any one time.
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-08-03 12:10:49 -05:00
|
|
|
For that, we can use the `Mutex<T>` type!
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-08-03 12:10:49 -05:00
|
|
|
Here's the working version:
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-05-18 13:56:00 -05:00
|
|
|
```rust
|
2015-01-13 21:06:52 -06:00
|
|
|
use std::sync::{Arc, Mutex};
|
2015-02-17 17:24:34 -06:00
|
|
|
use std::thread;
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
fn main() {
|
2015-08-03 19:48:14 -05:00
|
|
|
let data = Arc::new(Mutex::new(vec![1, 2, 3]));
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-05-02 07:29:53 -05:00
|
|
|
for i in 0..3 {
|
2015-01-13 21:06:52 -06:00
|
|
|
let data = data.clone();
|
2015-02-17 17:24:34 -06:00
|
|
|
thread::spawn(move || {
|
2015-01-13 21:06:52 -06:00
|
|
|
let mut data = data.lock().unwrap();
|
|
|
|
data[i] += 1;
|
|
|
|
});
|
|
|
|
}
|
|
|
|
|
2015-04-03 14:29:33 -05:00
|
|
|
thread::sleep_ms(50);
|
2015-01-13 21:06:52 -06:00
|
|
|
}
|
|
|
|
```
|
|
|
|
|
2015-08-03 12:10:49 -05:00
|
|
|
|
|
|
|
If we'd tried to use `Mutex<T>` without wrapping it in an `Arc<T>` we would have
|
|
|
|
seen another error like:
|
|
|
|
|
|
|
|
```text
|
|
|
|
error: the trait `core::marker::Send` is not implemented for the type `std::sync::mutex::MutexGuard<'_, collections::vec::Vec<u32>>` [E0277]
|
|
|
|
thread::spawn(move || {
|
|
|
|
^~~~~~~~~~~~~
|
|
|
|
note: `std::sync::mutex::MutexGuard<'_, collections::vec::Vec<u32>>` cannot be sent between threads safely
|
|
|
|
thread::spawn(move || {
|
|
|
|
^~~~~~~~~~~~~
|
|
|
|
```
|
|
|
|
|
|
|
|
You see, [`Mutex`](../std/sync/struct.Mutex.html) has a
|
|
|
|
[`lock`](../std/sync/struct.Mutex.html#method.lock)
|
|
|
|
method which has this signature:
|
|
|
|
|
|
|
|
```ignore
|
|
|
|
fn lock(&self) -> LockResult<MutexGuard<T>>
|
|
|
|
```
|
|
|
|
|
|
|
|
and because `Send` is not implemented for `MutexGuard<T>`, we couldn't have
|
|
|
|
transferred the guard across thread boundaries on it's own.
|
|
|
|
|
|
|
|
Let's examine the body of the thread more closely:
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-04-04 02:20:43 -05:00
|
|
|
```rust
|
2015-01-13 21:06:52 -06:00
|
|
|
# use std::sync::{Arc, Mutex};
|
2015-02-17 17:24:34 -06:00
|
|
|
# use std::thread;
|
2015-01-13 21:06:52 -06:00
|
|
|
# fn main() {
|
2015-08-03 19:48:14 -05:00
|
|
|
# let data = Arc::new(Mutex::new(vec![1, 2, 3]));
|
2015-05-02 07:29:53 -05:00
|
|
|
# for i in 0..3 {
|
2015-01-13 21:06:52 -06:00
|
|
|
# let data = data.clone();
|
2015-02-17 17:24:34 -06:00
|
|
|
thread::spawn(move || {
|
2015-01-13 21:06:52 -06:00
|
|
|
let mut data = data.lock().unwrap();
|
|
|
|
data[i] += 1;
|
|
|
|
});
|
|
|
|
# }
|
2015-04-03 14:29:33 -05:00
|
|
|
# thread::sleep_ms(50);
|
2015-01-13 21:06:52 -06:00
|
|
|
# }
|
|
|
|
```
|
|
|
|
|
|
|
|
First, we call `lock()`, which acquires the mutex's lock. Because this may fail,
|
|
|
|
it returns an `Result<T, E>`, and because this is just an example, we `unwrap()`
|
|
|
|
it to get a reference to the data. Real code would have more robust error handling
|
|
|
|
here. We're then free to mutate it, since we have the lock.
|
|
|
|
|
2015-03-31 15:14:04 -05:00
|
|
|
Lastly, while the threads are running, we wait on a short timer. But
|
|
|
|
this is not ideal: we may have picked a reasonable amount of time to
|
|
|
|
wait but it's more likely we'll either be waiting longer than
|
|
|
|
necessary or not long enough, depending on just how much time the
|
|
|
|
threads actually take to finish computing when the program runs.
|
|
|
|
|
|
|
|
A more precise alternative to the timer would be to use one of the
|
|
|
|
mechanisms provided by the Rust standard library for synchronizing
|
|
|
|
threads with each other. Let's talk about one of them: channels.
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
## Channels
|
|
|
|
|
|
|
|
Here's a version of our code that uses channels for synchronization, rather
|
|
|
|
than waiting for a specific time:
|
|
|
|
|
2015-05-18 13:56:00 -05:00
|
|
|
```rust
|
2015-01-13 21:06:52 -06:00
|
|
|
use std::sync::{Arc, Mutex};
|
2015-02-17 17:24:34 -06:00
|
|
|
use std::thread;
|
2015-01-13 21:06:52 -06:00
|
|
|
use std::sync::mpsc;
|
|
|
|
|
|
|
|
fn main() {
|
2015-08-03 19:48:14 -05:00
|
|
|
let data = Arc::new(Mutex::new(0));
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
let (tx, rx) = mpsc::channel();
|
|
|
|
|
2015-02-14 04:58:18 -06:00
|
|
|
for _ in 0..10 {
|
2015-01-13 21:06:52 -06:00
|
|
|
let (data, tx) = (data.clone(), tx.clone());
|
|
|
|
|
2015-02-17 17:24:34 -06:00
|
|
|
thread::spawn(move || {
|
2015-01-13 21:06:52 -06:00
|
|
|
let mut data = data.lock().unwrap();
|
|
|
|
*data += 1;
|
|
|
|
|
|
|
|
tx.send(());
|
|
|
|
});
|
|
|
|
}
|
|
|
|
|
2015-02-14 04:58:18 -06:00
|
|
|
for _ in 0..10 {
|
2015-01-13 21:06:52 -06:00
|
|
|
rx.recv();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
We use the `mpsc::channel()` method to construct a new channel. We just `send`
|
|
|
|
a simple `()` down the channel, and then wait for ten of them to come back.
|
|
|
|
|
|
|
|
While this channel is just sending a generic signal, we can send any data that
|
|
|
|
is `Send` over the channel!
|
|
|
|
|
2015-05-18 13:56:00 -05:00
|
|
|
```rust
|
2015-02-17 17:24:34 -06:00
|
|
|
use std::thread;
|
2015-01-13 21:06:52 -06:00
|
|
|
use std::sync::mpsc;
|
|
|
|
|
|
|
|
fn main() {
|
|
|
|
let (tx, rx) = mpsc::channel();
|
|
|
|
|
2015-02-06 16:44:45 -06:00
|
|
|
for _ in 0..10 {
|
2015-01-13 21:06:52 -06:00
|
|
|
let tx = tx.clone();
|
|
|
|
|
2015-02-17 17:24:34 -06:00
|
|
|
thread::spawn(move || {
|
2015-08-03 19:48:14 -05:00
|
|
|
let answer = 42;
|
2015-01-13 21:06:52 -06:00
|
|
|
|
|
|
|
tx.send(answer);
|
|
|
|
});
|
|
|
|
}
|
|
|
|
|
2015-03-14 07:42:12 -05:00
|
|
|
rx.recv().ok().expect("Could not receive answer");
|
2015-01-13 21:06:52 -06:00
|
|
|
}
|
|
|
|
```
|
|
|
|
|
|
|
|
A `u32` is `Send` because we can make a copy. So we create a thread, ask it to calculate
|
|
|
|
the answer, and then it `send()`s us the answer over the channel.
|
|
|
|
|
|
|
|
|
|
|
|
## Panics
|
|
|
|
|
|
|
|
A `panic!` will crash the currently executing thread. You can use Rust's
|
|
|
|
threads as a simple isolation mechanism:
|
|
|
|
|
2015-05-18 13:56:00 -05:00
|
|
|
```rust
|
2015-02-17 17:24:34 -06:00
|
|
|
use std::thread;
|
2015-01-13 21:06:52 -06:00
|
|
|
|
2015-02-17 17:24:34 -06:00
|
|
|
let result = thread::spawn(move || {
|
2015-01-13 21:06:52 -06:00
|
|
|
panic!("oops!");
|
|
|
|
}).join();
|
|
|
|
|
|
|
|
assert!(result.is_err());
|
|
|
|
```
|
|
|
|
|
|
|
|
Our `Thread` gives us a `Result` back, which allows us to check if the thread
|
|
|
|
has panicked or not.
|