docs: More task tutorial
This commit is contained in:
parent
39f114d171
commit
5d8b694e14
@ -2,12 +2,16 @@
|
||||
|
||||
# Introduction
|
||||
|
||||
Rust supports concurrency and parallelism through lightweight tasks.
|
||||
Rust tasks are significantly cheaper to create than traditional
|
||||
threads, with a typical 32-bit system able to run hundreds of
|
||||
thousands simultaneously. Tasks in Rust are what are often referred to
|
||||
as _green threads_, cooperatively scheduled by the Rust runtime onto a
|
||||
small number of operating system threads.
|
||||
The Rust language is designed from the ground up to support pervasive
|
||||
and safe concurrency through lightweight, memory-isolated tasks and
|
||||
message passing.
|
||||
|
||||
Rust tasks are not the same as traditional threads - they are what are
|
||||
often referred to as _green threads_, cooperatively scheduled by the
|
||||
Rust runtime onto a small number of operating system threads. Being
|
||||
significantly cheaper to create than traditional threads, Rust can
|
||||
create hundreds of thousands of concurrent tasks on a typical 32-bit
|
||||
system.
|
||||
|
||||
Tasks provide failure isolation and recovery. When an exception occurs
|
||||
in rust code (either by calling `fail` explicitly or by otherwise performing
|
||||
@ -16,11 +20,11 @@ to `catch` an exception as in other languages. Instead tasks may monitor
|
||||
each other to detect when failure has occurred.
|
||||
|
||||
Rust tasks have dynamically sized stacks. When a task is first created
|
||||
it starts off with a small amount of stack (in the hundreds to
|
||||
low thousands of bytes, depending on plattform), and more stack is
|
||||
added as needed. A Rust task will never run off the end of the stack as
|
||||
is possible in many other languages, but they do have a stack budget,
|
||||
and if a Rust task exceeds its stack budget then it will fail safely.
|
||||
it starts off with a small amount of stack (currently in the low
|
||||
thousands of bytes, depending on platform) and more stack is acquired as
|
||||
needed. A Rust task will never run off the end of the stack as is
|
||||
possible in many other languages, but they do have a stack budget, and
|
||||
if a Rust task exceeds its stack budget then it will fail safely.
|
||||
|
||||
Tasks make use of Rust's type system to provide strong memory safety
|
||||
guarantees, disallowing shared mutable state. Communication between
|
||||
@ -32,12 +36,12 @@ explore some typical patterns in concurrent Rust code, and finally
|
||||
discuss some of the more exotic synchronization types in the standard
|
||||
library.
|
||||
|
||||
# A note about the libraries
|
||||
## A note about the libraries
|
||||
|
||||
While Rust's type system provides the building blocks needed for safe
|
||||
and efficient tasks, all of the task functionality itself is implemented
|
||||
in the core and standard libraries, which are still under development
|
||||
and do not always present a nice programming interface.
|
||||
and do not always present a consistent interface.
|
||||
|
||||
In particular, there are currently two independent modules that provide
|
||||
a message passing interface to Rust code: `core::comm` and `core::pipes`.
|
||||
@ -66,43 +70,96 @@ concurrency at the moment.
|
||||
[`std::arc`]: std/arc.html
|
||||
[`std::par`]: std/par.html
|
||||
|
||||
# Spawning a task
|
||||
# Basics
|
||||
|
||||
Spawning a task is done using the various spawn functions in the
|
||||
module `task`. Let's begin with the simplest one, `task::spawn()`:
|
||||
The programming interface for creating and managing tasks is contained
|
||||
in the `task` module of the `core` library, making it available to all
|
||||
Rust code by default. At it's simplest, creating a task is a matter of
|
||||
calling the `spawn` function, passing a closure to run in the new
|
||||
task.
|
||||
|
||||
~~~~
|
||||
# use io::println;
|
||||
use task::spawn;
|
||||
use io::println;
|
||||
|
||||
let some_value = 22;
|
||||
// Print something profound in a different task using a named function
|
||||
fn print_message() { println("I am running in a different task!"); }
|
||||
spawn(print_message);
|
||||
|
||||
// Print something more profound in a different task using a lambda expression
|
||||
spawn( || println("I am also running in a different task!") );
|
||||
|
||||
// The canonical way to spawn is using `do` notation
|
||||
do spawn {
|
||||
println(~"This executes in the child task.");
|
||||
println(fmt!("%d", some_value));
|
||||
println("I too am running in a different task!");
|
||||
}
|
||||
~~~~
|
||||
|
||||
The argument to `task::spawn()` is a [unique
|
||||
closure](#unique-closures) of type `fn~()`, meaning that it takes no
|
||||
arguments and generates no return value. The effect of `task::spawn()`
|
||||
is to fire up a child task that will execute the closure in parallel
|
||||
with the creator.
|
||||
In Rust, there is nothing special about creating tasks - the language
|
||||
itself doesn't know what a 'task' is. Instead, Rust provides in the
|
||||
type system all the tools necessary to implement safe concurrency,
|
||||
_owned types_ in particular, and leaves the dirty work up to the
|
||||
core library.
|
||||
|
||||
# Communication
|
||||
The `spawn` function has a very simple type signature: `fn spawn(f:
|
||||
~fn())`. Because it accepts only owned closures, and owned closures
|
||||
contained only owned data, `spawn` can safely move the entire closure
|
||||
and all its associated state into an entirely different task for
|
||||
execution. Like any closure, the function passed to spawn may capture
|
||||
an environment that it carries across tasks.
|
||||
|
||||
Now that we have spawned a child task, it would be nice if we could
|
||||
communicate with it. This is done using *pipes*. Pipes are simply a
|
||||
pair of endpoints, with one for sending messages and another for
|
||||
receiving messages. The easiest way to create a pipe is to use
|
||||
`pipes::stream`. Imagine we wish to perform two expensive
|
||||
computations in parallel. We might write something like:
|
||||
~~~
|
||||
# use io::println;
|
||||
# use task::spawn;
|
||||
# fn generate_task_number() -> int { 0 }
|
||||
// Generate some state locally
|
||||
let child_task_number = generate_task_number();
|
||||
|
||||
do spawn {
|
||||
// Capture it in the remote task
|
||||
println(fmt!("I am child number %d", child_task_number));
|
||||
}
|
||||
~~~
|
||||
|
||||
By default tasks will be multiplexed across the available cores, running
|
||||
in parallel, thus on a multicore machine, running the following code
|
||||
should interleave the output in vaguely random order.
|
||||
|
||||
~~~
|
||||
# use io::print;
|
||||
# use task::spawn;
|
||||
|
||||
for int::range(0, 20) |child_task_number| {
|
||||
do spawn {
|
||||
print(fmt!("I am child number %d\n", child_task_number));
|
||||
}
|
||||
}
|
||||
~~~
|
||||
|
||||
## Communication
|
||||
|
||||
Now that we have spawned a new task, it would be nice if we could
|
||||
communicate with it. Recall that Rust does not have shared mutable
|
||||
state, so one task may not manipulate variables owned by another task.
|
||||
Instead we use *pipes*.
|
||||
|
||||
Pipes are simply a pair of endpoints, with one for sending messages
|
||||
and another for receiving messages. Pipes are low-level communication
|
||||
building-blocks and so come in a variety of forms, appropriate for
|
||||
different use cases, but there are just a few varieties that are most
|
||||
commonly used, which we will cover presently.
|
||||
|
||||
The simplest way to create a pipe is to use the `pipes::stream`
|
||||
function to create a `(Chan, Port)` pair. In Rust parlance a 'channel'
|
||||
is a sending endpoint of a pipe, and a 'port' is the recieving
|
||||
endpoint. Consider the following example of performing two calculations
|
||||
concurrently.
|
||||
|
||||
~~~~
|
||||
use task::spawn;
|
||||
use pipes::{stream, Port, Chan};
|
||||
|
||||
let (chan, port) = stream();
|
||||
let (chan, port): (Chan<int>, Port<int>) = stream();
|
||||
|
||||
do spawn {
|
||||
let result = some_expensive_computation();
|
||||
@ -116,17 +173,19 @@ let result = port.recv();
|
||||
# fn some_other_expensive_computation() {}
|
||||
~~~~
|
||||
|
||||
Let's walk through this code line-by-line. The first line creates a
|
||||
stream for sending and receiving integers:
|
||||
Let's examine this example in detail. The `let` statement first creates a
|
||||
stream for sending and receiving integers (recall that `let` can be
|
||||
used for destructuring patterns, in this case separating a tuple into
|
||||
its component parts).
|
||||
|
||||
~~~~ {.ignore}
|
||||
# use pipes::stream;
|
||||
let (chan, port) = stream();
|
||||
~~~~
|
||||
# use pipes::{stream, Chan, Port};
|
||||
let (chan, port): (Chan<int>, Port<int>) = stream();
|
||||
~~~~
|
||||
|
||||
This port is where we will receive the message from the child task
|
||||
once it is complete. The channel will be used by the child to send a
|
||||
message to the port. The next statement actually spawns the child:
|
||||
The channel will be used by the child task to send data to the parent task,
|
||||
which will wait to recieve the data on the port. The next statement
|
||||
spawns the child task.
|
||||
|
||||
~~~~
|
||||
# use task::{spawn};
|
||||
@ -140,14 +199,15 @@ do spawn {
|
||||
}
|
||||
~~~~
|
||||
|
||||
This child will perform the expensive computation send the result
|
||||
over the channel. (Under the hood, `chan` was captured by the
|
||||
closure that forms the body of the child task. This capture is
|
||||
allowed because channels are sendable.)
|
||||
Notice that `chan` was transferred to the child task implicitly by
|
||||
capturing it in the task closure. Both `Chan` and `Port` are sendable
|
||||
types and may be captured into tasks or otherwise transferred between
|
||||
them. In the example, the child task performs an expensive computation
|
||||
then sends the result over the captured channel.
|
||||
|
||||
Finally, the parent continues by performing
|
||||
some other expensive computation and then waiting for the child's result
|
||||
to arrive on the port:
|
||||
Finally, the parent continues by performing some other expensive
|
||||
computation and then waiting for the child's result to arrive on the
|
||||
port:
|
||||
|
||||
~~~~
|
||||
# use pipes::{stream, Port, Chan};
|
||||
@ -158,7 +218,73 @@ some_other_expensive_computation();
|
||||
let result = port.recv();
|
||||
~~~~
|
||||
|
||||
# Creating a task with a bi-directional communication path
|
||||
The `Port` and `Chan` pair created by `stream` enable efficient
|
||||
communication between a single sender and a single receiver, but
|
||||
multiple senders cannot use a single `Chan`, nor can multiple
|
||||
receivers use a single `Port`. What if our example needed to
|
||||
perform multiple computations across a number of tasks? In that
|
||||
case we might use a `SharedChan`, a type that allows a single
|
||||
`Chan` to be used by multiple senders.
|
||||
|
||||
~~~
|
||||
# use task::spawn;
|
||||
use pipes::{stream, SharedChan};
|
||||
|
||||
let (chan, port) = stream();
|
||||
let chan = SharedChan(move chan);
|
||||
|
||||
for uint::range(0, 3) |init_val| {
|
||||
// Create a new channel handle to distribute to the child task
|
||||
let child_chan = chan.clone();
|
||||
do spawn {
|
||||
child_chan.send(some_expensive_computation(init_val));
|
||||
}
|
||||
}
|
||||
|
||||
let result = port.recv() + port.recv() + port.recv();
|
||||
# fn some_expensive_computation(_i: uint) -> int { 42 }
|
||||
~~~
|
||||
|
||||
Here we transfer ownership of the channel into a new `SharedChan`
|
||||
value. Like `Chan`, `SharedChan` is a non-copyable, owned type
|
||||
(sometimes also referred to as an 'affine' or 'linear' type). Unlike
|
||||
`Chan` though, `SharedChan` may be duplicated with the `clone()`
|
||||
method. A cloned `SharedChan` produces a new handle to the same
|
||||
channel, allowing multiple tasks to send data to a single port.
|
||||
Between `spawn`, `stream` and `SharedChan` we have enough tools
|
||||
to implement many useful concurrency patterns.
|
||||
|
||||
Note that the above `SharedChan` example is somewhat contrived since
|
||||
you could also simply use three `stream` pairs, but it serves to
|
||||
illustrate the point. For reference, written with multiple streams it
|
||||
might look like the example below.
|
||||
|
||||
~~~
|
||||
# use task::spawn;
|
||||
# use pipes::{stream, Port, Chan};
|
||||
|
||||
let ports = do vec::from_fn(3) |init_val| {
|
||||
let (chan, port) = stream();
|
||||
|
||||
do spawn {
|
||||
chan.send(some_expensive_computation(init_val));
|
||||
}
|
||||
|
||||
port
|
||||
};
|
||||
|
||||
// Wait on each port, accumulating the results
|
||||
let result = ports.foldl(0, |accum, port| *accum + port.recv() );
|
||||
# fn some_expensive_computation(_i: uint) -> int { 42 }
|
||||
~~~
|
||||
|
||||
# Unfinished notes
|
||||
|
||||
## Actor patterns
|
||||
|
||||
## Linearity, option dancing, owned closures
|
||||
|
||||
## Creating a task with a bi-directional communication path
|
||||
|
||||
A very common thing to do is to spawn a child task where the parent
|
||||
and child both need to exchange messages with each other. The
|
||||
@ -227,3 +353,4 @@ assert from_child.recv() == ~"0";
|
||||
|
||||
The parent task first calls `DuplexStream` to create a pair of bidirectional endpoints. It then uses `task::spawn` to create the child task, which captures one end of the communication channel. As a result, both parent
|
||||
and child can send and receive data to and from the other.
|
||||
|
||||
|
Loading…
x
Reference in New Issue
Block a user