309 lines
13 KiB
Rust
309 lines
13 KiB
Rust
// Copyright 2013 The Rust Project Developers. See the COPYRIGHT
|
|
// file at the top-level directory of this distribution and at
|
|
// http://rust-lang.org/COPYRIGHT.
|
|
//
|
|
// Licensed under the Apache License, Version 2.0 <LICENSE-APACHE or
|
|
// http://www.apache.org/licenses/LICENSE-2.0> or the MIT license
|
|
// <LICENSE-MIT or http://opensource.org/licenses/MIT>, at your
|
|
// option. This file may not be copied, modified, or distributed
|
|
// except according to those terms.
|
|
|
|
/*!
|
|
|
|
Task death: asynchronous killing, linked failure, exit code propagation.
|
|
|
|
This file implements two orthogonal building-blocks for communicating failure
|
|
between tasks. One is 'linked failure' or 'task killing', that is, a failing
|
|
task causing other tasks to fail promptly (even those that are blocked on
|
|
pipes or I/O). The other is 'exit code propagation', which affects the result
|
|
observed by the parent of a task::try task that itself spawns child tasks
|
|
(such as any #[test] function). In both cases the data structures live in
|
|
KillHandle.
|
|
|
|
|
|
I. Task killing.
|
|
|
|
The model for killing involves two atomic flags, the "kill flag" and the
|
|
"unkillable flag". Operations on the kill flag include:
|
|
|
|
- In the taskgroup code (task/spawn.rs), tasks store a clone of their
|
|
KillHandle in their shared taskgroup. Another task in the group that fails
|
|
will use that handle to call kill().
|
|
- When a task blocks, it turns its ~Task into a BlockedTask by storing a
|
|
the transmuted ~Task pointer inside the KillHandle's kill flag. A task
|
|
trying to block and a task trying to kill it can simultaneously access the
|
|
kill flag, after which the task will get scheduled and fail (no matter who
|
|
wins the race). Likewise, a task trying to wake a blocked task normally and
|
|
a task trying to kill it can simultaneously access the flag; only one will
|
|
get the task to reschedule it.
|
|
|
|
Operations on the unkillable flag include:
|
|
|
|
- When a task becomes unkillable, it swaps on the flag to forbid any killer
|
|
from waking it up while it's blocked inside the unkillable section. If a
|
|
kill was already pending, the task fails instead of becoming unkillable.
|
|
- When a task is done being unkillable, it restores the flag to the normal
|
|
running state. If a kill was received-but-blocked during the unkillable
|
|
section, the task fails at this later point.
|
|
- When a task tries to kill another task, before swapping on the kill flag, it
|
|
first swaps on the unkillable flag, to see if it's "allowed" to wake up the
|
|
task. If it isn't, the killed task will receive the signal when it becomes
|
|
killable again. (Of course, a task trying to wake the task normally (e.g.
|
|
sending on a channel) does not access the unkillable flag at all.)
|
|
|
|
Why do we not need acquire/release barriers on any of the kill flag swaps?
|
|
This is because barriers establish orderings between accesses on different
|
|
memory locations, but each kill-related operation is only a swap on a single
|
|
location, so atomicity is all that matters. The exception is kill(), which
|
|
does a swap on both flags in sequence. kill() needs no barriers because it
|
|
does not matter if its two accesses are seen reordered on another CPU: if a
|
|
killer does perform both writes, it means it saw a KILL_RUNNING in the
|
|
unkillable flag, which means an unkillable task will see KILL_KILLED and fail
|
|
immediately (rendering the subsequent write to the kill flag unnecessary).
|
|
|
|
|
|
II. Exit code propagation.
|
|
|
|
The basic model for exit code propagation, which is used with the "watched"
|
|
spawn mode (on by default for linked spawns, off for supervised and unlinked
|
|
spawns), is that a parent will wait for all its watched children to exit
|
|
before reporting whether it succeeded or failed. A watching parent will only
|
|
report success if it succeeded and all its children also reported success;
|
|
otherwise, it will report failure. This is most useful for writing test cases:
|
|
|
|
```
|
|
#[test]
|
|
fn test_something_in_another_task {
|
|
do spawn {
|
|
assert!(collatz_conjecture_is_false());
|
|
}
|
|
}
|
|
```
|
|
|
|
Here, as the child task will certainly outlive the parent task, we might miss
|
|
the failure of the child when deciding whether or not the test case passed.
|
|
The watched spawn mode avoids this problem.
|
|
|
|
In order to propagate exit codes from children to their parents, any
|
|
'watching' parent must wait for all of its children to exit before it can
|
|
report its final exit status. We achieve this by using an UnsafeArc, using the
|
|
reference counting to track how many children are still alive, and using the
|
|
unwrap() operation in the parent's exit path to wait for all children to exit.
|
|
The UnsafeArc referred to here is actually the KillHandle itself.
|
|
|
|
This also works transitively, as if a "middle" watched child task is itself
|
|
watching a grandchild task, the "middle" task will do unwrap() on its own
|
|
KillHandle (thereby waiting for the grandchild to exit) before dropping its
|
|
reference to its watching parent (which will alert the parent).
|
|
|
|
While UnsafeArc::unwrap() accomplishes the synchronization, there remains the
|
|
matter of reporting the exit codes themselves. This is easiest when an exiting
|
|
watched task has no watched children of its own:
|
|
|
|
- If the task with no watched children exits successfully, it need do nothing.
|
|
- If the task with no watched children has failed, it sets a flag in the
|
|
parent's KillHandle ("any_child_failed") to false. It then stays false forever.
|
|
|
|
However, if a "middle" watched task with watched children of its own exits
|
|
before its child exits, we need to ensure that the grandparent task may still
|
|
see a failure from the grandchild task. While we could achieve this by having
|
|
each intermediate task block on its handle, this keeps around the other resources
|
|
the task was using. To be more efficient, this is accomplished via "tombstones".
|
|
|
|
A tombstone is a closure, proc() -> bool, which will perform any waiting necessary
|
|
to collect the exit code of descendant tasks. In its environment is captured
|
|
the KillHandle of whichever task created the tombstone, and perhaps also any
|
|
tombstones that that task itself had, and finally also another tombstone,
|
|
effectively creating a lazy-list of heap closures.
|
|
|
|
When a child wishes to exit early and leave tombstones behind for its parent,
|
|
it must use a LittleLock (pthread mutex) to synchronize with any possible
|
|
sibling tasks which are trying to do the same thing with the same parent.
|
|
However, on the other side, when the parent is ready to pull on the tombstones,
|
|
it need not use this lock, because the unwrap() serves as a barrier that ensures
|
|
no children will remain with references to the handle.
|
|
|
|
The main logic for creating and assigning tombstones can be found in the
|
|
function reparent_children_to() in the impl for KillHandle.
|
|
|
|
|
|
IIA. Issues with exit code propagation.
|
|
|
|
There are two known issues with the current scheme for exit code propagation.
|
|
|
|
- As documented in issue #8136, the structure mandates the possibility for stack
|
|
overflow when collecting tombstones that are very deeply nested. This cannot
|
|
be avoided with the closure representation, as tombstones end up structured in
|
|
a sort of tree. However, notably, the tombstones do not actually need to be
|
|
collected in any particular order, and so a doubly-linked list may be used.
|
|
However we do not do this yet because DList is in libextra.
|
|
|
|
- A discussion with Graydon made me realize that if we decoupled the exit code
|
|
propagation from the parents-waiting action, this could result in a simpler
|
|
implementation as the exit codes themselves would not have to be propagated,
|
|
and could instead be propagated implicitly through the taskgroup mechanism
|
|
that we already have. The tombstoning scheme would still be required. I have
|
|
not implemented this because currently we can't receive a linked failure kill
|
|
signal during the task cleanup activity, as that is currently "unkillable",
|
|
and occurs outside the task's unwinder's "try" block, so would require some
|
|
restructuring.
|
|
|
|
*/
|
|
|
|
use cast;
|
|
use cell::Cell;
|
|
use option::{Option, Some, None};
|
|
use prelude::*;
|
|
use rt::task::Task;
|
|
use rt::task::UnwindResult;
|
|
use unstable::atomics::{AtomicUint, SeqCst};
|
|
use unstable::sync::UnsafeArc;
|
|
|
|
/// A handle to a blocked task. Usually this means having the ~Task pointer by
|
|
/// ownership, but if the task is killable, a killer can steal it at any time.
|
|
pub enum BlockedTask {
|
|
Owned(~Task),
|
|
Shared(UnsafeArc<AtomicUint>),
|
|
}
|
|
|
|
/// Per-task state related to task death, killing, failure, etc.
|
|
pub struct Death {
|
|
// Action to be done with the exit code. If set, also makes the task wait
|
|
// until all its watched children exit before collecting the status.
|
|
on_exit: Option<proc(UnwindResult)>,
|
|
// nesting level counter for unstable::atomically calls (0 == can deschedule).
|
|
priv wont_sleep: int,
|
|
}
|
|
|
|
impl BlockedTask {
|
|
/// Returns Some if the task was successfully woken; None if already killed.
|
|
pub fn wake(self) -> Option<~Task> {
|
|
match self {
|
|
Owned(task) => Some(task),
|
|
Shared(arc) => unsafe {
|
|
match (*arc.get()).swap(0, SeqCst) {
|
|
0 => None,
|
|
n => cast::transmute(n),
|
|
}
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Create a blocked task, unless the task was already killed.
|
|
pub fn block(task: ~Task) -> BlockedTask {
|
|
Owned(task)
|
|
}
|
|
|
|
/// Converts one blocked task handle to a list of many handles to the same.
|
|
pub fn make_selectable(self, num_handles: uint) -> ~[BlockedTask] {
|
|
let handles = match self {
|
|
Owned(task) => {
|
|
let flag = unsafe {
|
|
AtomicUint::new(cast::transmute(task))
|
|
};
|
|
UnsafeArc::newN(flag, num_handles)
|
|
}
|
|
Shared(arc) => arc.cloneN(num_handles),
|
|
};
|
|
// Even if the task was unkillable before, we use 'Killable' because
|
|
// multiple pipes will have handles. It does not really mean killable.
|
|
handles.move_iter().map(|x| Shared(x)).collect()
|
|
}
|
|
|
|
// This assertion has two flavours because the wake involves an atomic op.
|
|
// In the faster version, destructors will fail dramatically instead.
|
|
#[inline] #[cfg(not(test))]
|
|
pub fn assert_already_awake(self) { }
|
|
#[inline] #[cfg(test)]
|
|
pub fn assert_already_awake(self) { assert!(self.wake().is_none()); }
|
|
|
|
/// Convert to an unsafe uint value. Useful for storing in a pipe's state flag.
|
|
#[inline]
|
|
pub unsafe fn cast_to_uint(self) -> uint {
|
|
match self {
|
|
Owned(task) => {
|
|
let blocked_task_ptr: uint = cast::transmute(task);
|
|
rtassert!(blocked_task_ptr & 0x1 == 0);
|
|
blocked_task_ptr
|
|
}
|
|
Shared(arc) => {
|
|
let blocked_task_ptr: uint = cast::transmute(~arc);
|
|
rtassert!(blocked_task_ptr & 0x1 == 0);
|
|
blocked_task_ptr | 0x1
|
|
}
|
|
}
|
|
}
|
|
|
|
/// Convert from an unsafe uint value. Useful for retrieving a pipe's state flag.
|
|
#[inline]
|
|
pub unsafe fn cast_from_uint(blocked_task_ptr: uint) -> BlockedTask {
|
|
if blocked_task_ptr & 0x1 == 0 {
|
|
Owned(cast::transmute(blocked_task_ptr))
|
|
} else {
|
|
let ptr: ~UnsafeArc<AtomicUint> = cast::transmute(blocked_task_ptr & !1);
|
|
Shared(*ptr)
|
|
}
|
|
}
|
|
}
|
|
|
|
impl Death {
|
|
pub fn new() -> Death {
|
|
Death {
|
|
on_exit: None,
|
|
wont_sleep: 0,
|
|
}
|
|
}
|
|
|
|
/// Collect failure exit codes from children and propagate them to a parent.
|
|
pub fn collect_failure(&mut self, result: UnwindResult) {
|
|
let result = Cell::new(result);
|
|
self.on_exit.take().map(|on_exit| on_exit(result.take()));
|
|
}
|
|
|
|
/// Enter a possibly-nested "atomic" section of code. Just for assertions.
|
|
/// All calls must be paired with a subsequent call to allow_deschedule.
|
|
#[inline]
|
|
pub fn inhibit_deschedule(&mut self) {
|
|
self.wont_sleep += 1;
|
|
}
|
|
|
|
/// Exit a possibly-nested "atomic" section of code. Just for assertions.
|
|
/// All calls must be paired with a preceding call to inhibit_deschedule.
|
|
#[inline]
|
|
pub fn allow_deschedule(&mut self) {
|
|
rtassert!(self.wont_sleep != 0);
|
|
self.wont_sleep -= 1;
|
|
}
|
|
|
|
/// Ensure that the task is allowed to become descheduled.
|
|
#[inline]
|
|
pub fn assert_may_sleep(&self) {
|
|
if self.wont_sleep != 0 {
|
|
rtabort!("illegal atomic-sleep: attempt to reschedule while \
|
|
using an Exclusive or LittleLock");
|
|
}
|
|
}
|
|
}
|
|
|
|
impl Drop for Death {
|
|
fn drop(&mut self) {
|
|
// Mustn't be in an atomic or unkillable section at task death.
|
|
rtassert!(self.wont_sleep == 0);
|
|
}
|
|
}
|
|
|
|
#[cfg(test)]
|
|
mod test {
|
|
use rt::test::*;
|
|
use super::*;
|
|
|
|
// Task blocking tests
|
|
|
|
#[test]
|
|
fn block_and_wake() {
|
|
do with_test_task |task| {
|
|
BlockedTask::block(task).wake().unwrap()
|
|
}
|
|
}
|
|
}
|