For libgreen, bookeeping should not be global but rather on a per-pool basis.
Inside libnative, it's known that there must be a global counter with a
mutex/cvar.
The benefit of taking this strategy is to remove this functionality from libstd
to allow fine-grained control of it through libnative/libgreen. Notably, helper
threads in libnative can manually decrement the global count so they don't count
towards the global count of threads. Also, the shutdown process of *all* sched
pools is now dependent on the number of tasks in the pool being 0 rather than
this only being a hardcoded solution for the initial sched pool in libgreen.
This involved adding a Local::try_take() method on the Local trait in order for
the channel wakeup to work inside of libgreen. The channel send was happening
from a SchedTask when there is no Task available in TLS, and now this is
possible to work (remote wakeups are always possible, just a little slower).
* vec::raw::to_ptr is gone
* Pausible => Pausable
* Removing @
* Calling the main task "<main>"
* Removing unused imports
* Removing unused mut
* Bringing some libextra tests up to date
* Allowing compiletest to work at stage0
* Fixing the bootstrap-from-c rmake tests
* assert => rtassert in a few cases
* printing to stderr instead of stdout in fail!()
There was a race in the code previously where schedulers would *immediately*
shut down after spawning the main task (because the global task count would
still be 0). This fixes the logic by blocking the sched pool task in receving on
a port instead of spawning a task into the pool to receive on a port.
The modifications necessary were to have a "simple task" running by the time the
code is executing, but this is a simple enough thing to implement and I forsee
this being necessary to have implemented in the future anyway.
Note that this removes a number of run-pass tests which are exercising behavior
of the old runtime. This functionality no longer exists and is thoroughly tested
inside of libgreen and libnative. There isn't really the notion of "starting the
runtime" any more. The major notion now is "bootstrapping the initial task".
The scheduler pool now has a much more simplified interface. There is now a
clear distinction between creating the pool and then interacting the pool. When
a pool is created, all schedulers are not active, and only later if a spawn is
done does activity occur.
There are four operations that you can do on a pool:
1. Create a new pool. The only argument to this function is the configuration
for the scheduler pool. Currently the only configuration parameter is the
number of threads to initially spawn.
2. Spawn a task into this pool. This takes a procedure and task configuration
options and spawns a new task into the pool of schedulers.
3. Spawn a new scheduler into the pool. This will return a handle on which to
communicate with the scheduler in order to do something like a pinned task.
4. Shut down the scheduler pool. This will consume the scheduler pool, request
all of the schedulers to shut down, and then wait on all the scheduler
threads. Currently this will block the invoking OS thread, but I plan on
making 'Thread::join' not a thread-blocking call.
These operations can be used to encode all current usage of M:N schedulers, as
well as providing a simple interface through which a pool can be modified. There
is currently no way to remove a scheduler from a pool of scheduler, as there's
no way to guarantee that a scheduler has exited. This may be added in the
future, however (as necessary).
This extracts everything related to green scheduling from libstd and introduces
a new libgreen crate. This mostly involves deleting most of std::rt and moving
it to libgreen.
Along with the movement of code, this commit rearchitects many functions in the
scheduler in order to adapt to the fact that Local::take now *only* works on a
Task, not a scheduler. This mostly just involved threading the current green
task through in a few locations, but there were one or two spots where things
got hairy.
There are a few repercussions of this commit:
* tube/rc have been removed (the runtime implementation of rc)
* There is no longer a "single threaded" spawning mode for tasks. This is now
encompassed by 1:1 scheduling + communication. Convenience methods have been
introduced that are specific to libgreen to assist in the spawning of pools of
schedulers.
* Streams are now ~3x faster than before (fewer allocations and more optimized)
* Based on a single-producer single-consumer lock-free queue that doesn't
always have to allocate on every send.
* Blocking via mutexes/cond vars outside the runtime
* Streams work in/out of the runtime seamlessly
* Select now works in/out of the runtime seamlessly
* Streams will now fail!() on send() if the other end has hung up
* try_send() will not fail
* PortOne/ChanOne removed
* SharedPort removed
* MegaPipe removed
* Generic select removed (only one kind of port now)
* API redesign
* try_recv == never block
* recv_opt == block, don't fail
* iter() == Iterator<T> for Port<T>
* removed peek
* Type::new
* Removed rt::comm
In order to keep up to date with changes to the libraries that `llvm-config`
spits out, the dependencies to the LLVM are a dynamically generated rust file.
This file is now automatically updated whenever LLVM is updated to get kept
up-to-date.
At the same time, this cleans out some old cruft which isn't necessary in the
makefiles in terms of dependencies.
Closes#10745Closes#10744
Right now, as pointed out in #8132, it is very easy to introduce a subtle race
in the runtime. I believe that this is the cause of the current flakiness on the
bots.
I have taken the last idea mentioned in that issue which is to use a lock around
descheduling and context switching in order to solve this race.
Closes#8132
The reasons for doing this are:
* The model on which linked failure is based is inherently complex
* The implementation is also very complex, and there are few remaining who
fully understand the implementation
* There are existing race conditions in the core context switching function of
the scheduler, and possibly others.
* It's unclear whether this model of linked failure maps well to a 1:1 threading
model
Linked failure is often a desired aspect of tasks, but we would like to take a
much more conservative approach in re-implementing linked failure if at all.
Closes#8674Closes#8318Closes#8863
These two attributes are no longer useful now that Rust has decided to leave
segmented stacks behind. It is assumed that the rust task's stack is always
large enough to make an FFI call (due to the stack being very large).
There's always the case of stack overflow, however, to consider. This does not
change the behavior of stack overflow in Rust. This is still normally triggered
by the __morestack function and aborts the whole process.
C stack overflow will continue to corrupt the stack, however (as it did before
this commit as well). The future improvement of a guard page at the end of every
rust stack is still unimplemented and is intended to be the mechanism through
which we attempt to detect C stack overflow.
Closes#8822Closes#10155
Now that the type_id intrinsic is working across crates, all of these
unnecessary messages can be removed to have the failure type for a task truly be
~Any and only ~Any
- `begin_unwind` is now generic over any `T: Any + Send`.
- Every value you fail with gets boxed as an `~Any`.
- Because of implementation details, `&'static str` and `~str` are still
handled specially behind the scenes.
- Changed the big macro source string in libsyntax to a raw string
literal, and enabled doc comments there.
Some code cleanup, sorting of import blocks
Removed std::unstable::UnsafeArc's use of Either
Added run-fail tests for the new FailWithCause impls
Changed future_result and try to return Result<(), ~Any>.
- Internally, there is an enum of possible fail messages passend around.
- In case of linked failure or a string message, the ~Any gets
lazyly allocated in future_results recv method.
- For that, future result now returns a wrapper around a Port.
- Moved and renamed task::TaskResult into rt::task::UnwindResult
and made it an internal enum.
- Introduced a replacement typedef `type TaskResult = Result<(), ~Any>`.
Almost all languages provide some form of buffering of the stdout stream, and
this commit adds this feature for rust. A handle to stdout is lazily initialized
in the Task structure as a buffered owned Writer trait object. The buffer
behavior depends on where stdout is directed to. Like C, this line-buffers the
stream when the output goes to a terminal (flushes on newlines), and also like C
this uses a fixed-size buffer when output is not directed at a terminal.
We may decide the fixed-size buffering is overkill, but it certainly does reduce
write syscall counts when piping output elsewhere. This is a *huge* benefit to
any code using logging macros or the printing macros. Formatting emits calls to
`write` very frequently, and to have each of them backed by a write syscall was
very expensive.
In a local benchmark of printing 10000 lines of "what" to stdout, I got the
following timings:
when | terminal | redirected
----------|---------------|--------
before | 0.575s | 0.525s
after | 0.197s | 0.013s
C | 0.019s | 0.004s
I can also confirm that we're buffering the output appropriately in both
situtations. We're still far slower than C, but I believe much of that has to do
with the "homing" that all tasks due, we're still performing an order of
magnitude more write syscalls than C does.
Almost all languages provide some form of buffering of the stdout stream, and
this commit adds this feature for rust. A handle to stdout is lazily initialized
in the Task structure as a buffered owned Writer trait object. The buffer
behavior depends on where stdout is directed to. Like C, this line-buffers the
stream when the output goes to a terminal (flushes on newlines), and also like C
this uses a fixed-size buffer when output is not directed at a terminal.
We may decide the fixed-size buffering is overkill, but it certainly does reduce
write syscall counts when piping output elsewhere. This is a *huge* benefit to
any code using logging macros or the printing macros. Formatting emits calls to
`write` very frequently, and to have each of them backed by a write syscall was
very expensive.
In a local benchmark of printing 10000 lines of "what" to stdout, I got the
following timings:
when | terminal | redirected
----------------------------------
before | 0.575s | 0.525s
after | 0.197s | 0.013s
C | 0.019s | 0.004s
I can also confirm that we're buffering the output appropriately in both
situtations. We're still far slower than C, but I believe much of that has to do
with the "homing" that all tasks due, we're still performing an order of
magnitude more write syscalls than C does.
It's not guaranteed that there will always be an event loop to run, and this
implementation will serve as an incredibly basic one which does not provide any
I/O, but allows the scheduler to still run.
cc #9128
The isn't an ideal patch, and the comment why is in the code. Basically uvio
uses task::unkillable which touches the kill flag for a task, and if the task is
failing due to mismangement of the kill flag, then there will be serious
problems when the task tries to print that it's failing.
When uv's TTY I/O is used for the stdio streams, the file descriptors are put
into a non-blocking mode. This means that other concurrent writes to the same
stream can fail with EAGAIN or EWOULDBLOCK. By all I/O to event-loop I/O, we
avoid this error.
There is one location which cannot move, which is the runtime's dumb_println
function. This was implemented to handle the EAGAIN and EWOULDBLOCK errors and
simply retry again and again.
This commit re-introduces the functionality of __morestack in a way that it was
not originally anticipated. Rust does not currently have segmented stacks,
rather just large stack segments. We do not detect when these stack segments are
overrun currently, but this commit leverages __morestack in order to check this.
This commit purges a lot of the old __morestack and stack limit C++
functionality, migrating the necessary chunks to rust. The stack limit is now
entirely maintained in rust, and the "main logic bits" of __morestack are now
also implemented in rust as well.
I put my best effort into validating that this currently builds and runs successfully on osx and linux 32/64 bit, but I was unable to get this working on windows. We never did have unwinding through __morestack frames, and although I tried poking at it for a bit, I was unable to understand why we don't get unwinding right now.
A focus of this commit is to implement as much of the logic in rust as possible. This involved some liberal usage of `no_split_stack` in various locations, along with some use of the `asm!` macro (scary). I modified a bit of C++ to stop calling `record_sp_limit` because this is no longer defined in C++, rather in rust.
Another consequence of this commit is that `thread_local_storage::{get, set}` must both be flagged with `#[rust_stack]`. I've briefly looked at the implementations on osx/linux/windows to ensure that they're pretty small stacks, and I'm pretty sure that they're definitely less than 20K stacks, so we probably don't have a lot to worry about.
Other things worthy of note:
* The default stack size is now 4MB instead of 2MB. This is so that when we request 2MB to call a C function you don't immediately overflow because you have consumed any stack at all.
* `asm!` is actually pretty cool, maybe we could actually define context switching with it?
* I wanted to add links to the internet about all this jazz of storing information in TLS, but I was only able to find a link for the windows implementation. Otherwise my suggestion is just "disassemble on that arch and see what happens"
* I put my best effort forward on arm/mips to tweak __morestack correctly, we have no ability to test this so an extra set of eyes would be useful on these spots.
* This is all really tricky stuff, so I tried to put as many comments as I thought were necessary, but if anything is still unclear (or I completely forgot to take something into account), I'm willing to write more!
This commit resumes management of the stack boundaries and limits when switching
between tasks. This additionally leverages the __morestack function to run code
on "stack overflow". The current behavior is to abort the process, but this is
probably not the best behavior in the long term (for deails, see the comment I
wrote up in the stack exhaustion routine).
This is 2x faster on 64-bit computers at generating anything larger
than 32-bits.
It has been verified against the canonical C implementation from the
website of the creator of ISAAC64.
Also, move `Rng.next` to `Rng.next_u32` and add `Rng.next_u64` to
take full advantage of the wider word width; otherwise Isaac64 will
always be squeezed down into a u32 wasting half the entropy and
offering no advantage over the 32-bit variant.