The green scheduler can optimize its runtime based on this by deciding to not go
to sleep in epoll() if there is no active I/O and there is a task to be stolen.
This is implemented for librustuv by keeping a count of the number of tasks
which are currently homed. If a task is homed, and then performs a blocking I/O
operation, the count will be nonzero while the task is blocked. The homing count
is intentionally 0 when there are I/O handles, but no handles currently blocked.
The reason for this is that epoll() would only be used to wake up the scheduler
anyway.
The crux of this change was to have a `HomingMissile` contain a mutable borrowed
reference back to the `HomeHandle`. The rest of the change was just dealing with
this fallout. This reference is used to decrement the homed handle count in a
HomingMissile's destructor.
Also note that the count maintained is not atomic because all of its
increments/decrements/reads are all on the same I/O thread.
This is part of the overall strategy I would like to take when approaching
issue #11165. The only two I/O objects that reasonably want to be "split" are
the network stream objects. Everything else can be "split" by just creating
another version.
The initial idea I had was the literally split the object into a reader and a
writer half, but that would just introduce lots of clutter with extra interfaces
that were a little unnnecssary, or it would return a ~Reader and a ~Writer which
means you couldn't access things like the remote peer name or local socket name.
The solution I found to be nicer was to just clone the stream itself. The clone
is just a clone of the handle, nothing fancy going on at the kernel level.
Conceptually I found this very easy to wrap my head around (everything else
supports clone()), and it solved the "split" problem at the same time.
The cloning support is pretty specific per platform/lib combination:
* native/win32 - uses some specific WSA apis to clone the SOCKET handle
* native/unix - uses dup() to get another file descriptor
* green/all - This is where things get interesting. When we support full clones
of a handle, this implies that we're allowing simultaneous writes
and reads to happen. It turns out that libuv doesn't support two
simultaneous reads or writes of the same object. It does support
*one* read and *one* write at the same time, however. Some extra
infrastructure was added to just block concurrent writers/readers
until the previous read/write operation was completed.
I've added tests to the tcp/unix modules to make sure that this functionality is
supported everywhere.
The `malloc` family of functions may return a null pointer for a
zero-size allocation, which should not be interpreted as an
out-of-memory error.
If the implementation does not return a null pointer, then handling
this will result in memory savings for zero-size types.
This also switches some code to `malloc_raw` in order to maintain a
centralized point for handling out-of-memory in `rt::global_heap`.
Closes#11634
All tests except for the homing tests are now working again with the
librustuv/libgreen refactoring. The homing-related tests are currently commented
out and now placed in the rustuv::homing module.
I plan on refactoring scheduler pool spawning in order to enable more homing
tests in a future commit.
This reimplements librustuv without using the interfaces provided by the
scheduler in libstd. This solely uses the new Runtime trait in order to
interface with the local task and perform the necessary scheduling operations.
The largest snag in this refactoring is reimplementing homing. The new runtime
trait exposes no concept of "homing" a task or forcibly sending a task to a
remote scheduler (there is no concept of a scheduler). In order to reimplement
homing, the transferrence of tasks is now done at the librustuv level instead of
the scheduler level. This means that all I/O loops now have a concurrent queue
which receives homing messages and requests.
This allows the entire implementation of librustuv to be only dependent on the
runtime trait, severing all dependence of librustuv on the scheduler and related
green-thread functions.
This is all in preparation of the introduction of libgreen and libnative.
At the same time, I also took the liberty of removing all glob imports from
librustuv.
This adds an implementation of the Chase-Lev work-stealing deque to libstd
under std::rt::deque. I've been unable to break the implementation of the deque
itself, and it's not super highly optimized just yet (everything uses a SeqCst
memory ordering).
The major snag in implementing the chase-lev deque is that the buffers used to
store data internally cannot get deallocated back to the OS. In the meantime, a
shared buffer pool (synchronized by a normal mutex) is used to
deallocate/allocate buffers from. This is done in hope of not overcommitting too
much memory. It is in theory possible to eventually free the buffers, but one
must be very careful in doing so.
I was unable to get some good numbers from src/test/bench tests (I don't think
many of them are slamming the work queue that much), but I was able to get some
good numbers from one of my own tests. In a recent rewrite of select::select(),
I found that my implementation was incredibly slow due to contention on the
shared work queue. Upon switching to the parallel deque, I saw the contention
drop to 0 and the runtime go from 1.6s to 0.9s with the most amount of time
spent in libuv awakening the schedulers (plus allocations).
Closes#4877
It turns out that libuv was returning ENOSPC to us in our usage of the
uv_ipX_name functions. It also turns out that there may be an off-by-one in
libuv. For now just add one to the buffer size and handle the return value
correctly.
Closes#10663
The reasons for doing this are:
* The model on which linked failure is based is inherently complex
* The implementation is also very complex, and there are few remaining who
fully understand the implementation
* There are existing race conditions in the core context switching function of
the scheduler, and possibly others.
* It's unclear whether this model of linked failure maps well to a 1:1 threading
model
Linked failure is often a desired aspect of tasks, but we would like to take a
much more conservative approach in re-implementing linked failure if at all.
Closes#8674Closes#8318Closes#8863
These two attributes are no longer useful now that Rust has decided to leave
segmented stacks behind. It is assumed that the rust task's stack is always
large enough to make an FFI call (due to the stack being very large).
There's always the case of stack overflow, however, to consider. This does not
change the behavior of stack overflow in Rust. This is still normally triggered
by the __morestack function and aborts the whole process.
C stack overflow will continue to corrupt the stack, however (as it did before
this commit as well). The future improvement of a guard page at the end of every
rust stack is still unimplemented and is intended to be the mechanism through
which we attempt to detect C stack overflow.
Closes#8822Closes#10155
In the ideal world, uv I/O could be canceled safely at any time. In reality,
however, we are unable to do this. Right now linked failure is fairly flaky as
implemented in the runtime, making it very difficult to test whether the linked
failure mechanisms inside of the uv bindings are ready for this kind of
interaction.
Right now, all constructors will execute in a task::unkillable block, and all
homing I/O operations will prevent linked failure in the duration of the homing
operation. What this means is that tasks which perform I/O are still susceptible
to linked failure, but the I/O operations themselves will never get interrupted.
Instead, the linked failure will be received at the edge of the I/O operation.
There are a few reasons that this is a desirable move to take:
1. Proof of concept that a third party event loop is possible
2. Clear separation of responsibility between rt::io and the uv-backend
3. Enforce in the future that the event loop is "pluggable" and replacable
Here's a quick summary of the points of this pull request which make this
possible:
* Two new lang items were introduced: event_loop, and event_loop_factory.
The idea of a "factory" is to define a function which can be called with no
arguments and will return the new event loop as a trait object. This factory
is emitted to the crate map when building an executable. The factory doesn't
have to exist, and when it doesn't then an empty slot is in the crate map and
a basic event loop with no I/O support is provided to the runtime.
* When building an executable, then the rustuv crate will be linked by default
(providing a default implementation of the event loop) via a similar method to
injecting a dependency on libstd. This is currently the only location where
the rustuv crate is ever linked.
* There is a new #[no_uv] attribute (implied by #[no_std]) which denies
implicitly linking to rustuv by default
Closes#5019