This optimizes the `home_for_io` code path by requiring fewer scheduler
operations in some situtations.
When moving to your home scheduler, this no longer forces a context switch if
you're already on the home scheduler. Instead, the homing code now simply pins
you to your current scheduler (making it so you can't be stolen away). If you're
not on your home scheduler, then we context switch away, sending you to your
home scheduler.
When the I/O operation is done, then we also no longer forcibly trigger a
context switch. Instead, the action is cased on whether the task is homed or
not. If a task does not have a home, then the task is re-flagged as not having a
home and no context switch is performed. If a task is homed to the current
scheduler, then we don't do anything, and if the task is homed to a foreign
scheduler, then it's sent along its merry way.
I verified that there are about a third as many `write` syscalls done in print
operations now. Libuv uses write to implement async handles, and the homing
before and after each I/O operation was triggering a write on these async
handles. Additionally, using the terrible benchmark of printing 10k times in a
loop, this drives the runtime from 0.6s down to 0.3s (yay!).
Almost all languages provide some form of buffering of the stdout stream, and
this commit adds this feature for rust. A handle to stdout is lazily initialized
in the Task structure as a buffered owned Writer trait object. The buffer
behavior depends on where stdout is directed to. Like C, this line-buffers the
stream when the output goes to a terminal (flushes on newlines), and also like C
this uses a fixed-size buffer when output is not directed at a terminal.
We may decide the fixed-size buffering is overkill, but it certainly does reduce
write syscall counts when piping output elsewhere. This is a *huge* benefit to
any code using logging macros or the printing macros. Formatting emits calls to
`write` very frequently, and to have each of them backed by a write syscall was
very expensive.
In a local benchmark of printing 10000 lines of "what" to stdout, I got the
following timings:
when | terminal | redirected
----------|---------------|--------
before | 0.575s | 0.525s
after | 0.197s | 0.013s
C | 0.019s | 0.004s
I can also confirm that we're buffering the output appropriately in both
situtations. We're still far slower than C, but I believe much of that has to do
with the "homing" that all tasks due, we're still performing an order of
magnitude more write syscalls than C does.
Almost all languages provide some form of buffering of the stdout stream, and
this commit adds this feature for rust. A handle to stdout is lazily initialized
in the Task structure as a buffered owned Writer trait object. The buffer
behavior depends on where stdout is directed to. Like C, this line-buffers the
stream when the output goes to a terminal (flushes on newlines), and also like C
this uses a fixed-size buffer when output is not directed at a terminal.
We may decide the fixed-size buffering is overkill, but it certainly does reduce
write syscall counts when piping output elsewhere. This is a *huge* benefit to
any code using logging macros or the printing macros. Formatting emits calls to
`write` very frequently, and to have each of them backed by a write syscall was
very expensive.
In a local benchmark of printing 10000 lines of "what" to stdout, I got the
following timings:
when | terminal | redirected
----------------------------------
before | 0.575s | 0.525s
after | 0.197s | 0.013s
C | 0.019s | 0.004s
I can also confirm that we're buffering the output appropriately in both
situtations. We're still far slower than C, but I believe much of that has to do
with the "homing" that all tasks due, we're still performing an order of
magnitude more write syscalls than C does.
This is more progress towards #9128 and all its related tree of issues. This implements a new `BasicLoop` on top of pthreads synchronization primitives (wrapped in `LittleLock`). This also removes the wonky `callback_ms` function from the interface of the event loop.
After #9901 is taking forever to land, I'm going to try to do all this runtime work in much smaller chunks than before. Right now this will not work unless #9901 lands first, but I'm close to landing it (hopefully), and I wanted to go ahead and get this reviewed before throwing it at bors later on down the road.
This "pausible idle callback" is also a bit of a weird idea, but it wasn't as difficult to implement as callback_ms so I'm more semi-ok with it.
It's not guaranteed that there will always be an event loop to run, and this
implementation will serve as an incredibly basic one which does not provide any
I/O, but allows the scheduler to still run.
cc #9128
This is a peculiar function to require event loops to implement, and it's only
used in one spot during tests right now. Instead, a possibly more robust apis
for timers should be used rather than requiring all event loops to implement a
curious-looking function.
The PausibleIdleCallback must have some handle into the event loop, and because
struct destructors are run in order of top-to-bottom in order of fields, this
meant that the event loop was getting destroyed before the idle callback was
getting destroyed.
I can't confirm that this fixes a problem in how we use libuv, but it does
semantically fix a problem for usage with other event loops.
Large topics:
* Implemented `rt::io::net::unix`. We've got an implementation backed by "named pipes" for windows for free from libuv, so I'm not sure if these should be `cfg(unix)` or whether they'd be better placed in `rt::io::pipe` (which is currently kinda useless), or to leave in `unix`. Regardless, we probably shouldn't deny windows of functionality which it certainly has.
* Fully implemented `net::addrinfo`, or at least fully implemented in the sense of making the best attempt to wrap libuv's `getaddrinfo` api
* Moved standard I/O to a libuv TTY instead of just a plain old file descriptor. I found that this interacted better when closing stdin, and it has the added bonus of getting things like terminal dimentions (someone should make a progress bar now!)
* Migrate to `~Trait` instead of a typedef'd object where possible. There are only two more types which are blocked on this, and those are traits which have a method which takes by-value self (there's an open issue on this)
* Drop `rt::io::support::PathLike` in favor of just `ToCStr`. We recently had a lot of Path work done, but it still wasn't getting passed down to libuv (there was an intermediate string conversion), and this allows true paths to work all the way down to libuv (and anything else that can become a C string).
* Removes `extra::fileinput` and `extra::io_util`
Closes#9895Closes#9975Closes#8330Closes#6850 (ported lots of libraries away from std::io)
cc #4248 (implemented unix/dns)
cc #9128 (made everything truly trait objects)
This adds constructors to pipe streams in the new runtime to take ownership of
file descriptors, and also fixes a few tests relating to the std::run changes
(new errors are raised on io_error and one test is xfail'd).
I was seeing a lot of weird behavior with stdin behaving as a tty, and it
doesn't really quite make sense, so instead this moves to using libuv's pipes
instead (which make more sense for stdin specifically).
This prevents piping input to rustc hanging forever.
The general idea is to remove conditions completely from I/O, so in the meantime
remove the read_error condition to mean the same thing as the io_error condition.
The isn't an ideal patch, and the comment why is in the code. Basically uvio
uses task::unkillable which touches the kill flag for a task, and if the task is
failing due to mismangement of the kill flag, then there will be serious
problems when the task tries to print that it's failing.
When uv's TTY I/O is used for the stdio streams, the file descriptors are put
into a non-blocking mode. This means that other concurrent writes to the same
stream can fail with EAGAIN or EWOULDBLOCK. By all I/O to event-loop I/O, we
avoid this error.
There is one location which cannot move, which is the runtime's dumb_println
function. This was implemented to handle the EAGAIN and EWOULDBLOCK errors and
simply retry again and again.
This involved changing a fair amount of code, rooted in how we access the local
IoFactory instance. I added a helper method to the rtio module to access the
optional local IoFactory. This is different than before in which it was assumed
that a local IoFactory was *always* present. Now, a separate io_error is raised
when an IoFactory is not present, yet I/O is requested.