This reifies the computations required for uniformity done by
(the old) `Rng.gen_integer_range` (now Rng.gen_range), so that they can
be amortised over many invocations, if it is called in a loop.
Also, it makes it correct, but using a trait + impls for each type,
rather than trying to coerce `Int` + `u64` to do the right thing. This
also makes it more extensible, e.g. big integers could & should
implement SampleRange.
This commit re-introduces the functionality of __morestack in a way that it was
not originally anticipated. Rust does not currently have segmented stacks,
rather just large stack segments. We do not detect when these stack segments are
overrun currently, but this commit leverages __morestack in order to check this.
This commit purges a lot of the old __morestack and stack limit C++
functionality, migrating the necessary chunks to rust. The stack limit is now
entirely maintained in rust, and the "main logic bits" of __morestack are now
also implemented in rust as well.
I put my best effort into validating that this currently builds and runs successfully on osx and linux 32/64 bit, but I was unable to get this working on windows. We never did have unwinding through __morestack frames, and although I tried poking at it for a bit, I was unable to understand why we don't get unwinding right now.
A focus of this commit is to implement as much of the logic in rust as possible. This involved some liberal usage of `no_split_stack` in various locations, along with some use of the `asm!` macro (scary). I modified a bit of C++ to stop calling `record_sp_limit` because this is no longer defined in C++, rather in rust.
Another consequence of this commit is that `thread_local_storage::{get, set}` must both be flagged with `#[rust_stack]`. I've briefly looked at the implementations on osx/linux/windows to ensure that they're pretty small stacks, and I'm pretty sure that they're definitely less than 20K stacks, so we probably don't have a lot to worry about.
Other things worthy of note:
* The default stack size is now 4MB instead of 2MB. This is so that when we request 2MB to call a C function you don't immediately overflow because you have consumed any stack at all.
* `asm!` is actually pretty cool, maybe we could actually define context switching with it?
* I wanted to add links to the internet about all this jazz of storing information in TLS, but I was only able to find a link for the windows implementation. Otherwise my suggestion is just "disassemble on that arch and see what happens"
* I put my best effort forward on arm/mips to tweak __morestack correctly, we have no ability to test this so an extra set of eyes would be useful on these spots.
* This is all really tricky stuff, so I tried to put as many comments as I thought were necessary, but if anything is still unclear (or I completely forgot to take something into account), I'm willing to write more!
This commit resumes management of the stack boundaries and limits when switching
between tasks. This additionally leverages the __morestack function to run code
on "stack overflow". The current behavior is to abort the process, but this is
probably not the best behavior in the long term (for deails, see the comment I
wrote up in the stack exhaustion routine).
This lets the C++ code in the rt handle the (slightly) tricky parts of
random number generation: e.g. error detection/handling, and using the
values of the `#define`d options to the various functions.
This provides 2 methods: .reseed() and ::from_seed that modify and
create respecitively.
Implement this trait for the RNGs in the stdlib for which this makes
sense.
The former reads from e.g. /dev/urandom, the latter just wraps any
std::rt::io::Reader into an interface that implements Rng.
This also adds Rng.fill_bytes for efficient implementations of the above
(reading 8 bytes at a time is inefficient when you can read 1000), and
removes the dependence on src/rt (i.e. rand_gen_seed) although this last
one requires implementing hand-seeding of the XorShiftRng used in the
scheduler on Linux/unixes, since OSRng relies on a scheduler existing to
be able to read from /dev/urandom.
This commit fixes all of the fallout of the previous commit which is an attempt
to refine privacy. There were a few unfortunate leaks which now must be plugged,
and the most horrible one is the current `shouldnt_be_public` module now inside
`std::rt`. I think that this either needs a slight reorganization of the
runtime, or otherwise it needs to just wait for the external users of these
modules to get replaced with their `rt` implementations.
Other fixes involve making things pub which should be pub, and otherwise
updating error messages that now reference privacy instead of referencing an
"unresolved name" (yay!).
Also, documentation & general clean-up:
- remove `gen_char_from`: better served by `sample` or `choose`.
- `gen_bytes` generalised to `gen_vec`.
- `gen_int_range`/`gen_uint_range` merged into `gen_integer_range` and
made to be properly uniformly distributed. Fixes#8644.
Minor adjustments to other functions.
The trait will keep the `Iterator` naming, but a more concise module
name makes using the free functions less verbose. The module will define
iterables in addition to iterators, as it deals with iteration in
general.
This removes the stacking of type parameters that occurs when invoking
trait methods, and fixes all places in the standard library that were
relying on it. It is somewhat awkward in places; I think we'll probably
want something like the `Foo::<for T>::new()` syntax.
These aren't used for anything at the moment and cause some TLS hits
on some perf-critical code paths. Will need to put better thought into
it in the future.
Fixed a memory leak caused by the singleton idle callback failing to close correctly. The problem was that the close function requires running inside a callback in the event loop, but we were trying to close the idle watcher after the loop returned from run. The fix was to just call run again to process this callback. There is an additional tweak to move the initialization logic fully into bootstrap, so tasks that do not ever call run do not have problems destructing.
Instead of a furious storm of idle callbacks we just have one. This is a major performance gain - around 40% on my machine for the ping pong bench.
Also in this PR is a cleanup commit for the scheduler code. Was previously up as a separate PR, but bors load + imminent merge hell led me to roll them together. Was #8549.
Every time run_sched_once performs a 'scheduling action' it needs to guarantee
that it runs at least one more time, so enqueue another run_sched_once callback.
The primary reason it needs to do this is because not all async callbacks
are guaranteed to run, it's only guaranteed that *a* callback will run after
enqueing one - some may get dropped.
At the moment this means we wastefully create lots of callbacks to ensure that
there will *definitely* be a callback queued up to continue running the scheduler.
The logic really needs to be tightened up here.