After writing some benchmarks for ebml::reader::vuint_at() I noticed that LLVM doesn't seem to inline the from_be32 function even though it only does a call to the bswap32 intrinsic in the x86_64 case. Marking the functions with #[inline(always)] fixes that and seems to me a reasonable thing to do. I got the following measurements in my vuint_at() benchmarks:
- Before
test ebml::bench::vuint_at_A_aligned ... bench: 1075 ns/iter (+/- 58)
test ebml::bench::vuint_at_A_unaligned ... bench: 1073 ns/iter (+/- 5)
test ebml::bench::vuint_at_D_aligned ... bench: 1150 ns/iter (+/- 5)
test ebml::bench::vuint_at_D_unaligned ... bench: 1151 ns/iter (+/- 6)
- Inline from_be32
test ebml::bench::vuint_at_A_aligned ... bench: 769 ns/iter (+/- 9)
test ebml::bench::vuint_at_A_unaligned ... bench: 795 ns/iter (+/- 6)
test ebml::bench::vuint_at_D_aligned ... bench: 758 ns/iter (+/- 8)
test ebml::bench::vuint_at_D_unaligned ... bench: 759 ns/iter (+/- 8)
- Using vuint_at_slow()
test ebml::bench::vuint_at_A_aligned ... bench: 646 ns/iter (+/- 7)
test ebml::bench::vuint_at_A_unaligned ... bench: 645 ns/iter (+/- 3)
test ebml::bench::vuint_at_D_aligned ... bench: 907 ns/iter (+/- 4)
test ebml::bench::vuint_at_D_unaligned ... bench: 1085 ns/iter (+/- 16)
As expected inlining from_be32() gave a considerable speedup.
I also tried how the "slow" version fared against the optimized version and noticed that it's
actually a bit faster for small A class integers (using only two bytes) but slower for big D class integers (using four bytes)
If there is a lot of data in thread-local storage some implementations
of pthreads (e.g. glibc) fail if you don't request a stack large enough
-- by adjusting for the minimum size we guarantee that our stacks are
always large enough. Issue #6233.
These methods are sorely needed on readers and writers, and I believe that the
encoding story should be solved with composition. This commit adds back the
missed functions when reading/writing strings onto generic Readers/Writers.
These methods are sorely needed on readers and writers, and I believe that the
encoding story should be solved with composition. This commit adds back the
missed functions when reading/writing strings onto generic Readers/Writers.
Previously this was an rtabort!, indicating a runtime bug. Promote
this to a more intentional abort and print a (slightly) more
informative error message.
Can't test this sense our test suite can't handle an abort exit.
This patchset adds intrinsics similar to the to_[be|le][16|32|64] intrinsics but for going in the reverse direction, e.g. from big/little endian to host endian. Implementation wise they do exactly the same as the corresponding to_* functions but I think it anyway make sense to have them since using the to_* functions in the reverse direction is not entirely intuitive.
The first patch adds the intrinsics and the two following changes instances of bswap* to use the [to|from]_* intrinsics instead.
For libgreen, bookeeping should not be global but rather on a per-pool basis.
Inside libnative, it's known that there must be a global counter with a
mutex/cvar.
The benefit of taking this strategy is to remove this functionality from libstd
to allow fine-grained control of it through libnative/libgreen. Notably, helper
threads in libnative can manually decrement the global count so they don't count
towards the global count of threads. Also, the shutdown process of *all* sched
pools is now dependent on the number of tasks in the pool being 0 rather than
this only being a hardcoded solution for the initial sched pool in libgreen.
This involved adding a Local::try_take() method on the Local trait in order for
the channel wakeup to work inside of libgreen. The channel send was happening
from a SchedTask when there is no Task available in TLS, and now this is
possible to work (remote wakeups are always possible, just a little slower).
For libgreen, bookeeping should not be global but rather on a per-pool basis.
Inside libnative, it's known that there must be a global counter with a
mutex/cvar.
The benefit of taking this strategy is to remove this functionality from libstd
to allow fine-grained control of it through libnative/libgreen. Notably, helper
threads in libnative can manually decrement the global count so they don't count
towards the global count of threads. Also, the shutdown process of *all* sched
pools is now dependent on the number of tasks in the pool being 0 rather than
this only being a hardcoded solution for the initial sched pool in libgreen.
This involved adding a Local::try_take() method on the Local trait in order for
the channel wakeup to work inside of libgreen. The channel send was happening
from a SchedTask when there is no Task available in TLS, and now this is
possible to work (remote wakeups are always possible, just a little slower).
I personally do not have huge amounts of experience in this area, so there's likely a thing or two wrong around the edges. I tried to just copy what libuv is doing as closely as possible with a few tweaks in a few places, but all of the `std::io::net::udp` tests are now run in both native and green settings so the published functionality is all being tested.
This PR adds `std::unsafe::intrinsics::{volatile_load,volatile_store}`, which map to LLVM's `load volatile` and `store volatile` operations correspondingly.
This would fix#11172.
I have addressed several uncertainties with this PR in the line comments.
The comments have more information as to why this is done, but the basic idea is
that finding an exported trait is actually a fairly difficult problem. The true
answer lies in whether a trait is ever referenced from another exported method,
and right now this kind of analysis doesn't exist, so the conservative answer of
"yes" is always returned to answer whether a trait is exported.
Closes#11224Closes#11225
Right now on linux, an empty executable with LTO still depends on librt becaues
of the clock_gettime function in rust_builtin.o, but this commit moves this
dependency into a rust function which is subject to elimination via LTO.
At the same time, this also drops libstd's dependency on librt on unices that
are not OSX because the library is only used by extra::time (and now the
dependency is listed in that module instead).
Right now on linux, an empty executable with LTO still depends on librt becaues
of the clock_gettime function in rust_builtin.o, but this commit moves this
dependency into a rust function which is subject to elimination via LTO.
At the same time, this also drops libstd's dependency on librt on unices that
are not OSX because the library is only used by extra::time (and now the
dependency is listed in that module instead).
Of the 8 static mutexes that are currently in-use by the compiler and its
libraries, 4 of them are currently used for one-time initialization. The
unforunate side effect of using a static mutex is that the mutex is leaked.
This primitive should provide the basis for efficiently keeping track of
one-time initialization as well as ensuring that it does not leak the internal
mutex that is used.
I have chosen to put this in libstd because libstd is currently making use of a
static initialization mutex (rt::local_ptr), but I can also see a more refined
version of this type being suitable to initialize FFI bindings (such as
initializing LLVM and initializing winsock networking on windows). I also intend
on adding "helper threads" to libnative, and those will greatly benefit from a
simple "once" primitive rather than always reinventing the wheel by using
mutexes and bools.
I would much rather see this primitive built on a mutex that blocks green
threads appropriately, but that does not exist at this time, so it does not
belong outside of `std::unstable`.
The old `rtio-processes` run-pass test is now moved into libstd's `io::process` module, and all process and TCP tests are now run with `iotest!` (both a native and a green version are tested).
All TCP networking on windows is provided by `ws2_32` which is apparently very similar to unix networking (hurray!).
Move the tests into libstd, use the `iotest!` macro to test both native and uv
bindings, and use the cloexec trick to figure out when the child process fails
in exec.
This patch changes `result::collect` (and adds a new `option::collect`) from creating a `~[T]` to take an `Iterator`. This makes the function much more flexible, and may replace the need for #10989.
This patch is a little more complicated than it needs to be because of #11084. Once that is fixed we can replace the `CollectIterator` with a `Scan` iterator.
It also fixes a test warning.
This commit uniforms the short title of modules provided by libstd,
in order to make their roles more explicit when glancing at the index.
Signed-off-by: Luca Bruno <lucab@debian.org>
* vec::raw::to_ptr is gone
* Pausible => Pausable
* Removing @
* Calling the main task "<main>"
* Removing unused imports
* Removing unused mut
* Bringing some libextra tests up to date
* Allowing compiletest to work at stage0
* Fixing the bootstrap-from-c rmake tests
* assert => rtassert in a few cases
* printing to stderr instead of stdout in fail!()
This test also had a race condition in using the cvar/lock, so I fixed that up
as well. The race originated from one half trying to destroy the lock when
another half was using it.
These functions are all unnecessary now, and they only have meaning in the M:N
context. Removing these functions uncovered a bug in the librustuv timer
bindings, but it was fairly easy to cover (and the test is already committed).
These cannot be completely removed just yet due to their usage in the WaitQueue
of extra::sync, and until the mutex in libextra is rewritten it will not be
possible to remove the deferred sends for channels.
This is a very real problem with cvars on normal systems, and all of channels
will not work if spurious wakeups are accepted. This problem is just solved with
a synchronized flag (accessed in the cvar's lock) to see whether a signal()
actually happened or whether it's spurious.
There was a race in the code previously where schedulers would *immediately*
shut down after spawning the main task (because the global task count would
still be 0). This fixes the logic by blocking the sched pool task in receving on
a port instead of spawning a task into the pool to receive on a port.
The modifications necessary were to have a "simple task" running by the time the
code is executing, but this is a simple enough thing to implement and I forsee
this being necessary to have implemented in the future anyway.
Note that this removes a number of run-pass tests which are exercising behavior
of the old runtime. This functionality no longer exists and is thoroughly tested
inside of libgreen and libnative. There isn't really the notion of "starting the
runtime" any more. The major notion now is "bootstrapping the initial task".
This allows creation of different sched pools with different io factories.
Namely, this will be used to test the basic I/O loop in the green crate. This
can also be used to override the global default.
This will prevent a deadlock when a task spins in a try_recv when using channel
communication routines is a clear location for a M:N scheduling to happen.
The scheduler pool now has a much more simplified interface. There is now a
clear distinction between creating the pool and then interacting the pool. When
a pool is created, all schedulers are not active, and only later if a spawn is
done does activity occur.
There are four operations that you can do on a pool:
1. Create a new pool. The only argument to this function is the configuration
for the scheduler pool. Currently the only configuration parameter is the
number of threads to initially spawn.
2. Spawn a task into this pool. This takes a procedure and task configuration
options and spawns a new task into the pool of schedulers.
3. Spawn a new scheduler into the pool. This will return a handle on which to
communicate with the scheduler in order to do something like a pinned task.
4. Shut down the scheduler pool. This will consume the scheduler pool, request
all of the schedulers to shut down, and then wait on all the scheduler
threads. Currently this will block the invoking OS thread, but I plan on
making 'Thread::join' not a thread-blocking call.
These operations can be used to encode all current usage of M:N schedulers, as
well as providing a simple interface through which a pool can be modified. There
is currently no way to remove a scheduler from a pool of scheduler, as there's
no way to guarantee that a scheduler has exited. This may be added in the
future, however (as necessary).
All tests except for the homing tests are now working again with the
librustuv/libgreen refactoring. The homing-related tests are currently commented
out and now placed in the rustuv::homing module.
I plan on refactoring scheduler pool spawning in order to enable more homing
tests in a future commit.
In the compiled version of local_ptr (that with #[thread_local]), the take()
funciton didn't zero-out the previous pointer, allowing for multiple takes (with
fewer runtime assertions being tripped).
This extracts everything related to green scheduling from libstd and introduces
a new libgreen crate. This mostly involves deleting most of std::rt and moving
it to libgreen.
Along with the movement of code, this commit rearchitects many functions in the
scheduler in order to adapt to the fact that Local::take now *only* works on a
Task, not a scheduler. This mostly just involved threading the current green
task through in a few locations, but there were one or two spots where things
got hairy.
There are a few repercussions of this commit:
* tube/rc have been removed (the runtime implementation of rc)
* There is no longer a "single threaded" spawning mode for tasks. This is now
encompassed by 1:1 scheduling + communication. Convenience methods have been
introduced that are specific to libgreen to assist in the spawning of pools of
schedulers.
This commit introduces a new crate called "native" which will be the crate that
implements the 1:1 runtime of rust. This currently entails having an
implementation of std::rt::Runtime inside of libnative as well as moving all of
the native I/O implementations to libnative.
The current snag is that the start lang item must currently be defined in
libnative in order to start running, but this will change in the future.
Cool fact about this crate, there are no extra features that are enabled.
Note that this commit does not include any makefile support necessary for
building libnative, that's all coming in a later commit.
Like the librustuv refactoring, this refactors std::comm to sever all ties with
the scheduler. This means that the entire `comm::imp` module can be deleted in
favor of implementations outside of libstd.
This commit fixes the logging function to be safely implemented, as well as
forcibly requiring a task to be present to use logging macros. This is safely
implemented by transferring ownership of the logger from the task to the local
stack frame in order to perform the print. This means that if a logger does more
logging while logging a new one will be initialized and then will get
overwritten once the initial logging function returns.
Without a scheme such as this, it is possible to unsafely alias two loggers by
logging twice (unsafely borrows from the task twice).
Printing is an incredibly useful debugging utility, and it's not much help if
your debugging prints just trigger an obscure abort when you need them most. In
order to handle this case, forcibly fall back to a libc::write implementation of
printing whenever a local task is not available.
Note that this is *not* a 1:1 fallback. All 1:1 rust tasks will still have a
local Task that it can go through (and stdio will be created through the local
IO factory), this is only a fallback for "no context" rust code (such as that
setting up the context).
It is not the case that all programs will always be able to acquire an instance
of the LocalIo borrow, so this commit exposes this limitation by returning
Option<LocalIo> from LocalIo::borrow().
At the same time, a helper method LocalIo::maybe_raise() has been added in order
to encapsulate the functionality of raising on io_error if there is on local I/O
available.
For now, this moves the following modules to std::sync
* UnsafeArc (also removed unwrap method)
* mpsc_queue
* spsc_queue
* atomics
* mpmc_bounded_queue
* deque
We may want to remove some of the queues, but for now this moves things out of
std::rt into std::sync
This module contains many M:N specific concepts. This will no longer be
available with libgreen, and most functions aren't really that necessary today
anyway. New testing primitives will be introduced as they become available for
1:1 and M:N.
A new io::test module is introduced with the new ip4/ip6 address helpers to
continue usage in io tests.
This trait is used to abstract the differences between 1:1 and M:N scheduling
and is the sole dispatch point for the differences between these two scheduling
modes.
This, and the following series of commits, is not intended to compile. Only
after the entire transition is complete are programs expected to compile.
Could prevent callers from catching the situation and lead to e.g early
iterator terminations (cf. `Reader::read_byte`) since `None` is only to
be returned only on EOF.
… instead of failing.
Make them default methods on the trait, and also make .to_ascii()
a default method while we’re at it.
Conflicts:
src/libstd/ascii.rs
This uses quite a bit of unsafe code for speed and failure safety, and allocates `2*n` temporary storage.
[Performance](https://gist.github.com/huonw/5547f2478380288a28c2):
| n | new | priority_queue | quick3 |
|-------:|---------:|---------------:|---------:|
| 5 | 200 | 155 | 106 |
| 100 | 6490 | 8750 | 5810 |
| 10000 | 1300000 | 1790000 | 1060000 |
| 100000 | 16700000 | 23600000 | 12700000 |
| sorted | 520000 | 1380000 | 53900000 |
| trend | 1310000 | 1690000 | 1100000 |
(The times are in nanoseconds, having subtracted the set-up time (i.e. the `just_generate` bench target).)
I imagine that there is still significant room for improvement, particularly because both priority_queue and quick3 are doing a static call via `Ord` or `TotalOrd` for the comparisons, while this is using a (boxed) closure.
Also, this code does not `clone`, unlike `quick_sort3`; and is stable, unlike both of the others.
Update the next() method to just return self.v in the case that we've reached
the last element that the iterator will yield. This produces equivalent
behavior as before, but without the cost of updating the field.
Update the size_hint() method to return a better hint now that #9629 is fixed.
very small runs.
This uses a lot of unsafe code for speed, otherwise we would be having
to sort by sorting lists of indices and then do a pile of swaps to put
everything in the correct place.
Fixes#9819.
Also, add `.remove_opt` and replace `.unshift` with `.remove(0)`. The
code size reduction seem to compensate for not having the optimised
special cases.
This makes the included benchmark more than 3 times faster.
I haven't landed this fix upstream just yet, but it's opened as
joyent/libuv#1048. For now, I've locally merged it into my fork, and I've
upgraded our repo to point to the new revision.
Closes#11027
I haven't landed this fix upstream just yet, but it's opened as
joyent/libuv#1048. For now, I've locally merged it into my fork, and I've
upgraded our repo to point to the new revision.
Closes#11027
This makes the included benchmark more than 3 times faster. Also,
`.unshift(x)` is now faster as `.insert(0, x)` which can reuse the
allocation if necessary.
For `str.as_mut_buf`, un-closure-ification is achieved by outright removal (see commit message). The others are replaced by `.as_ptr`, `.as_mut_ptr` and `.len`
`.as_mut_buf` was used exactly once, in `.push_char` which could be
written in a simpler way, using the `&mut ~[u8]` that it already
retrieved. In the rare situation when someone really needs
`.as_mut_buf`-like functionality (getting a `*mut u8`), they can go via
`str::raw::as_owned_vec`.
This code in resolve accidentally forced all types with an impl to become
public. This fixes it by default inheriting the privacy of what was previously
there and then becoming `true` if nothing else exits.
Closes#10545
### Remove {As,Into,To}{Option,Either,Result} traits.
Expanded, that is:
- `AsOption`
- `IntoOption`
- `ToOption`
- `AsEither`
- `IntoEither`
- `ToEither`
- `AsResult`
- `IntoResult`
- `ToResult`
These were defined for each other but never *used* anywhere. They are
all trivial and so removal will have negligible effect upon anyone.
`Either` has fallen out of favour (and its implementation of these
traits of dubious semantics), `Option<T>` → `Result<T, ()>` was never
really useful and `Result<T, E>` → `Option<T>` should now be done with
`Result.ok()` (mirrored with `Result.err()` for even more usefulness).
In summary, there's really no point in any of these remaining.
### Rename To{Str,Bytes}Consume traits to Into*.
That is:
- `ToStrConsume` → `IntoStr`;
- `ToBytesConsume` → `IntoBytes`.
This code in resolve accidentally forced all types with an impl to become
public. This fixes it by default inheriting the privacy of what was previously
there and then becoming `true` if nothing else exits.
Closes#10545
The removal of the aliasing &mut[] and &[] from `shift_opt` also comes with its simplification.
The above also allows the use of `copy_nonoverlapping_memory` in `[].copy_memory` (I did an audit of each use of `.copy_memory` and `std::vec::bytes::copy_memory`, and I believe none of them are called with arguments can ever alias). This changes requires that `unsafe` code using `copy_memory` **needs** to respect the aliasing rules of `&mut[]`.
This pull request completely rewrites std::comm and all associated users. Some major bullet points
* Everything now works natively
* oneshots have been removed
* shared ports have been removed
* try_recv no longer blocks (recv_opt blocks)
* constructors are now Chan::new and SharedChan::new
* failure is propagated on send
* stream channels are 3x faster
I have acquired the following measurements on this patch. I compared against Go, but remember that Go's channels are fundamentally different than ours in that sends are by-default blocking. This means that it's not really a totally fair comparison, but it's good to see ballpark numbers for anyway
```
oneshot stream shared1
std 2.111 3.073 1.730
my 6.639 1.037 1.238
native 5.748 1.017 1.250
go8 1.774 3.575 2.948
go8-inf slow 0.837 1.376
go8-128 4.832 1.430 1.504
go1 1.528 1.439 1.251
go2 1.753 3.845 3.166
```
I had three benchmarks:
* oneshot - N times, create a "oneshot channel", send on it, then receive on it (no task spawning)
* stream - N times, send from one task to another task, wait for both to complete
* shared1 - create N threads, each of which sends M times, and a port receives N*M times.
The rows are as follows:
* `std` - the current libstd implementation (before this pull request)
* `my` - this pull request's implementation (in M:N mode)
* `native` - this pull request's implementation (in 1:1 mode)
* `goN` - go's implementation with GOMAXPROCS=N. The only relevant value is 8 (I had 8 cores on this machine)
* `goN-X` - go's implementation where the channels in question were created with buffers of size `X` to behave more similarly to rust's channels.
* Streams are now ~3x faster than before (fewer allocations and more optimized)
* Based on a single-producer single-consumer lock-free queue that doesn't
always have to allocate on every send.
* Blocking via mutexes/cond vars outside the runtime
* Streams work in/out of the runtime seamlessly
* Select now works in/out of the runtime seamlessly
* Streams will now fail!() on send() if the other end has hung up
* try_send() will not fail
* PortOne/ChanOne removed
* SharedPort removed
* MegaPipe removed
* Generic select removed (only one kind of port now)
* API redesign
* try_recv == never block
* recv_opt == block, don't fail
* iter() == Iterator<T> for Port<T>
* removed peek
* Type::new
* Removed rt::comm
docs for copy_memory.
&mut [u8] and &[u8] really shouldn't be overlapping at all (part of the
uniqueness/aliasing guarantee of &mut), so no point in encouraging it.
Could prevent callers from catching the situation and lead to e.g early
iterator terminations (cf. `Reader::read_byte') since `None' is only to
be returned only on EOF.
Understand 'pkgid' in stage0. As a bonus, the snapshot now contains now metadata
(now that those changes have landed), and the snapshot download is half as large
as it used to be!
The problem was that std::run::Process::new() was unwrap()ing the result
of std::io::process::Process::new(), which returns None in the case
where the io_error condition is raised to signal failure to start the
process.
Have std::run::Process::new() similarly return an Option\<run::Process\>
to reflect the fact that a subprocess might have failed to start. Update
utility functions run::process_status() and run::process_output() to
return Option\<ProcessExit\> and Option\<ProcessOutput\>, respectively.
Various parts of librustc and librustpkg needed to be updated to reflect
these API changes.
closes#10754
The problem was that std::run::Process::new() was unwrap()ing the result
of std::io::process::Process::new(), which returns None in the case
where the io_error condition is raised to signal failure to start the
process.
Have std::run::Process::new() similarly return an Option<run::Process>
to reflect the fact that a subprocess might have failed to start. Update
utility functions run::process_status() and run::process_output() to
return Option<ProcessExit> and Option<ProcessOutput>, respectively.
Various parts of librustc and librustpkg needed to be updated to reflect
these API changes.
closes#10754
Expanded, that is:
- `AsOption`
- `IntoOption`
- `ToOption`
- `AsEither`
- `IntoEither`
- `ToEither`
- `AsResult`
- `IntoResult`
- `ToResult`
These were defined for each other but never *used* anywhere. They are
all trivial and so removal will have negligible effect upon anyone.
`Either` has fallen out of favour (and its implementation of these
traits of dubious semantics), `Option<T>` → `Result<T, ()>` was never
really useful and `Result<T, E>` → `Option<T>` should now be done with
`Result.ok()` (mirrored with `Result.err()` for even more usefulness).
In summary, there's really no point in any of these remaining.
This adds a bunch of useful Reader and Writer implementations. I'm not a
huge fan of the name `util` but I can't think of a better name and I
don't want to make `std::io` any longer than it already is.
This adds a bunch of useful Reader and Writer implementations. I'm not a
huge fan of the name `util` but I can't think of a better name and I
don't want to make `std::io` any longer than it already is.
- `Buffer.lines()` returns `LineIterator` which yields line using
`.read_line()`.
- `Reader.bytes()` now takes `&mut self` instead of `self`.
- `Reader.read_until()` swallows `EndOfFile`. This also affects
`.read_line()`.
This replaces the link meta attributes with a pkgid attribute and uses a hash
of this as the crate hash. This makes the crate hash computable by things
other than the Rust compiler. It also switches the hash function ot SHA1 since
that is much more likely to be available in shell, Python, etc than SipHash.
Fixes#10188, #8523.
This moves `std::rand::distribitions::{Normal, StandardNormal}` to `...::distributions::normal`, reexporting `Normal` from `distributions` (and similarly for `Exp` and Exp1`), and adds:
- Log-normal
- Chi-squared
- F
- Student T
all of which are implemented in C++11's random library. Tests in 0424b8aded. Note that these are approximately half documentation & half implementation (of which a significant portion is boilerplate `}`'s and so on).