The truncation needs to be done in the console logger in order
to catch all the logging output, and because truncation only matters
when outputting to the console.
When strings lose their trailing null, this pattern will become dangerous:
let foo = "bar";
let foo_ptr: *u8 = &foo[0];
Instead we should use c_strs to handle this correctly.
Every time run_sched_once performs a 'scheduling action' it needs to guarantee
that it runs at least one more time, so enqueue another run_sched_once callback.
The primary reason it needs to do this is because not all async callbacks
are guaranteed to run, it's only guaranteed that *a* callback will run after
enqueing one - some may get dropped.
At the moment this means we wastefully create lots of callbacks to ensure that
there will *definitely* be a callback queued up to continue running the scheduler.
The logic really needs to be tightened up here.
multicast functions now take IpAddr (without port), because they dont't
need port.
Uv* types renamed:
* UvIpAddr -> UvSocketAddr
* UvIpv4 -> UvIpv4SocketAddr
* UvIpv6 -> UvIpv6SocketAddr
"Socket address" is a common name for (ip-address, port) pair (e.g. in
sockaddr_in struct).
P. S. Are there any backward compatibility concerns? What is std::rt module, is it a part of public API?
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.
Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.
No changes to what we accept. A comment notes that surrogate halves are
accepted.
Before:
test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)
After:
test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)
An improvement upon the previous pull #8133
Previously it would call:
f(sf1.cmp(&of1), f(sf2.cmp(&of2), ...))
(where s/of1 = 'self/other field 1', and f was
std::cmp::lexical_ordering)
This meant that every .cmp subcall got evaluated when calling a derived
TotalOrd.cmp.
This corrects this to use
let test = sf1.cmp(&of1);
if test == Equal {
let test = sf2.cmp(&of2);
if test == Equal {
// ...
} else {
test
}
} else {
test
}
This gives a lexical ordering by short-circuiting on the first comparison
that is not Equal.
...y/catch
And before collect_failure. These are both running user dtors and need to be handled
in the task try/catch block and before the final task cleanup code.
And before collect_failure. These are both running user dtors and need to be handled
in the task try/catch block and before the final task cleanup code.
A test case was also created for this situation to prevent the problem
occuring again.
A similar problem was also fixed for the symbol method.
There was some minor code cleanup.
I am unsatisfied with using /dev/null as an invalid dynamic library. It is not cross platform.
The `new` constructor uses the task-local RNG to retrieve seeds for the
two key values, which requires the runtime. Exposing a constructor that
takes the keys directly allows HashMaps to be used in programs that wish
to avoid the runtime.
The method .into_owned() is meant to be used as an optimization when you
need to get a ~str from a Str, but don't want to unnecessarily copy it
if it's already a ~str.
This is meant to ease functions that look like
fn foo<S: Str>(strs: &[S])
Previously they could work with the strings as slices using .as_slice(),
but producing ~str required copying the string, even if the vector
turned out be a &[~str] already.
I don't have any concrete uses for this yet, since the one conversion I've done to `&[S]` so far (see PR #8203) didn't actually need owned strings. But having this here may make using `Str` more attractive.
It also may be worth adding an `into_managed()` function, but that one is less obviously useful than `into_owned()`.
OS X defaults the ulimit for open files to 256 for programs launched
from the Terminal (GUI apps get a higher default). Unfortunately this is
too low for the rt tests, which deliberately overcommit and create a lot
of threads (which means a lot of schedulers, and each scheduler needs at
least 2 fds).
By calling sysctl() and setrlimit() we can bump the fd limit up to the
maximum allowed (on stock OS X it's 10240).
Fixes#7772.
multicast functions now take IpAddr (without port), because they dont't
need port.
Uv* types renamed:
* UvIpAddr -> UvSocketAddr
* UvIpv4 -> UvIpv4SocketAddr
* UvIpv6 -> UvIpv6SocketAddr
"Socket address" is a common name for (ip-address, port) pair (e.g. in
sockaddr_in struct).
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.
Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.
No changes to what we accept. A comment notes that surrogate halves are
accepted.
Before:
test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)
After:
test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)
In the first commit it is obvious why some of the barriers can be changed to ```Relaxed```, but it is not as obvious for the once I changed in ```kill.rs```. The rationale for those is documented as part of the documenting commit.
Also the last commit is a temporary hack to prevent kill signals from being received in taskgroup cleanup code, which could be fixed in a more principled way once the old runtime is gone.
A test case was also created for this situation to prevent the problem
occuring again.
A similar problem was also fixed for the symbol method.
There was some minor code cleanup.
The method .into_owned() is meant to be used as an optimization when you
need to get a ~str from a Str, but don't want to unnecessarily copy it
if it's already a ~str.
This is meant to ease functions that look like
fn foo<S: Str>(strs: &[S])
Previously they could work with the strings as slices using .as_slice(),
but producing ~str required copying the string, even if the vector
turned out be a &[~str] already.
old design the TLS held the scheduler struct, and the scheduler struct
held the active task. This posed all sorts of weird problems due to
how we wanted to use the contents of TLS. The cleaner approach is to
leave the active task in TLS and have the task hold the scheduler. To
make this work out the scheduler has to run inside a regular task, and
then once that is the case the context switching code is massively
simplified, as instead of three possible paths there is only one. The
logical flow is also easier to follow, as the scheduler struct acts
somewhat like a "token" indicating what is active.
These changes also necessitated changing a large number of runtime
tests, and rewriting most of the runtime testing helpers.
Polish level is "low", as I will very soon start on more scheduler
changes that will require wiping the polish off. That being said there
should be sufficient comments around anything complex to make this
entirely respectable as a standalone commit.
Change the former repetition::
for 5.times { }
to::
do 5.times { }
.times() cannot be broken with `break` or `return` anymore; for those
cases, use a numerical range loop instead.
Change all users of old-style for with internal iterators to using
`do`-loops.
The code in stackwalk.rs does not actually implement the
looping protocol (no break on return false).
The code in gc.rs does not use loop breaks, nor does any code using it.
We remove the capacity to break from the loops in std::gc and implement
the walks using `do { .. }` expressions.
No behavior change.
.intersection(), .union() etc methods in trait std::container::Set use
internal iters. Remove these methods from the trait.
I reported issue #8154 for the reinstatement of iterator-based set algebra
methods to the Set trait.
For bitv and treemap, that lack Iterator implementations of set
operations, preserve them as methods directly on the types themselves.
For HashSet, these methods are replaced by the present .union_iter()
etc.
This removes a bunch of options from the task builder interface that are irrelevant to the new scheduler and were generally unused anyway. It also bumps the stack size of new scheduler tasks so that there's enough room to run rustc and changes the interface to `Thread` to not implicitly join threads on destruction, but instead require an explicit, and mandatory, call to `join`.
Main logic in ```Implement select() for new runtime pipes.```. The guts of the ```PortOne::try_recv()``` implementation are now split up across several functions, ```optimistic_check```, ```block_on```, and ```recv_ready```.
There is one weird FIXME I left open here, in the "implement select" commit -- an assertion I couldn't get to work in the receive path, on an invariant that for some reason doesn't hold with ```SharedPort```. Still investigating this.
An 'overlong encoding' is a codepoint encoded non-minimally using the
utf-8 format. Denying these enforce each codepoint to have only one
valid representation in utf-8.
An example is byte sequence 0xE0 0x80 0x80 which could be interpreted as
U+0, but it's an overlong encoding since the canonical form is just
0x00.
Another example is 0xE0 0x80 0xAF which was previously accepted and is
an overlong encoding of the solidus "/". Directory traversal characters
like / and . form the most compelling argument for why this commit is
security critical.
Factor out common UTF-8 decoding expressions as macros. This commit will
partly duplicate UTF-8 decoding, so it is now present in both
fn is_utf8() and .char_range_at(); the latter using an assumption of
a valid str.
Bytes 0xC0, 0xC1 can only be used to start 2-byte codepoint encodings,
that are 'overlong encodings' of codepoints below 128.
The reference given in a comment -- https://tools.ietf.org/html/rfc3629
-- does in fact already exclude these bytes, so no additional comment
should be needed in the code.
Renamed bytes_iter to byte_iter to match other iterators
Refactored str Iterators to use DoubleEnded Iterators and typedefs instead of wrapper structs
Reordered the Iterator section
Whitespace fixup
Moved clunky `each_split_within` function to the one place in the tree where it's actually needed
Replaced all block doccomments in str with line doccomments
Implement RAI where possible for iterator adaptors such as Map,
Enumerate, Skip, Take, Zip, Cycle (all of the requiring that the adapted
iterator also implements RAI).
Drop the "Iterator" suffix for the the structs in std::iterator.
Filter, Zip, Chain etc. are shorter type names for when iterator
pipelines need their types written out in full in return value types, so
it's easier to read and write. the iterator module already forms enough
namespace.
Implement Clone and DeepClone for functions with 0 to 8 arguments. `extern fn()` is implicitly copyable so it's simple, except there is no way to implement it generically over #n function arguments.
Allows deriving of Clone on structs containing `extern "Rust" fn`.
Drop the "Iterator" suffix for the the structs in std::iterator.
Filter, Zip, Chain etc. are shorter type names for when iterator
pipelines need their types written out in full in return value types, so
it's easier to read and write. the iterator module already forms enough
namespace.
Adds a fence operation to close#8061
Also adds static initializers to for atomic types. Since the fields are private, you aren't able to have `static mut` variables that are an atomic type. Each atomic type's initializer starts at a 0-value (so unset for `AtomicFlag` and false for `AtomicBool`).
Good evening,
This is a superset of @MaikKlein's #7969 commit, that I've fixed up to compile. I had a couple commits I wanted to do on top of @MaikKlein's work that I didn't want to bitrot.
To be more specific:
`UPPERCASETYPE` was changed to `UppercaseType`
`type_new` was changed to `Type::new`
`type_function(value)` was changed to `value.method()`
With the recent fixes to method resolution, we can now remove the
dummy type parameters used as crutches in the iterator module.
For example, the zip adaptor type is just ZipIterator<T, U> now.
Implements various missing tcp & udp methods.. Also fixes handling ipv4-mapped/compatible ipv6 addresses and addresses the XXX on `status_to_maybe_uv_error`.
r? @brson
This moves the raw struct layout of closures, vectors, boxes, and strings into a
new `unstable::raw` module. This is meant to be a centralized location to find
information for the layout of these values.
As safe method, `unwrap`, is provided to convert a rust value to its raw
representation. Unsafe methods to convert back are not provided because they are
rarely used and too numerous to write an implementation for each (not much of a
common pattern).
This is progress on #6790. I tried to get a nice interface for a trait to implement in the raw module, but I was unable to come up with one. The hard part is that there are so many different directions to go from one way to another that it's difficult to find a pattern to follow to implement a trait with. Someone else might have some better luck though.
This moves the raw struct layout of closures, vectors, boxes, and strings into a
new `unstable::raw` module. This is meant to be a centralized location to find
information for the layout of these values.
As safe method, `repr`, is provided to convert a rust value to its raw
representation. Unsafe methods to convert back are not provided because they are
rarely used and too numerous to write an implementation for each (not much of a
common pattern).
First, clean up the uses of "None" and "Some" to always use consistent title case matching the variant names.
Add .chain_mut_ref() which is a missing method. A use case example for this method is extraction of an optional value from an Option\<Container\> value.
This is a cleanup pull request that does:
* removes `os::as_c_charp`
* moves `str::as_buf` and `str::as_c_str` into `StrSlice`
* converts some functions from `StrSlice::as_buf` to `StrSlice::as_c_str`
* renames `StrSlice::as_buf` to `StrSlice::as_imm_buf` (and adds `StrSlice::as_mut_buf` to match `vec.rs`.
* renames `UniqueStr::as_bytes_with_null_consume` to `UniqueStr::to_bytes`
* and other misc cleanups and minor optimizations
My first bit of newsched IO work. Pretty simple and limited in scope.
the RtioTimer trait only has a `sleep(msecs: u64)` method, for now. Taking requests on what else ought to be here.
oh yeah: this resolves#6435
This removes all the code from libextra that depends on libuv. After that it removes three runtime features that existed to support the global uv loop: weak tasks, runtime-global variables, and at_exit handlers.
The networking code doesn't have many users besides servo, so shouldn't have much fallout. The timer code though is useful and will probably break out-of-tree code until the new scheduler lands, but I expect that to be soon.
It also incidentally moves `os::change_dir_locked` to `std::unstable`. This is a function used by test cases to avoid cwd races and in my opinion shouldn't be public (#7870).
Closes#7251 and #7870