* LLVM now has a C interface to LLVMBuildAtomicRMW
* The exception handling support for the JIT seems to have been dropped
* Various interfaces have been added or headers have changed
rvalues aren't going to be used anywhere but as the argument, so
there's no point in copying them. LLVM used to eliminate the copy
later, but why bother emitting it in the first place?
multicast functions now take IpAddr (without port), because they dont't
need port.
Uv* types renamed:
* UvIpAddr -> UvSocketAddr
* UvIpv4 -> UvIpv4SocketAddr
* UvIpv6 -> UvIpv6SocketAddr
"Socket address" is a common name for (ip-address, port) pair (e.g. in
sockaddr_in struct).
P. S. Are there any backward compatibility concerns? What is std::rt module, is it a part of public API?
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.
Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.
No changes to what we accept. A comment notes that surrogate halves are
accepted.
Before:
test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)
After:
test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)
An improvement upon the previous pull #8133
I suspect that this is a race between process exit and the termination of
worker threads used by libuv (if I sleep before exit it doesn't leak). This
isn't going to cause any real problems but should probably be fixed at
some point.
r? @pcwalton
cc #8253
Previously it would call:
f(sf1.cmp(&of1), f(sf2.cmp(&of2), ...))
(where s/of1 = 'self/other field 1', and f was
std::cmp::lexical_ordering)
This meant that every .cmp subcall got evaluated when calling a derived
TotalOrd.cmp.
This corrects this to use
let test = sf1.cmp(&of1);
if test == Equal {
let test = sf2.cmp(&of2);
if test == Equal {
// ...
} else {
test
}
} else {
test
}
This gives a lexical ordering by short-circuiting on the first comparison
that is not Equal.
...y/catch
And before collect_failure. These are both running user dtors and need to be handled
in the task try/catch block and before the final task cleanup code.
And before collect_failure. These are both running user dtors and need to be handled
in the task try/catch block and before the final task cleanup code.
A test case was also created for this situation to prevent the problem
occuring again.
A similar problem was also fixed for the symbol method.
There was some minor code cleanup.
I am unsatisfied with using /dev/null as an invalid dynamic library. It is not cross platform.
rvalues aren't going to be used anywhere but as the argument, so
there's no point in copying them. LLVM used to eliminate the copy
later, but why bother emitting it in the first place?
The `new` constructor uses the task-local RNG to retrieve seeds for the
two key values, which requires the runtime. Exposing a constructor that
takes the keys directly allows HashMaps to be used in programs that wish
to avoid the runtime.
The `new` constructor uses the task-local RNG to retrieve seeds for the
two key values, which requires the runtime. Exposing a constructor that
takes the keys directly allows HashMaps to be used in programs that wish
to avoid the runtime.
The method .into_owned() is meant to be used as an optimization when you
need to get a ~str from a Str, but don't want to unnecessarily copy it
if it's already a ~str.
This is meant to ease functions that look like
fn foo<S: Str>(strs: &[S])
Previously they could work with the strings as slices using .as_slice(),
but producing ~str required copying the string, even if the vector
turned out be a &[~str] already.
I don't have any concrete uses for this yet, since the one conversion I've done to `&[S]` so far (see PR #8203) didn't actually need owned strings. But having this here may make using `Str` more attractive.
It also may be worth adding an `into_managed()` function, but that one is less obviously useful than `into_owned()`.
RWArc had a clone() method, but it was part of impl RWArc instead of
an implementation of Clone.
Stick with the explicit implementation instead of deriving Clone so we
can have a docstring.
Fixes#8052.
I suspect that this is a race between process exit and the termination of
worker threads used by libuv (if I sleep before exit it doesn't leak). This
isn't going to cause any real problems but should probably be fixed at
some point.
This is preparation for removing `@fn`.
This does *not* use default methods yet, because I don't know
whether they work. If they do, a forthcoming PR will use them.
This also changes the precedence of `as`.
Sigil highlighting isn't perfect (especially how it handles ``&``) but
after having used it for a week I feel it to be considerably nicer than
nothing. As usual, if you don't like it, you can turn it off easily by
overriding the default highlighting.
Generics are not handled specially; this means that for something like
``S<T>``, the ``<`` and ``>`` are highlighted as operators. For myself,
I like this, and there is no way to make it properly context aware
without expanding the syntax matching enormously.
Also, special characters are highlighted properly in strings/chars, e.g.
``"\x00"`` or ``'\Ufedcba98'`` appropriately.
OS X defaults the ulimit for open files to 256 for programs launched
from the Terminal (GUI apps get a higher default). Unfortunately this is
too low for the rt tests, which deliberately overcommit and create a lot
of threads (which means a lot of schedulers, and each scheduler needs at
least 2 fds).
By calling sysctl() and setrlimit() we can bump the fd limit up to the
maximum allowed (on stock OS X it's 10240).
Fixes#7772.
Added functions to cryptoutil.rs that perform an addition after shifting
the 2nd parameter by a specified constant. These function fail!() if integer
overflow will result. Updated the Sha2 implementation to use these functions.
Create a helper function in cryptoutil.rs which feeds 1,000,000 'a's into
a Digest with varying input sizes and then checks the result. This is
essentially the same as one of Sha1's existing tests, so, that test was
re-implemented using this method. New tests were added using this method for
Sha512 and Sha256.
The Sha2 compression functions were re-written to execute the message
scheduling calculations in the same loop as the rest of the compression
function. The compiler is able to generate much better code. Additionally,
innermost part of the compression functions were turned into macros to
reduce code duplicate and to make the functions more concise.
There are 2 main pieces of functionality in cryptoutil.rs:
* A set of unsafe function for efficiently reading and writing u32 and u64
values. All of these functions are fairly easy to audit to confirm that
they do what they are supposed to.
* A FixedBuffer struct. This struct keeps track of input data until there
is enough of it to execute the a function on it which expects a fixed
block of data.
The Sha2 module was rewritten to take advantage of the new functions in
cryptoutil as well as FixedBuffer. The result is that the duplicate code
for maintaining a buffer of input data is removed from the Sha512 and
Sha256 implementation. Additionally, the FixedBuffer code is much more
efficient than the previous code was.
The result_X() methods just calculate an output of a fixed size. They don't
really have much to do with running the actually hash algorithm until the very
last step - the output. It makes much more sense to put all this logic into
the Digest impls for each specific variation on the hash function.
The code was arranged so that the core Sha2 code came first, and then
all of the various implementation of Digest followed along later. The
problem is that the Sha512 compression function code is far away from
the Sha512 Digest implementation, so, if you are trying to read over
the code, you need to scroll all around the file for no good reason. The
code was rearranged so that all of the Sha512 code is in one place and
all of the Sha256 code is in another and so that all impls for a struct
are near the definition of that struct.
multicast functions now take IpAddr (without port), because they dont't
need port.
Uv* types renamed:
* UvIpAddr -> UvSocketAddr
* UvIpv4 -> UvIpv4SocketAddr
* UvIpv6 -> UvIpv6SocketAddr
"Socket address" is a common name for (ip-address, port) pair (e.g. in
sockaddr_in struct).
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.
Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.
No changes to what we accept. A comment notes that surrogate halves are
accepted.
Before:
test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)
After:
test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)
In the first commit it is obvious why some of the barriers can be changed to ```Relaxed```, but it is not as obvious for the once I changed in ```kill.rs```. The rationale for those is documented as part of the documenting commit.
Also the last commit is a temporary hack to prevent kill signals from being received in taskgroup cleanup code, which could be fixed in a more principled way once the old runtime is gone.
A test case was also created for this situation to prevent the problem
occuring again.
A similar problem was also fixed for the symbol method.
There was some minor code cleanup.
* All globals marked as `pub` won't have the `internal` linkage type set
* All global references across crates are forced to use the address of the
global in the other crate via an external reference.
r? @graydon
Closes#8179
The method .into_owned() is meant to be used as an optimization when you
need to get a ~str from a Str, but don't want to unnecessarily copy it
if it's already a ~str.
This is meant to ease functions that look like
fn foo<S: Str>(strs: &[S])
Previously they could work with the strings as slices using .as_slice(),
but producing ~str required copying the string, even if the vector
turned out be a &[~str] already.
old design the TLS held the scheduler struct, and the scheduler struct
held the active task. This posed all sorts of weird problems due to
how we wanted to use the contents of TLS. The cleaner approach is to
leave the active task in TLS and have the task hold the scheduler. To
make this work out the scheduler has to run inside a regular task, and
then once that is the case the context switching code is massively
simplified, as instead of three possible paths there is only one. The
logical flow is also easier to follow, as the scheduler struct acts
somewhat like a "token" indicating what is active.
These changes also necessitated changing a large number of runtime
tests, and rewriting most of the runtime testing helpers.
Polish level is "low", as I will very soon start on more scheduler
changes that will require wiping the polish off. That being said there
should be sufficient comments around anything complex to make this
entirely respectable as a standalone commit.
The pipes compiler produced data types that encoded efficient and safe
bounded message passing protocols between two endpoints. It was also
capable of producing unbounded protocols.
It was useful research but was arguably done before its proper time.
I am removing it for the following reasons:
* In practice we used it only for producing the `oneshot` protcol and
the unbounded `stream` protocol and all communication in Rust use those.
* The interface between the proto! macro and the standard library
has a large surface area and was difficult to maintain through
language and library changes.
* It is now written in an old dialect of Rust and generates code
which would likely be considered non-idiomatic.
* Both the compiler and the runtime are difficult to understand,
and likewise the relationship between the generated code and
the library is hard to understand. Debugging is difficult.
* The new scheduler implements `stream` and `oneshot` by hand
in a way that will be significantly easier to maintain.
This shouldn't be taken as an indication that 'channel protocols'
for Rust are not worth pursuing again in the future.
Concerned parties may include: @graydon, @pcwalton, @eholk, @bblum
The most likely candidates for closing are #7666, #3018, #3020, #7021, #7667, #7303, #3658, #3295.
The pipes compiler produced data types that encoded efficient and safe
bounded message passing protocols between two endpoints. It was also
capable of producing unbounded protocols.
It was useful research but was arguably done before its proper time.
I am removing it for the following reasons:
* In practice we used it only for producing the `oneshot` and `stream`
unbounded protocols and all communication in Rust use those.
* The interface between the proto! macro and the standard library
has a large surface area and was difficult to maintain through
language and library changes.
* It is now written in an old dialect of Rust and generates code
which would likely be considered non-idiomatic.
* Both the compiler and the runtime are difficult to understand,
and likewise the relationship between the generated code and
the library is hard to understand. Debugging is difficult.
* The new scheduler implements `stream` and `oneshot` by hand
in a way that will be significantly easier to maintain.
This shouldn't be taken as an indication that 'channel protocols'
for Rust are not worth pursuing again in the future.
Change the former repetition::
for 5.times { }
to::
do 5.times { }
.times() cannot be broken with `break` or `return` anymore; for those
cases, use a numerical range loop instead.
Change all users of old-style for with internal iterators to using
`do`-loops.
The code in stackwalk.rs does not actually implement the
looping protocol (no break on return false).
The code in gc.rs does not use loop breaks, nor does any code using it.
We remove the capacity to break from the loops in std::gc and implement
the walks using `do { .. }` expressions.
No behavior change.
.intersection(), .union() etc methods in trait std::container::Set use
internal iters. Remove these methods from the trait.
I reported issue #8154 for the reinstatement of iterator-based set algebra
methods to the Set trait.
For bitv and treemap, that lack Iterator implementations of set
operations, preserve them as methods directly on the types themselves.
For HashSet, these methods are replaced by the present .union_iter()
etc.
* All globals marked as `pub` won't have the `internal` linkage type set
* All global references across crates are forced to use the address of the
global in the other crate via an external reference.
Assertions without a message get a generated message that consists of a
prefix plus the stringified expression that is being asserted. That
prefix is currently a unique string, while a static string would be
sufficient and needs less code.
This is a preliminary implementation of `for ... in ... { ...}` using a transitionary keyword `foreach`. Codesize seems to be a little bit down (10% or less non-opt) and otherwise it seems quite trivial to rewrite lambda-based loops to use it. Once we've rewritten the codebase away from lambda-based `for` we can retarget that word at the same production, snapshot, rewrite the keywords in one go, and expire `foreach`.
Feedback welcome. It's a desugaring-based approach which is arguably something we should have been doing for other constructs before. I apologize both for the laziness associated with doing it this way and with any sense that I'm bending rules I put in place previously concerning "never doing desugarings". I put the expansion in `expand.rs` and would be amenable to the argument that the code there needs better factoring / more helpers / to move to a submodule or helper function. It does seem to work at this point, though, and I gather we'd like to get the shift done relatively quickly.
This removes a bunch of options from the task builder interface that are irrelevant to the new scheduler and were generally unused anyway. It also bumps the stack size of new scheduler tasks so that there's enough room to run rustc and changes the interface to `Thread` to not implicitly join threads on destruction, but instead require an explicit, and mandatory, call to `join`.
Assertions without a message get a generated message that consists of a
prefix plus the stringified expression that is being asserted. That
prefix is currently a unique string, while a static string would be
sufficient and needs less code.
Main logic in ```Implement select() for new runtime pipes.```. The guts of the ```PortOne::try_recv()``` implementation are now split up across several functions, ```optimistic_check```, ```block_on```, and ```recv_ready```.
There is one weird FIXME I left open here, in the "implement select" commit -- an assertion I couldn't get to work in the receive path, on an invariant that for some reason doesn't hold with ```SharedPort```. Still investigating this.
Fix is_utf8 and UTF-8 char width functions to deny non-canonical 'overlong encodings' in UTF-8.
We address the function is_utf8 to make it more strict and correct, but no changes are made to the handling of invalid UTF-8.
Fixes issue #3787
An 'overlong encoding' is a codepoint encoded non-minimally using the
utf-8 format. Denying these enforce each codepoint to have only one
valid representation in utf-8.
An example is byte sequence 0xE0 0x80 0x80 which could be interpreted as
U+0, but it's an overlong encoding since the canonical form is just
0x00.
Another example is 0xE0 0x80 0xAF which was previously accepted and is
an overlong encoding of the solidus "/". Directory traversal characters
like / and . form the most compelling argument for why this commit is
security critical.
Factor out common UTF-8 decoding expressions as macros. This commit will
partly duplicate UTF-8 decoding, so it is now present in both
fn is_utf8() and .char_range_at(); the latter using an assumption of
a valid str.
Bytes 0xC0, 0xC1 can only be used to start 2-byte codepoint encodings,
that are 'overlong encodings' of codepoints below 128.
The reference given in a comment -- https://tools.ietf.org/html/rfc3629
-- does in fact already exclude these bytes, so no additional comment
should be needed in the code.
Renamed bytes_iter to byte_iter to match other iterators
Refactored str Iterators to use DoubleEnded Iterators and typedefs instead of wrapper structs
Reordered the Iterator section
Whitespace fixup
Moved clunky `each_split_within` function to the one place in the tree where it's actually needed
Replaced all block doccomments in str with line doccomments
Contiunation of naming cleanup in `libsyntax::ast`:
```rust
ast::node_id => ast::NodeId
ast::local_crate => ast::LOCAL_CRATE
ast::crate_node_id => ast::CRATE_NODE_ID
ast::blk_check_mode => ast::BlockCheckMode
ast::ty_field => ast::TypeField
ast::ty_method => ast::TypeMethod
```
Also moved span field directly into `TypeField` struct and cleaned up overlooked `ast::CrateConfig` renamings from last pull request.
Cheers,
Michael
Implement RAI where possible for iterator adaptors such as Map,
Enumerate, Skip, Take, Zip, Cycle (all of the requiring that the adapted
iterator also implements RAI).
Drop the "Iterator" suffix for the the structs in std::iterator.
Filter, Zip, Chain etc. are shorter type names for when iterator
pipelines need their types written out in full in return value types, so
it's easier to read and write. the iterator module already forms enough
namespace.
Implement Clone and DeepClone for functions with 0 to 8 arguments. `extern fn()` is implicitly copyable so it's simple, except there is no way to implement it generically over #n function arguments.
Allows deriving of Clone on structs containing `extern "Rust" fn`.
r? @graydon Package IDs can now be of the form a/b/c#FOO, where (if a/b/c is
a git repository) FOO is any tag in the repository. Non-numeric
tags only match against package IDs with the same tag, and aren't
compared linearly like numeric versions.
While I was at it, refactored the code that calls `git clone`, and segregated build output properly for different packages.
With an expression like
static w : foo = foo { a:5, ..x };
Rust currently gives the error "constant contains unimplemented expression type". This branch implements support for constant structs with `..base`.
Drop the "Iterator" suffix for the the structs in std::iterator.
Filter, Zip, Chain etc. are shorter type names for when iterator
pipelines need their types written out in full in return value types, so
it's easier to read and write. the iterator module already forms enough
namespace.
This fixes the recently introduced peak memory usage regression by
freeing the intermediate results as soon as they're not required
anymore instead of keeping them around for the whole compilation
process.
Refs #8077
Adds a fence operation to close#8061
Also adds static initializers to for atomic types. Since the fields are private, you aren't able to have `static mut` variables that are an atomic type. Each atomic type's initializer starts at a 0-value (so unset for `AtomicFlag` and false for `AtomicBool`).
#7617
While the code that was there should've been perfectly fine (and seemingly is on linux at least) there seems to be some sort of weird interaction going on with statics and vectors. I couldn't get a smaller test case to reproduce that behaviour. The for loop in `rust::usage` seemingly just goes past the end of the vector thus getting garbage which it tries to pass to malloc somewhere down the line.
In any case, using a fixed length vector seems to mitigate this.
Good evening,
This is a superset of @MaikKlein's #7969 commit, that I've fixed up to compile. I had a couple commits I wanted to do on top of @MaikKlein's work that I didn't want to bitrot.