Let Option be a base for a widely useful one- or zero- item iterator.
Refactor OptionIterator to support any generic element type, so the same
iterator impl can be used for both &T, &mut T and T iterators.
This is an alternative version to https://github.com/mozilla/rust/pull/8268, where instead of transitioning to `get()` completely, I transitioned to `unwrap()` completely.
My reasoning for also opening this PR is that having two different functions with identical behavior on a common datatype is bad for consistency and confusing for users, and should be solved as soon as possible. The fact that apparently half the code uses `get()`, and the other half `unwrap()` only makes it worse.
If the final naming decision ends up different, there needs to be a big renaming anyway, but until then it should at least be consistent.
---
- Made naming schemes consistent between Option, Result and Either
- Lifted the quality of the either and result module to that of option
- Changed Options Add implementation to work like the maybe Monad (return None if any of the inputs is None)
See https://github.com/mozilla/rust/issues/6002, especially my last comment.
- Removed duplicate Option::get and renamed all related functions to use the term `unwrap` instead
See also https://github.com/mozilla/rust/issues/7887.
Todo:
Adding testcases for all function in the three modules. Even without the few functions I added, the coverage wasn't complete to begin with. But I'd rather do that as a follow up PR, I've touched to much code here already, need to go through them again later.
- Made naming schemes consistent between Option, Result and Either
- Changed Options Add implementation to work like the maybe monad (return None if any of the inputs is None)
- Removed duplicate Option::get and renamed all related functions to use the term `unwrap` instead
fn slice_bytes is marked unsafe since it allows violating the valid
string encoding property; but the function did also allow extending the
lifetime of the slice by mistake, since it's returning `&str`.
Use the annotation `slice_bytes<'a>(&'a str, ...) -> &'a str` so
that all uses of slice_bytes are region checked correctly.
It seems that relatively new code uses `Foo::new()` instead of `Foo()` so I wrote a patch to migrate some structs to the former style.
Is it a right direction? If there are any guidelines not to use new()-style, could you add them to the [style guide](https://github.com/omasanori/rust/wiki/Note-style-guide)?
The truncation needs to be done in the console logger in order
to catch all the logging output, and because truncation only matters
when outputting to the console.
Update the arm linux support some more. We had a previous patch for the RasberryPi. This adds a new target `arm-unknown-linux-gnueabi` for more general arm linux support.
Build/Host machine: x86_64 Debian testing (jessie) with the `gcc-4.4-arm-linux-gnueabi` package
Tested on targets:
- TS-7800 Feroceon (ARMv5TEJ) running Debian 7.0 wheezy
- Beaglebone black (ARMv7) running Angstrom GNU/Linux v2012.12
- rustc flags: `--target=arm-unknown-linux-gnueabi --linker=arm-linux-gnueabi-gcc`
- Samsung Galaxy S II (to make sure android still works)
- rustc flags: `--target=arm-linux-androideabi --android-cross-path=[path to standalone toolchain]`
Since not all arm devices (i.e. afaik anything older than armv6 like the ts-7800 i tested on) supported getting the tls address via the `mrc` instruction, I made it also try via the magic address the kernel maps into the address space (0xFFFF0FF0). One or the other should work (and on android it seems like both work).
Also fixes a bug where rustc would always try to invoke the android assembler for any kind of arm target.
The compiler-generated dtor for DList recurses deeply to drop Nodes.
For big lists this can overflow the stack.
This is a problem for the new scheduler, where split stacks are not implemented.
Thanks @blake2-ppc
When strings lose their trailing null, this pattern will become dangerous:
let foo = "bar";
let foo_ptr: *u8 = &foo[0];
Instead we should use c_strs to handle this correctly.
Every time run_sched_once performs a 'scheduling action' it needs to guarantee
that it runs at least one more time, so enqueue another run_sched_once callback.
The primary reason it needs to do this is because not all async callbacks
are guaranteed to run, it's only guaranteed that *a* callback will run after
enqueing one - some may get dropped.
At the moment this means we wastefully create lots of callbacks to ensure that
there will *definitely* be a callback queued up to continue running the scheduler.
The logic really needs to be tightened up here.
WIth this patch `RUSTFLAGS='--cfg unicode' make check"` passed successfully.
* Why doesn't `#[link_name="icuuc"]` make libextra to link against libicuuc.so?
* In `extra::unicode::tests`, `use unicode; unicode::is_foo('a')` failed but `use unicode::*; is_foo('a')` succeeded. Is it right?
* LLVM now has a C interface to LLVMBuildAtomicRMW
* The exception handling support for the JIT seems to have been dropped
* Various interfaces have been added or headers have changed
rvalues aren't going to be used anywhere but as the argument, so
there's no point in copying them. LLVM used to eliminate the copy
later, but why bother emitting it in the first place?
multicast functions now take IpAddr (without port), because they dont't
need port.
Uv* types renamed:
* UvIpAddr -> UvSocketAddr
* UvIpv4 -> UvIpv4SocketAddr
* UvIpv6 -> UvIpv6SocketAddr
"Socket address" is a common name for (ip-address, port) pair (e.g. in
sockaddr_in struct).
P. S. Are there any backward compatibility concerns? What is std::rt module, is it a part of public API?
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.
Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.
No changes to what we accept. A comment notes that surrogate halves are
accepted.
Before:
test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)
After:
test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)
An improvement upon the previous pull #8133
I suspect that this is a race between process exit and the termination of
worker threads used by libuv (if I sleep before exit it doesn't leak). This
isn't going to cause any real problems but should probably be fixed at
some point.
r? @pcwalton
cc #8253
Previously it would call:
f(sf1.cmp(&of1), f(sf2.cmp(&of2), ...))
(where s/of1 = 'self/other field 1', and f was
std::cmp::lexical_ordering)
This meant that every .cmp subcall got evaluated when calling a derived
TotalOrd.cmp.
This corrects this to use
let test = sf1.cmp(&of1);
if test == Equal {
let test = sf2.cmp(&of2);
if test == Equal {
// ...
} else {
test
}
} else {
test
}
This gives a lexical ordering by short-circuiting on the first comparison
that is not Equal.
...y/catch
And before collect_failure. These are both running user dtors and need to be handled
in the task try/catch block and before the final task cleanup code.
And before collect_failure. These are both running user dtors and need to be handled
in the task try/catch block and before the final task cleanup code.
A test case was also created for this situation to prevent the problem
occuring again.
A similar problem was also fixed for the symbol method.
There was some minor code cleanup.
I am unsatisfied with using /dev/null as an invalid dynamic library. It is not cross platform.
rvalues aren't going to be used anywhere but as the argument, so
there's no point in copying them. LLVM used to eliminate the copy
later, but why bother emitting it in the first place?
The `new` constructor uses the task-local RNG to retrieve seeds for the
two key values, which requires the runtime. Exposing a constructor that
takes the keys directly allows HashMaps to be used in programs that wish
to avoid the runtime.
The `new` constructor uses the task-local RNG to retrieve seeds for the
two key values, which requires the runtime. Exposing a constructor that
takes the keys directly allows HashMaps to be used in programs that wish
to avoid the runtime.
The method .into_owned() is meant to be used as an optimization when you
need to get a ~str from a Str, but don't want to unnecessarily copy it
if it's already a ~str.
This is meant to ease functions that look like
fn foo<S: Str>(strs: &[S])
Previously they could work with the strings as slices using .as_slice(),
but producing ~str required copying the string, even if the vector
turned out be a &[~str] already.
I don't have any concrete uses for this yet, since the one conversion I've done to `&[S]` so far (see PR #8203) didn't actually need owned strings. But having this here may make using `Str` more attractive.
It also may be worth adding an `into_managed()` function, but that one is less obviously useful than `into_owned()`.
RWArc had a clone() method, but it was part of impl RWArc instead of
an implementation of Clone.
Stick with the explicit implementation instead of deriving Clone so we
can have a docstring.
Fixes#8052.
I suspect that this is a race between process exit and the termination of
worker threads used by libuv (if I sleep before exit it doesn't leak). This
isn't going to cause any real problems but should probably be fixed at
some point.
This is preparation for removing `@fn`.
This does *not* use default methods yet, because I don't know
whether they work. If they do, a forthcoming PR will use them.
This also changes the precedence of `as`.
Sigil highlighting isn't perfect (especially how it handles ``&``) but
after having used it for a week I feel it to be considerably nicer than
nothing. As usual, if you don't like it, you can turn it off easily by
overriding the default highlighting.
Generics are not handled specially; this means that for something like
``S<T>``, the ``<`` and ``>`` are highlighted as operators. For myself,
I like this, and there is no way to make it properly context aware
without expanding the syntax matching enormously.
Also, special characters are highlighted properly in strings/chars, e.g.
``"\x00"`` or ``'\Ufedcba98'`` appropriately.
OS X defaults the ulimit for open files to 256 for programs launched
from the Terminal (GUI apps get a higher default). Unfortunately this is
too low for the rt tests, which deliberately overcommit and create a lot
of threads (which means a lot of schedulers, and each scheduler needs at
least 2 fds).
By calling sysctl() and setrlimit() we can bump the fd limit up to the
maximum allowed (on stock OS X it's 10240).
Fixes#7772.
Added functions to cryptoutil.rs that perform an addition after shifting
the 2nd parameter by a specified constant. These function fail!() if integer
overflow will result. Updated the Sha2 implementation to use these functions.
Create a helper function in cryptoutil.rs which feeds 1,000,000 'a's into
a Digest with varying input sizes and then checks the result. This is
essentially the same as one of Sha1's existing tests, so, that test was
re-implemented using this method. New tests were added using this method for
Sha512 and Sha256.
The Sha2 compression functions were re-written to execute the message
scheduling calculations in the same loop as the rest of the compression
function. The compiler is able to generate much better code. Additionally,
innermost part of the compression functions were turned into macros to
reduce code duplicate and to make the functions more concise.
There are 2 main pieces of functionality in cryptoutil.rs:
* A set of unsafe function for efficiently reading and writing u32 and u64
values. All of these functions are fairly easy to audit to confirm that
they do what they are supposed to.
* A FixedBuffer struct. This struct keeps track of input data until there
is enough of it to execute the a function on it which expects a fixed
block of data.
The Sha2 module was rewritten to take advantage of the new functions in
cryptoutil as well as FixedBuffer. The result is that the duplicate code
for maintaining a buffer of input data is removed from the Sha512 and
Sha256 implementation. Additionally, the FixedBuffer code is much more
efficient than the previous code was.
The result_X() methods just calculate an output of a fixed size. They don't
really have much to do with running the actually hash algorithm until the very
last step - the output. It makes much more sense to put all this logic into
the Digest impls for each specific variation on the hash function.
The code was arranged so that the core Sha2 code came first, and then
all of the various implementation of Digest followed along later. The
problem is that the Sha512 compression function code is far away from
the Sha512 Digest implementation, so, if you are trying to read over
the code, you need to scroll all around the file for no good reason. The
code was rearranged so that all of the Sha512 code is in one place and
all of the Sha256 code is in another and so that all impls for a struct
are near the definition of that struct.
multicast functions now take IpAddr (without port), because they dont't
need port.
Uv* types renamed:
* UvIpAddr -> UvSocketAddr
* UvIpv4 -> UvIpv4SocketAddr
* UvIpv6 -> UvIpv6SocketAddr
"Socket address" is a common name for (ip-address, port) pair (e.g. in
sockaddr_in struct).
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.
Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.
No changes to what we accept. A comment notes that surrogate halves are
accepted.
Before:
test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)
After:
test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)
In the first commit it is obvious why some of the barriers can be changed to ```Relaxed```, but it is not as obvious for the once I changed in ```kill.rs```. The rationale for those is documented as part of the documenting commit.
Also the last commit is a temporary hack to prevent kill signals from being received in taskgroup cleanup code, which could be fixed in a more principled way once the old runtime is gone.
A test case was also created for this situation to prevent the problem
occuring again.
A similar problem was also fixed for the symbol method.
There was some minor code cleanup.
* All globals marked as `pub` won't have the `internal` linkage type set
* All global references across crates are forced to use the address of the
global in the other crate via an external reference.
r? @graydon
Closes#8179
The method .into_owned() is meant to be used as an optimization when you
need to get a ~str from a Str, but don't want to unnecessarily copy it
if it's already a ~str.
This is meant to ease functions that look like
fn foo<S: Str>(strs: &[S])
Previously they could work with the strings as slices using .as_slice(),
but producing ~str required copying the string, even if the vector
turned out be a &[~str] already.
old design the TLS held the scheduler struct, and the scheduler struct
held the active task. This posed all sorts of weird problems due to
how we wanted to use the contents of TLS. The cleaner approach is to
leave the active task in TLS and have the task hold the scheduler. To
make this work out the scheduler has to run inside a regular task, and
then once that is the case the context switching code is massively
simplified, as instead of three possible paths there is only one. The
logical flow is also easier to follow, as the scheduler struct acts
somewhat like a "token" indicating what is active.
These changes also necessitated changing a large number of runtime
tests, and rewriting most of the runtime testing helpers.
Polish level is "low", as I will very soon start on more scheduler
changes that will require wiping the polish off. That being said there
should be sufficient comments around anything complex to make this
entirely respectable as a standalone commit.
The pipes compiler produced data types that encoded efficient and safe
bounded message passing protocols between two endpoints. It was also
capable of producing unbounded protocols.
It was useful research but was arguably done before its proper time.
I am removing it for the following reasons:
* In practice we used it only for producing the `oneshot` protcol and
the unbounded `stream` protocol and all communication in Rust use those.
* The interface between the proto! macro and the standard library
has a large surface area and was difficult to maintain through
language and library changes.
* It is now written in an old dialect of Rust and generates code
which would likely be considered non-idiomatic.
* Both the compiler and the runtime are difficult to understand,
and likewise the relationship between the generated code and
the library is hard to understand. Debugging is difficult.
* The new scheduler implements `stream` and `oneshot` by hand
in a way that will be significantly easier to maintain.
This shouldn't be taken as an indication that 'channel protocols'
for Rust are not worth pursuing again in the future.
Concerned parties may include: @graydon, @pcwalton, @eholk, @bblum
The most likely candidates for closing are #7666, #3018, #3020, #7021, #7667, #7303, #3658, #3295.
The pipes compiler produced data types that encoded efficient and safe
bounded message passing protocols between two endpoints. It was also
capable of producing unbounded protocols.
It was useful research but was arguably done before its proper time.
I am removing it for the following reasons:
* In practice we used it only for producing the `oneshot` and `stream`
unbounded protocols and all communication in Rust use those.
* The interface between the proto! macro and the standard library
has a large surface area and was difficult to maintain through
language and library changes.
* It is now written in an old dialect of Rust and generates code
which would likely be considered non-idiomatic.
* Both the compiler and the runtime are difficult to understand,
and likewise the relationship between the generated code and
the library is hard to understand. Debugging is difficult.
* The new scheduler implements `stream` and `oneshot` by hand
in a way that will be significantly easier to maintain.
This shouldn't be taken as an indication that 'channel protocols'
for Rust are not worth pursuing again in the future.
Change the former repetition::
for 5.times { }
to::
do 5.times { }
.times() cannot be broken with `break` or `return` anymore; for those
cases, use a numerical range loop instead.
Change all users of old-style for with internal iterators to using
`do`-loops.
The code in stackwalk.rs does not actually implement the
looping protocol (no break on return false).
The code in gc.rs does not use loop breaks, nor does any code using it.
We remove the capacity to break from the loops in std::gc and implement
the walks using `do { .. }` expressions.
No behavior change.
.intersection(), .union() etc methods in trait std::container::Set use
internal iters. Remove these methods from the trait.
I reported issue #8154 for the reinstatement of iterator-based set algebra
methods to the Set trait.
For bitv and treemap, that lack Iterator implementations of set
operations, preserve them as methods directly on the types themselves.
For HashSet, these methods are replaced by the present .union_iter()
etc.
* All globals marked as `pub` won't have the `internal` linkage type set
* All global references across crates are forced to use the address of the
global in the other crate via an external reference.
Assertions without a message get a generated message that consists of a
prefix plus the stringified expression that is being asserted. That
prefix is currently a unique string, while a static string would be
sufficient and needs less code.