This PR fixes#7235 and #3371, which removes trailing nulls from `str` types. Instead, it replaces the creation of c strings with a new type, `std::c_str::CString`, which wraps a malloced byte array, and respects:
* No interior nulls
* Ends with a trailing null
r? @graydon Also, notably, make rustpkgtest depend on the rustpkg executable (otherwise, tests that shell out to rustpgk might run when rustpkg doesn't exist).
This commit allows you to write:
extern mod x = "a/b/c";
which means rustc will search in the RUST_PATH for a package with
ID a/b/c, and bind it to the name `x` if it's found.
Incidentally, move get_relative_to from back::rpath into std::path
Mostly optimizing TLS accesses to bring local heap allocation performance
closer to that of oldsched. It's not completely at parity but removing the
branches involved in supporting oldsched and optimizing pthread_get/setspecific
to instead use our dedicated TCB slot will probably make up for it.
It now actually does logging, and is compiled out when `--cfg rtdebug` is not
given to the libstd build, which it isn't by default. This makes the rt
benchmarks 18-50% faster.
Mostly optimizing TLS accesses to bring local heap allocation performance
closer to that of oldsched. It's not completely at parity but removing the
branches involved in supporting oldsched and optimizing pthread_get/setspecific
to instead use our dedicated TCB slot will probably make up for it.
Basically, generic containers should not use the default methods since a
type of elements may not guarantees total order. str could use them
since u8's Ord guarantees total order. Floating point numbers are also
broken with the default methods because of NaN. Thanks for @thestinger.
Timespec also guarantees total order AIUI. I'm unsure whether
extra::semver::Identifier does so I left it alone. Proof needed.
Signed-off-by: OGINO Masanori <masanori.ogino@gmail.com>
FromStr implemented from scratch.
It is overengineered a bit, however.
Old implementation handles errors by fail!()-ing. And it has bugs, like it accepts `127.0.0.1::127.0.0.1` as IPv6 address, and does not handle all ipv4-in-ipv6 schemes. So I decided to implement parser from scratch.
This pull request converts the scheduler from a naive shared queue scheduler to a naive workstealing scheduler. The deque is still a queue inside a lock, but there is still a substantial performance gain. Fiddling with the messaging benchmark I got a ~10x speedup and observed massively reduced memory usage.
There are still *many* locations for optimization, but based on my experience so far it is a clear performance win as it is now.
This is a fairly large rollup, but I've tested everything locally, and none of
it should be platform-specific.
r=alexcrichton (bdfdbdd)
r=brson (d803c18)
r=alexcrichton (a5041d0)
r=bstrie (317412a)
r=alexcrichton (135c85e)
r=thestinger (8805baa)
r=pcwalton (0661178)
r=cmr (9397fe0)
r=cmr (caa4135)
r=cmr (6a21d93)
r=cmr (4dc3379)
r=cmr (0aa5154)
r=cmr (18be261)
r=thestinger (f10be03)
Use the definition, where R is <, <=, >=, or >
[x, ..xs] R [y, ..ys] = if x != y { x R y } else { xs R ys }
Previously, tuples would only implement < and derive the other
comparisons from it; this is incorrect. Included are several testcases
involving NaN comparisons that are now correct.
Previously, tuples would consider an element equal if both a < b and
b < a were false, this was also incorrect.
Use Eq + Ord for lexicographical ordering of sequences.
For each of <, <=, >= or > as R, use::
[x, ..xs] R [y, ..ys] = if x != y { x R y } else { xs R ys }
Previous code using `a < b` and then `!(b < a)` for short-circuiting
fails on cases such as [1.0, 2.0] < [0.0/0.0, 3.0], where the first
element was effectively considered equal.
This is a reopening of #8182, although this removes any abuse of the compiler internals. Now it's just a pure syntax extension (hard coded what the attribute names are).
According to #7887, we've decided to use the syntax of `fn map<U>(f: &fn(&T) -> U) -> U`, which passes a reference to the closure, and to `fn map_move<U>(f: &fn(T) -> U) -> U` which moves the value into the closure. This PR adds these `.map_move()` functions to `Option` and `Result`.
In addition, it has these other minor features:
* Replaces a couple uses of `option.get()`, `result.get()`, and `result.get_err()` with `option.unwrap()`, `result.unwrap()`, and `result.unwrap_err()`. (See #8268 and #8288 for a more thorough adaptation of this functionality.
* Removes `option.take_map()` and `option.take_map_default()`. These two functions can be easily written as `.take().map_move(...)`.
* Adds a better error message to `result.unwrap()` and `result.unwrap_err()`.
The two deletions are because the test cases are very old (still using `class` and modes!), and, as far as I can tell (since they are so old), the areas they test are well tested by other rpass tests.
Some general clean-up relating to deriving:
- `TotalOrd` was too eager, and evaluated the `.cmp` call for every field, even if it could short-circuit earlier.
- the pointer types didn't have impls for `TotalOrd` or `TotalEq`.
- the Makefiles didn't reach deep enough into libsyntax for dependencies.
(Split out from https://github.com/mozilla/rust/pull/8258.)
This results in throwing away alias analysis information, because LLVM
does *not* implement reasoning about these conversions yet.
We specialize zero-size types since a `getelementptr` offset will
return us the same pointer, making it broken as a simple counter.
This lazily initializes the taskgroup structs for ```spawn_unlinked``` tasks. If such a task never spawns another task linked to it (or a descendant of it), its taskgroup is simply never initialized at all. Also if an unlinked task spawns another unlinked task, neither of them will need to initialize their taskgroups. This works for the main task too.
I benchmarked this with the following test case and observed a ~~21% speedup (average over 4 runs: 7.85 sec -> 6.20 sec, 2.5 GHz)~~ 11% speedup, see comment below.
```
use std::task;
use std::cell::Cell;
use std::rt::comm;
static NUM: uint = 1024*256;
fn run(f: ~fn()) {
let mut t = task::task();
t.unlinked();
t.spawn(f);
}
fn main() {
do NUM.times {
let (p,c) = comm::oneshot();
let c = Cell::new(c);
do run { c.take().send(()); }
p.recv();
}
}
```
Better than that in rt::uv::net, because it:
* handles invalid input explicitly, without fail!()
* parses socket address, not just IP
* handles various ipv4-in-ipv6 addresses, like 2001:db8:122:344::192.0.2.33
(see http://tools.ietf.org/html/rfc6052 for example)
* rejects output like `127.0000000.0.1`
* does not allocate heap memory
* have unit tests
`fn slice_bytes` is marked unsafe since it allows violating the valid
string encoding property; but the function did also allow extending the
lifetime of the slice by mistake, since it's returning `&str`.
Use the annotation `slice_bytes<'a>(&'a str, ...) -> &'a str` so
that all uses of `slice_bytes` are region checked correctly.
Let Option be a base for a widely useful one- or zero- item iterator.
Refactor OptionIterator to support any generic element type, so the same
iterator impl can be used for both &T, &mut T and T iterators.
This is an alternative version to https://github.com/mozilla/rust/pull/8268, where instead of transitioning to `get()` completely, I transitioned to `unwrap()` completely.
My reasoning for also opening this PR is that having two different functions with identical behavior on a common datatype is bad for consistency and confusing for users, and should be solved as soon as possible. The fact that apparently half the code uses `get()`, and the other half `unwrap()` only makes it worse.
If the final naming decision ends up different, there needs to be a big renaming anyway, but until then it should at least be consistent.
---
- Made naming schemes consistent between Option, Result and Either
- Lifted the quality of the either and result module to that of option
- Changed Options Add implementation to work like the maybe Monad (return None if any of the inputs is None)
See https://github.com/mozilla/rust/issues/6002, especially my last comment.
- Removed duplicate Option::get and renamed all related functions to use the term `unwrap` instead
See also https://github.com/mozilla/rust/issues/7887.
Todo:
Adding testcases for all function in the three modules. Even without the few functions I added, the coverage wasn't complete to begin with. But I'd rather do that as a follow up PR, I've touched to much code here already, need to go through them again later.
- Made naming schemes consistent between Option, Result and Either
- Changed Options Add implementation to work like the maybe monad (return None if any of the inputs is None)
- Removed duplicate Option::get and renamed all related functions to use the term `unwrap` instead
fn slice_bytes is marked unsafe since it allows violating the valid
string encoding property; but the function did also allow extending the
lifetime of the slice by mistake, since it's returning `&str`.
Use the annotation `slice_bytes<'a>(&'a str, ...) -> &'a str` so
that all uses of slice_bytes are region checked correctly.
The truncation needs to be done in the console logger in order
to catch all the logging output, and because truncation only matters
when outputting to the console.
When strings lose their trailing null, this pattern will become dangerous:
let foo = "bar";
let foo_ptr: *u8 = &foo[0];
Instead we should use c_strs to handle this correctly.
Every time run_sched_once performs a 'scheduling action' it needs to guarantee
that it runs at least one more time, so enqueue another run_sched_once callback.
The primary reason it needs to do this is because not all async callbacks
are guaranteed to run, it's only guaranteed that *a* callback will run after
enqueing one - some may get dropped.
At the moment this means we wastefully create lots of callbacks to ensure that
there will *definitely* be a callback queued up to continue running the scheduler.
The logic really needs to be tightened up here.
multicast functions now take IpAddr (without port), because they dont't
need port.
Uv* types renamed:
* UvIpAddr -> UvSocketAddr
* UvIpv4 -> UvIpv4SocketAddr
* UvIpv6 -> UvIpv6SocketAddr
"Socket address" is a common name for (ip-address, port) pair (e.g. in
sockaddr_in struct).
P. S. Are there any backward compatibility concerns? What is std::rt module, is it a part of public API?
Use unchecked vec indexing since the vector bounds are checked by the
loop. Iterators are not easy to use in this case since we skip 1-4 bytes
each lap. This part of the commit speeds up is_utf8 for ASCII input.
Check codepoint ranges by checking the byte ranges manually instead of
computing a full decoding for multibyte encodings. This is easy to read
and corresponds to the UTF-8 syntax in the RFC.
No changes to what we accept. A comment notes that surrogate halves are
accepted.
Before:
test str::bench::is_utf8_100_ascii ... bench: 165 ns/iter (+/- 3)
test str::bench::is_utf8_100_multibyte ... bench: 218 ns/iter (+/- 5)
After:
test str::bench::is_utf8_100_ascii ... bench: 130 ns/iter (+/- 1)
test str::bench::is_utf8_100_multibyte ... bench: 156 ns/iter (+/- 3)
An improvement upon the previous pull #8133
Previously it would call:
f(sf1.cmp(&of1), f(sf2.cmp(&of2), ...))
(where s/of1 = 'self/other field 1', and f was
std::cmp::lexical_ordering)
This meant that every .cmp subcall got evaluated when calling a derived
TotalOrd.cmp.
This corrects this to use
let test = sf1.cmp(&of1);
if test == Equal {
let test = sf2.cmp(&of2);
if test == Equal {
// ...
} else {
test
}
} else {
test
}
This gives a lexical ordering by short-circuiting on the first comparison
that is not Equal.
...y/catch
And before collect_failure. These are both running user dtors and need to be handled
in the task try/catch block and before the final task cleanup code.
And before collect_failure. These are both running user dtors and need to be handled
in the task try/catch block and before the final task cleanup code.
A test case was also created for this situation to prevent the problem
occuring again.
A similar problem was also fixed for the symbol method.
There was some minor code cleanup.
I am unsatisfied with using /dev/null as an invalid dynamic library. It is not cross platform.
The `new` constructor uses the task-local RNG to retrieve seeds for the
two key values, which requires the runtime. Exposing a constructor that
takes the keys directly allows HashMaps to be used in programs that wish
to avoid the runtime.
The method .into_owned() is meant to be used as an optimization when you
need to get a ~str from a Str, but don't want to unnecessarily copy it
if it's already a ~str.
This is meant to ease functions that look like
fn foo<S: Str>(strs: &[S])
Previously they could work with the strings as slices using .as_slice(),
but producing ~str required copying the string, even if the vector
turned out be a &[~str] already.
I don't have any concrete uses for this yet, since the one conversion I've done to `&[S]` so far (see PR #8203) didn't actually need owned strings. But having this here may make using `Str` more attractive.
It also may be worth adding an `into_managed()` function, but that one is less obviously useful than `into_owned()`.