rand: inform the optimiser that indexing is never out-of-bounds.
This uses a bitwise mask to ensure that there's no bounds checking for
the array accesses when generating the next random number. This isn't
costless, but the single instruction is nothing compared to the branch.
A `debug_assert` for "bounds check" is preserved to ensure that
refactoring doesn't accidentally break it (i.e. create values of `cnt`
that are out of bounds with the masking causing it to silently wrap-
around).
Before:
test test::rand_isaac ... bench: 990 ns/iter (+/- 24) = 808 MB/s
test test::rand_isaac64 ... bench: 614 ns/iter (+/- 25) = 1302 MB/s
After:
test test::rand_isaac ... bench: 877 ns/iter (+/- 134) = 912 MB/s
test test::rand_isaac64 ... bench: 470 ns/iter (+/- 30) = 1702 MB/s
(It also removes the unsafe code in Isaac64Rng.next_u64, with a *gain*
in performance; today is a good day.)
The performance hit from these checks is significant, but unoptimized
builds are already incredibly slow. Enabling these checks results in
better test coverage since there are bots doing unoptimized builds, and
the cost is relatively small in the context of an unoptimized build.
This also allows using `JEMALLOC_FLAGS` to override the default
configure flags.
instead of prefix `..`.
This breaks code that looked like:
match foo {
[ first, ..middle, last ] => { ... }
}
Change this code to:
match foo {
[ first, middle.., last ] => { ... }
}
RFC #55.
Closes#16967.
[breaking-change]
I've found that 64k is still too much and continue to see the errors as reported
in #14940. I've locally found that 32k fails, and 24k succeeds, so I've trimmed
the size down to 10000 which the included links in the added comment end up
recommending.
It sounds like the limit can still be hit with many threads in play, but I have
yet to reproduce this, so I figure we can wait until that's hit (if it's
possible) and then take action.
I've found that 64k is still too much and continue to see the errors as reported
in #14940. I've locally found that 32k fails, and 24k succeeds, so I've trimmed
the size down to 8192 which libuv happens to use as well.
It sounds like the limit can still be hit with many threads in play, but I have
yet to reproduce this, so I figure we can wait until that's hit (if it's
possible) and then take action.
This breaks code that uses the `..xs` form anywhere but at the end of a
slice. For example:
match foo {
[ 1, ..xs, 2 ]
[ ..xs, 1, 2 ]
}
Add the `#![feature(advanced_slice_patterns)]` gate to reenable the
syntax.
RFC #54.
Closes#16951.
[breaking-change]
itself.
This breaks code like:
for &x in my_vector.iter() {
my_vector[2] = "wibble";
...
}
Change this code to not invalidate iterators. For example:
for i in range(0, my_vector.len()) {
my_vector[2] = "wibble";
...
}
The `for-loop-does-not-borrow-iterators` test for #8372 was incorrect
and has been removed.
Closes#16820.
[breaking-change]
This was inspired by seeing a LLVM flatline of **~600MB** when running rustc with jemalloc (each type's `t_box_` is allocated on the heap, creating a lot of fragmentation, which jemalloc can deal with, unlike glibc).
By default, 32-bit Windows executables are restricted to 2GiB of address
space even when running on 64-bit Windows when 4GiB is available.
Closes#17043
This uses a bitwise mask to ensure that there's no bounds checking for
the array accesses when generating the next random number. This isn't
costless, but the single instruction is nothing compared to the branch.
A `debug_assert` for "bounds check" is preserved to ensure that
refactoring doesn't accidentally break it (i.e. create values of `cnt`
that are out of bounds with the masking causing it to silently wrap-
around).
Before:
test test::rand_isaac ... bench: 990 ns/iter (+/- 24) = 808 MB/s
test test::rand_isaac64 ... bench: 614 ns/iter (+/- 25) = 1302 MB/s
After:
test test::rand_isaac ... bench: 877 ns/iter (+/- 134) = 912 MB/s
test test::rand_isaac64 ... bench: 470 ns/iter (+/- 30) = 1702 MB/s
(It also removes the unsafe code in Isaac64Rng.next_u64, with a *gain*
in performance; today is a good day.)
A match in callee.rs was recognizing some foreign fns as named tuple constructors. A reproducible test case for this is nearly impossible since it depends on the way NodeIds happen to be assigned in different crates.
Fixes#15913