Based on an observation that strings and arguments are always interleaved, thanks to #15832. Additionally optimize invocations where formatting parameters are unspecified for all arguments, e.g. `"{} {:?} {:x}"`, by emptying the `__STATIC_FMTARGS` array. Next, `Arguments::new` replaces an empty slice with `None` so that passing empty `__STATIC_FMTARGS` generates slightly less machine code when `Arguments::new` is inlined. Furthermore, formatting itself treats these cases separately without making redundant copies of formatting parameters.
All in all, this adds a single mov instruction per `write!` in most cases. That's why code size has increased.
This allows code to access the fields of tuples and tuple structs:
let x = (1i, 2i);
assert_eq!(x.1, 2);
struct Point(int, int);
let origin = Point(0, 0);
assert_eq!(origin.0, 0);
assert_eq!(origin.1, 0);
Format specs are ignored and not stored in case they're all default.
Restore default formatting parameters during iteration.
Pass `None` instead of empty slices of format specs to take advantage
of non-nullable pointer optimization.
Generate a call to one of two functions of `fmt::Argument`.
The pointer in the slice must not be null, because enum representations
make that assumption. The `exchange_malloc` function returns a non-null
sentinel for the zero size case, and it must not be passed to the
`exchange_free` lang item.
Since the length is always equal to the true capacity, a branch on the
length is enough for most types. Slices of zero size types are
statically special cased to never attempt deallocation. This is the same
implementation as `Vec<T>`.
Closes#14395
rand: inform the optimiser that indexing is never out-of-bounds.
This uses a bitwise mask to ensure that there's no bounds checking for
the array accesses when generating the next random number. This isn't
costless, but the single instruction is nothing compared to the branch.
A `debug_assert` for "bounds check" is preserved to ensure that
refactoring doesn't accidentally break it (i.e. create values of `cnt`
that are out of bounds with the masking causing it to silently wrap-
around).
Before:
test test::rand_isaac ... bench: 990 ns/iter (+/- 24) = 808 MB/s
test test::rand_isaac64 ... bench: 614 ns/iter (+/- 25) = 1302 MB/s
After:
test test::rand_isaac ... bench: 877 ns/iter (+/- 134) = 912 MB/s
test test::rand_isaac64 ... bench: 470 ns/iter (+/- 30) = 1702 MB/s
(It also removes the unsafe code in Isaac64Rng.next_u64, with a *gain*
in performance; today is a good day.)
The performance hit from these checks is significant, but unoptimized
builds are already incredibly slow. Enabling these checks results in
better test coverage since there are bots doing unoptimized builds, and
the cost is relatively small in the context of an unoptimized build.
This also allows using `JEMALLOC_FLAGS` to override the default
configure flags.
instead of prefix `..`.
This breaks code that looked like:
match foo {
[ first, ..middle, last ] => { ... }
}
Change this code to:
match foo {
[ first, middle.., last ] => { ... }
}
RFC #55.
Closes#16967.
[breaking-change]
I've found that 64k is still too much and continue to see the errors as reported
in #14940. I've locally found that 32k fails, and 24k succeeds, so I've trimmed
the size down to 10000 which the included links in the added comment end up
recommending.
It sounds like the limit can still be hit with many threads in play, but I have
yet to reproduce this, so I figure we can wait until that's hit (if it's
possible) and then take action.
I've found that 64k is still too much and continue to see the errors as reported
in #14940. I've locally found that 32k fails, and 24k succeeds, so I've trimmed
the size down to 8192 which libuv happens to use as well.
It sounds like the limit can still be hit with many threads in play, but I have
yet to reproduce this, so I figure we can wait until that's hit (if it's
possible) and then take action.