Instead of towards negative infinity as is currently the case.
This done so that the obvious expectations of
`midpoint(a, b) == midpoint(b, a)` and
`midpoint(-a, -b) == -midpoint(a, b)` are true, which makes the even
more obvious implementation `(a + b) / 2` true.
https://github.com/rust-lang/rust/issues/110840#issuecomment-2336753931
Const stability checks v2
The const stability system has served us well ever since `const fn` were first stabilized. It's main feature is that it enforces *recursive* validity -- a stable const fn cannot internally make use of unstable const features without an explicit marker in the form of `#[rustc_allow_const_fn_unstable]`. This is done to make sure that we don't accidentally expose unstable const features on stable in a way that would be hard to take back. As part of this, it is enforced that a `#[rustc_const_stable]` can only call `#[rustc_const_stable]` functions. However, some problems have been coming up with increased usage:
- It is baffling that we have to mark private or even unstable functions as `#[rustc_const_stable]` when they are used as helpers in regular stable `const fn`, and often people will rather add `#[rustc_allow_const_fn_unstable]` instead which was not our intention.
- The system has several gaping holes: a private `const fn` without stability attributes whose inherited stability (walking up parent modules) is `#[stable]` is allowed to call *arbitrary* unstable const operations, but can itself be called from stable `const fn`. Similarly, `#[allow_internal_unstable]` on a macro completely bypasses the recursive nature of the check.
Fundamentally, the problem is that we have *three* disjoint categories of functions, and not enough attributes to distinguish them:
1. const-stable functions
2. private/unstable functions that are meant to be callable from const-stable functions
3. functions that can make use of unstable const features
Functions in the first two categories cannot use unstable const features and they can only call functions from the first two categories.
This PR implements the following system:
- `#[rustc_const_stable]` puts functions in the first category. It may only be applied to `#[stable]` functions.
- `#[rustc_const_unstable]` by default puts functions in the third category. The new attribute `#[rustc_const_stable_indirect]` can be added to such a function to move it into the second category.
- `const fn` without a const stability marker are in the second category if they are still unstable. They automatically inherit the feature gate for regular calls, it can now also be used for const-calls.
Also, all the holes mentioned above have been closed. There's still one potential hole that is hard to avoid, which is when MIR building automatically inserts calls to a particular function in stable functions -- which happens in the panic machinery. Those need to be manually marked `#[rustc_const_stable_indirect]` to be sure they follow recursive const stability. But that's a fairly rare and special case so IMO it's fine.
The net effect of this is that a `#[unstable]` or unmarked function can be constified simply by marking it as `const fn`, and it will then be const-callable from stable `const fn` and subject to recursive const stability requirements. If it is publicly reachable (which implies it cannot be unmarked), it will be const-unstable under the same feature gate. Only if the function ever becomes `#[stable]` does it need a `#[rustc_const_unstable]` or `#[rustc_const_stable]` marker to decide if this should also imply const-stability.
Adding `#[rustc_const_unstable]` is only needed for (a) functions that need to use unstable const lang features (including intrinsics), or (b) `#[stable]` functions that are not yet intended to be const-stable. Adding `#[rustc_const_stable]` is only needed for functions that are actually meant to be directly callable from stable const code. `#[rustc_const_stable_indirect]` is used to mark intrinsics as const-callable and for `#[rustc_const_unstable]` functions that are actually called from other, exposed-on-stable `const fn`. No other attributes are required.
Also see the updated dev-guide at https://github.com/rust-lang/rustc-dev-guide/pull/2098.
I think in the future we may want to tweak this further, so that in the hopefully common case where a public function's const-stability just exactly mirrors its regular stability, we never have to add any attribute. But right now, once the function is stable this requires `#[rustc_const_stable]`.
### Open question
There is one point I could see we might want to do differently, and that is putting `#[rustc_const_unstable]` functions (but not intrinsics) in category 2 by default, and requiring an extra attribute for `#[rustc_const_not_exposed_on_stable]` or so. This would require a bunch of extra annotations, but would have the advantage that turning a `#[rustc_const_unstable]` into `#[rustc_const_stable]` will never change the way the function is const-checked. Currently, we often discover in the const stabilization PR that a function needs some other unstable const things, and then we rush to quickly deal with that. In this alternative universe, we'd work towards getting rid of the `rustc_const_not_exposed_on_stable` before stabilization, and once that is done stabilization becomes a trivial matter. `#[rustc_const_stable_indirect]` would then only be used for intrinsics.
I think I like this idea, but might want to do it in a follow-up PR, as it will need a whole bunch of annotations in the standard library. Also, we probably want to convert all const intrinsics to the "new" form (`#[rustc_intrinsic]` instead of an `extern` block) before doing this to avoid having to deal with two different ways of declaring intrinsics.
Cc `@rust-lang/wg-const-eval` `@rust-lang/libs-api`
Part of https://github.com/rust-lang/rust/issues/129815 (but not finished since this is not yet sufficient to safely let us expose `const fn` from hashbrown)
Fixes https://github.com/rust-lang/rust/issues/131073 by making it so that const-stable functions are always stable
try-job: test-various
Fundamentally, we have *three* disjoint categories of functions:
1. const-stable functions
2. private/unstable functions that are meant to be callable from const-stable functions
3. functions that can make use of unstable const features
This PR implements the following system:
- `#[rustc_const_stable]` puts functions in the first category. It may only be applied to `#[stable]` functions.
- `#[rustc_const_unstable]` by default puts functions in the third category. The new attribute `#[rustc_const_stable_indirect]` can be added to such a function to move it into the second category.
- `const fn` without a const stability marker are in the second category if they are still unstable. They automatically inherit the feature gate for regular calls, it can now also be used for const-calls.
Also, several holes in recursive const stability checking are being closed.
There's still one potential hole that is hard to avoid, which is when MIR
building automatically inserts calls to a particular function in stable
functions -- which happens in the panic machinery. Those need to *not* be
`rustc_const_unstable` (or manually get a `rustc_const_stable_indirect`) to be
sure they follow recursive const stability. But that's a fairly rare and special
case so IMO it's fine.
The net effect of this is that a `#[unstable]` or unmarked function can be
constified simply by marking it as `const fn`, and it will then be
const-callable from stable `const fn` and subject to recursive const stability
requirements. If it is publicly reachable (which implies it cannot be unmarked),
it will be const-unstable under the same feature gate. Only if the function ever
becomes `#[stable]` does it need a `#[rustc_const_unstable]` or
`#[rustc_const_stable]` marker to decide if this should also imply
const-stability.
Adding `#[rustc_const_unstable]` is only needed for (a) functions that need to
use unstable const lang features (including intrinsics), or (b) `#[stable]`
functions that are not yet intended to be const-stable. Adding
`#[rustc_const_stable]` is only needed for functions that are actually meant to
be directly callable from stable const code. `#[rustc_const_stable_indirect]` is
used to mark intrinsics as const-callable and for `#[rustc_const_unstable]`
functions that are actually called from other, exposed-on-stable `const fn`. No
other attributes are required.
Run most `core::num` tests in const context too
This adds some infrastructure for something I was going to use in #131566, but it felt worthwhile enough on its own to merge/discuss separately.
Essentially, right now we tend to rely on UI tests to ensure that things work in const context, rather than just using library tests. This uses a few simple macro tricks to make it *relatively* painless to execute tests in both runtime and compile-time context. And this only applies to the numeric tests, and not anything else.
Recommended to review without whitespace in the diff.
cc `@RalfJung`
merge const_ipv4 / const_ipv6 feature gate into 'ip' feature gate
https://github.com/rust-lang/rust/issues/76205 has been closed a while ago, but there are still some functions that reference it. Those functions are all unstable *and* const-unstable. There's no good reason to use a separate feature gate for their const-stability, so this PR moves their const-stability under the same gate as their regular stability, and therefore removes the remaining references to https://github.com/rust-lang/rust/issues/76205.
core/net: add Ipv[46]Addr::from_octets, Ipv6Addr::from_segments.
Adds:
- `Ipv4Address::from_octets([u8;4])`
- `Ipv6Address::from_octets([u8;16])`
- `Ipv6Address::from_segments([u16;8])`
equivalent to the existing `From` impls.
Advantages:
- Consistent with `to_bits, from_bits`.
- More discoverable than the `From` impls.
- Helps with type inference: it's common to want to convert byte slices to IP addrs. If you try this
```rust
fn foo(x: &[u8]) -> Ipv4Addr {
Ipv4Addr::from(foo.try_into().unwrap())
}
```
it [doesn't work](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=0e2873312de275a58fa6e33d1b213bec). You have to write `Ipv4Addr::from(<[u8;4]>::try_from(x).unwrap())` instead, which is not great. With `from_octets` it is able to infer the right types.
Found this while porting [smoltcp](https://github.com/smoltcp-rs/smoltcp/) from its own IP address types to the `core::net` types.
~~Tracking issues #27709 #76205~~
Tracking issue: https://github.com/rust-lang/rust/issues/131360
Optimize `escape_ascii` using a lookup table
Based upon my suggestion here: https://github.com/rust-lang/rust/pull/125340#issuecomment-2130441817
Effectively, we can take advantage of the fact that ASCII only needs 7 bits to make the eighth bit store whether the value should be escaped or not. This adds a 256-byte lookup table, but 256 bytes *should* be small enough that very few people will mind, according to my probably not incontrovertible opinion.
The generated assembly isn't clearly better (although has fewer branches), so, I decided to benchmark on three inputs: first on a random 200KiB, then on `/bin/cat`, then on `Cargo.toml` for this repo. In all cases, the generated code ran faster on my machine. (an old i7-8700)
But, if you want to try my benchmarking code for yourself:
<details><summary>Criterion code below. Replace <code>/home/ltdk/rustsrc</code> with the appropriate directory.</summary>
```rust
#![feature(ascii_char)]
#![feature(ascii_char_variants)]
#![feature(const_option)]
#![feature(let_chains)]
use core::ascii;
use core::ops::Range;
use criterion::{criterion_group, criterion_main, Criterion};
use rand::{thread_rng, Rng};
const HEX_DIGITS: [ascii::Char; 16] = *b"0123456789abcdef".as_ascii().unwrap();
#[inline]
const fn backslash<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 2) };
let mut output = [ascii::Char::Null; N];
output[0] = ascii::Char::ReverseSolidus;
output[1] = a;
(output, 0..2)
}
#[inline]
const fn hex_escape<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 4) };
let mut output = [ascii::Char::Null; N];
let hi = HEX_DIGITS[(byte >> 4) as usize];
let lo = HEX_DIGITS[(byte & 0xf) as usize];
output[0] = ascii::Char::ReverseSolidus;
output[1] = ascii::Char::SmallX;
output[2] = hi;
output[3] = lo;
(output, 0..4)
}
#[inline]
const fn verbatim<const N: usize>(a: ascii::Char) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 1) };
let mut output = [ascii::Char::Null; N];
output[0] = a;
(output, 0..1)
}
/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_old<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
const { assert!(N >= 4) };
match byte {
b'\t' => backslash(ascii::Char::SmallT),
b'\r' => backslash(ascii::Char::SmallR),
b'\n' => backslash(ascii::Char::SmallN),
b'\\' => backslash(ascii::Char::ReverseSolidus),
b'\'' => backslash(ascii::Char::Apostrophe),
b'\"' => backslash(ascii::Char::QuotationMark),
0x00..=0x1F => hex_escape(byte),
_ => match ascii::Char::from_u8(byte) {
Some(a) => verbatim(a),
None => hex_escape(byte),
},
}
}
/// Escapes an ASCII character.
///
/// Returns a buffer and the length of the escaped representation.
const fn escape_ascii_new<const N: usize>(byte: u8) -> ([ascii::Char; N], Range<u8>) {
/// Lookup table helps us determine how to display character.
///
/// Since ASCII characters will always be 7 bits, we can exploit this to store the 8th bit to
/// indicate whether the result is escaped or unescaped.
///
/// We additionally use 0x80 (escaped NUL character) to indicate hex-escaped bytes, since
/// escaped NUL will not occur.
const LOOKUP: [u8; 256] = {
let mut arr = [0; 256];
let mut idx = 0;
loop {
arr[idx as usize] = match idx {
// use 8th bit to indicate escaped
b'\t' => 0x80 | b't',
b'\r' => 0x80 | b'r',
b'\n' => 0x80 | b'n',
b'\\' => 0x80 | b'\\',
b'\'' => 0x80 | b'\'',
b'"' => 0x80 | b'"',
// use NUL to indicate hex-escaped
0x00..=0x1F | 0x7F..=0xFF => 0x80 | b'\0',
_ => idx,
};
if idx == 255 {
break;
}
idx += 1;
}
arr
};
let lookup = LOOKUP[byte as usize];
// 8th bit indicates escape
let lookup_escaped = lookup & 0x80 != 0;
// SAFETY: We explicitly mask out the eighth bit to get a 7-bit ASCII character.
let lookup_ascii = unsafe { ascii::Char::from_u8_unchecked(lookup & 0x7F) };
if lookup_escaped {
// NUL indicates hex-escaped
if matches!(lookup_ascii, ascii::Char::Null) {
hex_escape(byte)
} else {
backslash(lookup_ascii)
}
} else {
verbatim(lookup_ascii)
}
}
fn escape_bytes(bytes: &[u8], f: impl Fn(u8) -> ([ascii::Char; 4], Range<u8>)) -> Vec<ascii::Char> {
let mut vec = Vec::new();
for b in bytes {
let (buf, range) = f(*b);
vec.extend_from_slice(&buf[range.start as usize..range.end as usize]);
}
vec
}
pub fn criterion_benchmark(c: &mut Criterion) {
let mut group = c.benchmark_group("escape_ascii");
group.sample_size(1000);
let rand_200k = &mut [0; 200 * 1024];
thread_rng().fill(&mut rand_200k[..]);
let cat = include_bytes!("/bin/cat");
let cargo_toml = include_bytes!("/home/ltdk/rustsrc/Cargo.toml");
group.bench_function("old_rand", |b| {
b.iter(|| escape_bytes(rand_200k, escape_ascii_old));
});
group.bench_function("new_rand", |b| {
b.iter(|| escape_bytes(rand_200k, escape_ascii_new));
});
group.bench_function("old_bin", |b| {
b.iter(|| escape_bytes(cat, escape_ascii_old));
});
group.bench_function("new_bin", |b| {
b.iter(|| escape_bytes(cat, escape_ascii_new));
});
group.bench_function("old_cargo_toml", |b| {
b.iter(|| escape_bytes(cargo_toml, escape_ascii_old));
});
group.bench_function("new_cargo_toml", |b| {
b.iter(|| escape_bytes(cargo_toml, escape_ascii_new));
});
group.finish();
}
criterion_group!(benches, criterion_benchmark);
criterion_main!(benches);
```
</details>
My benchmark results:
```
escape_ascii/old_rand time: [1.6965 ms 1.7006 ms 1.7053 ms]
Found 22 outliers among 1000 measurements (2.20%)
4 (0.40%) high mild
18 (1.80%) high severe
escape_ascii/new_rand time: [1.6749 ms 1.6953 ms 1.7158 ms]
Found 38 outliers among 1000 measurements (3.80%)
38 (3.80%) high mild
escape_ascii/old_bin time: [224.59 µs 225.40 µs 226.33 µs]
Found 39 outliers among 1000 measurements (3.90%)
17 (1.70%) high mild
22 (2.20%) high severe
escape_ascii/new_bin time: [164.86 µs 165.63 µs 166.58 µs]
Found 107 outliers among 1000 measurements (10.70%)
43 (4.30%) high mild
64 (6.40%) high severe
escape_ascii/old_cargo_toml
time: [23.397 µs 23.699 µs 24.014 µs]
Found 204 outliers among 1000 measurements (20.40%)
21 (2.10%) high mild
183 (18.30%) high severe
escape_ascii/new_cargo_toml
time: [16.404 µs 16.438 µs 16.483 µs]
Found 88 outliers among 1000 measurements (8.80%)
56 (5.60%) high mild
32 (3.20%) high severe
```
Random: 1.7006ms => 1.6953ms (<1% speedup)
Binary: 225.40µs => 165.63µs (26% speedup)
Text: 23.699µs => 16.438µs (30% speedup)
Stabilize const `ptr::write*` and `mem::replace`
Since `const_mut_refs` and `const_refs_to_cell` have been stabilized, we may now also stabilize the ability to write to places during const evaluation inside our library API. So, we now propose the `const fn` version of `ptr::write` and its variants. This allows us to also stabilize `mem::replace` and `ptr::replace`.
- const `mem::replace`: https://github.com/rust-lang/rust/issues/83164#issuecomment-2338660862
- const `ptr::write{,_bytes,_unaligned}`: https://github.com/rust-lang/rust/issues/86302#issuecomment-2330275266
Their implementation requires an additional internal stabilization of `const_intrinsic_forget`, which is required for `*::write*` and thus `*::replace`. Thus we const-stabilize the internal intrinsics `forget`, `write_bytes`, and `write_via_move`.
Port sort-research-rs test suite to Rust stdlib tests
This PR is a followup to https://github.com/rust-lang/rust/pull/124032. It replaces the tests that test the various sort functions in the standard library with a test-suite developed as part of https://github.com/Voultapher/sort-research-rs. The current tests suffer a couple of problems:
- They don't cover important real world patterns that the implementations take advantage of and execute special code for.
- The input lengths tested miss out on code paths. For example, important safety property tests never reach the quicksort part of the implementation.
- The miri side is often limited to `len <= 20` which means it very thoroughly tests the insertion sort, which accounts for 19 out of 1.5k LoC.
- They are split into to core and alloc, causing code duplication and uneven coverage.
- ~~The randomness is tied to a caller location, wasting the space exploration capabilities of randomized testing.~~ The randomness is not repeatable, as it relies on `std:#️⃣:RandomState::new().build_hasher()`.
Most of these issues existed before https://github.com/rust-lang/rust/pull/124032, but they are intensified by it. One thing that is new and requires additional testing, is that the new sort implementations specialize based on type properties. For example `Freeze` and non `Freeze` execute different code paths.
Effectively there are three dimensions that matter:
- Input type
- Input length
- Input pattern
The ported test-suite tests various properties along all three dimensions, greatly improving test coverage. It side-steps the miri issue by preferring sampled approaches. For example the test that checks if after a panic the set of elements is still the original one, doesn't do so for every single possible panic opportunity but rather it picks one at random, and performs this test across a range of input length, which varies the panic point across them. This allows regular execution to easily test inputs of length 10k, and miri execution up to 100 which covers significantly more code. The randomness used is tied to a fixed - but random per process execution - seed. This allows for fully repeatable tests and fuzzer like exploration across multiple runs.
Structure wise, the tests are previously found in the core integration tests for `sort_unstable` and alloc unit tests for `sort`. The new test-suite was developed to be a purely black-box approach, which makes integration testing the better place, because it can't accidentally rely on internal access. Because unwinding support is required the tests can't be in core, even if the implementation is, so they are now part of the alloc integration tests. Are there architectures that can only build and test core and not alloc? If so, do such platforms require sort testing? For what it's worth the current implementation state passes miri `--target mips64-unknown-linux-gnuabi64` which is big endian.
The test-suite also contains tests for properties that were and are given by the current and previous implementations, and likely relied upon by users but weren't tested. For example `self_cmp` tests that the two parameters `a` and `b` passed into the comparison function are never references to the same object, which if the user is sorting for example a `&mut [Mutex<i32>]` could lead to a deadlock.
Instead of using the hashed caller location as rand seed, it uses seconds since unix epoch / 10, which given timestamps in the CI should be reasonably easy to reproduce, but also allows fuzzer like space exploration.
---
Test run-time changes:
Setup:
```
Linux 6.10
rustc 1.83.0-nightly (f79a912d9 2024-09-18)
AMD Ryzen 9 5900X 12-Core Processor (Zen 3 micro-architecture)
CPU boost enabled.
```
master: e9df22f
Before core integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/coretests-219cbd0308a49e2f
Time (mean ± σ): 869.6 ms ± 21.1 ms [User: 1327.6 ms, System: 95.1 ms]
Range (min … max): 845.4 ms … 917.0 ms 10 runs
# MIRIFLAGS="-Zmiri-disable-isolation" to get real time
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/core
finished in 738.44s
```
After core integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/coretests-219cbd0308a49e2f
Time (mean ± σ): 865.1 ms ± 14.7 ms [User: 1283.5 ms, System: 88.4 ms]
Range (min … max): 836.2 ms … 885.7 ms 10 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/core
finished in 752.35s
```
Before alloc unit tests:
```
LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloc-19c15e6e8565aa54
Time (mean ± σ): 295.0 ms ± 9.9 ms [User: 719.6 ms, System: 35.3 ms]
Range (min … max): 284.9 ms … 319.3 ms 10 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 322.75s
```
After alloc unit tests:
```
LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloc-19c15e6e8565aa54
Time (mean ± σ): 97.4 ms ± 4.1 ms [User: 297.7 ms, System: 28.6 ms]
Range (min … max): 92.3 ms … 109.2 ms 27 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 309.18s
```
Before alloc integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloctests-439e7300c61a8046
Time (mean ± σ): 103.2 ms ± 1.7 ms [User: 135.7 ms, System: 39.4 ms]
Range (min … max): 99.7 ms … 107.3 ms 28 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 231.35s
```
After alloc integration tests:
```
$ LD_LIBRARY_PATH=build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/ hyperfine build/x86_64-unknown-linux-gnu/stage0-std/x86_64-unknown-linux-gnu/release/deps/alloctests-439e7300c61a8046
Time (mean ± σ): 379.8 ms ± 4.7 ms [User: 4620.5 ms, System: 1157.2 ms]
Range (min … max): 373.6 ms … 386.9 ms 10 runs
$ MIRIFLAGS="-Zmiri-disable-isolation" ./x.py miri library/alloc
finished in 449.24s
```
In my opinion the results don't change iterative library development or CI execution in meaningful ways. For example currently the library doc-tests take ~66s and incremental compilation takes 10+ seconds. However I only have limited knowledge of the various local development workflows that exist, and might be missing one that is significantly impacted by this change.
Stabilize const `{slice,array}::from_mut`
This PR stabilizes the following APIs as const stable as of rust `1.83`:
```rs
// core::array
pub const fn from_mut<T>(s: &mut T) -> &mut [T; 1];
// core::slice
pub const fn from_mut<T>(s: &mut T) -> &mut [T];
```
This is made possible by `const_mut_refs` being stabilized (yay).
Tracking issue: #90206
This commit is a followup to https://github.com/rust-lang/rust/pull/124032. It
replaces the tests that test the various sort functions in the standard library
with a test-suite developed as part of
https://github.com/Voultapher/sort-research-rs. The current tests suffer a
couple of problems:
- They don't cover important real world patterns that the implementations take
advantage of and execute special code for.
- The input lengths tested miss out on code paths. For example, important safety
property tests never reach the quicksort part of the implementation.
- The miri side is often limited to `len <= 20` which means it very thoroughly
tests the insertion sort, which accounts for 19 out of 1.5k LoC.
- They are split into to core and alloc, causing code duplication and uneven
coverage.
- The randomness is not repeatable, as it
relies on `std:#️⃣:RandomState::new().build_hasher()`.
Most of these issues existed before
https://github.com/rust-lang/rust/pull/124032, but they are intensified by it.
One thing that is new and requires additional testing, is that the new sort
implementations specialize based on type properties. For example `Freeze` and
non `Freeze` execute different code paths.
Effectively there are three dimensions that matter:
- Input type
- Input length
- Input pattern
The ported test-suite tests various properties along all three dimensions,
greatly improving test coverage. It side-steps the miri issue by preferring
sampled approaches. For example the test that checks if after a panic the set of
elements is still the original one, doesn't do so for every single possible
panic opportunity but rather it picks one at random, and performs this test
across a range of input length, which varies the panic point across them. This
allows regular execution to easily test inputs of length 10k, and miri execution
up to 100 which covers significantly more code. The randomness used is tied to a
fixed - but random per process execution - seed. This allows for fully
repeatable tests and fuzzer like exploration across multiple runs.
Structure wise, the tests are previously found in the core integration tests for
`sort_unstable` and alloc unit tests for `sort`. The new test-suite was
developed to be a purely black-box approach, which makes integration testing the
better place, because it can't accidentally rely on internal access. Because
unwinding support is required the tests can't be in core, even if the
implementation is, so they are now part of the alloc integration tests. Are
there architectures that can only build and test core and not alloc? If so, do
such platforms require sort testing? For what it's worth the current
implementation state passes miri `--target mips64-unknown-linux-gnuabi64` which
is big endian.
The test-suite also contains tests for properties that were and are given by the
current and previous implementations, and likely relied upon by users but
weren't tested. For example `self_cmp` tests that the two parameters `a` and `b`
passed into the comparison function are never references to the same object,
which if the user is sorting for example a `&mut [Mutex<i32>]` could lead to a
deadlock.
Instead of using the hashed caller location as rand seed, it uses seconds since
unix epoch / 10, which given timestamps in the CI should be reasonably easy to
reproduce, but also allows fuzzer like space exploration.
[`cfg_match`] Generalize inputs
cc #115585
Changes the input type from `item` to `tt`, which makes the macro have the same functionality of `cfg_if`.
Also adds a test to ensure that `stmt_expr_attributes` is not triggered.
Improve documentation for <integer>::from_str_radix
Two improvements to the documentation:
- Document `-` as a valid character for signed integer destinations
- Make the documentation even more clear that extra whitespace and non-digit characters is invalid. Many other languages, e.g. c++, are very permissive in string to integer routines and simply try to consume as much as they can, ignoring the rest. This is trying to make the transition for developers who are used to the conversion semantics in these languages a bit easier.
In the implementation of `force_mut`, I chose performance over safety.
For `LazyLock` this isn't really a choice; the code has to be unsafe.
But for `LazyCell`, we can have a full-safe implementation, but it will
be a bit less performant, so I went with the unsafe approach.