Test benchmarks with `-Z panic-abort-tests`
During test execution, when a `#[bench]` benchmark is encountered it's executed once to check whether it works. Unfortunately that was not compatible with `-Z panic-abort-tests`: the feature works by spawning a subprocess for each test, which prevents the use of dynamic tests as we cannot pass closures to child processes, and before this PR the conversion from benchmark to test was done by turning benchmarks into dynamic tests whose closures execute the benchmark once.
The approach this PR took was to add two new kinds of `TestFn`s: `StaticBenchAsTestFn` and `DynBenchAsTestFn` (⚠️ **this is a breaking change** ⚠️). With that change, a `StaticBenchFn` can be converted into a `StaticBenchAsTestFn` without creating dynamic tests, and making it possible to test `#[bench]` functions with `-Z panic-abort-tests`. The subprocess test runner also had to be updated to perform the conversion from benchmark to test when appropriate.
Along with the bug fix, in the first commit I refactored how tests are executed: rather than executing the test function in multiple places across `libtest`, there is now a private `TestFn::into_runnable()` method, which returns either a `RunnableTest` or `RunnableBench`, on which you can call the `run()` method. This simplified the rest of the changes in the PR.
This PR is best reviewed commit-by-commit.
Fixes https://github.com/rust-lang/rust/issues/73509
Before this commit, both static and dynamic benches were converted to a
DynTestFn, with a boxed closure that ran the benchmarks exactly once.
While this worked, it conflicted with -Z panic-abort-tests as the flag
does not support dynamic tests. With this change, a StaticBenchFn is
converted to a StaticBenchAsTestFn, avoiding any dynamic test creation.
DynBenchFn is also converted to DynBenchAsTestFn for completeness.
Before this commit, tests were invoked in multiple places, especially
due to `-Z panic-abort-tests`, and adding a new test kind meant having
to chase down and update all these places.
This commit creates a new Runnable enum, and its children RunnableTest
and RunnableBench. The rest of the harness will now pass around the enum
rather than constructing and passing around boxed functions. The enum
has two children enums because invoking tests and invoking benchmarks
requires different parameters.
Add a `sysroot` crate to represent the standard library crates
This adds a dummy crate named `sysroot` to represent the standard library target instead of using the `test` crate. This allows the removal of `proc_macro` as a dependency of `test` allowing these 2 crates to build in parallel saving around 9 seconds locally.
By placing the stdout in a CDATA block we avoid almost all escaping, as
there's only two byte sequences you can't sneak into a CDATA and you can
handle that with some only slightly regrettable CDATA-splitting. I've
done this in at least two other implementations of the junit xml format
over the years and it's always worked out. The only quirk new to this
(for me) is smuggling newlines as 
 to avoid literal newlines in the
output.
Allow setting CFG_DISABLE_UNSTABLE_FEATURES to 0
Two locations check whether this build-time environment variable is defined. Allowing it to be explicitly disabled with a "0" value is useful, especially for integrating with external build systems.
Two locations check whether this build-time environment variable is
defined. Allowing it to be explicitly disabled with a "0" value is
useful, especially for integrating with external build systems.
- Remove unnecessary references and dereferences
- Use `.contains` instead of `a <= x && x <= b`
- Use `mem::take` instead of `mem::replace` where possible
libtest: run all tests in their own thread, if supported by the host
This reverts the threading changes of https://github.com/rust-lang/rust/pull/56243, which made it so that with `-j1`, the test harness does not spawn any threads. Those changes were done to enable Miri to run the test harness, but Miri supports threads nowadays, so this is no longer needed. Using a thread for each test is useful because the thread's name can be set to the test's name which makes panic messages consistent between `-j1` and `-j2` runs and also a bit more readable.
I did not revert the HashMap changes of https://github.com/rust-lang/rust/pull/56243; using a deterministic map seems fine for the test harness and the more deterministic testing is the better.
Fixes https://github.com/rust-lang/rust/issues/59122
Fixes https://github.com/rust-lang/rust/issues/70492
Sort tests at compile time, not at startup
Recently, another Miri user was trying to run `cargo miri test` on the crate `iced-x86` with `--features=code_asm,mvex`. This configuration has a startup time of ~18 minutes. That's ~18 minutes before any tests even start to run. The fact that this crate has over 26,000 tests and Miri is slow makes a lot of code which is otherwise a bit sloppy but fine into a huge runtime issue.
Sorting the tests when the test harness is created instead of at startup time knocks just under 4 minutes out of those ~18 minutes. I have ways to remove most of the rest of the startup time, but this change requires coordinating changes of both the compiler and libtest, so I'm sending it separately.
(except for doctests, because there is no compile-time harness)
The UNIX and WASI implementations use `isatty`. The Windows
implementation uses the same logic the `atty` crate uses, including the
hack needed to detect msys terminals.
Implement this trait for `File` and for `Stdin`/`Stdout`/`Stderr` and
their locked counterparts on all platforms. On UNIX and WASI, implement
it for `BorrowedFd`/`OwnedFd`. On Windows, implement it for
`BorrowedHandle`/`OwnedHandle`.
Based on https://github.com/rust-lang/rust/pull/91121
Co-authored-by: Matt Wilkinson <mattwilki17@gmail.com>
Do not panic when a test function returns Result::Err.
Rust's test library allows test functions to return a `Result`, so that the test is deemed to have failed if the function returns a `Result::Err` variant. Currently, this works by having `Result` implement the `Termination` trait and asserting in assert_test_result that `Termination::report()` indicates successful completion. This turns a `Result::Err` into a panic, which is caught and unwound in the test library.
This approach is problematic in certain environments where one wishes to save on both binary size and compute resources when running tests by:
* Compiling all code with `--panic=abort` to avoid having to generate unwinding tables, and
* Running most tests in-process to avoid the overhead of spawning new processes.
This change removes the intermediate panic step and passes a `Result::Err` directly through to the test runner.
To do this, it modifies `assert_test_result` to return a `Result<(), String>` where the `Err` variant holds what was previously the panic message. It changes the types in the `TestFn` enum to return `Result<(), String>`.
This tries to minimise the changes to benchmark tests, so it calls `unwrap()` on the `Result` returned by `assert_test_result`, effectively keeping the same behaviour as before.
Some questions for reviewers:
* Does the change to the return types in the enum `TestFn` constitute a breaking change for the library API? Namely, the enum definition is public but the test library indicates that "Currently, not much of this is meant for users" and most of the library API appears to be marked unstable.
* Is there a way to test this change, i.e., to test that no panic occurs if a test returns `Result::Err`?
* Is there a shorter, more idiomatic way to fold `Result<Result<T,E>,E>` into a `Result<T,E>` than the `fold_err` function I added?
Rust's test library allows test functions to return a Result, so that the test is deemed to have failed if the function returns a Result::Err variant. Currently, this works by having Result implement the Termination trait and asserting in assert_test_result that Termination::report() indicates successful completion. This turns a Result::Err into a panic, which is caught and unwound in the test library.
This approach is problematic in certain environments where one wishes to save on both binary size and compute resources when running tests by:
* Compiling all code with --panic=abort to avoid having to generate unwinding tables, and
* Running most tests in-process to avoid the overhead of spawning new processes.
This change removes the intermediate panic step and passes a Result::Err directly through to the test runner.
To do this, it modifies assert_test_result to return a Result<(), String> where the Err variant holds what was previously the panic message. It changes the types in the TestFn enum to return Result<(), String>.
This tries to minimise the changes to benchmark tests, so it calls unwrap() on the Result returned by assert_test_result, effectively keeping the same behaviour as before.
Currently the documentation of f64::min refers to "IEEE-754 2008" while the documentation of
f64::minimum refers to "IEEE 754-2019".
Note that one has the format IEEE,hyphen,number,space,year while the other is
IEEE,space,number,hyphen,year. The official IEEE site [1] uses the later format and it is also the
one most commonly used throughout the codebase.
Update all comments and - more importantly - documentation to consistently use the official format.
[1] https://standards.ieee.org/ieee/754/4211/
Recently, another Miri user was trying to run `cargo miri test` on the
crate `iced-x86` with `--features=code_asm,mvex`. This configuration has
a startup time of ~18 minutes. That's ~18 minutes before any tests even
start to run. The fact that this crate has over 26,000 tests and Miri is
slow makes a lot of code which is otherwise a bit sloppy but fine into a
huge runtime issue.
Sorting the tests when the test harness is created instead of at startup
time knocks just under 4 minutes out of those ~18 minutes. I have ways
to remove most of the rest of the startup time, but this change requires
coordinating changes of both the compiler and libtest, so I'm sending it
separately.
Bump bootstrap compiler to 1.61.0 beta
This PR bumps the bootstrap compiler to the 1.61.0 beta. The first commit changes the stage0 compiler, the second commit applies the "mechanical" changes and the third and fourth commits apply changes explained in the relevant comments.
r? `@Mark-Simulacrum`
Improve terse test output.
The current terse output gives 112 chars per line, which causes
wraparound for people using 100 char wide terminals, which is very
common.
This commit changes it to be exactly 100 wide, which makes the output
look much nicer.