Prevent futex_wait from actually waiting if a concurrent waker was executed before us
Fixes#2223
Two SC fences were placed in `futex_wake` (after the caller has changed `addr`), and in `futex_wait` (before we read `addr`). This guarantees that `futex_wait` sees the value written to `addr` before the last `futex_wake` call, should one exists, and avoid going into sleep with no one else to wake us up.
ada7b72a87/src/concurrency/weak_memory.rs (L324-L326)
Earlier I proposed to use `fetch_add(0)` to read the latest value in MO, though this isn't the proper way to do it and breaks aliasing: syscall caller may pass in a `*const` from a `&` and Miri complains about write to a `SharedReadOnly` location, causing this test to fail.
ada7b72a87/tests/pass/concurrency/linux-futex.rs (L56-L68)
ui_test: ensure all worker threads stay around
Also organize files such that the by far slowest test (weak_memory/consistency) always starts first. It still finishes last on my system... even after I halved the iteration count.
Fixes https://github.com/rust-lang/miri/issues/2204
Always show stderr on test failure.
fixes#2224
I overengineered the original thing to the point where it became fragile. Let's just always print stderr, unless it was already printed
test ui output also in rustc test suite
`@oli-obk` when I just tried this locally (`./x.py test src/tools/miri --stage 0`), it worked fine. What differences had you seen before?
Add more bench-cargo-miri programs
These example programs are derived from long-running (>15 minutes) tests in the test suites of highly-downloaded crates (if I have my way, that runtime will not be correct for long). They should serve as realistic but also somewhat pathological workloads for the interpreter.
The unicode program stresses the code which looks for adjacent and equal stacks to merge them.
The backtraces program has an uncommonly large working set of borrow tags per borrow stack.
This also updates the .gitignore to ignore files commonly emitted in the course of using these benchmark programs.
---
The benchmark programs are so-named to avoid confusingly duplicating the names of the crates they are benchmarking. Is that a good idea? I started doing this so that I could use `cargo add` but now I'm not entirely sold on the names.
These example programs are derived from long-running (>15 minutes) tests
in the test suites of highly-downloaded crates. They should serve as
realistic but also somewhat pathological workloads for the interpreter.
The unicode program stresses the code which looks for adjacent and equal
stacks to merge them.
The backtrace program has an uncommonly large working set of borrow tags
per borrow stack.
This also updates the .gitignore to ignore files commonly emitted in the
course of using these benchmark programs.