Improve isatty support
Per https://github.com/rust-lang/miri/issues/2292#issuecomment-1171858283, this is an attempt at
> do something more clever with Miri's `isatty` shim
Since Unix -> Unix is very simple, I'm starting with a patch that just does that. Happy to augment/rewrite this based on feedback.
The linked file in libtest specifically only supports stdout. If we're doing this to support terminal applications, I think it would be strange to support one but not all 3 of the standard streams.
The `atty` crate contains a bunch of extra logic that libtest does not contain, in order to support MSYS terminals: db8d55f88e so I think if we're going to do Windows support, we should probably access all that logic somehow. I think it's pretty clear that the implementation is not going to change, so I think if we want to, pasting the contents of the `atty` crate into Miri is on the table, instead of taking a dependency.
use PlaceTy visitor and dedup sime retagging code
I benchmarked this and as far as I can see the difference to the old code is totally within noise. And this makes the code a lot simpler and removes duplication so yay. :)
Make "./miri {build,run,test}" use debug assertions but "./miri install" not
This makes `./miri run`/`./miri test` use the full set of debug assertions (including the rather expensive ones that check consistency of the Stacked Borrows cache), but `./miri install` installs a Miri *without* those debug assertions.
That's the same behavior as cargo, and helps catch Miri bugs with the test suite while making installed Miri usable for larger runs.
move checking ptr tracking on item pop into cold helper function
Before:
```
Benchmark 1: cargo miri run --manifest-path bench-cargo-miri/serde1/Cargo.toml
Time (mean ± σ): 6.729 s ± 0.050 s [User: 6.608 s, System: 0.124 s]
Range (min … max): 6.665 s … 6.799 s 5 runs
Benchmark 2: cargo miri run --manifest-path bench-cargo-miri/unicode/Cargo.toml
Time (mean ± σ): 20.923 s ± 0.271 s [User: 20.386 s, System: 0.537 s]
Range (min … max): 20.580 s … 21.165 s 5 runs
```
After:
```
Benchmark 1: cargo miri run --manifest-path bench-cargo-miri/serde1/Cargo.toml
Time (mean ± σ): 6.562 s ± 0.023 s [User: 6.430 s, System: 0.135 s]
Range (min … max): 6.544 s … 6.594 s 5 runs
Benchmark 2: cargo miri run --manifest-path bench-cargo-miri/unicode/Cargo.toml
Time (mean ± σ): 20.375 s ± 0.228 s [User: 19.964 s, System: 0.413 s]
Range (min … max): 20.201 s … 20.736 s 5 runs
```
Nothing major, but we'll take it I guess. 🤷
Fixes https://github.com/rust-lang/miri/issues/2132
reborrow error: clarify that we are reborrowing *from* that tag
`@saethlin` I found the current message not entirely clear, so what do you think about this?
Add a benchmark of the hang-on-test-failure code path
This is the code pattern that produces the performance problem in https://github.com/rust-lang/miri/issues/2273
I figured out what I was stuck on in https://github.com/rust-lang/miri/pull/2315#discussion_r916387919. For a while I was just doing `let x: &[u8] = &[0u8; 4096];` but that doesn't produce the runtime inside `Stack::item_popped` that I was looking for, I think because this allocation is never deallocated. But with `Vec`, I get the profile I'm looking for.
Optimizing Stacked Borrows (part 2): Shrink Item
This moves protectors out of `Item`, storing them both in a global `HashSet` which contains all currently-protected tags as well as a `Vec<SbTag>` on each `Frame` so that when we return from a function we know which tags to remove from the protected set.
This also bit-packs the 64-bit tag and the 2-bit permission together when they are stored in memory. This means we theoretically run out of tags sooner, but I doubt that limit will ever be hit.
Together these optimizations reduce the memory footprint of Miri when executing programs which stress Stacked Borrows by ~66%. For example, running a test with isolation off which only panics currently peaks at ~19 GB, with this PR it peaks at ~6.2 GB.
To-do
- [x] Enforce the 62-bit limit
- [x] Decide if there is a better order to pack the tag and permission in
- [x] Wait for `UnsafeCell` to become infectious, or express offsets + tags in the global protector set
Benchmarks before:
```
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/backtraces/Cargo.toml
Time (mean ± σ): 8.948 s ± 0.253 s [User: 8.752 s, System: 0.158 s]
Range (min … max): 8.619 s … 9.279 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/mse/Cargo.toml
Time (mean ± σ): 2.129 s ± 0.037 s [User: 1.849 s, System: 0.248 s]
Range (min … max): 2.086 s … 2.176 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/serde1/Cargo.toml
Time (mean ± σ): 3.334 s ± 0.017 s [User: 3.211 s, System: 0.103 s]
Range (min … max): 3.315 s … 3.352 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/serde2/Cargo.toml
Time (mean ± σ): 3.316 s ± 0.038 s [User: 3.207 s, System: 0.095 s]
Range (min … max): 3.282 s … 3.375 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/unicode/Cargo.toml
Time (mean ± σ): 6.391 s ± 0.323 s [User: 5.928 s, System: 0.412 s]
Range (min … max): 6.090 s … 6.917 s 5 runs
```
After:
```
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/backtraces/Cargo.toml
Time (mean ± σ): 6.955 s ± 0.051 s [User: 6.807 s, System: 0.132 s]
Range (min … max): 6.900 s … 7.038 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/mse/Cargo.toml
Time (mean ± σ): 1.784 s ± 0.012 s [User: 1.627 s, System: 0.156 s]
Range (min … max): 1.772 s … 1.797 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/serde1/Cargo.toml
Time (mean ± σ): 2.505 s ± 0.095 s [User: 2.311 s, System: 0.096 s]
Range (min … max): 2.405 s … 2.603 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/serde2/Cargo.toml
Time (mean ± σ): 2.449 s ± 0.031 s [User: 2.306 s, System: 0.100 s]
Range (min … max): 2.395 s … 2.467 s 5 runs
Benchmark 1: cargo +miri miri run --manifest-path bench-cargo-miri/unicode/Cargo.toml
Time (mean ± σ): 3.667 s ± 0.110 s [User: 3.498 s, System: 0.140 s]
Range (min … max): 3.564 s … 3.814 s 5 runs
```
The decrease in system time is probably due to spending less time in the page fault handler.
stacked_borrow now has an item module, and its own FrameExtra. These
serve to protect the implementation of Item (which is a bunch of
bit-packing tricks) from the primary logic of Stacked Borrows, and the
FrameExtra we have separates Stacked Borrows more cleanly from the
interpreter itself.
The new strategy for checking protectors also makes some subtle
performance tradeoffs, so they are now documented in Stack::item_popped
because that function primarily benefits from them, and it also touches
every aspect of them.
Also separating the actual CallId that is protecting a Tag from the Tag
makes it inconvienent to reproduce exactly the same protector errors, so
this also takes the opportunity to use some slightly cleaner English in
those errors. We need to make some change, might as well make it good.
Previously, Item was a struct of a NonZeroU64, an Option which was
usually unset or irrelevant, and a 4-variant enum. So collectively, the
size of an Item was 24 bytes, but only 8 bytes were used for the most
part.
So this takes advantage of the fact that it is probably impossible to
exhaust the total space of SbTags, and steals 3 bits from it to pack the
whole struct into a single u64. This bit-packing means that we reduce
peak memory usage when Miri goes memory-bound by ~3x. We also get CPU
performance improvements of varying size, because not only are we simply
accessing less memory, we can now compare a Vec<Item> using a memcmp
because it does not have any padding.