move checking ptr tracking on item pop into cold helper function
Before:
```
Benchmark 1: cargo miri run --manifest-path bench-cargo-miri/serde1/Cargo.toml
Time (mean ± σ): 6.729 s ± 0.050 s [User: 6.608 s, System: 0.124 s]
Range (min … max): 6.665 s … 6.799 s 5 runs
Benchmark 2: cargo miri run --manifest-path bench-cargo-miri/unicode/Cargo.toml
Time (mean ± σ): 20.923 s ± 0.271 s [User: 20.386 s, System: 0.537 s]
Range (min … max): 20.580 s … 21.165 s 5 runs
```
After:
```
Benchmark 1: cargo miri run --manifest-path bench-cargo-miri/serde1/Cargo.toml
Time (mean ± σ): 6.562 s ± 0.023 s [User: 6.430 s, System: 0.135 s]
Range (min … max): 6.544 s … 6.594 s 5 runs
Benchmark 2: cargo miri run --manifest-path bench-cargo-miri/unicode/Cargo.toml
Time (mean ± σ): 20.375 s ± 0.228 s [User: 19.964 s, System: 0.413 s]
Range (min … max): 20.201 s … 20.736 s 5 runs
```
Nothing major, but we'll take it I guess. 🤷
Fixes https://github.com/rust-lang/miri/issues/2132
stacked_borrow now has an item module, and its own FrameExtra. These
serve to protect the implementation of Item (which is a bunch of
bit-packing tricks) from the primary logic of Stacked Borrows, and the
FrameExtra we have separates Stacked Borrows more cleanly from the
interpreter itself.
The new strategy for checking protectors also makes some subtle
performance tradeoffs, so they are now documented in Stack::item_popped
because that function primarily benefits from them, and it also touches
every aspect of them.
Also separating the actual CallId that is protecting a Tag from the Tag
makes it inconvienent to reproduce exactly the same protector errors, so
this also takes the opportunity to use some slightly cleaner English in
those errors. We need to make some change, might as well make it good.
Previously, Item was a struct of a NonZeroU64, an Option which was
usually unset or irrelevant, and a 4-variant enum. So collectively, the
size of an Item was 24 bytes, but only 8 bytes were used for the most
part.
So this takes advantage of the fact that it is probably impossible to
exhaust the total space of SbTags, and steals 3 bits from it to pack the
whole struct into a single u64. This bit-packing means that we reduce
peak memory usage when Miri goes memory-bound by ~3x. We also get CPU
performance improvements of varying size, because not only are we simply
accessing less memory, we can now compare a Vec<Item> using a memcmp
because it does not have any padding.
This adds a very simple LRU-like cache which stores the locations of
often-used tags. While the implementation is very simple, the cache hit
rate is incredible at ~99.9% on most programs, and often the element at
position 0 in the cache has a hit rate of 90%. So the sub-optimality of
this cache basicaly vanishes into the noise in a profile.
Additionally, we keep a range which denotes where there might be an item
granting Unique permission in the stack, so that when we invalidate
Uniques we do not need to scan much of the stack, and often scan nothing
at all.
* Store the local crates in an Rc<[CrateNum]>
* Move all the allocation history into Stacks
* Clean up the implementation of get_logs_relevant_to a bit
* Pass a ThreadInfo down to grant/access to get the current span lazily
* Rename add_* to log_* for clarity
* Hoist borrow_mut calls out of loops by tweaking the for_each signature
* Explain the parameters of check_protector a bit more