handle Box with allocators
This is the Miri side of https://github.com/rust-lang/rust/pull/98847.
Thanks `@DrMeepster` for doing most of the work of getting this test case to pass in Miri. :)
Optimizing Stacked Borrows (part 1?): Cache locations of Tags in a Borrow Stack
Before this PR, a profile of Miri under almost any workload points quite squarely at these regions of code as being incredibly hot (each being ~40% of cycles):
dadcbebfbd/src/stacked_borrows.rs (L259-L269)dadcbebfbd/src/stacked_borrows.rs (L362-L369)
This code is one of at least three reasons that stacked borrows analysis is super-linear: These are both linear in the number of borrows in the stack and they are positioned along the most commonly-taken paths.
I'm addressing the first loop (which is in `Stack::find_granting`) by adding a very very simple sort of LRU cache implemented on a `VecDeque`, which maps recently-looked-up tags to their position in the stack. For `Untagged` access we fall back to the same sort of linear search. But as far as I can tell there are never enough `Untagged` items to be significant.
I'm addressing the second loop by keeping track of the region of stack where there could be items granting `Permission::Unique`. This optimization is incredibly effective because `Read` access tends to dominate and many trips through this code path now skip the loop entirely.
These optimizations result in pretty enormous improvements:
Without raw pointer tagging, `mse` 34.5s -> 2.4s, `serde1` 5.6s -> 3.6s
With raw pointer tagging, `mse` 35.3s -> 2.4s, `serde1` 5.7s -> 3.6s
And there is hardly any impact on memory usage:
Memory usage on `mse` 844 MB -> 848 MB, `serde1` 184 MB -> 184 MB (jitter on these is a few MB).
Support (stat/fstat/lstat)64 on macos
"In order to accommodate advanced capabilities of newer file systems,
the struct stat, struct statfs, and struct dirent data structures
were updated in Mac OSX 10.5."
"TRANSITIONAL DESCRIPTION (NOW DEPRECATED)
The fstat64, lstat64 and stat64 routines are equivalent to their
corresponding non-64-suffixed routine, when 64-bit inodes are in
effect. They were added before there was support for the symbol
variants, and so are now deprecated. Instead of using these, set
the _DARWIN_USE_64_BIT_INODE macro before including header files to
force 64-bit inode support. The stat64 structure used by these deprecated routines is the same
as the stat structure when 64-bit inodes are in effect (see above)."
"HISTORY
An lstat() function call appeared in 4.2BSD. The stat64(),
fstat64(), and lstat64() system calls first appeared in Mac OS X
10.5 (Leopard) and are now deprecated in favor of the corresponding
symbol variants. The fstatat() system call appeared in OS X 10.10"
"In order to accommodate advanced capabilities of newer file systems,
the struct stat, struct statfs, and struct dirent data structures
were updated in Mac OSX 10.5."
"TRANSITIONAL DESCRIPTION (NOW DEPRECATED)
The fstat64, lstat64 and stat64 routines are equivalent to their
corresponding non-64-suffixed routine, when 64-bit inodes are in
effect. They were added before there was support for the symbol
variants, and so are now deprecated. Instead of using these, set
the _DARWIN_USE_64_BIT_INODE macro before including header files to
force 64-bit inode support.
The stat64 structure used by these deprecated routines is the same
as the stat structure when 64-bit inodes are in effect (see above)."
"HISTORY
An lstat() function call appeared in 4.2BSD. The stat64(),
fstat64(), and lstat64() system calls first appeared in Mac OS X
10.5 (Leopard) and are now deprecated in favor of the corresponding
symbol variants. The fstatat() system call appeared in OS X 10.10"
This adds a very simple LRU-like cache which stores the locations of
often-used tags. While the implementation is very simple, the cache hit
rate is incredible at ~99.9% on most programs, and often the element at
position 0 in the cache has a hit rate of 90%. So the sub-optimality of
this cache basicaly vanishes into the noise in a profile.
Additionally, we keep a range which denotes where there might be an item
granting Unique permission in the stack, so that when we invalidate
Uniques we do not need to scan much of the stack, and often scan nothing
at all.
Enable permissive provenance by default
This completes the plan laid out in https://github.com/rust-lang/miri/issues/2133:
- We use permissive provenance with wildcard pointers by default.
- We print a warning on int2ptr casts. `-Zmiri-permissive-provenance` suppresses the warning; `-Zmiri-strict-provenance` turns it into a hard error.
- Raw pointer tagging is now always enabled, so we remove the `-Zmiri-tag-raw-pointers` flag and the code for untagged pointers. (Passing the flag still works, for compatibility -- but we just ignore it, with a warning.)
We also fix an intptrcast issue:
- Only live allocations are considered when computing the AllocId from an address.
So, finally, Miri has a good story for ptr2int2ptr roundtrips *and* no weird false negatives when doing raw pointer stuff with Stacked Borrows. :-) 🎉 Thanks a lot to everyone who helped with this, in particular `@carbotaniuman` who convinced me this is even possible.
Fixes https://github.com/rust-lang/miri/issues/2133
Fixes https://github.com/rust-lang/miri/issues/1866
Fixes https://github.com/rust-lang/miri/issues/1993
Prevent futex_wait from actually waiting if a concurrent waker was executed before us
Fixes#2223
Two SC fences were placed in `futex_wake` (after the caller has changed `addr`), and in `futex_wait` (before we read `addr`). This guarantees that `futex_wait` sees the value written to `addr` before the last `futex_wake` call, should one exists, and avoid going into sleep with no one else to wake us up.
ada7b72a87/src/concurrency/weak_memory.rs (L324-L326)
Earlier I proposed to use `fetch_add(0)` to read the latest value in MO, though this isn't the proper way to do it and breaks aliasing: syscall caller may pass in a `*const` from a `&` and Miri complains about write to a `SharedReadOnly` location, causing this test to fail.
ada7b72a87/tests/pass/concurrency/linux-futex.rs (L56-L68)
make Miri's scheduler proper round-robin
When thread N blocks or yields, we activate thread N+1 next, rather than always activating thread 0. This should guarantee that as long as all threads regularly yield, each thread eventually takes a step again.
Fixes the "multiple loops that yield playing ping-pong" part of https://github.com/rust-lang/miri/issues/1388.
`@cbeuw` I hope this doesn't screw up the scheduler-dependent tests you are adding in your PR.
run Clippy on CI
and fix some things it complains about. Also use `rustup-toolchain` script on CI (reduces code duplication, and good thing to make sure it keeps working, since we recommend it in the docs).
I left `ui_test` out for now; I'll leave those nits to `@oli-obk.` ;)
Save a created event for zero-size reborrows
Currently, we don't save a created event for zero-sized reborrows. Attempting to use something from a zero-sized reborrow is surprisingly common, for example on `minimal-lexical==0.2.1` we previously just emit this:
```
Undefined Behavior: attempting a write access using <187021> at alloc72933[0x0], but that tag does not exist in the borrow stack for this location
--> /root/rust/library/core/src/ptr/mod.rs:1287:9
|
1287 | copy_nonoverlapping(&src as *const T, dst, 1);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| attempting a write access using <187021> at alloc72933[0x0], but that tag does not exist in the borrow stack for this location
| this error occurs as part of an access at alloc72933[0x0..0x8]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
= note: inside `std::ptr::write::<u64>` at /root/rust/library/core/src/ptr/mod.rs:1287:9
note: inside `minimal_lexical::stackvec::StackVec::push_unchecked` at /root/build/src/stackvec.rs:82:13
--> /root/build/src/stackvec.rs:82:13
|
82 | ptr::write(self.as_mut_ptr().add(self.len()), value);
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
... backtrace continues...
```
Which leaves us with the question "where did we make this pointer?" because for every other diagnostic you get a "was created by" note, so I suspect people might be tempted to think there is a Miri bug here. I certainly was.
---
This code duplication is so awful, I'm going to take a look at cleaning it up later. The fact that `ptr_get_alloc_id` can fail in this situation makes things annoying.
Initial work on Miri permissive-exposed-provenance
Miri portions of the changes for portions of a permissive ptr-to-int model for Miri. This is more restrictive than what we currently have so it will probably need a flag once I figure out how to hook that up.
> This implements a form of permissive exposed-address provenance, wherein the only way to expose the address is with a cast to usize (ideally expose_addr). This is more restrictive than C in that stuff like reading the representation bytes (via unions, type-punning, transmute) does not expose the address, only expose_addr. This is less restrictive than C in that a pointer casted from an integer has union provenance of all exposed pointers, not any udi stuff.
There's a few TODOs here, namely related to `fn memory_read` and friends. We pass it the maybe/unreified provenance before `ptr_get_alloc` reifies it into a concrete one, so it doesn't have the `AllocId` (or the SB tag, but that's getting ahead of ourselves). One way this could be fixed is changing `ptr_get_alloc` and (`ptr_try_get_alloc_id` on the rustc side) to return a pointer with the tag fixed up. We could also take in different arguments, but I'm not sure what works best.
The other TODOs here are how permissive this model could be. This currently does not enforce that a ptr-to-int cast happens before the corresponding int-to-ptr (colloquial meaning of happens before, not atomic meaning). Example:
```
let ptr = 0x2000 as *const i32;
let a: i32 = 5;
let a_ptr = &a as *const i32;
// value is 0x2000;
a_ptr as usize;
println!("{}", unsafe { *ptr }); // this is valid
```
We also allow the resulting pointer to dereference different non-contiguous allocations (the "not any udi stuff" mentioned above), which I'm not sure if is allowed by LLVM.
This is the Miri side of https://github.com/rust-lang/rust/pull/95826.
Adjust diagnostics assertion so we don't ICE in setup
Fixes https://github.com/rust-lang/miri/issues/2076 just by handling diagnostics produced during setup. The tracking notes don't have any spans but it's better than an ICE.
It looks like we leak allocations 1..20, and allocations 13..19 don't have any creation notes, and 14 only has a `FreedAlloc` alloc tracking diagnostic.
Make allow_data_races_* public and use it during EnvVars::cleanup
Fixes https://github.com/rust-lang/miri/issues/2020
I've tried for hours now to come up with a test case for this ICE with no luck. I suspect there's something about the way the data race detection works under these conditions that I just don't understand 😩.
But I tried this change out on a handful of crates and I don't see any more ICEs of this form. For whatever reason it seems like `bastion==0.4.5` is a good way to run into this, with the flags
```
MIRIFLAGS="-Zmiri-tag-raw-pointers -Zmiri-panic-on-unsupported -Zmiri-disable-isolation" cargo +miri miri test --no-fail-fast --doc
```
I think all the cases I've run into with this involve both `-Zmiri-panic-on-unsupported` and `-Zmiri-tag-raw-pointers`, so it could be that the combination of an unexpected panic and a machine halt is required.
Pass the correct size to the AllocRange for log_creation
Fixes https://github.com/rust-lang/miri/issues/2127
I guess all I needed was a bit of sleep and reassurance that this diagnostic is the wrong part of that situation.
Print spans where tags are created and invalidated
5225225 called this "automatic tag tracking" and I think that may be a reasonable description, but I would like to kill tag tracking as a primary use of Miri if possible. Tag tracking isn't always possible; for example if the UB is only detected with isolation off and the failing tag is made unstable by removing isolation. (also it's bad UX to run the tool twice)
This is just one of the things we can do with https://github.com/rust-lang/miri/pull/2024
The memory usage of this is _shockingly_ low, I think because the memory usage of Miri is driven by allocations where each byte ends up with its own very large stack. The memory usage in this change is linear with the number of tags, not tags * bytes. If memory usage gets out of control we can cap the number of events we save per allocation, from experience we tend to only use the most recent few in diagnostics but of course there's no guarantee of that so if we can manage to keep everything that would be best.
In many cases now I can tell exactly what these codebases are doing wrong just from the new outputs here, which I think is extremely cool.
New helps generated with plain old `cargo miri test` on `rust-argon2` v1.0.0:
```
test argon2::tests::single_thread_verification_multi_lane_hash ... error: Undefined Behavior: trying to reborrow <1485898> for Unique permission at alloc110523[0x0], but that tag does not exist in the borrow stack for this location
--> /home/ben/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/mem/manually_drop.rs:89:9
|
89 | slot.value
| ^^^^^^^^^^
| |
| trying to reborrow <1485898> for Unique permission at alloc110523[0x0], but that tag does not exist in the borrow stack for this location
| this error occurs as part of a reborrow at alloc110523[0x0..0x20]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <1485898> was created by a retag at offsets [0x0..0x20]
--> src/memory.rs:42:13
|
42 | vec.push(unsafe { &mut (*ptr) });
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: <1485898> was later invalidated at offsets [0x0..0x20]
--> src/memory.rs:42:31
|
42 | vec.push(unsafe { &mut (*ptr) });
| ^^^^^^^^^^^
```
And with `-Zmiri-tag-raw-pointers` on `slab` v0.4.5
```
error: Undefined Behavior: trying to reborrow <2915> for Unique permission at alloc1418[0x0], but that tag does not exist in the borrow stack for this location
--> /tmp/slab-0.4.5/src/lib.rs:835:16
|
835 | match (&mut *ptr1, &mut *ptr2) {
| ^^^^^^^^^^
| |
| trying to reborrow <2915> for Unique permission at alloc1418[0x0], but that tag does not exist in the borrow stack for this location
| this error occurs as part of a reborrow at alloc1418[0x0..0x10]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <2915> was created by a retag at offsets [0x0..0x10]
--> /tmp/slab-0.4.5/src/lib.rs:833:20
|
833 | let ptr1 = self.entries.get_unchecked_mut(key1) as *mut Entry<T>;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: <2915> was later invalidated at offsets [0x0..0x20]
--> /tmp/slab-0.4.5/src/lib.rs:834:20
|
834 | let ptr2 = self.entries.get_unchecked_mut(key2) as *mut Entry<T>;
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```
And without raw pointer tagging, `cargo miri test` on `half` v1.8.2
```
error: Undefined Behavior: trying to reborrow <untagged> for Unique permission at alloc1340[0x0], but that tag only grants SharedReadOnly permission for this location
--> /home/ben/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/slice/raw.rs:141:9
|
141 | &mut *ptr::slice_from_raw_parts_mut(data, len)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| |
| trying to reborrow <untagged> for Unique permission at alloc1340[0x0], but that tag only grants SharedReadOnly permission for this location
| this error occurs as part of a reborrow at alloc1340[0x0..0x6]
|
= help: this indicates a potential bug in the program: it performed an invalid operation, but the rules it violated are still experimental
= help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: tag was most recently created at offsets [0x0..0x6]
--> /tmp/half-1.8.2/src/slice.rs:309:22
|
309 | let length = self.len();
| ^^^^^^^^^^
help: this tag was also created here at offsets [0x0..0x6]
--> /tmp/half-1.8.2/src/slice.rs:308:23
|
308 | let pointer = self.as_ptr() as *mut u16;
| ^^^^^^^^^^^^^
```
The second suggestion is close to guesswork, but from experience it tends to be correct (as in, it tends to locate the pointer the user wanted) more often that it doesn't.
* Store the local crates in an Rc<[CrateNum]>
* Move all the allocation history into Stacks
* Clean up the implementation of get_logs_relevant_to a bit
Use atomic RMW for `{mutex, rwlock, cond, srwlock}_get_or_create_id` functions
This is required for #1963
`{mutex, rwlock, cond, srwlock}_get_or_create_id()` currently checks whether an ID field is 0 using an atomic read, allocate one and get a new ID if it is, then write it in a separate atomic write. This is fine without weak memory. For instance, in `pthread_mutex_lock` which may be called by two threads concurrently, only one thread can read 0, create and then write a new ID, the later-run thread will always see the newly created ID and never 0.
```rust
fn pthread_mutex_lock(&mut self, mutex_op: &OpTy<'tcx, Tag>) -> InterpResult<'tcx, i32> {
let this = self.eval_context_mut();
let kind = mutex_get_kind(this, mutex_op)?.check_init()?;
let id = mutex_get_or_create_id(this, mutex_op)?;
let active_thread = this.get_active_thread();
```
However, with weak memory behaviour, both threads may read 0: the first thread has to see 0 because nothing else was written to it, and the second thread is not guaranteed to observe the latest value, causing a duplicate mutex to be created and both threads "successfully" acquiring the lock at the same time.
This is a pretty typical pattern requiring the use of atomic RMWs. RMW *always* reads the latest value in a location, so only one thread can create the new mutex and ID, all others scheduled later will see the new ID.
* Pass a ThreadInfo down to grant/access to get the current span lazily
* Rename add_* to log_* for clarity
* Hoist borrow_mut calls out of loops by tweaking the for_each signature
* Explain the parameters of check_protector a bit more
Add a command line flag to avoid printing to stdout and stderr
This is practical for tests that don't actually care about the output and thus don't want it intermingled with miri's warnings, errors or ICEs
fixes#2083