miri: prune some atomic operation and raw pointer details from stacktrace
Since Miri removes `track_caller` frames from the stacktrace, adding that attribute can help make backtraces more readable (similar to how it makes panic locations better). I made them only show up with `cfg(miri)` to make sure the extra arguments induced by `track_caller` do not cause any runtime performance trouble.
This is also testing the waters for whether the libs team is okay with having these attributes in their code, or whether you'd prefer if we find some other way to do this. If you are fine with this, we will probably want to add it to a lot more functions (all the other atomic operations, to start).
Before:
```
error: Undefined Behavior: Data race detected between Atomic Load on Thread(id = 2) and Write on Thread(id = 1) at alloc1727 (current vector clock = VClock([9, 0, 6]), conflicting timestamp = VClock([0, 6]))
--> /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/sync/atomic.rs:2594:23
|
2594 | SeqCst => intrinsics::atomic_load_seqcst(dst),
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Data race detected between Atomic Load on Thread(id = 2) and Write on Thread(id = 1) at alloc1727 (current vector clock = VClock([9, 0, 6]), conflicting timestamp = VClock([0, 6]))
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: inside `std::sync::atomic::atomic_load::<usize>` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/sync/atomic.rs:2594:23
= note: inside `std::sync::atomic::AtomicUsize::load` at /home/r/.rustup/toolchains/miri/lib/rustlib/src/rust/library/core/src/sync/atomic.rs:1719:26
note: inside closure at ../miri/tests/fail/data_race/atomic_read_na_write_race1.rs:22:13
--> ../miri/tests/fail/data_race/atomic_read_na_write_race1.rs:22:13
|
22 | (&*c.0).load(Ordering::SeqCst)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
```
After:
```
error: Undefined Behavior: Data race detected between Atomic Load on Thread(id = 2) and Write on Thread(id = 1) at alloc1727 (current vector clock = VClock([9, 0, 6]), conflicting timestamp = VClock([0, 6]))
--> tests/fail/data_race/atomic_read_na_write_race1.rs:22:13
|
22 | (&*c.0).load(Ordering::SeqCst)
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Data race detected between Atomic Load on Thread(id = 2) and Write on Thread(id = 1) at alloc1727 (current vector clock = VClock([9, 0, 6]), conflicting timestamp = VClock([0, 6]))
|
= help: this indicates a bug in the program: it performed an invalid operation, and caused Undefined Behavior
= help: see https://doc.rust-lang.org/nightly/reference/behavior-considered-undefined.html for further information
= note: inside closure at tests/fail/data_race/atomic_read_na_write_race1.rs:22:13
```
Add `[f32]::sort_floats` and `[f64]::sort_floats`
It's inconvenient to sort a slice or Vec of floats, compared to sorting integers. To simplify numeric code, add a convenience method to `[f32]` and `[f64]` to sort them using `sort_unstable_by` with `total_cmp`.
Rename `<*{mut,const} T>::as_{const,mut}` to `cast_`
This renames the methods to use the `cast_` prefix instead of `as_` to
make it more readable and avoid confusion with `<*mut T>::as_mut()`
which is `unsafe` and returns a reference.
Sorry, didn't notice ACP process exists, opened https://github.com/rust-lang/libs-team/issues/51
See #92675
make vtable pointers entirely opaque
This implements the scheme discussed in https://github.com/rust-lang/unsafe-code-guidelines/issues/338: vtable pointers should be considered entirely opaque and not even readable by Rust code, similar to function pointers.
- We have a new kind of `GlobalAlloc` that symbolically refers to a vtable.
- Miri uses that kind of allocation when generating a vtable.
- The codegen backends, upon encountering such an allocation, call `vtable_allocation` to obtain an actually dataful allocation for this vtable.
- We need new intrinsics to obtain the size and align from a vtable (for some `ptr::metadata` APIs), since direct accesses are UB now.
I had to touch quite a bit of code that I am not very familiar with, so some of this might not make much sense...
r? `@oli-obk`
This renames the methods to use the `cast_` prefix instead of `as_` to
make it more readable and avoid confusion with `<*mut T>::as_mut()`
which is `unsafe` and returns a reference.
See #92675
Improve the function pointer docs
This is #97842 but for function pointers instead of tuples. The concept is basically the same.
* Reduce duplicate impls; show `fn (T₁, T₂, …, Tₙ)` and include a sentence saying that there exists up to twelve of them.
* Show `Copy` and `Clone`.
* Show auto traits like `Send` and `Sync`, and blanket impls like `Any`.
https://notriddle.com/notriddle-rustdoc-test/std/primitive.fn.html
* Reduce duplicate impls; show only the `fn (T)` and include a sentence
saying that there exists up to twelve of them.
* Show `Copy` and `Clone`.
* Show auto traits like `Send` and `Sync`, and blanket impls like `Any`.
core::any: replace some generic types with impl Trait
This gives a cleaner API since the caller only specifies the concrete type they usually want to.
r? `@yaahc`
Fix `Skip::next` for non-fused inner iterators
`iter.skip(n).next()` will currently call `nth` and `next` in succession on `iter`, without checking whether `nth` exhausts the iterator. Using `?` to propagate a `None` value returned by `nth` avoids this.
Use split_once in FromStr docs
Current implementation:
```rust
fn from_str(s: &str) -> Result<Self, Self::Err> {
let coords: Vec<&str> = s.trim_matches(|p| p == '(' || p == ')' )
.split(',')
.collect();
let x_fromstr = coords[0].parse::<i32>()?;
let y_fromstr = coords[1].parse::<i32>()?;
Ok(Point { x: x_fromstr, y: y_fromstr })
}
```
Creating the vector is not necessary, `split_once` does the job better.
Alternatively we could also remove `trim_matches` with `strip_prefix` and `strip_suffix`:
```rust
let (x, y) = s
.strip_prefix('(')
.and_then(|s| s.strip_suffix(')'))
.and_then(|s| s.split_once(','))
.unwrap();
```
The question is how much 'correctness' is too much and distracts from the example. In a real implementation you would also not unwrap (or originally access the vector without bounds checks), but implementing a custom Error and adding a `From<ParseIntError>` and implementing the `Error` trait adds a lot of code to the example which is not relevant to the `FromStr` trait.
Add assertion that `transmute_copy`'s U is not larger than T
This is called out as a safety requirement in the docs, but because knowing this can be done at compile time and constant folded (just like the `align_of` branch is removed), we can just panic here.
I've looked at the asm (using `cargo-asm`) of a function that both is correct and incorrect, and the panic is completely removed, or is unconditional, without needing build-std.
I don't expect this to cause much breakage in the wild. I scanned through https://miri.saethlin.dev/ub for issues that would look like this (error: Undefined Behavior: memory access failed: alloc1768 has size 1, so pointer to 8 bytes starting at offset 0 is out-of-bounds), but couldn't find any.
That doesn't rule out it happening in crates tested that fail earlier for some other reason, though, but it indicates that doing this is rare, if it happens at all. A crater run for this would need to be build and test, since this is a runtime thing.
Also added a few more transmute_copy tests.
Rearrange slice::split_mut to remove bounds check
Closes https://github.com/rust-lang/rust/issues/86313
Turns out that all we need to do here is reorder the bounds checks to convince LLVM that all the bounds checks can be removed. It seems like LLVM just fails to propagate the original length information past the first bounds check and into the second one. With this implementation it doesn't need to, each check can be proven inbounds based on the one immediately previous.
I've gradually convinced myself that this implementation is unambiguously better based on the above logic, but maybe this is still deserving of a codegen test?
Also the mentioned borrowck limitation no longer seems to exist.
Add a special case for align_offset /w stride != 1
This generalizes the previous `stride == 1` special case to apply to any
situation where the requested alignment is divisible by the stride. This
in turn allows the test case from #98809 produce ideal assembly, along
the lines of:
leaq 15(%rdi), %rax
andq $-16, %rax
This also produces pretty high quality code for situations where the
alignment of the input pointer isn’t known:
pub unsafe fn ptr_u32(slice: *const u32) -> *const u32 {
slice.offset(slice.align_offset(16) as isize)
}
// =>
movl %edi, %eax
andl $3, %eax
leaq 15(%rdi), %rcx
andq $-16, %rcx
subq %rdi, %rcx
shrq $2, %rcx
negq %rax
sbbq %rax, %rax
orq %rcx, %rax
leaq (%rdi,%rax,4), %rax
Here LLVM is smart enough to replace the `usize::MAX` special case with
a branch-less bitwise-OR approach, where the mask is constructed using
the neg and sbb instructions. This appears to work across various
architectures I’ve tried.
This change ends up introducing more branches and code in situations
where there is less knowledge of the arguments. For example when the
requested alignment is entirely unknown. This use-case was never really
a focus of this function, so I’m not particularly worried, especially
since llvm-mca is saying that the new code is still appreciably faster,
despite all the new branching.
Fixes#98809.
Sadly, this does not help with #72356.
This generalizes the previous `stride == 1` special case to apply to any
situation where the requested alignment is divisible by the stride. This
in turn allows the test case from #98809 produce ideal assembly, along
the lines of:
leaq 15(%rdi), %rax
andq $-16, %rax
This also produces pretty high quality code for situations where the
alignment of the input pointer isn’t known:
pub unsafe fn ptr_u32(slice: *const u32) -> *const u32 {
slice.offset(slice.align_offset(16) as isize)
}
// =>
movl %edi, %eax
andl $3, %eax
leaq 15(%rdi), %rcx
andq $-16, %rcx
subq %rdi, %rcx
shrq $2, %rcx
negq %rax
sbbq %rax, %rax
orq %rcx, %rax
leaq (%rdi,%rax,4), %rax
Here LLVM is smart enough to replace the `usize::MAX` special case with
a branch-less bitwise-OR approach, where the mask is constructed using
the neg and sbb instructions. This appears to work across various
architectures I’ve tried.
This change ends up introducing more branches and code in situations
where there is less knowledge of the arguments. For example when the
requested alignment is entirely unknown. This use-case was never really
a focus of this function, so I’m not particularly worried, especially
since llvm-mca is saying that the new code is still appreciably faster,
despite all the new branching.
Fixes#98809.
Sadly, this does not help with #72356.
Stabilize `core::ffi::CStr`, `alloc::ffi::CString`, and friends
Stabilize the `core_c_str` and `alloc_c_string` feature gates.
Change `std::ffi` to re-export these types rather than creating type
aliases, since they now have matching stability.
Stabilize the `core_c_str` and `alloc_c_string` feature gates.
Change `std::ffi` to re-export these types rather than creating type
aliases, since they now have matching stability.