This adds the `only-apple`/`ignore-apple` compiletest directive, and
uses that basically everywhere instead of `only-macos`/`ignore-macos`.
Some of the updates in `run-make` are a bit redundant, as they use
`ignore-cross-compile` and won't run on iOS - but using Apple in these
is still more correct, so I've made that change anyhow.
These types are currently rejected for `as` casts by the compiler.
Remove this incorrect check and add codegen tests for all conversions
involving these types.
https://github.com/llvm/llvm-project/pull/89799 changes llvm.dbg.value/declare intrinsics to be in a different, out-of-instruction-line representation. For example
call void @llvm.dbg.declare(...)
becomes
#dbg_declare(...)
Update tests accordingly to work with both the old and new way.
There are a few tests that depend on some target features **not** being
enabled by default, and usually they are correct with the default x86-64
target CPU. However, in downstream builds we have modified the default
to fit our distros -- `x86-64-v2` in RHEL 9 and `x86-64-v3` in RHEL 10
-- and the latter especially trips tests that expect not to have AVX.
These cases are few enough that we can just set them back explicitly.
codegen tests: Tolerate `range()` qualifications in enum tests
Current LLVM can infer range bounds on the i8s involved with these tests, and annotates it. Accept these bounds if present.
`@rustbot` label: +llvm-main
cc `@durin42`
Set writable and dead_on_unwind attributes for sret arguments
Set the `writable` and `dead_on_unwind` attributes for `sret` arguments. This allows call slot optimization to remove more memcpy's.
See https://llvm.org/docs/LangRef.html#parameter-attributes for the specification of these attributes. In short, the statement we're making here is that:
* The return slot is writable.
* The return slot will not be read if the function unwinds.
Fixes https://github.com/rust-lang/rust/issues/90595.
Stop using LLVM struct types for alloca
The alloca type has no semantic meaning, only the size (and alignment, but we specify it explicitly) matter. Using `[N x i8]` is a more direct way to specify that we want `N` bytes, and avoids relying on LLVM's struct layout. It is likely that a future LLVM version will change to an untyped alloca representation.
Split out from #121577.
r? `@ghost`
Dellvmize some intrinsics (use `u32` instead of `Self` in some integer intrinsics)
This implements https://github.com/rust-lang/compiler-team/issues/693 minus what was implemented in #123226.
Note: I decided to _not_ change `shl`/... builder methods, as it just doesn't seem worth it.
r? ``@scottmcm``
For things with easily pre-checked overflow conditions -- shifts and unsigned subtraction -- write then checked methods in such a way that we stop emitting wrapping versions of them.
For example, today <https://rust.godbolt.org/z/qM9YK8Txb> neither
```rust
a.checked_sub(b).unwrap()
```
nor
```rust
a.checked_sub(b).unwrap_unchecked()
```
actually optimizes to `sub nuw`. After this PR they do.
Add the missing inttoptr when we ptrtoint in ptr atomics
Ralf noticed this here: https://github.com/rust-lang/rust/pull/122220#discussion_r1535172094
Our previous codegen forgot to add the cast back to integer type. The code compiles anyway, because of course all locals are in-memory to start with, so previous codegen would do the integer atomic, store the integer to a local, then load a pointer from that local. Which is definitely _not_ what we wanted: That's an integer-to-pointer transmute, so all pointers returned by these `AtomicPtr` methods didn't have provenance. Yikes.
Here's the IR for `AtomicPtr::fetch_byte_add` on 1.76: https://godbolt.org/z/8qTEjeraY
```llvm
define noundef ptr `@atomicptr_fetch_byte_add(ptr` noundef nonnull align 8 %a, i64 noundef %v) unnamed_addr #0 !dbg !7 {
start:
%0 = alloca ptr, align 8, !dbg !12
%val = inttoptr i64 %v to ptr, !dbg !12
call void `@llvm.lifetime.start.p0(i64` 8, ptr %0), !dbg !28
%1 = ptrtoint ptr %val to i64, !dbg !28
%2 = atomicrmw add ptr %a, i64 %1 monotonic, align 8, !dbg !28
store i64 %2, ptr %0, align 8, !dbg !28
%self = load ptr, ptr %0, align 8, !dbg !28
call void `@llvm.lifetime.end.p0(i64` 8, ptr %0), !dbg !28
ret ptr %self, !dbg !33
}
```
r? `@RalfJung`
cc `@nikic`
llvm/llvm-project#87910 infers `nuw` and `nsw` on some `trunc`
instructions we're doing `FileCheck` on. Tolerate but don't require them
to support both release and head LLVM.
`f16` and `f128` step 4: basic library support
This is the next step after https://github.com/rust-lang/rust/pull/121926, another portion of https://github.com/rust-lang/rust/pull/114607
Tracking issue: https://github.com/rust-lang/rust/issues/116909
This PR adds the most basic operations to `f16` and `f128` that get lowered as LLVM intrinsics. This is a very small step but it seemed reasonable enough to add unopinionated basic operations before the larger modules that are built on top of them.
r? ```@Amanieu``` since you were pretty involved in the RFC
cc ```@compiler-errors```
```@rustbot``` label +T-libs-api +S-blocked +F-f16_and_f128
I added this back in 111999, but I no longer think it's a good idea
- It had to get scaled back to only power-of-two things to not break a bunch of targets
- LLVM seems to be getting better at memcpy removal anyway
- Introducing vector instructions has seemed to sometimes (115515) make autovectorization worse
So this removes it from the codegen crates entirely, and instead just tries to use <https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/builder/trait.BuilderMethods.html#method.typed_place_copy> instead of direct `memcpy` so things will still use load/store for immediates.
Re-enable the early otherwise branch optimization
Closes#95162. Fixes#119014.
This is the first part of #121397.
An invalid enum discriminant can come from anywhere. We have to check to see if all successors contain the discriminant statement. This should have a pass to hoist instructions.
r? cjgillot
Don't emit divide-by-zero panic paths in `StepBy::len`
I happened to notice today that there's actually two such calls emitted in the assembly: <https://rust.godbolt.org/z/1Wbbd3Ts6>
Since they're impossible, hopefully telling LLVM that will also help optimizations elsewhere.
Fix argument ABI for overaligned structs on ppc64le
When passing a 16 (or higher) aligned struct by value on ppc64le, it needs to be passed as an array of `i128` rather than an array of `i64`. This will force the use of an even starting doubleword.
For the case of a 16 byte struct with alignment 16 it is important that `[1 x i128]` is used instead of `i128` -- apparently, the latter will get treated similarly to `[2 x i64]`, not exhibiting the correct ABI. Add a `force_array` flag to `Uniform` to support this.
The relevant clang code can be found here:
fe2119a7b0/clang/lib/CodeGen/Targets/PPC.cpp (L878-L884)fe2119a7b0/clang/lib/CodeGen/Targets/PPC.cpp (L780-L784)
I think the corresponding psABI wording is this:
> Fixed size aggregates and unions passed by value are mapped to as
> many doublewords of the parameter save area as the value uses in
> memory. Aggregrates and unions are aligned according to their
> alignment requirements. This may result in doublewords being
> skipped for alignment.
In particular the last sentence. Though I didn't find any wording for Clang's behavior of clamping the alignment to 16.
Fixes https://github.com/rust-lang/rust/issues/122767.
r? `@cuviper`