(This is just the data structure changes and some boilerplate match
code that followed from it; the actual emission of these statements
comes in a follow-up commit.)
Add simd math intrinsics and gather/scatter
This PR adds simd math intrinsics for floating-point vectors (sqrt, sin, cos, pow, exp, log, fma, abs, etc.) and the generic simd gather/scatter intrinsics.
std: Ensure OOM is classified as `nounwind`
OOM can't unwind today, and historically it's been optimized as if it can't
unwind. This accidentally regressed with recent changes to the OOM handler, so
this commit adds in a codegen test to assert that everything gets optimized away
after the OOM function is approrpiately classified as nounwind
Closes#50925
OOM can't unwind today, and historically it's been optimized as if it can't
unwind. This accidentally regressed with recent changes to the OOM handler, so
this commit adds in a codegen test to assert that everything gets optimized away
after the OOM function is approrpiately classified as nounwind
Closes#50925
Add -Z no-parallel-llvm flag
Codegen issues commonly only manifest under specific circumstances,
e.g. if multiple codegen units are used and ThinLTO is enabled.
However, these configuration are threaded, making the use of LLVM
debugging facilities hard, as output is interleaved.
This patch adds a -Z no-parallel-llvm flag, which allows disabling
parallelization of codegen and linking, while otherwise preserving
behavior with regard to codegen units and LTO.
Codegen issues commonly only manifest under specific circumstances,
e.g. if multiple codegen units are used and ThinLTO is enabled.
However, these configuration are threaded, making the use of LLVM
debugging facilities hard, as output is interleaved.
This patch adds a -Z no-parallel-llvm flag, which allows disabling
parallelization of codegen and linking, while otherwise preserving
behavior with regard to codegen units and LTO.
Rollup of 8 pull requests
Successful merges:
- #50531 (Cleanup uses of TypeIdHasher and replace them with StableHasher)
- #50819 (Fix potential divide by zero)
- #50827 (Update LLVM to 56c931901cfb85cd6f7ed44c7d7520a8de1edf97)
- #50829 (CheckLoopVisitor: also visit break expressions)
- #50854 (in which the unused shorthand field pattern debacle/saga continues)
- #50858 (Reorder description for snippets in rustdoc documentation)
- #50883 (Fix warning when building stage0 libcore)
- #50889 (Update clippy)
Failed merges:
Fix potential divide by zero
This should fix#50761
I had trouble reproducing with the provided code, but looking at the stack trace would indicate that this code is the likely cause. I made a number of assumptions here, because I don't have enough context on how the register size is set:
1. I assumed `rest.unit.size.bytes()` can be 0, and it's ok if it's set to 0 before this function is called
2. I assumed that if `rest.unit.size.bytes()` is 0, that we want `rest_count` to also be 0.
remove semicolon -_-
Add rem_bytes to conditional to avoid error when performing mod by 0
Add test file to confirm compilation passes.
Ensure we don't divide or mod by zero in llvm_type. Include test file from issue.
Emit noalias on &mut parameters by default
This used to be disabled due to LLVM bugs in the handling of
noalias information in conjunction with unwinding. However,
according to #31681 all known LLVM bugs have been fixed by
LLVM 6.0, so it's probably time to reenable this optimization.
-Z no-mutable-noalias is left as an escape-hatch to debug problems
suspected to stem from this change.
Implement [T]::align_to
Note that this PR deviates from what is accepted by RFC slightly by making `align_offset` to return an offset in elements, rather than bytes. This is necessary to sanely support `[T]::align_to` and also simply makes more sense™. The caveat is that trying to align a pointer of ZST is now an equivalent to `is_aligned` check, rather than anything else (as no number of ZST elements will align a misaligned ZST pointer).
It also implements the `align_to` slightly differently than proposed in the RFC to properly handle cases where size of T and U aren’t co-prime.
Furthermore, a promise is made that the slice containing `U`s will be as large as possible (contrary to the RFC) – otherwise the function is quite useless.
The implementation uses quite a few underhanded tricks and takes advantage of the fact that alignment is a power-of-two quite heavily to optimise the machine code down to something that results in as few known-expensive instructions as possible. Currently calling `ptr.align_offset` with an unknown-at-compile-time `align` results in code that has just a single "expensive" modulo operation; the rest is "cheap" arithmetic and bitwise ops.
cc https://github.com/rust-lang/rust/issues/44488 @oli-obk
As mentioned in the commit message for align_offset, many thanks go to Chris McDonald.