Add -Z no-parallel-llvm flag
Codegen issues commonly only manifest under specific circumstances,
e.g. if multiple codegen units are used and ThinLTO is enabled.
However, these configuration are threaded, making the use of LLVM
debugging facilities hard, as output is interleaved.
This patch adds a -Z no-parallel-llvm flag, which allows disabling
parallelization of codegen and linking, while otherwise preserving
behavior with regard to codegen units and LTO.
Codegen issues commonly only manifest under specific circumstances,
e.g. if multiple codegen units are used and ThinLTO is enabled.
However, these configuration are threaded, making the use of LLVM
debugging facilities hard, as output is interleaved.
This patch adds a -Z no-parallel-llvm flag, which allows disabling
parallelization of codegen and linking, while otherwise preserving
behavior with regard to codegen units and LTO.
Rollup of 8 pull requests
Successful merges:
- #50531 (Cleanup uses of TypeIdHasher and replace them with StableHasher)
- #50819 (Fix potential divide by zero)
- #50827 (Update LLVM to 56c931901cfb85cd6f7ed44c7d7520a8de1edf97)
- #50829 (CheckLoopVisitor: also visit break expressions)
- #50854 (in which the unused shorthand field pattern debacle/saga continues)
- #50858 (Reorder description for snippets in rustdoc documentation)
- #50883 (Fix warning when building stage0 libcore)
- #50889 (Update clippy)
Failed merges:
Fix potential divide by zero
This should fix#50761
I had trouble reproducing with the provided code, but looking at the stack trace would indicate that this code is the likely cause. I made a number of assumptions here, because I don't have enough context on how the register size is set:
1. I assumed `rest.unit.size.bytes()` can be 0, and it's ok if it's set to 0 before this function is called
2. I assumed that if `rest.unit.size.bytes()` is 0, that we want `rest_count` to also be 0.
remove semicolon -_-
Add rem_bytes to conditional to avoid error when performing mod by 0
Add test file to confirm compilation passes.
Ensure we don't divide or mod by zero in llvm_type. Include test file from issue.
Emit noalias on &mut parameters by default
This used to be disabled due to LLVM bugs in the handling of
noalias information in conjunction with unwinding. However,
according to #31681 all known LLVM bugs have been fixed by
LLVM 6.0, so it's probably time to reenable this optimization.
-Z no-mutable-noalias is left as an escape-hatch to debug problems
suspected to stem from this change.
Implement [T]::align_to
Note that this PR deviates from what is accepted by RFC slightly by making `align_offset` to return an offset in elements, rather than bytes. This is necessary to sanely support `[T]::align_to` and also simply makes more sense™. The caveat is that trying to align a pointer of ZST is now an equivalent to `is_aligned` check, rather than anything else (as no number of ZST elements will align a misaligned ZST pointer).
It also implements the `align_to` slightly differently than proposed in the RFC to properly handle cases where size of T and U aren’t co-prime.
Furthermore, a promise is made that the slice containing `U`s will be as large as possible (contrary to the RFC) – otherwise the function is quite useless.
The implementation uses quite a few underhanded tricks and takes advantage of the fact that alignment is a power-of-two quite heavily to optimise the machine code down to something that results in as few known-expensive instructions as possible. Currently calling `ptr.align_offset` with an unknown-at-compile-time `align` results in code that has just a single "expensive" modulo operation; the rest is "cheap" arithmetic and bitwise ops.
cc https://github.com/rust-lang/rust/issues/44488 @oli-obk
As mentioned in the commit message for align_offset, many thanks go to Chris McDonald.
This used to be disabled due to LLVM bugs in the handling of
noalias information in conjunction with unwinding. However,
according to #31681 all known LLVM bugs have been fixed by
LLVM 6.0, so it's probably time to reenable this optimization.
Noalias annotations will not be emitted by default if either
-C panic=abort (as previously) or LLVM >= 6.0 (new).
-Z mutable-noalias=no is left as an escape-hatch to allow
debugging problems suspected to stem from this change.
Keep only the language item. This removes some indirection and makes
codegen worse for debug builds, but simplifies code significantly, which
is a good tradeoff to make, in my opinion.
Besides, the codegen can be improved even further with some constant
evaluation improvements that we expect to happen in the future.
This is necessary if we want to implement `[T]::align_to` and is more
useful in general.
This implementation effort has begun during the All Hands and represents
a month of my futile efforts to do any sort of maths. Luckily, I
found the very very nice Chris McDonald (cjm) on IRC who figured out the
core formulas for me! All the thanks for existence of this PR go to
them!
Anyway… Those formulas were mangled by yours truly into the arcane forms
you see here to squeeze out the best assembly possible on most of the
modern architectures (x86 and ARM were evaluated in practice). I mean,
just look at it: *one actual* modulo operation and everything else is
just the cheap single cycle ops! Admitedly, the naive solution might be
faster in some common scenarios, but this code absolutely butchers the
naive solution on the worst case scenario.
Alas, the result of this arcane magic also means that the code pretty
heavily relies on the preconditions holding true and breaking those
preconditions will unleash the UB-est of all UBs! So don’t.