Disable probestack when GCOV profiling is being used
If I compile Firefox with gcov profiling enabled, Firefox crashes at startup because of probestack.
Since it's disabled for PGO, I think it makes sense to disable it for gcov too.
Currently -Z no-verify only controls IR verification prior to
LLVM codegen, while verification is performed unconditionally
both before and after linking with (Thin)LTO.
Generate br for all two target SwitchInts
Instead of only for booleans. This means that `if let` also becomes a br.
Apart from making the IR slightly simpler, this is supported by FastISel (#4353).
Fix building rustc on and for musl hosts.
This fixes all problems I had when trying to compile rustc on a musl-based distribution (with `crt-static = false` in `config.toml`).
This is a fixed version of what ended up being #50105, making it possible to compile rustc on musl targets.
The differences to the old (now merged and subsequently reverted) pull request are:
- The commit (6d9154a830) that caused the regression for which the original commits were reverted in #50709 is left out. This means the corresponding bug #36710 is still not fixed with `+crt-static`.
- The test for issue 36710 is skipped for musl targets (until the issue is properly fixed).
- Building cargo-vendor if `crt-static = false` is needed was broken (cargo-vendor links to some shared libraries if they exist on the system and this produces broken binaries with `+crt-static`)
CC @alexcrichton
(This is just the data structure changes and some boilerplate match
code that followed from it; the actual emission of these statements
comes in a follow-up commit.)
Add simd math intrinsics and gather/scatter
This PR adds simd math intrinsics for floating-point vectors (sqrt, sin, cos, pow, exp, log, fma, abs, etc.) and the generic simd gather/scatter intrinsics.
std: Ensure OOM is classified as `nounwind`
OOM can't unwind today, and historically it's been optimized as if it can't
unwind. This accidentally regressed with recent changes to the OOM handler, so
this commit adds in a codegen test to assert that everything gets optimized away
after the OOM function is approrpiately classified as nounwind
Closes#50925
OOM can't unwind today, and historically it's been optimized as if it can't
unwind. This accidentally regressed with recent changes to the OOM handler, so
this commit adds in a codegen test to assert that everything gets optimized away
after the OOM function is approrpiately classified as nounwind
Closes#50925
Add -Z no-parallel-llvm flag
Codegen issues commonly only manifest under specific circumstances,
e.g. if multiple codegen units are used and ThinLTO is enabled.
However, these configuration are threaded, making the use of LLVM
debugging facilities hard, as output is interleaved.
This patch adds a -Z no-parallel-llvm flag, which allows disabling
parallelization of codegen and linking, while otherwise preserving
behavior with regard to codegen units and LTO.
Codegen issues commonly only manifest under specific circumstances,
e.g. if multiple codegen units are used and ThinLTO is enabled.
However, these configuration are threaded, making the use of LLVM
debugging facilities hard, as output is interleaved.
This patch adds a -Z no-parallel-llvm flag, which allows disabling
parallelization of codegen and linking, while otherwise preserving
behavior with regard to codegen units and LTO.
Rollup of 8 pull requests
Successful merges:
- #50531 (Cleanup uses of TypeIdHasher and replace them with StableHasher)
- #50819 (Fix potential divide by zero)
- #50827 (Update LLVM to 56c931901cfb85cd6f7ed44c7d7520a8de1edf97)
- #50829 (CheckLoopVisitor: also visit break expressions)
- #50854 (in which the unused shorthand field pattern debacle/saga continues)
- #50858 (Reorder description for snippets in rustdoc documentation)
- #50883 (Fix warning when building stage0 libcore)
- #50889 (Update clippy)
Failed merges:
Fix potential divide by zero
This should fix#50761
I had trouble reproducing with the provided code, but looking at the stack trace would indicate that this code is the likely cause. I made a number of assumptions here, because I don't have enough context on how the register size is set:
1. I assumed `rest.unit.size.bytes()` can be 0, and it's ok if it's set to 0 before this function is called
2. I assumed that if `rest.unit.size.bytes()` is 0, that we want `rest_count` to also be 0.
remove semicolon -_-
Add rem_bytes to conditional to avoid error when performing mod by 0
Add test file to confirm compilation passes.
Ensure we don't divide or mod by zero in llvm_type. Include test file from issue.
Emit noalias on &mut parameters by default
This used to be disabled due to LLVM bugs in the handling of
noalias information in conjunction with unwinding. However,
according to #31681 all known LLVM bugs have been fixed by
LLVM 6.0, so it's probably time to reenable this optimization.
-Z no-mutable-noalias is left as an escape-hatch to debug problems
suspected to stem from this change.
Implement [T]::align_to
Note that this PR deviates from what is accepted by RFC slightly by making `align_offset` to return an offset in elements, rather than bytes. This is necessary to sanely support `[T]::align_to` and also simply makes more sense™. The caveat is that trying to align a pointer of ZST is now an equivalent to `is_aligned` check, rather than anything else (as no number of ZST elements will align a misaligned ZST pointer).
It also implements the `align_to` slightly differently than proposed in the RFC to properly handle cases where size of T and U aren’t co-prime.
Furthermore, a promise is made that the slice containing `U`s will be as large as possible (contrary to the RFC) – otherwise the function is quite useless.
The implementation uses quite a few underhanded tricks and takes advantage of the fact that alignment is a power-of-two quite heavily to optimise the machine code down to something that results in as few known-expensive instructions as possible. Currently calling `ptr.align_offset` with an unknown-at-compile-time `align` results in code that has just a single "expensive" modulo operation; the rest is "cheap" arithmetic and bitwise ops.
cc https://github.com/rust-lang/rust/issues/44488 @oli-obk
As mentioned in the commit message for align_offset, many thanks go to Chris McDonald.
This used to be disabled due to LLVM bugs in the handling of
noalias information in conjunction with unwinding. However,
according to #31681 all known LLVM bugs have been fixed by
LLVM 6.0, so it's probably time to reenable this optimization.
Noalias annotations will not be emitted by default if either
-C panic=abort (as previously) or LLVM >= 6.0 (new).
-Z mutable-noalias=no is left as an escape-hatch to allow
debugging problems suspected to stem from this change.
Keep only the language item. This removes some indirection and makes
codegen worse for debug builds, but simplifies code significantly, which
is a good tradeoff to make, in my opinion.
Besides, the codegen can be improved even further with some constant
evaluation improvements that we expect to happen in the future.
This is necessary if we want to implement `[T]::align_to` and is more
useful in general.
This implementation effort has begun during the All Hands and represents
a month of my futile efforts to do any sort of maths. Luckily, I
found the very very nice Chris McDonald (cjm) on IRC who figured out the
core formulas for me! All the thanks for existence of this PR go to
them!
Anyway… Those formulas were mangled by yours truly into the arcane forms
you see here to squeeze out the best assembly possible on most of the
modern architectures (x86 and ARM were evaluated in practice). I mean,
just look at it: *one actual* modulo operation and everything else is
just the cheap single cycle ops! Admitedly, the naive solution might be
faster in some common scenarios, but this code absolutely butchers the
naive solution on the worst case scenario.
Alas, the result of this arcane magic also means that the code pretty
heavily relies on the preconditions holding true and breaking those
preconditions will unleash the UB-est of all UBs! So don’t.