Remove `ty_to_def_id`
fixes https://github.com/rust-lang/rust/issues/52341
The uses were mostly convenience and generally "too powerful" (would also have worked for types that weren't interesting at the use site)
r? @eddyb
Add the `amdgpu-kernel` ABI.
Technically, there are requirements imposed by the LLVM
`AMDGPUTargetMachine` on functions with this ABI (eg, the return type
must be void), but I'm unsure exactly where this should be enforced.
Technically, there are requirements imposed by the LLVM
`AMDGPUTargetMachine` on functions with this ABI (eg, the return type
must be void), but I'm unsure exactly where this should be enforced.
Preliminary work for incremental ThinLTO.
Since implementing incremental ThinLTO is a bit more involved than I initially thought, I'm splitting out some of the things that already work. This PR (1) adds a way accessing some ThinLTO information in `rustc` and (2) does some cleanup around CGU/object file naming (which makes things quite a bit nicer).
This is probably best reviewed one commit at a time.
nll experiment: compute SCCs instead of iterative region solving
This is an attempt to speed up region solving by replacing the current iterative dataflow with a SCC computation. The idea is to detect cycles (SCCs) amongst region constraints and then compute just one value per cycle. The graph with all cycles removed is of course a DAG, so we can then solve constraints "bottom up" once the liveness values are known.
I kinda ran out of time this morning so the last commit is a bit sloppy but I wanted to get this posted, let travis run on it, and maybe do a perf run, before I clean it up.
rustc_codegen_llvm: replace the first argument early in FnType::new_vtable.
Fixes#51907 by removing the vtable pointer before the `ArgType` is even created.
This allows any ABI to support trait object method calls, regardless of how it passes `*dyn Trait`.
r? @nikomatsakis
Disable LLVM verification by default
Currently -Z no-verify only controls IR verification prior to LLVM codegen, while verification is performed unconditionally both before and after linking with (Thin)LTO.
Also wondering what the sentiment is on disabling verification by default (and e.g. only enabling it on ALT builds with assertions). This does not seem terribly useful outside of rustc development and it does seem to show up in profiles (at something like 3%).
**EDIT:** A table showing the various configurations and what is enabled when.
| Configuration | Dynamic verification performed | LLVM static assertions compiled in |
| --- | --- | --- |
| alt builds | | yes |
| nightly builds | | no |
| stable builds | | no |
| CI builds | | |
| dev builds in a checkout | | |
Upgrade to LLVM's master branch (LLVM 7)
### Current status
~~Blocked on a [performance regression](https://github.com/rust-lang/rust/pull/51966#issuecomment-402320576). The performance regression has an [upstream LLVM issue](https://bugs.llvm.org/show_bug.cgi?id=38047) and has also [been bisected](https://reviews.llvm.org/D44282) to an LLVM revision.~~
Ready to merge!
---
This commit upgrades the main LLVM submodule to LLVM's current master branch.
The LLD submodule is updated in tandem as well as compiler-builtins.
Along the way support was also added for LLVM 7's new features. This primarily
includes the support for custom section concatenation natively in LLD so we now
add wasm custom sections in LLVM IR rather than having custom support in rustc
itself for doing so.
Some other miscellaneous changes are:
* We now pass `--gc-sections` to `wasm-ld`
* The optimization level is now passed to `wasm-ld`
* A `--stack-first` option is passed to LLD to have stack overflow always cause
a trap instead of corrupting static data
* The wasm target for LLVM switched to `wasm32-unknown-unknown`.
* The syntax for aligned pointers has changed in LLVM IR and tests are updated
to reflect this.
* ~~The `thumbv6m-none-eabi` target is disabled due to an [LLVM bug][llbug]~~
Nowadays we've been mostly only upgrading whenever there's a major release of
LLVM but enough changes have been happening on the wasm target that there's been
growing motivation for quite some time now to upgrade out version of LLD. To
upgrade LLD, however, we need to upgrade LLVM to avoid needing to build yet
another version of LLVM on the builders.
The revision of LLVM in use here is arbitrarily chosen. We will likely need to
continue to update it over time if and when we discover bugs. Once LLVM 7 is
fully released we can switch to that channel as well.
[llbug]: https://bugs.llvm.org/show_bug.cgi?id=37382
cc #50543
Correct some codegen stats counter inconsistencies
I noticed some possible typos/inconsistencies in the codegen counters. For example, `fsub` was getting counted as an integer `sub`, whereas `fadd` was counted as an add. And `addincoming` was only being counted on the initial call.
dbd10f8175/src/librustc_codegen_llvm/builder.rs (L831-L841)
Only remaining inconsistencies I can see are things like `fadd_fast` are counted as `fadd`. But the vector versions like `vector_reduce_fmax_fast` are counted as `vector.reduce.fmax_fast` not as their 'base' versions (`vector_reduce_fmax` is counted as `vector.reduce.fmax`).
This commit upgrades the main LLVM submodule to LLVM's current master branch.
The LLD submodule is updated in tandem as well as compiler-builtins.
Along the way support was also added for LLVM 7's new features. This primarily
includes the support for custom section concatenation natively in LLD so we now
add wasm custom sections in LLVM IR rather than having custom support in rustc
itself for doing so.
Some other miscellaneous changes are:
* We now pass `--gc-sections` to `wasm-ld`
* The optimization level is now passed to `wasm-ld`
* A `--stack-first` option is passed to LLD to have stack overflow always cause
a trap instead of corrupting static data
* The wasm target for LLVM switched to `wasm32-unknown-unknown`.
* The syntax for aligned pointers has changed in LLVM IR and tests are updated
to reflect this.
* The `thumbv6m-none-eabi` target is disabled due to an [LLVM bug][llbug]
Nowadays we've been mostly only upgrading whenever there's a major release of
LLVM but enough changes have been happening on the wasm target that there's been
growing motivation for quite some time now to upgrade out version of LLD. To
upgrade LLD, however, we need to upgrade LLVM to avoid needing to build yet
another version of LLVM on the builders.
The revision of LLVM in use here is arbitrarily chosen. We will likely need to
continue to update it over time if and when we discover bugs. Once LLVM 7 is
fully released we can switch to that channel as well.
[llbug]: https://bugs.llvm.org/show_bug.cgi?id=37382
Store scalar pair bools as i8 in memory
We represent `bool` as `i1` in a `ScalarPair`, unlike other aggregates,
to optimize IR for checked operators and the like. With this patch, we
still do so when the pair is an immediate value, but we use the `i8`
memory type when the value is loaded or stored as an LLVM aggregate.
So `(bool, bool)` looks like an `{ i1, i1 }` immediate, but `{ i8, i8 }`
in memory. When a pair is a direct function argument, `PassMode::Pair`,
it is still passed using the immediate `i1` type, but as a return value
it will use the `i8` memory type. Also, `bool`-like` enum tags will now
use scalar pairs when possible, where they were previously excluded due
to optimization issues.
Fixes#51516.
Closes#51566.
r? @eddyb
cc @nox
Mostly fix metadata_only backend and extract some code out of rustc_codegen_llvm
Removes dependency on the `ar` crate and removes the `llvm.enabled` config option in favour of setting `rust.codegen-backends` to `[]`.
When doing linker-plugin based LTO, write LLVM bitcode obj-files instead of embedding the bitcode into the regular object file.
This PR makes the compiler emit LLVM bitcode object files instead of regular object files with the IR embed when compiling for linker-plugin-based LTO. The reasoning for switching the strategy is this:
- Embedding bitcode in a section of the object file actually makes us save bitcode twice in rlibs and Rust dylibs, once for linker-based LTO and once for rustc-based LTO. That's a waste of space.
- When compiling for plugin-based LTO, one usually has no use for the machine code also present in the object file. Generating it is a waste of time.
- When compiling for plugin-based LTO, `rustc` will skip running ThinLTO because the linker will do that anyway. This has the side effect of then generating poorly optimized machine code, which makes it even less useful (and may lead to users not knowing why their code is slow instead of getting an error).
- Not having machine code available makes it impossible for the linker to silently fall back to not inlining stuff across language boundaries.
- This is what Clang does and according to [the documentation](https://llvm.org/docs/BitCodeFormat.html#native-object-file-wrapper-format) is the better supported option.
- The current behavior (minus the runtime performance problems) is still available via `-Z embed-bitcode` (we might want to do this for `libstd` at some point).
r? @alexcrichton
[cross-lang-lto] Allow the linker to choose the LTO-plugin (which is useful when using LLD)
This PR allows for not specifying an LTO-linker plugin but still let `rustc` invoke the linker with the correct plugin arguments. This is useful when using LLD which does not need the `-plugin` argument. Since LLD is the best linker for this scenario anyway, this change should improve ergonomics quite a bit.
r? @alexcrichton
We represent `bool` as `i1` in a `ScalarPair`, unlike other aggregates,
to optimize IR for checked operators and the like. With this patch, we
still do so when the pair is an immediate value, but we use the `i8`
memory type when the value is loaded or stored as an LLVM aggregate.
So `(bool, bool)` looks like an `{ i1, i1 }` immediate, but `{ i8, i8 }`
in memory. When a pair is a direct function argument, `PassMode::Pair`,
it is still passed using the immediate `i1` type, but as a return value
it will use the `i8` memory type. Also, `bool`-like` enum tags will now
use scalar pairs when possible, where they were previously excluded due
to optimization issues.
incr.comp.: Take names of children into account when computing the ICH of a module's HIR.
Fixes#40876. Red-green tracking does not make this a problem anymore. We should verify this via a perf-run though.
r? @nikomatsakis
Do not allow LLVM to increase a TLS's alignment on macOS.
This addresses the various TLS segfault on macOS 10.10.
Fix#51794.
Fix#51758.
Fix#50867.
Fix#48866.
Fix#46355.
Fix#44056.
Disable probestack when GCOV profiling is being used
If I compile Firefox with gcov profiling enabled, Firefox crashes at startup because of probestack.
Since it's disabled for PGO, I think it makes sense to disable it for gcov too.
Currently -Z no-verify only controls IR verification prior to
LLVM codegen, while verification is performed unconditionally
both before and after linking with (Thin)LTO.
Generate br for all two target SwitchInts
Instead of only for booleans. This means that `if let` also becomes a br.
Apart from making the IR slightly simpler, this is supported by FastISel (#4353).
Fix building rustc on and for musl hosts.
This fixes all problems I had when trying to compile rustc on a musl-based distribution (with `crt-static = false` in `config.toml`).
This is a fixed version of what ended up being #50105, making it possible to compile rustc on musl targets.
The differences to the old (now merged and subsequently reverted) pull request are:
- The commit (6d9154a830) that caused the regression for which the original commits were reverted in #50709 is left out. This means the corresponding bug #36710 is still not fixed with `+crt-static`.
- The test for issue 36710 is skipped for musl targets (until the issue is properly fixed).
- Building cargo-vendor if `crt-static = false` is needed was broken (cargo-vendor links to some shared libraries if they exist on the system and this produces broken binaries with `+crt-static`)
CC @alexcrichton
(This is just the data structure changes and some boilerplate match
code that followed from it; the actual emission of these statements
comes in a follow-up commit.)
Add simd math intrinsics and gather/scatter
This PR adds simd math intrinsics for floating-point vectors (sqrt, sin, cos, pow, exp, log, fma, abs, etc.) and the generic simd gather/scatter intrinsics.
std: Ensure OOM is classified as `nounwind`
OOM can't unwind today, and historically it's been optimized as if it can't
unwind. This accidentally regressed with recent changes to the OOM handler, so
this commit adds in a codegen test to assert that everything gets optimized away
after the OOM function is approrpiately classified as nounwind
Closes#50925
OOM can't unwind today, and historically it's been optimized as if it can't
unwind. This accidentally regressed with recent changes to the OOM handler, so
this commit adds in a codegen test to assert that everything gets optimized away
after the OOM function is approrpiately classified as nounwind
Closes#50925
Add -Z no-parallel-llvm flag
Codegen issues commonly only manifest under specific circumstances,
e.g. if multiple codegen units are used and ThinLTO is enabled.
However, these configuration are threaded, making the use of LLVM
debugging facilities hard, as output is interleaved.
This patch adds a -Z no-parallel-llvm flag, which allows disabling
parallelization of codegen and linking, while otherwise preserving
behavior with regard to codegen units and LTO.
Codegen issues commonly only manifest under specific circumstances,
e.g. if multiple codegen units are used and ThinLTO is enabled.
However, these configuration are threaded, making the use of LLVM
debugging facilities hard, as output is interleaved.
This patch adds a -Z no-parallel-llvm flag, which allows disabling
parallelization of codegen and linking, while otherwise preserving
behavior with regard to codegen units and LTO.
Rollup of 8 pull requests
Successful merges:
- #50531 (Cleanup uses of TypeIdHasher and replace them with StableHasher)
- #50819 (Fix potential divide by zero)
- #50827 (Update LLVM to 56c931901cfb85cd6f7ed44c7d7520a8de1edf97)
- #50829 (CheckLoopVisitor: also visit break expressions)
- #50854 (in which the unused shorthand field pattern debacle/saga continues)
- #50858 (Reorder description for snippets in rustdoc documentation)
- #50883 (Fix warning when building stage0 libcore)
- #50889 (Update clippy)
Failed merges:
Fix potential divide by zero
This should fix#50761
I had trouble reproducing with the provided code, but looking at the stack trace would indicate that this code is the likely cause. I made a number of assumptions here, because I don't have enough context on how the register size is set:
1. I assumed `rest.unit.size.bytes()` can be 0, and it's ok if it's set to 0 before this function is called
2. I assumed that if `rest.unit.size.bytes()` is 0, that we want `rest_count` to also be 0.
remove semicolon -_-
Add rem_bytes to conditional to avoid error when performing mod by 0
Add test file to confirm compilation passes.
Ensure we don't divide or mod by zero in llvm_type. Include test file from issue.
Emit noalias on &mut parameters by default
This used to be disabled due to LLVM bugs in the handling of
noalias information in conjunction with unwinding. However,
according to #31681 all known LLVM bugs have been fixed by
LLVM 6.0, so it's probably time to reenable this optimization.
-Z no-mutable-noalias is left as an escape-hatch to debug problems
suspected to stem from this change.
Implement [T]::align_to
Note that this PR deviates from what is accepted by RFC slightly by making `align_offset` to return an offset in elements, rather than bytes. This is necessary to sanely support `[T]::align_to` and also simply makes more sense™. The caveat is that trying to align a pointer of ZST is now an equivalent to `is_aligned` check, rather than anything else (as no number of ZST elements will align a misaligned ZST pointer).
It also implements the `align_to` slightly differently than proposed in the RFC to properly handle cases where size of T and U aren’t co-prime.
Furthermore, a promise is made that the slice containing `U`s will be as large as possible (contrary to the RFC) – otherwise the function is quite useless.
The implementation uses quite a few underhanded tricks and takes advantage of the fact that alignment is a power-of-two quite heavily to optimise the machine code down to something that results in as few known-expensive instructions as possible. Currently calling `ptr.align_offset` with an unknown-at-compile-time `align` results in code that has just a single "expensive" modulo operation; the rest is "cheap" arithmetic and bitwise ops.
cc https://github.com/rust-lang/rust/issues/44488 @oli-obk
As mentioned in the commit message for align_offset, many thanks go to Chris McDonald.
This used to be disabled due to LLVM bugs in the handling of
noalias information in conjunction with unwinding. However,
according to #31681 all known LLVM bugs have been fixed by
LLVM 6.0, so it's probably time to reenable this optimization.
Noalias annotations will not be emitted by default if either
-C panic=abort (as previously) or LLVM >= 6.0 (new).
-Z mutable-noalias=no is left as an escape-hatch to allow
debugging problems suspected to stem from this change.
Keep only the language item. This removes some indirection and makes
codegen worse for debug builds, but simplifies code significantly, which
is a good tradeoff to make, in my opinion.
Besides, the codegen can be improved even further with some constant
evaluation improvements that we expect to happen in the future.
This is necessary if we want to implement `[T]::align_to` and is more
useful in general.
This implementation effort has begun during the All Hands and represents
a month of my futile efforts to do any sort of maths. Luckily, I
found the very very nice Chris McDonald (cjm) on IRC who figured out the
core formulas for me! All the thanks for existence of this PR go to
them!
Anyway… Those formulas were mangled by yours truly into the arcane forms
you see here to squeeze out the best assembly possible on most of the
modern architectures (x86 and ARM were evaluated in practice). I mean,
just look at it: *one actual* modulo operation and everything else is
just the cheap single cycle ops! Admitedly, the naive solution might be
faster in some common scenarios, but this code absolutely butchers the
naive solution on the worst case scenario.
Alas, the result of this arcane magic also means that the code pretty
heavily relies on the preconditions holding true and breaking those
preconditions will unleash the UB-est of all UBs! So don’t.