Enable ThinLTO with incremental compilation.
This is an updated version of #52309. This PR allows `rustc` to use (local) ThinLTO and incremental compilation at the same time. In theory this should allow for getting compile-time improvements for small changes while keeping the runtime performance of the generated code roughly the same as when compiling non-incrementally.
The difference to #52309 is that this version also caches the pre-LTO version of LLVM bitcode. This allows for another layer of caching:
1. if the module itself has changed, we have to re-codegen and re-optimize.
2. if the module itself has not changed, but a module it imported from during ThinLTO has, we don't need to re-codegen and don't need to re-run the first optimization phase. Only the second (i.e. ThinLTO-) optimization phase is re-run.
3. if neither the module itself nor any of its imports have changed then we can re-use the final, post-ThinLTO version of the module. (We might have to load its pre-ThinLTO version though so it's available for other modules to import from)
This commit updates the LLVM submodule to the current trunk of LLVM itself. This
brings a few notable improvements for the wasm target:
* Support for wasm atomic instructions is greatly improved
* Renamed memory wasm intrinsics are fully supported
* LLD has fixed a quadratic execution bug with large numbers of relocations in
wasm files.
The compiler-rt submodule has been updated in tandem as well.
This fixes a regression from #53031 where specifying `-C target-cpu=native` is
printing a lot of warnings from LLVM about `native` being an unknown CPU. It
turns out that `native` is indeed an unknown CPU and we have to perform a
mapping to an actual CPU name, but this mapping is only performed in one
location rather than all locations we inform LLVM about the target CPU.
This commit centralizes the mapping of `native` to LLVM's value of the native
CPU, ensuring that all locations we inform LLVM about the `target-cpu` it's
never `native`.
Closes#53322
In some profiling on OSX I saw the `write` syscall as quite high up on
the profiling graph, which is definitely not good! It looks like we're
setting the output stream of an object file as directly to a file
descriptor which means that we run the risk of doing lots of little
writes rather than a few large writes.
This commit fixes this issue by adding a buffered stream on the output,
causing the `write` syscall to disappear from the profiles on OSX.
rustc: Handle linker diagnostics from LLVM
Previously linker diagnostic were being hidden when two modules were linked
together but failed to link. This commit fixes the situation by ensuring that we
have a diagnostic handler installed and also adds support for handling linker
diagnostics.
Fix -Wpessimizing-move warnings in rustllvm/PassWrapper
These are producing warnings when building rustc (`warning: moving a local object in a return statement prevents copy elision [-Wpessimizing-move]`).
Previously linker diagnostic were being hidden when two modules were linked
together but failed to link. This commit fixes the situation by ensuring that we
have a diagnostic handler installed and also adds support for handling linker
diagnostics.
rustc: Work around an upstream wasm ThinLTO bug
This commit implements a workaround for an [upstream LLVM bug][1] where custom
sections were accidentally duplicated amongst codegen units when ThinLTO passes
were performed. This is due to the fact that custom sections for wasm are stored
as metadata nodes which are automatically imported into modules when ThinLTO
happens. The fix here is to forcibly delete the metadata node from imported
modules before LLVM has a chance to try to copy it over.
[1]: https://bugs.llvm.org/show_bug.cgi?id=38184
This commit implements a workaround for an [upstream LLVM bug][1] where custom
sections were accidentally duplicated amongst codegen units when ThinLTO passes
were performed. This is due to the fact that custom sections for wasm are stored
as metadata nodes which are automatically imported into modules when ThinLTO
happens. The fix here is to forcibly delete the metadata node from imported
modules before LLVM has a chance to try to copy it over.
[1]: https://bugs.llvm.org/show_bug.cgi?id=38184
This commit removes a hack in our ThinLTO passes which removes available
externally functions manually. The [upstream bug][1] has long since been fixed,
so we should be able to rely on LLVM natively for this now!
[1]: https://bugs.llvm.org/show_bug.cgi?id=35736
Preliminary work for incremental ThinLTO.
Since implementing incremental ThinLTO is a bit more involved than I initially thought, I'm splitting out some of the things that already work. This PR (1) adds a way accessing some ThinLTO information in `rustc` and (2) does some cleanup around CGU/object file naming (which makes things quite a bit nicer).
This is probably best reviewed one commit at a time.
Upgrade to LLVM's master branch (LLVM 7)
### Current status
~~Blocked on a [performance regression](https://github.com/rust-lang/rust/pull/51966#issuecomment-402320576). The performance regression has an [upstream LLVM issue](https://bugs.llvm.org/show_bug.cgi?id=38047) and has also [been bisected](https://reviews.llvm.org/D44282) to an LLVM revision.~~
Ready to merge!
---
This commit upgrades the main LLVM submodule to LLVM's current master branch.
The LLD submodule is updated in tandem as well as compiler-builtins.
Along the way support was also added for LLVM 7's new features. This primarily
includes the support for custom section concatenation natively in LLD so we now
add wasm custom sections in LLVM IR rather than having custom support in rustc
itself for doing so.
Some other miscellaneous changes are:
* We now pass `--gc-sections` to `wasm-ld`
* The optimization level is now passed to `wasm-ld`
* A `--stack-first` option is passed to LLD to have stack overflow always cause
a trap instead of corrupting static data
* The wasm target for LLVM switched to `wasm32-unknown-unknown`.
* The syntax for aligned pointers has changed in LLVM IR and tests are updated
to reflect this.
* ~~The `thumbv6m-none-eabi` target is disabled due to an [LLVM bug][llbug]~~
Nowadays we've been mostly only upgrading whenever there's a major release of
LLVM but enough changes have been happening on the wasm target that there's been
growing motivation for quite some time now to upgrade out version of LLD. To
upgrade LLD, however, we need to upgrade LLVM to avoid needing to build yet
another version of LLVM on the builders.
The revision of LLVM in use here is arbitrarily chosen. We will likely need to
continue to update it over time if and when we discover bugs. Once LLVM 7 is
fully released we can switch to that channel as well.
[llbug]: https://bugs.llvm.org/show_bug.cgi?id=37382
cc #50543
This commit upgrades the main LLVM submodule to LLVM's current master branch.
The LLD submodule is updated in tandem as well as compiler-builtins.
Along the way support was also added for LLVM 7's new features. This primarily
includes the support for custom section concatenation natively in LLD so we now
add wasm custom sections in LLVM IR rather than having custom support in rustc
itself for doing so.
Some other miscellaneous changes are:
* We now pass `--gc-sections` to `wasm-ld`
* The optimization level is now passed to `wasm-ld`
* A `--stack-first` option is passed to LLD to have stack overflow always cause
a trap instead of corrupting static data
* The wasm target for LLVM switched to `wasm32-unknown-unknown`.
* The syntax for aligned pointers has changed in LLVM IR and tests are updated
to reflect this.
* The `thumbv6m-none-eabi` target is disabled due to an [LLVM bug][llbug]
Nowadays we've been mostly only upgrading whenever there's a major release of
LLVM but enough changes have been happening on the wasm target that there's been
growing motivation for quite some time now to upgrade out version of LLD. To
upgrade LLD, however, we need to upgrade LLVM to avoid needing to build yet
another version of LLVM on the builders.
The revision of LLVM in use here is arbitrarily chosen. We will likely need to
continue to update it over time if and when we discover bugs. Once LLVM 7 is
fully released we can switch to that channel as well.
[llbug]: https://bugs.llvm.org/show_bug.cgi?id=37382
The crash that happened in #23566 doesn't happen anymore with the LLVM mergefunc
pass enabled and it hugely reduces code size (for example it shaves off 10% of the
final Servo executable). This patch reenables it.
The LLVM PassManager has a PrepareForThinLTO flag, which is intended
when compilation occurs in conjunction with linking by ThinLTO. The
flag has two effects:
* The NameAnonGlobal pass is run after all other passes, which
ensures that all globals have a name.
* In optimized builds, a number of late passes (mainly related to
vectorization and unrolling) are disabled, on the rationale that
these a) will increase codesize of the intermediate artifacts
and b) will be run by ThinLTO again anyway.
This patch enables the use of PrepareForThinLTO if Thin or ThinLocal
linking is used.
The background for this change is the CI failure in #49479, which
we assume to be caused by the NameAnonGlobal pass not being run.
As this changes which passes LLVM runs, this might have performance
(or other) impact, so we want to land this separately.
In `LLVMRustHasFeature()`, rather than using `MCInfo->getFeatureTable()`
that is specific to Rust's LLVM fork, we can use this in LLVM 6:
/// Check whether the subtarget features are enabled/disabled as per
/// the provided string, ignoring all other features.
bool checkFeatures(StringRef FS) const;
Now rustc using external LLVM can also have `target_feature`.
implement minmax intrinsics
This adds the `simd_{fmin,fmax}` intrinsics, which do a vertical (lane-wise) `min`/`max` for floating point vectors that's equivalent to Rust's `min`/`max` for `f32`/`f64`.
It might make sense to make `{f32,f64}::{min,max}` use the `minnum` and `minmax` intrinsics as well.
---
~~HELP: I need some help with these. Either I should go to sleep or there must be something that I must be missing. AFAICT I am calling the `maxnum` builder correctly, yet rustc/LLVM seem to insert a call to `llvm.minnum` there instead...~~ EDIT: Rust's LLVM version is too old :/
Add basic PGO support.
This PR adds two mutually exclusive options for profile usage and generation using LLVM's instruction profile generation (the same as clang uses), `-C pgo-use` and `-C pgo-gen`.
See each commit for details.