The LLVM PassManager has a PrepareForThinLTO flag, which is intended
when compilation occurs in conjunction with linking by ThinLTO. The
flag has two effects:
* The NameAnonGlobal pass is run after all other passes, which
ensures that all globals have a name.
* In optimized builds, a number of late passes (mainly related to
vectorization and unrolling) are disabled, on the rationale that
these a) will increase codesize of the intermediate artifacts
and b) will be run by ThinLTO again anyway.
This patch enables the use of PrepareForThinLTO if Thin or ThinLocal
linking is used.
The background for this change is the CI failure in #49479, which
we assume to be caused by the NameAnonGlobal pass not being run.
As this changes which passes LLVM runs, this might have performance
(or other) impact, so we want to land this separately.
In `LLVMRustHasFeature()`, rather than using `MCInfo->getFeatureTable()`
that is specific to Rust's LLVM fork, we can use this in LLVM 6:
/// Check whether the subtarget features are enabled/disabled as per
/// the provided string, ignoring all other features.
bool checkFeatures(StringRef FS) const;
Now rustc using external LLVM can also have `target_feature`.
LLVM has since removed the `CodeModel::Default` enum value in favor of an
`Optional` implementationg throughout LLVM. Let's mirror the same change in Rust
and update the various bindings we call accordingly.
Removed in llvm-mirror/llvm@9aafb854c
This commit is the next attempt to enable multiple codegen units by default in
release mode, getting some of those sweet, sweet parallelism wins by running
codegen in parallel. Performance should not be lost due to ThinLTO being on by
default as well.
Closes#45320
This commit implements a workaround for #46346 which basically just
avoids triggering the situation that LLVM's bug
https://bugs.llvm.org/show_bug.cgi?id=35562 arises. More details can be
found in the code itself but this commit is also intended to ...
Closes#46346
In #46382 the logic around linkage preservation with ThinLTO ws tweaked but the
loop that registered all otherwise exported GUID values as "don't internalize
me please" was erroneously too conservative and only asking "external" linkage
items to not be internalized. Instead we actually want the inversion of that
condition, everything *without* "local" linkage to be internalized.
This commit updates the condition there, adds a test, and...
Closes#46543
Assume at least LLVM 3.9 in rustllvm and rustc_llvm
We bumped the minimum LLVM to 3.9 in #45326. This just cleans up the conditional code in the `rustllvm` C++ wrappers to assume that minimum, and similarly cleans up the `rustc_llvm` build script.
Previously we were too eagerly exporting almost all symbols used in ThinLTO
which can cause a whole host of problems downstream! This commit instead fixes
this error by aligning more closely with `lib/LTO/LTO.cpp` in LLVM's codebase
which is to only change the linkage of summaries which are computed as dead.
Closes#46374
This makes it more robust when assertions are disabled,
crashing instead of causing UB.
Also introduces a tidy check to enforce this rule,
which in turn necessitated making tidy run on src/rustllvm.
Fixes#44020
This commit adds a new target to the compiler: wasm32-unknown-unknown. This
target is a reimagining of what it looks like to generate WebAssembly code from
Rust. Instead of using Emscripten which can bring with it a weighty runtime this
instead is a target which uses only the LLVM backend for WebAssembly and a
"custom linker" for now which will hopefully one day be direct calls to lld.
Notable features of this target include:
* There is zero runtime footprint. The target assumes nothing exists other than
the wasm32 instruction set.
* There is zero toolchain footprint beyond adding the target. No custom linker
is needed, rustc contains everything.
* Very small wasm modules can be generated directly from Rust code using this
target.
* Most of the standard library is stubbed out to return an error, but anything
related to allocation works (aka `HashMap`, `Vec`, etc).
* Naturally, any `#[no_std]` crate should be 100% compatible with this new
target.
This target is currently somewhat janky due to how linking works. The "linking"
is currently unconditional whole program LTO (aka LLVM is being used as a
linker). Naturally that means compiling programs is pretty slow! Eventually
though this target should have a linker.
This target is also intended to be quite experimental. I'm hoping that this can
act as a catalyst for further experimentation in Rust with WebAssembly. Breaking
changes are very likely to land to this target, so it's not recommended to rely
on it in any critical capacity yet. We'll let you know when it's "production
ready".
---
Currently testing-wise this target is looking pretty good but isn't complete.
I've got almost the entire `run-pass` test suite working with this target (lots
of tests ignored, but many passing as well). The `core` test suite is still
getting LLVM bugs fixed to get that working and will take some time. Relatively
simple programs all seem to work though!
---
It's worth nothing that you may not immediately see the "smallest possible wasm
module" for the input you feed to rustc. For various reasons it's very difficult
to get rid of the final "bloat" in vanilla rustc (again, a real linker should
fix all this). For now what you'll have to do is:
cargo install --git https://github.com/alexcrichton/wasm-gc
wasm-gc foo.wasm bar.wasm
And then `bar.wasm` should be the smallest we can get it!
---
In any case for now I'd love feedback on this, particularly on the various
integration points if you've got better ideas of how to approach them!
Enable LLVM's TrapUnreachable flag, which tells it to translate
`unreachable` instructions into hardware trap instructions, rather
than allowing control flow to "fall through" into whatever code
happens to follow it in memory.
First the `addPreservedGUID` function forgot to take care of "alias" summaries.
I'm not 100% sure what this is but the current code now matches upstream. Next
the `computeDeadSymbols` return value wasn't actually being used, but it needed
to be used! Together these should...
Closes#45195
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.
This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:
* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
single compilation. That is, we won't load upstream rlibs, but we'll instead
just perform ThinLTO amongst all codegen units produced by the compiler for
the local crate. This is intended to emulate a desired end point where we have
codegen units turned on by default for all crates and ThinLTO allows us to do
this without performance loss.
* In anther mode, like full LTO today, we'll optimize all upstream dependencies
in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
should finish much more quickly.
There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:
* Controlling parallelism means we can use the existing jobserver support to
avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
integrates with our own incremental strategy, but this is yet to be
determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
creation, where all our options we used today aren't necessarily supported by
upstream LLVM yet.
My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
Commit c4710203c0 in #43492 make `LLVMRustHasFeature` "more robust"
by using `getFeatureTable()`. However, this function is specific to
Rust's own LLVM fork, not upstream LLVM-4.0, so we need to use
`#if LLVM_RUSTLLVM` to guard this call.
The function should accept feature strings that old LLVM might not
support.
Simplify the code using the same approach used by
LLVMRustPrintTargetFeatures.
Dummify the function for non 4.0 LLVM and update the tests accordingly.
Add RWPI/ROPI relocation model support
This PR adds support for using LLVM 4's ROPI and RWPI relocation models for ARM.
ROPI (Read-Only Position Independence) and RWPI (Read-Write Position Independence) are two new relocation models in LLVM for the ARM backend ([LLVM changset](https://reviews.llvm.org/rL278015)). The motivation is that these are the specific strategies we use in userspace [Tock](https://www.tockos.org) apps, so supporting this is an important step (perhaps the final step, but can't confirm yet) in enabling userspace Rust processes.
## Explanation
ROPI makes all code and immutable accesses PC relative, but not assumed to be overriden at runtime (so for example, jumps are always relative).
RWPI uses a base register (`r9`) that stores the addresses of the GOT in memory so the runtime (e.g. a kernel) only adjusts r9 tell running code where the GOT is.
## Complications adding support in Rust
While this landed in LLVM master back in August, the header files in `llvm-c` have not been updated yet to reflect it. Rust replicates that header file's version of the `LLVMRelocMode` enum as the Rust enum `llvm::RelocMode` and uses an implicit cast in the ffi to translate from Rust's notion of the relocation model to the LLVM library's notion.
My workaround for this currently is to replace the `LLVMRelocMode` argument to `LLVMTargetMachineRef` with an int and using the hardcoded int representation of the `RelocMode` enum. This is A Bad Idea(tm), but I think very nearly the right thing.
Would a better alternative be to patch rust-llvm to support these enum variants (also a fairly trivial change)?
Replaces the llvm-c exposed LLVMRelocMode, which does not include all
relocation model variants, with a LLVMRustRelocMode modeled after
LLVMRustCodeMode.
Improve naming style in rustllvm.
As per the LLVM style guide, use CamelCase for all locals and classes,
and camelCase for all non-FFI functions.
Also, make names of variables of commonly used types more consistent.
Fixes#38688.
r? @rkruppe
As per the LLVM style guide, use CamelCase for all locals and classes,
and camelCase for all non-FFI functions.
Also, make names of variables of commonly used types more consistent.
Fixes#38688.
StringRefs have a length and their contents are not usually null-terminated.
The solution is to either copy the string data (in rustc_llvm::diagnostic) or take the size into account (in LLVMRustPrintPasses).
I couldn't trigger a bug caused by this (apparently all the strings returned in practice are actually null-terminated) but this is more correct and more future-proof.