This commit updates our Fat LTO logic to tweak our custom wrapper around LLVM's
"link modules" functionality. Previously whenever the
`LLVMRustLinkInExternalBitcode` function was called it would call LLVM's
`Linker::linkModules` wrapper. Internally this would crate an instance of a
`Linker` which internally creates an instance of an `IRMover`. Unfortunately for
us the creation of `IRMover` is somewhat O(n) with the input module. This means
that every time we linked a module it was O(n) with respect to the entire module
we had built up!
Now the modules we build up during LTO are quite large, so this quickly started
creating an O(n^2) problem for us! Discovered in #48025 it turns out this has
always been a problem and we just haven't noticed it. It became particularly
worse recently though due to most libraries having 16x more object files than
they previously did (1 -> 16).
This commit fixes this performance issue by preserving the `Linker` instance
across all links into the main LLVM module. This means we only create one
`IRMover` and allows LTO to progress much speedier.
From the `cargo-cache` project in #48025 a **full build** locally when from
5m15s to 2m24s. Looking at the timing logs each object file was linked in in
single-digit millisecond rather than hundreds, clearly being a nice improvement!
Closes#48025
This commit introduces a separately compiled backend for Emscripten, avoiding
compiling the `JSBackend` target in the main LLVM codegen backend. This builds
on the foundation provided by #47671 to create a new codegen backend dedicated
solely to Emscripten, removing the `JSBackend` of the main codegen backend in
the process.
A new field was added to each target for this commit which specifies the backend
to use for translation, the default being `llvm` which is the main backend that
we use. The Emscripten targets specify an `emscripten` backend instead of the
main `llvm` one.
There's a whole bunch of consequences of this change, but I'll try to enumerate
them here:
* A *second* LLVM submodule was added in this commit. The main LLVM submodule
will soon start to drift from the Emscripten submodule, but currently they're
both at the same revision.
* Logic was added to rustbuild to *not* build the Emscripten backend by default.
This is gated behind a `--enable-emscripten` flag to the configure script. By
default users should neither check out the emscripten submodule nor compile
it.
* The `init_repo.sh` script was updated to fetch the Emscripten submodule from
GitHub the same way we do the main LLVM submodule (a tarball fetch).
* The Emscripten backend, turned off by default, is still turned on for a number
of targets on CI. We'll only be shipping an Emscripten backend with Tier 1
platforms, though. All cross-compiled platforms will not be receiving an
Emscripten backend yet.
This commit means that when you download the `rustc` package in Rustup for Tier
1 platforms you'll be receiving two trans backends, one for Emscripten and one
that's the general LLVM backend. If you never compile for Emscripten you'll
never use the Emscripten backend, so we may update this one day to only download
the Emscripten backend when you add the Emscripten target. For now though it's
just an extra 10MB gzip'd.
Closes#46819
Building on the work of # 45684 this commit updates the compiler to
unconditionally load the `rustc_trans` crate at runtime instead of linking to it
at compile time. The end goal of this work is to implement # 46819 where rustc
will have multiple backends available to it to load.
This commit starts off by removing the `extern crate rustc_trans` from the
driver. This involved moving some miscellaneous functionality into the
`TransCrate` trait and also required an implementation of how to locate and load
the trans backend. This ended up being a little tricky because the sysroot isn't
always the right location (for example `--sysroot` arguments) so some extra code
was added as well to probe a directory relative to the current dll (the
rustc_driver dll).
Rustbuild has been updated accordingly as well to have a separate compilation
invocation for the `rustc_trans` crate and assembly it accordingly into the
sysroot. Finally, the distribution logic for the `rustc` package was also
updated to slurp up the trans backends folder.
A number of assorted fallout changes were included here as well to ensure tests
pass and such, and they should all be commented inline.
First round of LLVM 6.0.0 compatibility
This includes a number of commits for the first round of upgrading to LLVM 6. There are still [lingering bugs](https://github.com/rust-lang/rust/issues/47683) but I believe all of this will nonetheless be necessary!
Teach rustc about DW_AT_noreturn and a few more DIFlags
We achieve two small things with this PR:
1. We provide definitions for a few additional llvm debuginfo flags
1. We _use_ one of these new flags, `FlagNoReturn`, and add it to debuginfo for functions with the never return type (`!`).
It looks like LLVM also removed it in llvm-mirror/llvm@f45adc29d in favor of the
name "GNU64". This was added in the thought that we'd need such a variant when
adding mips64 support but we ended up not needing it! For now let's just
removing the various support on the Rust side of things.
LLVM has since removed the `CodeModel::Default` enum value in favor of an
`Optional` implementationg throughout LLVM. Let's mirror the same change in Rust
and update the various bindings we call accordingly.
Removed in llvm-mirror/llvm@9aafb854c
LLVM <= 4.0 used a non-standard interpretation of `DW_OP_plus`. In the
DWARF standard, this adds two items on the expressions stack. LLVM's
behavior was more like DWARF's `DW_OP_plus_uconst` -- adding a constant
that follows the op. The patch series starting with [D33892] switched
to the standard DWARF interpretation, so we need to follow.
[D33892]: https://reviews.llvm.org/D33892
Use name-discarding LLVM context
This is only applicable when neither of --emit=llvm-ir or --emit=llvm-bc are not
requested.
In case either of these outputs are wanted, but the benefits of such context are
desired as well, -Zfewer_names option provides the same functionality regardless
of the outputs requested.
Should be a viable fix for https://github.com/rust-lang/rust/issues/46449
This is only applicable when neither of --emit=llvm-ir or --emit=llvm-bc are not
requested.
In case either of these outputs are wanted, but the benefits of such context are
desired as well, -Zfewer_names option provides the same functionality regardless
of the outputs requested.
Re-do the FreeBSD cross-builds to use Clang and libc++. Fixes#44433
Reviving #45077, from @jld:
> The main goal here is to use FreeBSD's normal libc++, instead of
> statically linking the libstdc++ packaged with GCC, because that
> libstdc++ has bugs that cause rustc to deadlock inside LLVM.
>
> But the easiest way to use libc++ is to switch the build from GCC to
> Clang, and the Clang package in the Ubuntu image already knows how to
> cross-compile (given a sysroot and preferably cross-binutils), so the
> toolchain script now uses that instead of building a custom compiler.
>
> This also de-duplicates the build-toolchain.sh script.
#45077 was close but didn't quite make it. I rebased @jld's work off the current `master` and started with that.
I was able to determine that this Travis error (https://github.com/rust-lang/rust/pull/45077#issuecomment-336029862) was ultimately caused by `src/librustc_llvm/build.rs` attempting to follow a wrong value in `LLVM_STATIC_STDCPP` (https://github.com/rust-lang/rust/pull/45077#issuecomment-352639456).
I looked at the downstream port for FreeBSD (https://svnweb.freebsd.org/ports/head/lang/rust/) and it seems like they do not use `--enable-llvm-static-stdcpp`.
Since `libc++` is included in the FreeBSD 10+ base system, we don't need to statically link it either?
So in b989428f7d I have set the FreeBSD build to not actually use `LLVM_STATIC_STDCPP`.
I was able to run `./src/ci/docker/run.sh` with both `dist-i686-freebsd` and `dist-x86_64-freebsd` successfully and in about 1 minute of testing it seemed like the dist-x86_64-freebsd results worked on a FreeBSD 11 system.
It should fix#44433, which seems to be affecting many potential users. Also FreeBSD users should be able to `./x.py build` which should help anyone who wants to upstream fixes for FreeBSD.
Questions:
Does this approach seem to be the right way to go? Do we actually really want to statically link `libc++`? (I tried that here, but it ultimately ran into a roadblock on x86_64: https://github.com/rust-lang/rust/pull/45077#issuecomment-353293414)
Can we rewrite the comment here to be more clear about why some systems aren't going to actually use this option:
b989428f7d/src/bootstrap/compile.rs (L550-L553)
How does this affect users of older FreeBSD systems? It seemed like no one was complaining about using a 10.3 base version in the thread for #45077. FreeBSD seems to only officially support 10.3, 10.4, and 11.x right now, do we have to consider older users? The `libc++` stuff came in for FreeBSD 10, older FreeBSD used `libstdc++`.
Looks like @alexcrichton was leading the discussion on the previous issue:
r? @alexcrichton
Let me know what I can do to help get this through.
This commit is the next attempt to enable multiple codegen units by default in
release mode, getting some of those sweet, sweet parallelism wins by running
codegen in parallel. Performance should not be lost due to ThinLTO being on by
default as well.
Closes#45320
The main goal here is to use FreeBSD's normal libc++, instead of
statically linking the libstdc++ packaged with GCC, because that
libstdc++ has bugs that cause rustc to deadlock inside LLVM.
But the easiest way to use libc++ is to switch the build from GCC to
Clang, and the Clang package in the Ubuntu image already knows how to
cross-compile (given a sysroot and preferably cross-binutils), so the
toolchain script now uses that instead of building a custom compiler.
This also de-duplicates the `build-toolchain.sh` script.
This commit implements a workaround for #46346 which basically just
avoids triggering the situation that LLVM's bug
https://bugs.llvm.org/show_bug.cgi?id=35562 arises. More details can be
found in the code itself but this commit is also intended to ...
Closes#46346
Assume at least LLVM 3.9 in rustllvm and rustc_llvm
We bumped the minimum LLVM to 3.9 in #45326. This just cleans up the conditional code in the `rustllvm` C++ wrappers to assume that minimum, and similarly cleans up the `rustc_llvm` build script.
This commit adds compiler support for two basic operations needed for binding
SIMD on x86 platforms:
* First, a `nontemporal_store` intrinsic was added for the `_mm_stream_ps`, seen
in rust-lang-nursery/stdsimd#114. This was relatively straightforward and is
quite similar to the volatile store intrinsic.
* Next, and much more intrusively, a new type to the backend was added. The
`x86_mmx` type is used in LLVM for a 64-bit vector register and is used in
various intrinsics like `_mm_abs_pi8` as seen in rust-lang-nursery/stdsimd#74.
This new type was added as a new layout option as well as having support added
to the trans backend. The type is enabled with the `#[repr(x86_mmx)]`
attribute which is intended to just be an implementation detail of SIMD in
Rust.
I'm not 100% certain about how the `x86_mmx` type was added, so any extra eyes
or thoughts on that would be greatly appreciated!
std: Add a new wasm32-unknown-unknown target
This commit adds a new target to the compiler: wasm32-unknown-unknown. This target is a reimagining of what it looks like to generate WebAssembly code from Rust. Instead of using Emscripten which can bring with it a weighty runtime this instead is a target which uses only the LLVM backend for WebAssembly and a "custom linker" for now which will hopefully one day be direct calls to lld.
Notable features of this target include:
* There is zero runtime footprint. The target assumes nothing exists other than the wasm32 instruction set.
* There is zero toolchain footprint beyond adding the target. No custom linker is needed, rustc contains everything.
* Very small wasm modules can be generated directly from Rust code using this target.
* Most of the standard library is stubbed out to return an error, but anything related to allocation works (aka `HashMap`, `Vec`, etc).
* Naturally, any `#[no_std]` crate should be 100% compatible with this new target.
This target is currently somewhat janky due to how linking works. The "linking" is currently unconditional whole program LTO (aka LLVM is being used as a linker). Naturally that means compiling programs is pretty slow! Eventually though this target should have a linker.
This target is also intended to be quite experimental. I'm hoping that this can act as a catalyst for further experimentation in Rust with WebAssembly. Breaking changes are very likely to land to this target, so it's not recommended to rely on it in any critical capacity yet. We'll let you know when it's "production ready".
### Building yourself
First you'll need to configure the build of LLVM and enable this target
```
$ ./configure --target=wasm32-unknown-unknown --set llvm.experimental-targets=WebAssembly
```
Next you'll want to remove any previously compiled LLVM as it needs to be rebuilt with WebAssembly support. You can do that with:
```
$ rm -rf build
```
And then you're good to go! A `./x.py build` should give you a rustc with the appropriate libstd target.
### Test support
Currently testing-wise this target is looking pretty good but isn't complete. I've got almost the entire `run-pass` test suite working with this target (lots of tests ignored, but many passing as well). The `core` test suite is [still getting LLVM bugs fixed](https://reviews.llvm.org/D39866) to get that working and will take some time. Relatively simple programs all seem to work though!
In general I've only tested this with a local fork that makes use of LLVM 5 rather than our current LLVM 4 on master. The LLVM 4 WebAssembly backend AFAIK isn't broken per se but is likely missing bug fixes available on LLVM 5. I'm hoping though that we can decouple the LLVM 5 upgrade and adding this wasm target!
### But the modules generated are huge!
It's worth nothing that you may not immediately see the "smallest possible wasm module" for the input you feed to rustc. For various reasons it's very difficult to get rid of the final "bloat" in vanilla rustc (again, a real linker should fix all this). For now what you'll have to do is:
cargo install --git https://github.com/alexcrichton/wasm-gc
wasm-gc foo.wasm bar.wasm
And then `bar.wasm` should be the smallest we can get it!
---
In any case for now I'd love feedback on this, particularly on the various integration points if you've got better ideas of how to approach them!
This commit adds a new target to the compiler: wasm32-unknown-unknown. This
target is a reimagining of what it looks like to generate WebAssembly code from
Rust. Instead of using Emscripten which can bring with it a weighty runtime this
instead is a target which uses only the LLVM backend for WebAssembly and a
"custom linker" for now which will hopefully one day be direct calls to lld.
Notable features of this target include:
* There is zero runtime footprint. The target assumes nothing exists other than
the wasm32 instruction set.
* There is zero toolchain footprint beyond adding the target. No custom linker
is needed, rustc contains everything.
* Very small wasm modules can be generated directly from Rust code using this
target.
* Most of the standard library is stubbed out to return an error, but anything
related to allocation works (aka `HashMap`, `Vec`, etc).
* Naturally, any `#[no_std]` crate should be 100% compatible with this new
target.
This target is currently somewhat janky due to how linking works. The "linking"
is currently unconditional whole program LTO (aka LLVM is being used as a
linker). Naturally that means compiling programs is pretty slow! Eventually
though this target should have a linker.
This target is also intended to be quite experimental. I'm hoping that this can
act as a catalyst for further experimentation in Rust with WebAssembly. Breaking
changes are very likely to land to this target, so it's not recommended to rely
on it in any critical capacity yet. We'll let you know when it's "production
ready".
---
Currently testing-wise this target is looking pretty good but isn't complete.
I've got almost the entire `run-pass` test suite working with this target (lots
of tests ignored, but many passing as well). The `core` test suite is still
getting LLVM bugs fixed to get that working and will take some time. Relatively
simple programs all seem to work though!
---
It's worth nothing that you may not immediately see the "smallest possible wasm
module" for the input you feed to rustc. For various reasons it's very difficult
to get rid of the final "bloat" in vanilla rustc (again, a real linker should
fix all this). For now what you'll have to do is:
cargo install --git https://github.com/alexcrichton/wasm-gc
wasm-gc foo.wasm bar.wasm
And then `bar.wasm` should be the smallest we can get it!
---
In any case for now I'd love feedback on this, particularly on the various
integration points if you've got better ideas of how to approach them!
Remove support for the PNaCl target (le32-unknown-nacl)
This removes support for the `le32-unknown-nacl` target which is currently supported by rustc on tier 3. Despite the "nacl" in the name, the target doesn't output native code (x86, ARM, MIPS), instead it outputs binaries in the PNaCl format.
There are two reasons for the removal:
* Google [has announced](https://blog.chromium.org/2017/05/goodbye-pnacl-hello-webassembly.html) deprecation of the PNaCl format. The suggestion is to migrate to wasm. Happens we already have a wasm backend!
* Our PNaCl LLVM backend is provided by the fastcomp patch set that the LLVM fork used by rustc contains in addition to vanilla LLVM (`src/llvm/lib/Target/JSBackend/NaCl`). Upstream LLVM doesn't have PNaCl support. Removing PNaCl support will enable us to move away from fastcomp (#44006) and have a lighter set of patches on top of upstream LLVM inside our LLVM fork. This will help distribution packagers of Rust.
Fixes#42420
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.
This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:
* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
single compilation. That is, we won't load upstream rlibs, but we'll instead
just perform ThinLTO amongst all codegen units produced by the compiler for
the local crate. This is intended to emulate a desired end point where we have
codegen units turned on by default for all crates and ThinLTO allows us to do
this without performance loss.
* In anther mode, like full LTO today, we'll optimize all upstream dependencies
in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
should finish much more quickly.
There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:
* Controlling parallelism means we can use the existing jobserver support to
avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
integrates with our own incremental strategy, but this is yet to be
determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
creation, where all our options we used today aren't necessarily supported by
upstream LLVM yet.
My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
This commit is a refactoring of the LTO backend in Rust to support compilations
with multiple codegen units. The immediate result of this PR is to remove the
artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer
term this is intended to lay the groundwork for LTO with incremental compilation
and ultimately be the underpinning of ThinLTO support.
The problem here that needed solving is that when rustc is producing multiple
codegen units in one compilation LTO needs to merge them all together.
Previously only upstream dependencies were merged and it was inherently relied
on that there was only one local codegen unit. Supporting this involved
refactoring the optimization backend architecture for rustc, namely splitting
the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM
module has been optimized it may be blocked and queued up for LTO, and only
after LTO are modules code generated.
Non-LTO compilations should look the same as they do today backend-wise, we'll
spin up a thread for each codegen unit and optimize/codegen in that thread. LTO
compilations will, however, send the LLVM module back to the coordinator thread
once optimizations have finished. When all LLVM modules have finished optimizing
the coordinator will invoke the LTO backend, producing a further list of LLVM
modules. Currently this is always a list of one LLVM module. The coordinator
then spawns further work to run LTO and code generation passes over each module.
In the course of this refactoring a number of other pieces were refactored:
* Management of the bytecode encoding in rlibs was centralized into one module
instead of being scattered across LTO and linking.
* Some internal refactorings on the link stage of the compiler was done to work
directly from `CompiledModule` structures instead of lists of paths.
* The trans time-graph output was tweaked a little to include a name on each
bar and inflate the size of the bars a little