This commit removes a hack in our ThinLTO passes which removes available
externally functions manually. The [upstream bug][1] has long since been fixed,
so we should be able to rely on LLVM natively for this now!
[1]: https://bugs.llvm.org/show_bug.cgi?id=35736
Add the `amdgpu-kernel` ABI.
Technically, there are requirements imposed by the LLVM
`AMDGPUTargetMachine` on functions with this ABI (eg, the return type
must be void), but I'm unsure exactly where this should be enforced.
Technically, there are requirements imposed by the LLVM
`AMDGPUTargetMachine` on functions with this ABI (eg, the return type
must be void), but I'm unsure exactly where this should be enforced.
Preliminary work for incremental ThinLTO.
Since implementing incremental ThinLTO is a bit more involved than I initially thought, I'm splitting out some of the things that already work. This PR (1) adds a way accessing some ThinLTO information in `rustc` and (2) does some cleanup around CGU/object file naming (which makes things quite a bit nicer).
This is probably best reviewed one commit at a time.
Note this isn't useful, yet. More changes will be necessary to be able to
actually codegen for this machine. As such, it is not enabled by default.
This patch is on its own for the benefit of the reviewers.
The LLVM PassManager has a PrepareForThinLTO flag, which is intended
when compilation occurs in conjunction with linking by ThinLTO. The
flag has two effects:
* The NameAnonGlobal pass is run after all other passes, which
ensures that all globals have a name.
* In optimized builds, a number of late passes (mainly related to
vectorization and unrolling) are disabled, on the rationale that
these a) will increase codesize of the intermediate artifacts
and b) will be run by ThinLTO again anyway.
This patch enables the use of PrepareForThinLTO if Thin or ThinLocal
linking is used.
The background for this change is the CI failure in #49479, which
we assume to be caused by the NameAnonGlobal pass not being run.
As this changes which passes LLVM runs, this might have performance
(or other) impact, so we want to land this separately.
implement minmax intrinsics
This adds the `simd_{fmin,fmax}` intrinsics, which do a vertical (lane-wise) `min`/`max` for floating point vectors that's equivalent to Rust's `min`/`max` for `f32`/`f64`.
It might make sense to make `{f32,f64}::{min,max}` use the `minnum` and `minmax` intrinsics as well.
---
~~HELP: I need some help with these. Either I should go to sleep or there must be something that I must be missing. AFAICT I am calling the `maxnum` builder correctly, yet rustc/LLVM seem to insert a call to `llvm.minnum` there instead...~~ EDIT: Rust's LLVM version is too old :/
Add basic PGO support.
This PR adds two mutually exclusive options for profile usage and generation using LLVM's instruction profile generation (the same as clang uses), `-C pgo-use` and `-C pgo-gen`.
See each commit for details.
Adds intrinsics::exact_div to take advantage of the unsafe, which reduces the implementation from
```asm
sub rcx, rdx
mov rax, rcx
sar rax, 63
shr rax, 62
lea rax, [rax + rcx]
sar rax, 2
ret
```
down to
```asm
sub rcx, rdx
sar rcx, 2
mov rax, rcx
ret
```
(for `*const i32`)
This commit updates our Fat LTO logic to tweak our custom wrapper around LLVM's
"link modules" functionality. Previously whenever the
`LLVMRustLinkInExternalBitcode` function was called it would call LLVM's
`Linker::linkModules` wrapper. Internally this would crate an instance of a
`Linker` which internally creates an instance of an `IRMover`. Unfortunately for
us the creation of `IRMover` is somewhat O(n) with the input module. This means
that every time we linked a module it was O(n) with respect to the entire module
we had built up!
Now the modules we build up during LTO are quite large, so this quickly started
creating an O(n^2) problem for us! Discovered in #48025 it turns out this has
always been a problem and we just haven't noticed it. It became particularly
worse recently though due to most libraries having 16x more object files than
they previously did (1 -> 16).
This commit fixes this performance issue by preserving the `Linker` instance
across all links into the main LLVM module. This means we only create one
`IRMover` and allows LTO to progress much speedier.
From the `cargo-cache` project in #48025 a **full build** locally when from
5m15s to 2m24s. Looking at the timing logs each object file was linked in in
single-digit millisecond rather than hundreds, clearly being a nice improvement!
Closes#48025
This commit introduces a separately compiled backend for Emscripten, avoiding
compiling the `JSBackend` target in the main LLVM codegen backend. This builds
on the foundation provided by #47671 to create a new codegen backend dedicated
solely to Emscripten, removing the `JSBackend` of the main codegen backend in
the process.
A new field was added to each target for this commit which specifies the backend
to use for translation, the default being `llvm` which is the main backend that
we use. The Emscripten targets specify an `emscripten` backend instead of the
main `llvm` one.
There's a whole bunch of consequences of this change, but I'll try to enumerate
them here:
* A *second* LLVM submodule was added in this commit. The main LLVM submodule
will soon start to drift from the Emscripten submodule, but currently they're
both at the same revision.
* Logic was added to rustbuild to *not* build the Emscripten backend by default.
This is gated behind a `--enable-emscripten` flag to the configure script. By
default users should neither check out the emscripten submodule nor compile
it.
* The `init_repo.sh` script was updated to fetch the Emscripten submodule from
GitHub the same way we do the main LLVM submodule (a tarball fetch).
* The Emscripten backend, turned off by default, is still turned on for a number
of targets on CI. We'll only be shipping an Emscripten backend with Tier 1
platforms, though. All cross-compiled platforms will not be receiving an
Emscripten backend yet.
This commit means that when you download the `rustc` package in Rustup for Tier
1 platforms you'll be receiving two trans backends, one for Emscripten and one
that's the general LLVM backend. If you never compile for Emscripten you'll
never use the Emscripten backend, so we may update this one day to only download
the Emscripten backend when you add the Emscripten target. For now though it's
just an extra 10MB gzip'd.
Closes#46819
Building on the work of # 45684 this commit updates the compiler to
unconditionally load the `rustc_trans` crate at runtime instead of linking to it
at compile time. The end goal of this work is to implement # 46819 where rustc
will have multiple backends available to it to load.
This commit starts off by removing the `extern crate rustc_trans` from the
driver. This involved moving some miscellaneous functionality into the
`TransCrate` trait and also required an implementation of how to locate and load
the trans backend. This ended up being a little tricky because the sysroot isn't
always the right location (for example `--sysroot` arguments) so some extra code
was added as well to probe a directory relative to the current dll (the
rustc_driver dll).
Rustbuild has been updated accordingly as well to have a separate compilation
invocation for the `rustc_trans` crate and assembly it accordingly into the
sysroot. Finally, the distribution logic for the `rustc` package was also
updated to slurp up the trans backends folder.
A number of assorted fallout changes were included here as well to ensure tests
pass and such, and they should all be commented inline.
First round of LLVM 6.0.0 compatibility
This includes a number of commits for the first round of upgrading to LLVM 6. There are still [lingering bugs](https://github.com/rust-lang/rust/issues/47683) but I believe all of this will nonetheless be necessary!
Teach rustc about DW_AT_noreturn and a few more DIFlags
We achieve two small things with this PR:
1. We provide definitions for a few additional llvm debuginfo flags
1. We _use_ one of these new flags, `FlagNoReturn`, and add it to debuginfo for functions with the never return type (`!`).
It looks like LLVM also removed it in llvm-mirror/llvm@f45adc29d in favor of the
name "GNU64". This was added in the thought that we'd need such a variant when
adding mips64 support but we ended up not needing it! For now let's just
removing the various support on the Rust side of things.
LLVM has since removed the `CodeModel::Default` enum value in favor of an
`Optional` implementationg throughout LLVM. Let's mirror the same change in Rust
and update the various bindings we call accordingly.
Removed in llvm-mirror/llvm@9aafb854c
LLVM <= 4.0 used a non-standard interpretation of `DW_OP_plus`. In the
DWARF standard, this adds two items on the expressions stack. LLVM's
behavior was more like DWARF's `DW_OP_plus_uconst` -- adding a constant
that follows the op. The patch series starting with [D33892] switched
to the standard DWARF interpretation, so we need to follow.
[D33892]: https://reviews.llvm.org/D33892
Use name-discarding LLVM context
This is only applicable when neither of --emit=llvm-ir or --emit=llvm-bc are not
requested.
In case either of these outputs are wanted, but the benefits of such context are
desired as well, -Zfewer_names option provides the same functionality regardless
of the outputs requested.
Should be a viable fix for https://github.com/rust-lang/rust/issues/46449
This is only applicable when neither of --emit=llvm-ir or --emit=llvm-bc are not
requested.
In case either of these outputs are wanted, but the benefits of such context are
desired as well, -Zfewer_names option provides the same functionality regardless
of the outputs requested.
Re-do the FreeBSD cross-builds to use Clang and libc++. Fixes#44433
Reviving #45077, from @jld:
> The main goal here is to use FreeBSD's normal libc++, instead of
> statically linking the libstdc++ packaged with GCC, because that
> libstdc++ has bugs that cause rustc to deadlock inside LLVM.
>
> But the easiest way to use libc++ is to switch the build from GCC to
> Clang, and the Clang package in the Ubuntu image already knows how to
> cross-compile (given a sysroot and preferably cross-binutils), so the
> toolchain script now uses that instead of building a custom compiler.
>
> This also de-duplicates the build-toolchain.sh script.
#45077 was close but didn't quite make it. I rebased @jld's work off the current `master` and started with that.
I was able to determine that this Travis error (https://github.com/rust-lang/rust/pull/45077#issuecomment-336029862) was ultimately caused by `src/librustc_llvm/build.rs` attempting to follow a wrong value in `LLVM_STATIC_STDCPP` (https://github.com/rust-lang/rust/pull/45077#issuecomment-352639456).
I looked at the downstream port for FreeBSD (https://svnweb.freebsd.org/ports/head/lang/rust/) and it seems like they do not use `--enable-llvm-static-stdcpp`.
Since `libc++` is included in the FreeBSD 10+ base system, we don't need to statically link it either?
So in b989428f7dec7b52d68bed6a21e9b5b0a8086267 I have set the FreeBSD build to not actually use `LLVM_STATIC_STDCPP`.
I was able to run `./src/ci/docker/run.sh` with both `dist-i686-freebsd` and `dist-x86_64-freebsd` successfully and in about 1 minute of testing it seemed like the dist-x86_64-freebsd results worked on a FreeBSD 11 system.
It should fix#44433, which seems to be affecting many potential users. Also FreeBSD users should be able to `./x.py build` which should help anyone who wants to upstream fixes for FreeBSD.
Questions:
Does this approach seem to be the right way to go? Do we actually really want to statically link `libc++`? (I tried that here, but it ultimately ran into a roadblock on x86_64: https://github.com/rust-lang/rust/pull/45077#issuecomment-353293414)
Can we rewrite the comment here to be more clear about why some systems aren't going to actually use this option:
b989428f7d/src/bootstrap/compile.rs (L550-L553)
How does this affect users of older FreeBSD systems? It seemed like no one was complaining about using a 10.3 base version in the thread for #45077. FreeBSD seems to only officially support 10.3, 10.4, and 11.x right now, do we have to consider older users? The `libc++` stuff came in for FreeBSD 10, older FreeBSD used `libstdc++`.
Looks like @alexcrichton was leading the discussion on the previous issue:
r? @alexcrichton
Let me know what I can do to help get this through.
This commit is the next attempt to enable multiple codegen units by default in
release mode, getting some of those sweet, sweet parallelism wins by running
codegen in parallel. Performance should not be lost due to ThinLTO being on by
default as well.
Closes#45320