Remove support for the PNaCl target (le32-unknown-nacl)
This removes support for the `le32-unknown-nacl` target which is currently supported by rustc on tier 3. Despite the "nacl" in the name, the target doesn't output native code (x86, ARM, MIPS), instead it outputs binaries in the PNaCl format.
There are two reasons for the removal:
* Google [has announced](https://blog.chromium.org/2017/05/goodbye-pnacl-hello-webassembly.html) deprecation of the PNaCl format. The suggestion is to migrate to wasm. Happens we already have a wasm backend!
* Our PNaCl LLVM backend is provided by the fastcomp patch set that the LLVM fork used by rustc contains in addition to vanilla LLVM (`src/llvm/lib/Target/JSBackend/NaCl`). Upstream LLVM doesn't have PNaCl support. Removing PNaCl support will enable us to move away from fastcomp (#44006) and have a lighter set of patches on top of upstream LLVM inside our LLVM fork. This will help distribution packagers of Rust.
Fixes#42420
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.
This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:
* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
single compilation. That is, we won't load upstream rlibs, but we'll instead
just perform ThinLTO amongst all codegen units produced by the compiler for
the local crate. This is intended to emulate a desired end point where we have
codegen units turned on by default for all crates and ThinLTO allows us to do
this without performance loss.
* In anther mode, like full LTO today, we'll optimize all upstream dependencies
in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
should finish much more quickly.
There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:
* Controlling parallelism means we can use the existing jobserver support to
avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
integrates with our own incremental strategy, but this is yet to be
determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
creation, where all our options we used today aren't necessarily supported by
upstream LLVM yet.
My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
This commit is a refactoring of the LTO backend in Rust to support compilations
with multiple codegen units. The immediate result of this PR is to remove the
artificial error emitted by rustc about `-C lto -C codegen-units-8`, but longer
term this is intended to lay the groundwork for LTO with incremental compilation
and ultimately be the underpinning of ThinLTO support.
The problem here that needed solving is that when rustc is producing multiple
codegen units in one compilation LTO needs to merge them all together.
Previously only upstream dependencies were merged and it was inherently relied
on that there was only one local codegen unit. Supporting this involved
refactoring the optimization backend architecture for rustc, namely splitting
the `optimize_and_codegen` function into `optimize` and `codegen`. After an LLVM
module has been optimized it may be blocked and queued up for LTO, and only
after LTO are modules code generated.
Non-LTO compilations should look the same as they do today backend-wise, we'll
spin up a thread for each codegen unit and optimize/codegen in that thread. LTO
compilations will, however, send the LLVM module back to the coordinator thread
once optimizations have finished. When all LLVM modules have finished optimizing
the coordinator will invoke the LTO backend, producing a further list of LLVM
modules. Currently this is always a list of one LLVM module. The coordinator
then spawns further work to run LTO and code generation passes over each module.
In the course of this refactoring a number of other pieces were refactored:
* Management of the bytecode encoding in rlibs was centralized into one module
instead of being scattered across LTO and linking.
* Some internal refactorings on the link stage of the compiler was done to work
directly from `CompiledModule` structures instead of lists of paths.
* The trans time-graph output was tweaked a little to include a name on each
bar and inflate the size of the bars a little
APFloat: Rewrite It In Rust and use it for deterministic floating-point CTFE.
As part of the CTFE initiative, we're forced to find a solution for floating-point operations.
By design, IEEE-754 does not explicitly define everything in a deterministic manner, and there is some variability between platforms, at the very least (e.g. NaN payloads).
If types are to evaluate constant expressions involving type (or in the future, const) generics, that evaluation needs to be *fully deterministic*, even across `rustc` host platforms.
That is, if `[T; T::X]` was used in a cross-compiled library, and the evaluation of `T::X` executed a floating-point operation, that operation has to be reproducible on *any other host*, only knowing `T` and the definition of the `X` associated const (as either AST or HIR).
Failure to uphold those rules allows an associated type (e.g. `<Foo as Iterator>::Item`) to be seen as two (or more) different types, depending on the current host, and such type safety violations typically allow writing of a `transmute` in safe code, given enough generics.
The options considered by @rust-lang/compiler were:
1. Ban floating-point operations in generic const-evaluation contexts
2. Emulate floating-point operations in an uniformly deterministic fashion
The former option may seem appealing at first, but floating-point operations *are allowed today*, so they can't be banned wholesale, a distinction has to be made between the code that already works, and future generic contexts. *Moreover*, every computation that succeeded *has to be cached*, otherwise the generic case can be reproduced without any generics. IMO there are too many ways it can go wrong, and a single violation can be enough for an unsoundness hole.
Not to mention we may end up really wanting floating-point operations *anyway*, in CTFE.
I went with the latter option, and seeing how LLVM *already* has a library for this exact purpose (as it needs to perform optimizations independently of host floating-point capabilities), i.e. `APFloat`, that was what I ended up basing this PR on.
But having been burned by the low reusability of bindings that link to LLVM, and because I would *rather* the floating-point operations to be wrong than not deterministic or not memory-safe (`APFloat` does far more pointer juggling than I'm comfortable with), I decided to RIIR.
This way, we have a guarantee of *no* `unsafe` code, a bit more control over the where native floating-point might accidentally be involved, and non-LLVM backends can share it.
I've also ported all the testcases over, *before* any functionality, to catch any mistakes.
Currently the PR replaces all CTFE operations to go through `apfloat::ieee::{Single,Double}`, keeping only the bits of the `f32` / `f64` memory representation in between operations.
Converting from a string also double-checks that `core::num` and `apfloat` agree on the interpretation of a floating-point number literal, in case either of them has any bugs left around.
r? @nikomatsakis
f? @nagisa @est31
<hr/>
Huge thanks to @edef1c for first demoing usable `APFloat` bindings and to @chandlerc for fielding my questions on IRC about `APFloat` peculiarities (also upstreaming some bugfixes).
Thread through the original error when opening archives
This updates the management of opening archives to thread through the original
piece of error information from LLVM over to the end consumer, trans.
This PR is an implementation of [RFC 1974] which specifies a new method of
defining a global allocator for a program. This obsoletes the old
`#![allocator]` attribute and also removes support for it.
[RFC 1974]: https://github.com/rust-lang/rfcs/pull/197
The new `#[global_allocator]` attribute solves many issues encountered with the
`#![allocator]` attribute such as composition and restrictions on the crate
graph itself. The compiler now has much more control over the ABI of the
allocator and how it's implemented, allowing much more freedom in terms of how
this feature is implemented.
cc #27389
When writing LLVM IR output demangled fn name in comments
`--emit=llvm-ir` looks like this now:
```
; <alloc::vec::Vec<T> as core::ops::index::IndexMut<core::ops::range::RangeFull>>::index_mut
; Function Attrs: inlinehint uwtable
define internal { i8*, i64 } @"_ZN106_$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..index..IndexMut$LT$core..ops..range..RangeFull$GT$$GT$9index_mut17h7f7b576609f30262E"(%"alloc::vec::Vec<u8>"* dereferenceable(24)) unnamed_addr #0 {
start:
...
```
cc https://github.com/integer32llc/rust-playground/issues/15
Enable wasm LLVM backend
Enables compilation to WebAssembly with the LLVM backend using the target triple "wasm32-unknown-unknown". This is the beginning of my work on #38804.
**edit:** The new new target is now wasm32-experimental-emscripten instead of wasm32-unknown-unknown.
The new target is wasm32-experimental-emscripten. Adds a new
configuration option to opt in to building experimental LLVM backends
such as the WebAssembly backend. The target name was chosen to be
similar to the existing wasm32-unknown-emscripten target so that the
build and tests would work with minimal other code changes. When/if the
new target replaces the old target, simply renaming it should just work.
Build instruction profiler runtime as part of compiler-rt
r? @alexcrichton
This is #38608 with some fixes.
Still missing:
- [x] testing with profiler enabled on some builders (on which ones? Should I add the option to some of the already existing configurations, or create a new configuration?);
- [x] enabling distribution (on which builders?);
- [x] documentation.
This support is needed for bindgen to work well on 32-bit Windows, and
also enables people to begin experimenting with C++ FFI support on that
platform.
Fixes#42044.
Add RWPI/ROPI relocation model support
This PR adds support for using LLVM 4's ROPI and RWPI relocation models for ARM.
ROPI (Read-Only Position Independence) and RWPI (Read-Write Position Independence) are two new relocation models in LLVM for the ARM backend ([LLVM changset](https://reviews.llvm.org/rL278015)). The motivation is that these are the specific strategies we use in userspace [Tock](https://www.tockos.org) apps, so supporting this is an important step (perhaps the final step, but can't confirm yet) in enabling userspace Rust processes.
## Explanation
ROPI makes all code and immutable accesses PC relative, but not assumed to be overriden at runtime (so for example, jumps are always relative).
RWPI uses a base register (`r9`) that stores the addresses of the GOT in memory so the runtime (e.g. a kernel) only adjusts r9 tell running code where the GOT is.
## Complications adding support in Rust
While this landed in LLVM master back in August, the header files in `llvm-c` have not been updated yet to reflect it. Rust replicates that header file's version of the `LLVMRelocMode` enum as the Rust enum `llvm::RelocMode` and uses an implicit cast in the ffi to translate from Rust's notion of the relocation model to the LLVM library's notion.
My workaround for this currently is to replace the `LLVMRelocMode` argument to `LLVMTargetMachineRef` with an int and using the hardcoded int representation of the `RelocMode` enum. This is A Bad Idea(tm), but I think very nearly the right thing.
Would a better alternative be to patch rust-llvm to support these enum variants (also a fairly trivial change)?
When -Z profile is passed, the GCDAProfiling LLVM pass is added
to the pipeline, which uses debug information to instrument the IR.
After compiling with -Z profile, the $(OUT_DIR)/$(CRATE_NAME).gcno
file is created, containing initial profiling information.
After running the program built, the $(OUT_DIR)/$(CRATE_NAME).gcda
file is created, containing branch counters.
The created *.gcno and *.gcda files can be processed using
the "llvm-cov gcov" and "lcov" tools. The profiling data LLVM
generates does not faithfully follow the GCC's format for *.gcno
and *.gcda files, and so it will probably not work with other tools
(such as gcov itself) that consume these files.
Replaces the llvm-c exposed LLVMRelocMode, which does not include all
relocation model variants, with a LLVMRustRelocMode modeled after
LLVMRustCodeMode.
LLVM 4.0 Upgrade
Since nobody has done this yet, I decided to get things started:
**Todo:**
* [x] push the relevant commits to `rust-lang/llvm` and `rust-lang/compiler-rt`
* [x] cleanup `.gitmodules`
* [x] Verify if there are any other commits from `rust-lang/llvm` which need backporting
* [x] Investigate / fix debuginfo ("`<optimized out>`") failures
* [x] Use correct emscripten version in docker image
---
Closes#37609.
---
**Test results:**
Everything is green 🎉
Tracking issue: https://github.com/rust-lang/rust/issues/40180
This calling convention can be used for definining interrupt handlers on
32-bit and 64-bit x86 targets. The compiler then uses `iret` instead of
`ret` for returning and ensures that all registers are restored to their
original values.
Usage:
```
extern "x86-interrupt" fn handler(stack_frame: &ExceptionStackFrame) {…}
```
for interrupts and exceptions without error code and
```
extern "x86-interrupt" fn page_fault_handler(stack_frame: &ExceptionStackFrame,
error_code: u64) {…}
```
for exceptions that push an error code (e.g., page faults or general
protection faults). The programmer must ensure that the correct version
is used for each interrupt.
For more details see the [LLVM PR][1] and the corresponding [proposal][2].
[1]: https://reviews.llvm.org/D15567
[2]: http://lists.llvm.org/pipermail/cfe-dev/2015-September/045171.html