Instead disable creation of assumptions during inlining using an
LLVM opt flag.
The -Z arg-align-attributes option which previously controlled this
behavior is removed.
We represent `bool` as `i1` in a `ScalarPair`, unlike other aggregates,
to optimize IR for checked operators and the like. With this patch, we
still do so when the pair is an immediate value, but we use the `i8`
memory type when the value is loaded or stored as an LLVM aggregate.
So `(bool, bool)` looks like an `{ i1, i1 }` immediate, but `{ i8, i8 }`
in memory. When a pair is a direct function argument, `PassMode::Pair`,
it is still passed using the immediate `i1` type, but as a return value
it will use the `i8` memory type. Also, `bool`-like` enum tags will now
use scalar pairs when possible, where they were previously excluded due
to optimization issues.
This used to be disabled due to LLVM bugs in the handling of
noalias information in conjunction with unwinding. However,
according to #31681 all known LLVM bugs have been fixed by
LLVM 6.0, so it's probably time to reenable this optimization.
Noalias annotations will not be emitted by default if either
-C panic=abort (as previously) or LLVM >= 6.0 (new).
-Z mutable-noalias=no is left as an escape-hatch to allow
debugging problems suspected to stem from this change.
Initially MIR differentiated between arguments and locals, which
introduced a need to add extra copies assigning the argument to a
local, even for simple bindings. This differentiation no longer exists,
but we're still creating those copies, bloating the MIR and LLVM IR we
emit.
Additionally, the current approach means that we create debug info for
both the incoming argument (marking it as an argument), and then
immediately shadow it a local that goes by the same name. This can be
confusing when using e.g. "info args" in gdb, or when e.g. a debugger
with a GUI displays the function arguments separately from the local
variables, especially when the binding is mutable, because the argument
doesn't change, while the local variable does.
This PR is an implementation of [RFC 1974] which specifies a new method of
defining a global allocator for a program. This obsoletes the old
`#![allocator]` attribute and also removes support for it.
[RFC 1974]: https://github.com/rust-lang/rfcs/pull/197
The new `#[global_allocator]` attribute solves many issues encountered with the
`#![allocator]` attribute such as composition and restrictions on the crate
graph itself. The compiler now has much more control over the ABI of the
allocator and how it's implemented, allowing much more freedom in terms of how
this feature is implemented.
cc #27389
LLVM's memory dependence analysis doesn't properly account for calls
that could unwind and thus effectively act as a branching point. This
can lead to stores that are only visible when the call unwinds being
removed, possibly leading to calls to drop() functions with b0rked
memory contents.
As there is no fix for this in LLVM yet and we want to keep
compatibility to current LLVM versions anyways, we have to workaround
this bug by omitting the noalias attribute on &mut function arguments.
Benchmarks suggest that the performance loss by this change is very
small.
Thanks to @RalfJung for pushing me towards not removing too many
noalias annotations and @alexcrichton for helping out with the test for
this bug.
Fixes#29485
This has a number of advantages compared to creating a copy in memory
and passing a pointer. The obvious one is that we don't have to put the
data into memory but can keep it in registers. Since we're currently
passing a pointer anyway (instead of using e.g. a known offset on the
stack, which is what the `byval` attribute would achieve), we only use a
single additional register for each fat pointer, but save at least two
pointers worth of stack in exchange (sometimes more because more than
one copy gets eliminated). On archs that pass arguments on the stack, we
save a pointer worth of stack even without considering the omitted
copies.
Additionally, LLVM can optimize the code a lot better, to a large degree
due to the fact that lots of copies are gone or can be optimized away.
Additionally, we can now emit attributes like nonnull on the data and/or
vtable pointers contained in the fat pointer, potentially allowing for
even more optimizations.
This results in LLVM passes being about 3-7% faster (depending on the
crate), and the resulting code is also a few percent smaller, for
example:
text data filename
5671479 3941461 before/librustc-d8ace771.so
5447663 3905745 after/librustc-d8ace771.so
1944425 2394024 before/libstd-d8ace771.so
1896769 2387610 after/libstd-d8ace771.so
I had to remove a call in the backtrace-debuginfo test, because LLVM can
now merge the tails of some blocks when optimizations are turned on,
which can't correctly preserve line info.
Fixes#22924
Cc #22891 (at least for fat pointers the code is good now)
The current codegen tests only compare IR line counts between similar
rust and C programs, the latter getting compiled with clang. That looked
like a good idea back then, but actually things like lifetime intrinsics
mean that less IR isn't always better, so the metric isn't really
helpful.
Instead, we can start doing tests that check specific aspects of the
generated IR, like attributes or metadata. To do that, we can use LLVM's
FileCheck tool which has a number of useful features for such tests.
To start off, I created some tests for a few things that were recently
added and/or broken.