Use object instead of LLVM for reading bitcode from rlibs
Together with changes I plan to make as part of https://github.com/rust-lang/rust/pull/97485 this will allow entirely removing usage of LLVM's archive reader and thus allow removing `archive_ro.rs` and `ArchiveWrapper.cpp`.
No functional changes intended.
LLVM commit 633f5663c3 removed `createWriteThinLTOBitcodePass`.
This adapts PassWrapper similarly to the example mentioned upstream: 633f5663c3.
Update the minimum external LLVM to 13
With this change, we'll have stable support for LLVM 13 through 15 (pending release).
For reference, the previous increase to LLVM 12 was #90175.
r? `@nagisa`
Add support for generating unique profraw files by default when using `-C instrument-coverage`
Currently, enabling the rustc flag `-C instrument-coverage` instruments the given crate and by default uses the naming scheme `default.profraw` for any instrumented profile files generated during the execution of a binary linked against this crate. This leads to multiple binaries being executed overwriting one another and causing only the last executable run to contain actual coverage results.
This can be overridden by manually setting the environment variable `LLVM_PROFILE_FILE` to use a unique naming scheme.
This PR adds a change to add support for a reasonable default for rustc to use when enabling coverage instrumentation similar to how the Rust compiler treats generating these same `profraw` files when PGO is enabled.
The new naming scheme is set to `default_%m_%p.profraw` to ensure the uniqueness of each file being generated using [LLVMs special pattern strings](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html#running-the-instrumented-program).
Today the compiler sets the default for PGO `profraw` files to `default_%m.profraw` to ensure a unique file for each run. The same can be done for the instrumented profile files generated via the `-C instrument-coverage` flag as well which LLVM has API support for.
Linked Issue: https://github.com/rust-lang/rust/issues/100381
r? `@wesleywiser`
Fixes a warning:
warning: llvm-wrapper/RustWrapper.cpp:159:11: warning: enumeration values 'AllocatedPointer' and 'AllocAlign' not handled in switch [-Wswitch]
warning: switch (Kind) {
warning: ^
Which was fall out from 130a1df71e.
This obviates the patch that teaches LLVM internals about
_rust_{re,de}alloc functions by putting annotations directly in the IR
for the optimizer.
The sole test change is required to anchor FileCheck to the body of the
`box_uninitialized` method, so it doesn't see the `allocalign` on
`__rust_alloc` and get mad about the string `alloca` showing up. Since I
was there anyway, I added some checks on the attributes to prove the
right attributes got set.
While we're here, we also emit allocator attributes on
__rust_alloc_zeroed. This should allow LLVM to perform more
optimizations for zeroed blocks, and probably fixes#90032. [This
comment](https://github.com/rust-lang/rust/issues/24194#issuecomment-308791157)
mentions "weird UB-like behaviour with bitvec iterators in
rustc_data_structures" so we may need to back this change out if things
go wrong.
The new test cases require LLVM 15, so we copy them into LLVM
14-supporting versions, which we can delete when we drop LLVM 14.
Add support for LLVM ShadowCallStack.
LLVMs ShadowCallStack provides backward edge control flow integrity protection by using a separate shadow stack to store and retrieve a function's return address.
LLVM currently only supports this for AArch64 targets. The x18 register is used to hold the pointer to the shadow stack, and therefore this only works on ABIs which reserve x18. Further details are available in the [LLVM ShadowCallStack](https://clang.llvm.org/docs/ShadowCallStack.html) docs.
# Usage
`-Zsanitizer=shadow-call-stack`
# Comments/Caveats
* Currently only enabled for the aarch64-linux-android target
* Requires the platform to define a runtime to initialize the shadow stack, see the [LLVM docs](https://clang.llvm.org/docs/ShadowCallStack.html) for more detail.
Allow to disable thinLTO buffer to support lto-embed-bitcode lld feature
Hello
This change is to fix issue (https://github.com/rust-lang/rust/issues/84395) in which passing "-lto-embed-bitcode=optimized" to lld when linking rust code via linker-plugin-lto doesn't produce the expected result.
Instead of emitting a single unified module into a llvmbc section of the linked elf, it emits multiple submodules.
This is caused because rustc emits the BC modules after running llvm `createWriteThinLTOBitcodePass` pass.
Which in turn triggers a thinLTO linkage and causes the said issue.
This patch allows via compiler flag (-Cemit-thin-lto=<bool>) to select between running `createWriteThinLTOBitcodePass` and `createBitcodeWriterPass`.
Note this pattern of selecting between those 2 passes is common inside of LLVM code.
The default is to match the old behavior.
Remove branch target prologues from `#[naked] fn`
This patch hacks around rust-lang/rust#98768 for now via injecting appropriate attributes into the LLVMIR we emit for naked functions. I intend to pursue this upstream so that these attributes can be removed in general, but it's slow going wading through C++ for me.
Revert "Work around invalid DWARF bugs for fat LTO"
Since September, the toolchain has not been generating reliable DWARF
information for static variables when LTO is on. This has affected
projects in the embedded space where the use of LTO is typical. In our
case, it has kept us from bumping past the 2021-09-22 nightly toolchain
lest our debugger break. This has been a pretty dramatic regression for
people using debuggers and static variables. See #90357 for more info
and a repro case.
This commit is a mechanical revert of
d5de680e20 from PR #89041, which caused
the issue. (Note on that PR that the commit's author has requested it be
reverted.)
I have locally verified that this fixes#90357 by restoring the
functionality of both the repro case I posted on that bug, and debugger
behavior on real programs. There do not appear to be test cases for this
in the toolchain; if I've missed them, point me at 'em and I'll update
them.
Adding the option to control from rustc CLI
if the resulted ".o" bitcode module files are with
thinLTO info or regular LTO info.
Allows using "-lto-embed-bitcode=optimized" during linkage
correctly.
Signed-off-by: Ziv Dunkelman <ziv.dunkelman@nextsilicon.com>
This adds the typeid and `vcall_visibility` metadata to vtables when the
-Cvirtual-function-elimination flag is set.
The typeid is generated in the same way as for the
`llvm.type.checked.load` intrinsic from the trait_ref.
The offset that is added to the typeid is always 0. This is because LLVM
assumes that vtables are constructed according to the definition in the
Itanium ABI. This includes an "address point" of the vtable. In C++ this
is the offset in the vtable where information for RTTI is placed. Since
there is no RTTI information in Rust's vtables, this "address point" is
always 0. This "address point" in combination with the offset passed to
the `llvm.type.checked.load` intrinsic determines the final function
that should be loaded from the vtable in the
`WholeProgramDevirtualization` pass in LLVM. That's why the
`llvm.type.checked.load` intrinsics are generated with the typeid of the
trait, rather than with that of the function that is called. This
matches what `clang` does for C++.
The vcall_visibility metadata depends on three factors:
1. LTO level: Currently this is always fat LTO, because LLVM only
supports this optimization with fat LTO.
2. Visibility of the trait: If the trait is publicly visible, VFE
can only act on its vtables after linking.
3. Number of CGUs: if there is more than one CGU, also vtables with
restricted visibility could be seen outside of the CGU, so VFE can
only act on them after linking.
To reflect this, there are three visibility levels: Public, LinkageUnit,
and TranslationUnit.
To apply the optimization the `Virtual Function Elim` module flag has to
be set. To apply this optimization post-link the `LTOPostLink` module
flag has to be set.
In https://reviews.llvm.org/D125556 upstream changed sext() and zext()
to allow some no-op cases, which previously required use of the *OrSelf()
methods, which I assume is what was going on here. The *OrSelf() methods
got removed in https://reviews.llvm.org/D125559 after two weeks of
deprecation because they came with some bonus (probably-undesired)
behavior. Since the behavior of sext() and zext() changed slightly, I
kept the old *OrSelf() calls in LLVM 14 and earlier, and only use the
new version in LLVM 15.
r? @nikic
This new enum entry was introduced in https://reviews.llvm.org/D122268,
and if I'm reading correctly there's no case where we'd ever encounter
it in our uses of LLVM. To preserve the ability to compile this file
with -Werror -Wswitch we add an explicit case for this entry.
Since September, the toolchain has not been generating reliable DWARF
information for static variables when LTO is on. This has affected
projects in the embedded space where the use of LTO is typical. In our
case, it has kept us from bumping past the 2021-09-22 nightly toolchain
lest our debugger break. This has been a pretty dramatic regression for
people using debuggers and static variables. See #90357 for more info
and a repro case.
This commit is a mechanical revert of
d5de680e20 from PR #89041, which caused
the issue. (Note on that PR that the commit's author has requested it be
reverted.)
I have locally verified that this fixes#90357 by restoring the
functionality of both the repro case I posted on that bug, and debugger
behavior on real programs. There do not appear to be test cases for this
in the toolchain; if I've missed them, point me at 'em and I'll update
them.
Remove LLVM attribute removal
This was necessary before, because `declare_raw_fn` would always apply
the default optimization attributes to every declared function.
Then `attributes::from_fn_attrs` would have to remove the default
attributes in the case of, e.g. `#[optimize(speed)]` in a `-Os` build.
(see [`src/test/codegen/optimize-attr-1.rs`](03a8cc7df1/src/test/codegen/optimize-attr-1.rs (L33)))
However, every relevant callsite of `declare_raw_fn` (i.e. where we
actually generate code for the function, and not e.g. a call to an
intrinsic, where optimization attributes don't [?] matter)
calls `from_fn_attrs`, so we can remove the attribute setting
from `declare_raw_fn`, and rely on `from_fn_attrs` to apply the correct
attributes all at once.
r? `@ghost` (blocked on #94221)
`@rustbot` label S-blocked
This was necessary before, because `declare_raw_fn` would always apply
the default optimization attributes to every declared function,
and then `attributes::from_fn_attrs` would have to remove the default
attributes in the case of, e.g. `#[optimize(speed)]` in a `-Os` build.
However, every relevant callsite of `declare_raw_fn` (i.e. where we
actually generate code for the function, and not e.g. a call to an
intrinsic, where optimization attributes don't [?] matter)
calls `from_fn_attrs`, so we can simply remove the attribute setting
from `declare_raw_fn`, and rely on `from_fn_attrs` to apply the correct
attributes all at once.
Add MemTagSanitizer Support
Add support for the LLVM [MemTagSanitizer](https://llvm.org/docs/MemTagSanitizer.html).
On hardware which supports it (see caveats below), the MemTagSanitizer can catch bugs similar to AddressSanitizer and HardwareAddressSanitizer, but with lower overhead.
On a tag mismatch, a SIGSEGV is signaled with code SEGV_MTESERR / SEGV_MTEAERR.
# Usage
`-Zsanitizer=memtag -C target-feature="+mte"`
# Comments/Caveats
* MemTagSanitizer is only supported on AArch64 targets with hardware support
* Requires `-C target-feature="+mte"`
* LLVM MemTagSanitizer currently only performs stack tagging.
# TODO
* Tests
* Example
In https://reviews.llvm.org/D114543 the uwtable attribute gained a flag
so that we can ask for sync uwtables instead of async, as the former are
much cheaper. The default is async, so that's what I've done here, but I
left a TODO that we might be able to do better.
While in here I went ahead and dropped support for removing uwtable
attributes in rustc: we never did it, so I didn't write the extra C++
bridge code to make it work. Maybe I should have done the same thing
with the `sync|async` parameter but we'll see.