Update list of allowed aarch64 features
I recently added these features to std_detect for aarch64 linux, pending [review](https://github.com/rust-lang/stdarch/pull/1146).
I have commented any features not supported by LLVM 9, the current minimum version for Rust. Some (PAuth at least) were renamed between 9 & 12 and I've left them disabled. TME, however, is not in LLVM 9 but I've left it enabled.
See https://github.com/rust-lang/stdarch/issues/993
Set dso_local for more items
Related to https://github.com/rust-lang/rust/pull/83592. (cc `@nagisa)`
Noticed that on x86_64 with `relocation-model: static` `R_X86_64_GOTPCREL` relocations were still generated in some cases. (related: https://github.com/Rust-for-Linux/linux/issues/135; Rust-for-Linux needs these fixes to successfully build)
First time doing anything with LLVM so not sure whether this is correct but the following are some of the things I've tried to convince myself.
## C equivalent
Example from clang which also sets `dso_local` in these cases:
`clang-12 -fno-PIC -S -emit-llvm test.c`
```C
extern int A;
int* a() {
return &A;
}
int B;
int* b() {
return &B;
}
```
```
; ModuleID = 'test.c'
source_filename = "test.c"
target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
`@A` = external dso_local global i32, align 4
`@B` = dso_local global i32 0, align 4
; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32* `@a()` #0 {
ret i32* `@A`
}
; Function Attrs: noinline nounwind optnone uwtable
define dso_local i32* `@b()` #0 {
ret i32* `@B`
}
attributes #0 = { noinline nounwind optnone uwtable "disable-tail-calls"="false" "frame-pointer"="all" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+cx8,+fxsr,+mmx,+sse,+sse2,+x87" "tune-cpu"="generic" "unsafe-fp-math"="false" "use-soft-float"="false" }
!llvm.module.flags = !{!0}
!llvm.ident = !{!1}
!0 = !{i32 1, !"wchar_size", i32 4}
!1 = !{!"clang version 12.0.0 (https://github.com/llvm/llvm-project/ b978a93635b584db380274d7c8963c73989944a1)"}
```
`clang-12 -fno-PIC -c test.c`
`objdump test.o -r`:
```
test.o: file format elf64-x86-64
RELOCATION RECORDS FOR [.text]:
OFFSET TYPE VALUE
0000000000000006 R_X86_64_64 A
0000000000000016 R_X86_64_64 B
RELOCATION RECORDS FOR [.eh_frame]:
OFFSET TYPE VALUE
0000000000000020 R_X86_64_PC32 .text
0000000000000040 R_X86_64_PC32 .text+0x0000000000000010
```
## Comparison to pre-LLVM 12 output
`rustc --emit=obj,llvm-ir --target=x86_64-unknown-none-linuxkernel --crate-type rlib test.rs`
```Rust
#![feature(no_core, lang_items)]
#![no_core]
#[lang="sized"]
trait Sized {}
#[lang="sync"]
trait Sync {}
#[lang = "drop_in_place"]
pub unsafe fn drop_in_place<T: ?Sized>(_: *mut T) {}
impl Sync for i32 {}
pub static STATIC: i32 = 32;
extern {
pub static EXT_STATIC: i32;
}
pub fn a() -> &'static i32 {
&STATIC
}
pub fn b() -> &'static i32 {
unsafe {&EXT_STATIC}
}
```
`objdump test.o -r`
nightly-2021-02-20 (rustc target is `x86_64-linux-kernel`):
```
RELOCATION RECORDS FOR [.text._ZN4test1a17h1024ba65f3424175E]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_32S _ZN4test6STATIC17h3adc41a83746c9ffE
RELOCATION RECORDS FOR [.text._ZN4test1b17h86a6a80c1190ac8dE]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_32S EXT_STATIC
```
nightly-2021-05-10:
```
RELOCATION RECORDS FOR [.text._ZN4test1a17he846f03bf37b2d20E]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_GOTPCREL _ZN4test6STATIC17h5a059515bf3d4968E-0x0000000000000004
RELOCATION RECORDS FOR [.text._ZN4test1b17h7e0f7f80fbd91125E]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_GOTPCREL EXT_STATIC-0x0000000000000004
```
This PR:
```
RELOCATION RECORDS FOR [.text._ZN4test1a17he846f03bf37b2d20E]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_32S _ZN4test6STATIC17h5a059515bf3d4968E
RELOCATION RECORDS FOR [.text._ZN4test1b17h7e0f7f80fbd91125E]:
OFFSET TYPE VALUE
0000000000000007 R_X86_64_32S EXT_STATIC
```
# Stabilization report
## Summary
This stabilizes using macro expansion in key-value attributes, like so:
```rust
#[doc = include_str!("my_doc.md")]
struct S;
#[path = concat!(env!("OUT_DIR"), "/generated.rs")]
mod m;
```
See the changes to the reference for details on what macros are allowed;
see Petrochenkov's excellent blog post [on internals](https://internals.rust-lang.org/t/macro-expansion-points-in-attributes/11455)
for alternatives that were considered and rejected ("why accept no more
and no less?")
This has been available on nightly since 1.50 with no major issues.
## Notes
### Accepted syntax
The parser accepts arbitrary Rust expressions in this position, but any expression other than a macro invocation will ultimately lead to an error because it is not expected by the built-in expression forms (e.g., `#[doc]`). Note that decorators and the like may be able to observe other expression forms.
### Expansion ordering
Expansion of macro expressions in "inert" attributes occurs after decorators have executed, analogously to macro expressions appearing in the function body or other parts of decorator input.
There is currently no way for decorators to accept macros in key-value position if macro expansion must be performed before the decorator executes (if the macro can simply be copied into the output for later expansion, that can work).
## Test cases
- https://github.com/rust-lang/rust/blob/master/src/test/ui/attributes/key-value-expansion-on-mac.rs
- https://github.com/rust-lang/rust/blob/master/src/test/rustdoc/external-doc.rs
The feature has also been dogfooded extensively in the compiler and
standard library:
- https://github.com/rust-lang/rust/pull/83329
- https://github.com/rust-lang/rust/pull/83230
- https://github.com/rust-lang/rust/pull/82641
- https://github.com/rust-lang/rust/pull/80534
## Implementation history
- Initial proposal: https://github.com/rust-lang/rust/issues/55414#issuecomment-554005412
- Experiment to see how much code it would break: https://github.com/rust-lang/rust/pull/67121
- Preliminary work to restrict expansion that would conflict with this
feature: https://github.com/rust-lang/rust/pull/77271
- Initial implementation: https://github.com/rust-lang/rust/pull/78837
- Fix for an ICE: https://github.com/rust-lang/rust/pull/80563
## Unresolved Questions
~~https://github.com/rust-lang/rust/pull/83366#issuecomment-805180738 listed some concerns, but they have been resolved as of this final report.~~
## Additional Information
There are two workarounds that have a similar effect for `#[doc]`
attributes on nightly. One is to emulate this behavior by using a limited version of this feature that was stabilized for historical reasons:
```rust
macro_rules! forward_inner_docs {
($e:expr => $i:item) => {
#[doc = $e]
$i
};
}
forward_inner_docs!(include_str!("lib.rs") => struct S {});
```
This also works for other attributes (like `#[path = concat!(...)]`).
The other is to use `doc(include)`:
```rust
#![feature(external_doc)]
#[doc(include = "lib.rs")]
struct S {}
```
The first works, but is non-trivial for people to discover, and
difficult to read and maintain. The second is a strange special-case for
a particular use of the macro. This generalizes it to work for any use
case, not just including files.
I plan to remove `doc(include)` when this is stabilized. The
`forward_inner_docs` workaround will still compile without warnings, but
I expect it to be used less once it's no longer necessary.
Remove CrateNum parameter for queries that only work on local crate
The pervasive `CrateNum` parameter is a remnant of the multi-crate rustc idea.
Using `()` as query key in those cases avoids having to worry about the validity of the query key.
Use the object crate for metadata reading
This allows sharing the metadata reader between cg_llvm, cg_clif and other codegen backends.
This is not currently useful for rlib reading with cg_spirv ([rust-gpu](https://github.com/EmbarkStudios/rust-gpu/)) as it uses tar rather than ar as .rlib format, but it is useful for dylib reading required for loading proc macros. (cc `@eddyb)`
The object crate is already trusted as dependency of libstd through backtrace. As far as I know it supports reading all object file formats used by targets for which we support rust dylibs with crate metadata, but I am not certain. If this happens to not be the case, I could keep using LLVM for reading dylib metadata.
Marked as WIP for a perf run and as it is based on #83637.
Add asm!() support for PowerPC
This includes GPRs and FPRs only.
Note that this does not include PowerPC64.
For my reference, this was mostly duplicated from PR #73214.
Fix `--remap-path-prefix` not correctly remapping `rust-src` component paths and unify handling of path mapping with virtualized paths
This PR fixes#73167 ("Binaries end up containing path to the rust-src component despite `--remap-path-prefix`") by preventing real local filesystem paths from reaching compilation output if the path is supposed to be remapped.
`RealFileName::Named` introduced in #72767 is now renamed as `LocalPath`, because this variant wraps a (most likely) valid local filesystem path.
`RealFileName::Devirtualized` is renamed as `Remapped` to be used for remapped path from a real path via `--remap-path-prefix` argument, as well as real path inferred from a virtualized (during compiler bootstrapping) `/rustc/...` path. The `local_path` field is now an `Option<PathBuf>`, as it will be set to `None` before serialisation, so it never reaches any build output. Attempting to serialise a non-`None` `local_path` will cause an assertion faliure.
When a path is remapped, a `RealFileName::Remapped` variant is created. The original path is preserved in `local_path` field and the remapped path is saved in `virtual_name` field. Previously, the `local_path` is directly modified which goes against its purpose of "suitable for reading from the file system on the local host".
`rustc_span::SourceFile`'s fields `unmapped_path` (introduced by #44940) and `name_was_remapped` (introduced by #41508 when `--remap-path-prefix` feature originally added) are removed, as these two pieces of information can be inferred from the `name` field: if it's anything other than a `FileName::Real(_)`, or if it is a `FileName::Real(RealFileName::LocalPath(_))`, then clearly `name_was_remapped` would've been false and `unmapped_path` would've been `None`. If it is a `FileName::Real(RealFileName::Remapped{local_path, virtual_name})`, then `name_was_remapped` would've been true and `unmapped_path` would've been `Some(local_path)`.
cc `@eddyb` who implemented `/rustc/...` path devirtualisation
This doesn't seem to be necessary anymore, although I don't know
at which point or why that changed.
Forcing -O1 makes some tests fail under NewPM, because NewPM also
performs inlining at -O1, so it ends up performing much more
optimization in practice than before.
rustc: Support Rust-specific features in -Ctarget-feature
Since the beginning of time the `-Ctarget-feature` flag on the command
line has largely been passed unmodified to LLVM. Afterwards, though, the
`#[target_feature]` attribute was stabilized and some of the names in
this attribute do not match the corresponding LLVM name. This is because
Rust doesn't always want to stabilize the exact feature name in LLVM for
the equivalent functionality in Rust. This creates a situation, however,
where in Rust you'd write:
#[target_feature(enable = "pclmulqdq")]
unsafe fn foo() {
// ...
}
but on the command line you would write:
RUSTFLAGS="-Ctarget-feature=+pclmul" cargo build --release
This difference is somewhat odd to deal with if you're a newcomer and
the situation may be made worse with upcoming features like [WebAssembly
SIMD](https://github.com/rust-lang/rust/issues/74372) which may be more
prevalent.
This commit implements a mapping to translate requests via
`-Ctarget-feature` through the same name-mapping functionality that's
present for attributes in Rust going to LLVM. This means that
`+pclmulqdq` will work on x86 targets where as previously it did not.
I've attempted to keep this backwards-compatible where the compiler will
just opportunistically attempt to remap features found in
`-Ctarget-feature`, but if there's something it doesn't understand it
gets passed unmodified to LLVM just as it was before.
Removes unneeded check of `#[no_coverage]` in mapgen
There is an anticipated feature request to support a compiler flag that
only adds coverage for specific files (or perhaps mods). As I thought
about where that change would need to be supported, I realized that
checking the attribute in mapgen (for unused functions) was unnecessary.
The unused functions are only synthesized if they have MIR coverage, and
functions with the `no_coverage` attribute will not have been
instrumented with MIR coverage statements in the first place.
New tests confirm this.
Also, while adding tests, I updated resolved comments and FIXMEs in
other tests, and expanded comments and tests on one remaining issue that
is still not resolved.
r? `@tmandry`
cc: `@wesleywiser`
And adds tests to validate it still works.
There is an anticipated feature request to support a compiler flag that
only adds coverage for specific files (or perhaps mods). As I thought
about where that change would need to be supported, I realized that
checking the attribute in mapgen (for unused functions) was unnecessary.
The unused functions are only synthesized if they have MIR coverage, and
functions with the `no_coverage` attribute will not have been
instrumented with MIR coverage statements in the first place.
New tests confirm this.
Also, while adding tests, I updated resolved comments and FIXMEs in
other tests.
Since the beginning of time the `-Ctarget-feature` flag on the command
line has largely been passed unmodified to LLVM. Afterwards, though, the
`#[target_feature]` attribute was stabilized and some of the names in
this attribute do not match the corresponding LLVM name. This is because
Rust doesn't always want to stabilize the exact feature name in LLVM for
the equivalent functionality in Rust. This creates a situation, however,
where in Rust you'd write:
#[target_feature(enable = "pclmulqdq")]
unsafe fn foo() {
// ...
}
but on the command line you would write:
RUSTFLAGS="-Ctarget-feature=+pclmul" cargo build --release
This difference is somewhat odd to deal with if you're a newcomer and
the situation may be made worse with upcoming features like [WebAssembly
SIMD](https://github.com/rust-lang/rust/issues/74372) which may be more
prevalent.
This commit implements a mapping to translate requests via
`-Ctarget-feature` through the same name-mapping functionality that's
present for attributes in Rust going to LLVM. This means that
`+pclmulqdq` will work on x86 targets where as previously it did not.
I've attempted to keep this backwards-compatible where the compiler will
just opportunistically attempt to remap features found in
`-Ctarget-feature`, but if there's something it doesn't understand it
gets passed unmodified to LLVM just as it was before.
This commit implements both the native linking modifiers infrastructure
as well as an initial attempt at the individual modifiers from the RFC.
It also introduces a feature flag for the general syntax along with
individual feature flags for each modifier.
Fix debuginfo for generators
First, all fields except the discriminant (including `outer_fields`) should be put into structures inside the variant part, which gives an equivalent layout but offers us much better integration with debuggers.
Second, artificial flags in generator variants should be removed.
- Literally, variants are not artificial. We have `yield` statements, upvars and inner variables in the source code.
- Functionally, we don't want debuggers to suppress the variants. It contains the state of the generator, which is useful when debugging. So they shouldn't be marked artificial.
- Debuggers may use artificial flags to find the active variant. In this case, marking variants artificial will make debuggers not work properly.
Fixes#62572.
Fixes#79009.
And refer https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Debuginfo.20for.20generators.
- Literally, variants are not artificial. We have `yield` statements,
upvars and inner variables in the source code.
- Functionally, we don't want debuggers to suppress the variants. It
contains the state of the generator, which is useful when debugging.
So they shouldn't be marked artificial.
- Debuggers may use artificial flags to find the active variant. In
this case, marking variants artificial will make debuggers not work
properly.
Fixes#79009.
All fields except the discriminant (including `outer_fields`)
should be put into structures inside the variant part, which gives
an equivalent layout but offers us much better integration with
debuggers.
Implement RFC 1260 with feature_name `imported_main`.
This is the second extraction part of #84062 plus additional adjustments.
This (mostly) implements RFC 1260.
However there's still one test case failure in the extern crate case. Maybe `LocalDefId` doesn't work here? I'm not sure.
cc https://github.com/rust-lang/rust/issues/28937
r? `@petrochenkov`
The Eq trait has a special hidden function. MIR `InstrumentCoverage`
would add this function to the coverage map, but it is never called, so
the `Eq` trait would always appear uncovered.
Fixes: #83601
The fix required creating a new function attribute `no_coverage` to mark
functions that should be ignored by `InstrumentCoverage` and the
coverage `mapgen` (during codegen).
While testing, I also noticed two other issues:
* spanview debug file output ICEd on a function with no body. The
workaround for this is included in this PR.
* `assert_*!()` macro coverage can appear covered if followed by another
`assert_*!()` macro. Normally they appear uncovered. I submitted a new
Issue #84561, and added a coverage test to demonstrate this issue.
This commit updates rustc, with an applicable LLVM version, to use
LLVM's new `llvm.fpto{u,s}i.sat.*.*` intrinsics to implement saturating
floating-point-to-int conversions. This results in a little bit tighter
codegen for x86/x86_64, but the main purpose of this is to prepare for
upcoming changes to the WebAssembly backend in LLVM where wasm's
saturating float-to-int instructions will now be implemented with these
intrinsics.
This change allows simplifying a good deal of surrounding code, namely
removing a lot of wasm-specific behavior. WebAssembly no longer has any
special-casing of saturating arithmetic instructions and the need for
`fptoint_may_trap` is gone and all handling code for that is now
removed. This means that the only wasm-specific logic is in the
`fpto{s,u}i` instructions which only get used for "out of bounds is
undefined behavior". This does mean that for the WebAssembly target
specifically the Rust compiler will no longer be 100% compatible with
pre-LLVM 12 versions, but it seems like that's unlikely to be relied on
by too many folks.
Note that this change does immediately regress the codegen of saturating
float-to-int casts on WebAssembly due to the specialization of the LLVM
intrinsic not being present in our LLVM fork just yet. I'll be following
up with an LLVM update to pull in those patches, but affects a few other
SIMD things in flight for WebAssembly so I wanted to separate this change.
Eventually the entire `cast_float_to_int` function can be removed when
LLVM 12 is the minimum version, but that will require sinking the
complexity of it into other backends such as Cranelfit.
`fast-math` implies things like functions not being able to accept as an
argument or return as a result, say, `inf` which made these functions
confusingly named or behaving incorrectly, depending on how you
interpret it. Since the time when these intrinsics have been implemented
the intrinsics user's (stdsimd) approach has changed significantly and
so now it is required that these intrinsics operate normally rather than
in "whatever" way.
Fixes#84268
LLVM supports many functions from math.h in its IR. Many of these have
single-instruction variants on various platforms. So, let's add them so
std::arch can use them.
Yes, exact comparison is intentional: rounding must always return a
valid integer-equal value, except for inf/NAN.
Categorize and explain target features support
There are 3 different uses of the `-C target-feature` args passed to rustc:
1. All of the features are passed to LLVM, which uses them to configure code-generation. This is sort-of stabilized since 1.0 though LLVM does change/add/remove target features regularly.
2. Target features which are in [the compiler's allowlist](69e1d22ddb/compiler/rustc_codegen_ssa/src/target_features.rs (L12-L34)) can be used in `cfg!(target_feature)` etc. These may have different names than in LLVM and are renamed before passing them to LLVM.
3. Target features which are in the allowlist and which are stabilized or feature-gate-enabled can be used in `#[target_feature]`.
It can be confusing that `rustc --print target-features` just prints out the LLVM features without separating out the rustc features or even mentioning that the dichotomy exists.
This improves the situation by separating out the rustc and LLVM target features and adding a brief explanation about the difference.
Abbreviated Example Output:
```
$ rustc --print target-features
Features supported by rustc for this target:
adx - Support ADX instructions.
aes - Enable AES instructions.
...
xsaves - Support xsaves instructions.
crt-static - Enables libraries with C Run-time Libraries(CRT) to be statically linked.
Code-generation features supported by LLVM for this target:
16bit-mode - 16-bit mode (i8086).
32bit-mode - 32-bit mode (80386).
...
x87 - Enable X87 float instructions.
xop - Enable XOP instructions.
Use +feature to enable a feature, or -feature to disable it.
For example, rustc -C target-cpu=mycpu -C target-feature=+feature1,-feature2
Code-generation features cannot be used in cfg or #[target_feature],
and may be renamed or removed in a future version of LLVM or rustc.
```
Motivated by #83975.
CC https://github.com/rust-lang/rust/issues/49653
This commit implements the idea of a new ABI for the WebAssembly target,
one called `"wasm"`. This ABI is entirely of my own invention
and has no current precedent, but I think that the addition of this ABI
might help solve a number of issues with the WebAssembly targets.
When `wasm32-unknown-unknown` was first added to Rust I naively
"implemented an abi" for the target. I then went to write `wasm-bindgen`
which accidentally relied on details of this ABI. Turns out the ABI
definition didn't match C, which is causing issues for C/Rust interop.
Currently the compiler has a "wasm32 bindgen compat" ABI which is the
original implementation I added, and it's purely there for, well,
`wasm-bindgen`.
Another issue with the WebAssembly target is that it's not clear to me
when and if the default C ABI will change to account for WebAssembly's
multi-value feature (a feature that allows functions to return multiple
values). Even if this does happen, though, it seems like the C ABI will
be guided based on the performance of WebAssembly code and will likely
not match even what the current wasm-bindgen-compat ABI is today. This
leaves a hole in Rust's expressivity in binding WebAssembly where given
a particular import type, Rust may not be able to import that signature
with an updated C ABI for multi-value.
To fix these issues I had the idea of a new ABI for WebAssembly, one
called `wasm`. The definition of this ABI is "what you write
maps straight to wasm". The goal here is that whatever you write down in
the parameter list or in the return values goes straight into the
function's signature in the WebAssembly file. This special ABI is for
intentionally matching the ABI of an imported function from the
environment or exporting a function with the right signature.
With the addition of a new ABI, this enables rustc to:
* Eventually remove the "wasm-bindgen compat hack". Once this
ABI is stable wasm-bindgen can switch to using it everywhere.
Afterwards the wasm32-unknown-unknown target can have its default ABI
updated to match C.
* Expose the ability to precisely match an ABI signature for a
WebAssembly function, regardless of what the C ABI that clang chooses
turns out to be.
* Continue to evolve the definition of the default C ABI to match what
clang does on all targets, since the purpose of that ABI will be
explicitly matching C rather than generating particular function
imports/exports.
Naturally this is implemented as an unstable feature initially, but it
would be nice for this to get stabilized (if it works) in the near-ish
future to remove the wasm32-unknown-unknown incompatibility with the C
ABI. Doing this, however, requires the feature to be on stable because
wasm-bindgen works with stable Rust.
Allow specifying alignment for functions
Fixes#75072
This allows the user to specify alignment for functions, which can be useful for low level work where functions need to necessarily be aligned to a specific value.
I believe the error cases not covered in the match are caught earlier based on my testing so I had them just return `None`.
Set dso_local for hidden, private and local items
This should probably have no real effect in most cases, as e.g. `hidden`
visibility already implies `dso_local` (or at least LLVM IR does not
preserve the `dso_local` setting if the item is already `hidden`), but
it should fix `-Crelocation-model=static` and improve codegen in
executables.
Note that this PR does not exhaustively port the logic in [clang], only the
portion that is necessary to fix a regression from LLVM 12 that relates to
`-Crelocation_model=static`.
Fixes#83335
[clang]: 3001d080c8/clang/lib/CodeGen/CodeGenModule.cpp (L945-L1039)
Use FromStr trait for number option parsing
Replace `parse_uint` with generic `parse_number` based on `FromStr`.
Use it for parsing inlining threshold to avoid casting later.
Allow clobbering unsupported registers in asm!
Previously registers could only be marked as clobbered if the target feature for that register was enabled. This restriction is now removed.
cc #81092
r? ``@nagisa``
Translate counters from Rust 1-based to LLVM 0-based counter ids
A colleague contacted me and asked why Rust's counters start at 1, when
Clangs appear to start at 0. There is a reason why Rust's internal
counters start at 1 (see the docs), and I tried to keep them consistent
when codegenned to LLVM's coverage mapping format. LLVM should be
tolerant of missing counters, but as my colleague pointed out,
`llvm-cov` will silently fail to generate a coverage report for a
function based on LLVM's assumption that the counters are 0-based.
See:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp#L170
Apparently, if, for example, a function has no branches, it would have
exactly 1 counter. `CounterValues.size()` would be 1, and (with the
1-based index), the counter ID would be 1. This would fail the check
and abort reporting coverage for the function.
It turns out that by correcting for this during coverage map generation,
by subtracting 1 from the Rust Counter ID (both when generating the
counter increment intrinsic call, and when adding counters to the map),
some uncovered functions (including in tests) now appear covered! This
corrects the coverage for a few tests!
r? `@tmandry`
FYI: `@wesleywiser`
A colleague contacted me and asked why Rust's counters start at 1, when
Clangs appear to start at 0. There is a reason why Rust's internal
counters start at 1 (see the docs), and I tried to keep them consistent
when codegenned to LLVM's coverage mapping format. LLVM should be
tolerant of missing counters, but as my colleague pointed out,
`llvm-cov` will silently fail to generate a coverage report for a
function based on LLVM's assumption that the counters are 0-based.
See:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp#L170
Apparently, if, for example, a function has no branches, it would have
exactly 1 counter. `CounterValues.size()` would be 1, and (with the
1-based index), the counter ID would be 1. This would fail the check
and abort reporting coverage for the function.
It turns out that by correcting for this during coverage map generation,
by subtracting 1 from the Rust Counter ID (both when generating the
counter increment intrinsic call, and when adding counters to the map),
some uncovered functions (including in tests) now appear covered! This
corrects the coverage for a few tests!
This should have no real effect in most cases, as e.g. `hidden`
visibility already implies `dso_local` (or at least LLVM IR does not
preserve the `dso_local` setting if the item is already `hidden`), but
it should fix `-Crelocation-model=static` and improve codegen in
executables.
Note that this PR does not exhaustively port the logic in [clang]. Only
the obviously correct portion and what is necessary to fix a regression
from LLVM 12 that relates to `-Crelocation_model=static`.
Fixes#83335
[clang]: 3001d080c8/clang/lib/CodeGen/CodeGenModule.cpp (L945-L1039)
Run LLVM coverage instrumentation passes before optimization passes
This matches the behavior of Clang and allows us to remove several
hacks which were needed to ensure functions weren't optimized away
before reaching the instrumentation pass.
Fixes#83429
cc `@richkadel`
r? `@tmandry`
This matches the behavior of Clang and allows us to remove several
hacks which were needed to ensure functions weren't optimized away
before reaching the instrumentation pass.
- Add back `HirIdVec`, with a comment that it will soon be used.
- Add back `*_region` functions, with a comment they may soon be used.
- Remove `-Z borrowck_stats` completely. It didn't do anything.
- Remove `make_nop` completely.
- Add back `current_loc`, which is used by an out-of-tree tool.
- Fix style nits
- Remove `AtomicCell` with `cfg(parallel_compiler)` for consistency.
Found with https://github.com/est31/warnalyzer.
Dubious changes:
- Is anyone else using rustc_apfloat? I feel weird completely deleting
x87 support.
- Maybe some of the dead code in rustc_data_structures, in case someone
wants to use it in the future?
- Don't change rustc_serialize
I plan to scrap most of the json module in the near future (see
https://github.com/rust-lang/compiler-team/issues/418) and fixing the
tests needed more work than I expected.
TODO: check if any of the comments on the deleted code should be kept.
Import small cold functions
The Rust code is often written under an assumption that for generic
methods inline attribute is mostly unnecessary, since for optimized
builds using ThinLTO, a method will be code generated in at least one
CGU and available for import.
For example, deref implementations for Box, Vec, MutexGuard, and
MutexGuard are not currently marked as inline, neither is identity
implementation of From trait.
In PGO builds, when functions are determined to be cold, the default
multiplier of zero will stop the import, no matter how trivial the
implementation.
Increase slightly the default multiplier from 0 to 0.1.
r? `@ghost`
coverage bug fixes and optimization support
Adjusted LLVM codegen for code compiled with `-Zinstrument-coverage` to
address multiple, somewhat related issues.
Fixed a significant flaw in prior coverage solution: Every counter
generated a new counter variable, but there should have only been one
counter variable per function. This appears to have bloated .profraw
files significantly. (For a small program, it increased the size by
about 40%. I have not tested large programs, but there is anecdotal
evidence that profraw files were way too large. This is a good fix,
regardless, but hopefully it also addresses related issues.
Fixes: #82144
Invalid LLVM coverage data produced when compiled with -C opt-level=1
Existing tests now work up to at least `opt-level=3`. This required a
detailed analysis of the LLVM IR, comparisons with Clang C++ LLVM IR
when compiled with coverage, and a lot of trial and error with codegen
adjustments.
The biggest hurdle was figuring out how to continue to support coverage
results for unused functions and generics. Rust's coverage results have
three advantages over Clang's coverage results:
1. Rust's coverage map does not include any overlapping code regions,
making coverage counting unambiguous.
2. Rust generates coverage results (showing zero counts) for all unused
functions, including generics. (Clang does not generate coverage for
uninstantiated template functions.)
3. Rust's unused functions produce minimal stubbed functions in LLVM IR,
sufficient for including in the coverage results; while Clang must
generate the complete LLVM IR for each unused function, even though
it will never be called.
This PR removes the previous hack of attempting to inject coverage into
some other existing function instance, and generates dedicated instances
for each unused function. This change, and a few other adjustments
(similar to what is required for `-C link-dead-code`, but with lower
impact), makes it possible to support LLVM optimizations.
Fixes: #79651
Coverage report: "Unexecuted instantiation:..." for a generic function
from multiple crates
Fixed by removing the aforementioned hack. Some "Unexecuted
instantiation" notices are unavoidable, as explained in the
`used_crate.rs` test, but `-Zinstrument-coverage` has new options to
back off support for either unused generics, or all unused functions,
which avoids the notice, at the cost of less coverage of unused
functions.
Fixes: #82875
Invalid LLVM coverage data produced with crate brotli_decompressor
Fixed by disabling the LLVM function attribute that forces inlining, if
`-Z instrument-coverage` is enabled. This attribute is applied to
Rust functions with `#[inline(always)], and in some cases, the forced
inlining breaks coverage instrumentation and reports.
FYI: `@wesleywiser`
r? `@tmandry`
The frontend shouldn't be deciding whether or not to use mutable
noalias attributes, as this is a pure LLVM concern. Only provide
the necessary information and do the actual decision in
codegen_llvm.
* Use Markdown list syntax and unindent a bit to prevent Markdown
interpreting the nested lists as code blocks
* A few more small typographical cleanups
Adjusted LLVM codegen for code compiled with `-Zinstrument-coverage` to
address multiple, somewhat related issues.
Fixed a significant flaw in prior coverage solution: Every counter
generated a new counter variable, but there should have only been one
counter variable per function. This appears to have bloated .profraw
files significantly. (For a small program, it increased the size by
about 40%. I have not tested large programs, but there is anecdotal
evidence that profraw files were way too large. This is a good fix,
regardless, but hopefully it also addresses related issues.
Fixes: #82144
Invalid LLVM coverage data produced when compiled with -C opt-level=1
Existing tests now work up to at least `opt-level=3`. This required a
detailed analysis of the LLVM IR, comparisons with Clang C++ LLVM IR
when compiled with coverage, and a lot of trial and error with codegen
adjustments.
The biggest hurdle was figuring out how to continue to support coverage
results for unused functions and generics. Rust's coverage results have
three advantages over Clang's coverage results:
1. Rust's coverage map does not include any overlapping code regions,
making coverage counting unambiguous.
2. Rust generates coverage results (showing zero counts) for all unused
functions, including generics. (Clang does not generate coverage for
uninstantiated template functions.)
3. Rust's unused functions produce minimal stubbed functions in LLVM IR,
sufficient for including in the coverage results; while Clang must
generate the complete LLVM IR for each unused function, even though
it will never be called.
This PR removes the previous hack of attempting to inject coverage into
some other existing function instance, and generates dedicated instances
for each unused function. This change, and a few other adjustments
(similar to what is required for `-C link-dead-code`, but with lower
impact), makes it possible to support LLVM optimizations.
Fixes: #79651
Coverage report: "Unexecuted instantiation:..." for a generic function
from multiple crates
Fixed by removing the aforementioned hack. Some "Unexecuted
instantiation" notices are unavoidable, as explained in the
`used_crate.rs` test, but `-Zinstrument-coverage` has new options to
back off support for either unused generics, or all unused functions,
which avoids the notice, at the cost of less coverage of unused
functions.
Fixes: #82875
Invalid LLVM coverage data produced with crate brotli_decompressor
Fixed by disabling the LLVM function attribute that forces inlining, if
`-Z instrument-coverage` is enabled. This attribute is applied to
Rust functions with `#[inline(always)], and in some cases, the forced
inlining breaks coverage instrumentation and reports.
Make source-based code coverage compatible with MIR inlining
When codegenning code coverage use the instance that coverage data was
originally generated for, to ensure basic level of compatibility with
MIR inlining.
Fixes#83061