debuginfo: Refactor debuginfo generation for types
This PR implements the refactoring of the `rustc_codegen_llvm::debuginfo::metadata` module as described in MCP https://github.com/rust-lang/compiler-team/issues/482.
In particular it
- changes names to use `di_node` instead of `metadata`
- uniformly names all functions that build new debuginfo nodes `build_xyz_di_node`
- renames `CrateDebugContext` to `CodegenUnitDebugContext` (which is more accurate)
- removes outdated parts from `compiler/rustc_codegen_llvm/src/debuginfo/doc.md`
- moves `TypeMap` and functions that work directly work with it to a new `type_map` module
- moves enum related builder functions to a new `enums` module
- splits enum debuginfo building for the native and cpp-like cases, since they are mostly separate
- uses `SmallVec` instead of `Vec` in many places
- removes the old infrastructure for dealing with recursion cycles (`create_and_register_recursive_type_forward_declaration()`, `RecursiveTypeDescription`, `set_members_of_composite_type()`, `MemberDescription`, `MemberDescriptionFactory`, `prepare_xyz_metadata()`, etc)
- adds `type_map::build_type_with_children()` as a replacement for dealing with recursion cycles
- adds many (doc-)comments explaining what's going on
- changes cpp-like naming for C-Style enums so they don't get a `enum$<...>` name (because the NatVis visualizer does not apply to them)
- fixes detection of what is a C-style enum because some enums where classified as C-style even though they have fields
- changes cpp-like naming for generator enums so that NatVis works for them
- changes the position of discriminant debuginfo node so it is consistently nested inside the top-level union instead of, sometimes, next to it
The following could be done in subsequent PRs:
- add caching for `closure_saved_names_of_captured_variables`
- add caching for `generator_layout_and_saved_local_names`
- fix inconsistent handling of what is considered a C-style enum wrt to debuginfo
- rename `metadata` module to `types`
- move common generator fields to front instead of appending them
This PR is based on https://github.com/rust-lang/rust/pull/93644 which is not merged yet.
Right now, the changes are all done in one big commit. They could be split into smaller commits but hopefully the list of changes above makes it tractable to review them as a single commit too.
For now: r? `@ghost` (let's see if this affects compile times)
This is a continuation of #60109, which noted that while the ADX
intrinsics were stabilized, the corresponding target feature never was.
This PR follows the same general structure and stabilizes the ADX target
feature.
This commit
- changes names to use di_node instead of metadata
- uniformly names all functions that build new debuginfo nodes build_xyz_di_node
- renames CrateDebugContext to CodegenUnitDebugContext (which is more accurate)
- moves TypeMap and functions that work directly work with it to a new type_map module
- moves and reimplements enum related builder functions to a new enums module
- splits enum debuginfo building for the native and cpp-like cases, since they are mostly separate
- uses SmallVec instead of Vec in many places
- removes the old infrastructure for dealing with recursion cycles (create_and_register_recursive_type_forward_declaration(), RecursiveTypeDescription, set_members_of_composite_type(), MemberDescription, MemberDescriptionFactory, prepare_xyz_metadata(), etc)
- adds type_map::build_type_with_children() as a replacement for dealing with recursion cycles
- adds many (doc-)comments explaining what's going on
- changes cpp-like naming for C-Style enums so they don't get a enum$<...> name (because the NatVis visualizer does not apply to them)
- fixes detection of what is a C-style enum because some enums where classified as C-style even though they have fields
- changes the position of discriminant debuginfo node so it is consistently nested inside the top-level union instead of, sometimes, next to it
This commit makes `AdtDef` use `Interned`. Much the commit is tedious
changes to introduce getter functions. The interesting changes are in
`compiler/rustc_middle/src/ty/adt.rs`.
`Layout` is another type that is sometimes interned, sometimes not, and
we always use references to refer to it so we can't take any advantage
of the uniqueness properties for hashing or equality checks.
This commit renames `Layout` as `LayoutS`, and then introduces a new
`Layout` that is a newtype around an `Interned<LayoutS>`. It also
interns more layouts than before. Previously layouts within layouts
(via the `variants` field) were never interned, but now they are. Hence
the lifetime on the new `Layout` type.
Unlike other interned types, these ones are in `rustc_target` instead of
`rustc_middle`. This reflects the existing structure of the code, which
does layout-specific stuff in `rustc_target` while `TyAndLayout` is
generic over the `Ty`, allowing the type-specific stuff to occur in
`rustc_middle`.
The commit also adds a `HashStable` impl for `Interned`, which was
needed. It hashes the contents, unlike the `Hash` impl which hashes the
pointer.
Currently some `Allocation`s are interned, some are not, and it's very
hard to tell at a use point which is which.
This commit introduces `ConstAllocation` for the known-interned ones,
which makes the division much clearer. `ConstAllocation::inner()` is
used to get the underlying `Allocation`.
In some places it's natural to use an `Allocation`, in some it's natural
to use a `ConstAllocation`, and in some places there's no clear choice.
I've tried to make things look as nice as possible, while generally
favouring `ConstAllocation`, which is the type that embodies more
information. This does require quite a few calls to `inner()`.
The commit also tweaks how `PartialOrd` works for `Interned`. The
previous code was too clever by half, building on `T: Ord` to make the
code shorter. That caused problems with deriving `PartialOrd` and `Ord`
for `ConstAllocation`, so I changed it to build on `T: PartialOrd`,
which is slightly more verbose but much more standard and avoided the
problems.
ARM: Only allow using d16-d31 with asm! when supported by the target
Support can be determined by checking for the "d32" LLVM feature.
r? ```````````````@nagisa```````````````
Improve allowness of the unexpected_cfgs lint
This pull-request improve the allowness (`#[allow(...)]`) of the `unexpected_cfgs` lint.
Before this PR only crate level `#![allow(unexpected_cfgs)]` worked, now with this PR it also work when put around `cfg!` or if it is in a upper level. Making it work ~for the attributes `cfg`, `cfg_attr`, ...~ for the same level is awkward as the current code is design to give "Some parent node that is close to this macro call" (cf. https://doc.rust-lang.org/nightly/nightly-rustc/rustc_expand/base/struct.ExpansionData.html) meaning that allow on the same line as an attribute won't work. I'm note even sure if this would be possible.
Found while working on https://github.com/rust-lang/rust/pull/94298.
r? ````````@petrochenkov````````
Rollup of 9 pull requests
Successful merges:
- #94464 (Suggest adding a new lifetime parameter when two elided lifetimes should match up for traits and impls.)
- #94476 (7 - Make more use of `let_chains`)
- #94478 (Fix panic when handling intra doc links generated from macro)
- #94482 (compiler: fix some typos)
- #94490 (Update books)
- #94496 (tests: accept llvm intrinsic in align-checking test)
- #94498 (9 - Make more use of `let_chains`)
- #94503 (Provide C FFI types via core::ffi, not just in std)
- #94513 (update Miri)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Direct users towards using Rust target feature names in CLI
This PR consists of a couple of changes on how we handle target features.
In particular there is a bug-fix wherein we avoid passing through features that aren't prefixed by `+` or `-` to LLVM. These appear to be causing LLVM to assert, which is pretty poor a behaviour (and also makes it pretty clear we expect feature names to be prefixed).
The other commit, I anticipate to be somewhat more controversial is outputting a warning when users specify a LLVM-specific, or otherwise unknown, feature name on the CLI. In those situations we request users to either replace it with a known Rust feature name (e.g. `bmi` -> `bmi1`) or file a feature request. I've a couple motivations for this: first of all, if users are specifying these features on the command line, I'm pretty confident there is also a need for these features to be usable via `#[cfg(target_feature)]` machinery. And second, we're growing a fair number of backends recently and having ability to provide some sort of unified-ish interface in this place seems pretty useful to me.
Sponsored by: standard.ai
If they are trying to use features rustc doesn't yet know about,
request a feature request.
Additionally, also warn against using feature names without leading `+`
or `-` signs.
rustc_errors: let `DiagnosticBuilder::emit` return a "guarantee of emission".
That is, `DiagnosticBuilder` is now generic over the return type of `.emit()`, so we'll now have:
* `DiagnosticBuilder<ErrorReported>` for error (incl. fatal/bug) diagnostics
* can only be created via a `const L: Level`-generic constructor, that limits allowed variants via a `where` clause, so not even `rustc_errors` can accidentally bypass this limitation
* asserts `diagnostic.is_error()` on emission, just in case the construction restriction was bypassed (e.g. by replacing the whole `Diagnostic` inside `DiagnosticBuilder`)
* `.emit()` returns `ErrorReported`, as a "proof" token that `.emit()` was called
(though note that this isn't a real guarantee until after completing the work on
#69426)
* `DiagnosticBuilder<()>` for everything else (warnings, notes, etc.)
* can also be obtained from other `DiagnosticBuilder`s by calling `.forget_guarantee()`
This PR is a companion to other ongoing work, namely:
* #69426
and it's ongoing implementation:
#93222
the API changes in this PR are needed to get statically-checked "only errors produce `ErrorReported` from `.emit()`", but doesn't itself provide any really strong guarantees without those other `ErrorReported` changes
* #93244
would make the choices of API changes (esp. naming) in this PR fit better overall
In order to be able to let `.emit()` return anything trustable, several changes had to be made:
* `Diagnostic`'s `level` field is now private to `rustc_errors`, to disallow arbitrary "downgrade"s from "some kind of error" to "warning" (or anything else that doesn't cause compilation to fail)
* it's still possible to replace the whole `Diagnostic` inside the `DiagnosticBuilder`, sadly, that's harder to fix, but it's unlikely enough that we can paper over it with asserts on `.emit()`
* `.cancel()` now consumes `DiagnosticBuilder`, preventing `.emit()` calls on a cancelled diagnostic
* it's also now done internally, through `DiagnosticBuilder`-private state, instead of having a `Level::Cancelled` variant that can be read (or worse, written) by the user
* this removes a hazard of calling `.cancel()` on an error then continuing to attach details to it, and even expect to be able to `.emit()` it
* warnings were switched to *only* `can_emit_warnings` on emission (instead of pre-cancelling early)
* `struct_dummy` was removed (as it relied on a pre-`Cancelled` `Diagnostic`)
* since `.emit()` doesn't consume the `DiagnosticBuilder` <sub>(I tried and gave up, it's much more work than this PR)</sub>,
we have to make `.emit()` idempotent wrt the guarantees it returns
* thankfully, `err.emit(); err.emit();` can return `ErrorReported` both times, as the second `.emit()` call has no side-effects *only* because the first one did do the appropriate emission
* `&mut Diagnostic` is now used in a lot of function signatures, which used to take `&mut DiagnosticBuilder` (in the interest of not having to make those functions generic)
* the APIs were already mostly identical, allowing for low-effort porting to this new setup
* only some of the suggestion methods needed some rework, to have the extra `DiagnosticBuilder` functionality on the `Diagnostic` methods themselves (that change is also present in #93259)
* `.emit()`/`.cancel()` aren't available, but IMO calling them from an "error decorator/annotator" function isn't a good practice, and can lead to strange behavior (from the caller's perspective)
* `.downgrade_to_delayed_bug()` was added, letting you convert any `.is_error()` diagnostic into a `delay_span_bug` one (which works because in both cases the guarantees available are the same)
This PR should ideally be reviewed commit-by-commit, since there is a lot of fallout in each.
r? `@estebank` cc `@Manishearth` `@nikomatsakis` `@mark-i-m`
Partially move cg_ssa towards using a single builder
Not all codegen backends can handle hopping between blocks well. For example Cranelift requires blocks to be terminated before switching to building a new block. Rust-gpu requires a `RefCell` to allow hopping between blocks and cg_gcc currently has a buggy implementation of hopping between blocks. This PR reduces the amount of cases where cg_ssa switches between blocks before they are finished and mostly fixes the block hopping in cg_gcc. (~~only `scalar_to_backend` doesn't handle it correctly yet in cg_gcc~~ fixed that one.)
`@antoyo` please review the cg_gcc changes.
Fix several asm! related issues
This is a combination of several fixes, each split into a separate commit. Splitting these into PRs is not practical since they conflict with each other.
Fixes#92378Fixes#85247
r? ``@nagisa``
The previous approach of checking for the reserve-r9 target feature
didn't actually work because LLVM only sets this feature very late when
initializing the per-function subtarget.
Move ty::print methods to Drop-based scope guards
Primary goal is reducing codegen of the TLS access for each closure, which shaves ~3 seconds of bootstrap time over rustc as a whole.
Adopt let else in more places
Continuation of #89933, #91018, #91481, #93046, #93590, #94011.
I have extended my clippy lint to also recognize tuple passing and match statements. The diff caused by fixing it is way above 1 thousand lines. Thus, I split it up into multiple pull requests to make reviewing easier. This is the biggest of these PRs and handles the changes outside of rustdoc, rustc_typeck, rustc_const_eval, rustc_trait_selection, which were handled in PRs #94139, #94142, #94143, #94144.
Guard against unwinding in cleanup code
Currently the only safe guard we have against double unwind is the panic count (which is local to Rust). When double unwinds indeed happen (e.g. C++ exception + Rust panic, or two C++ exceptions), then the second unwind actually goes through and the first unwind is leaked. This can cause UB. cc rust-lang/project-ffi-unwind#6
E.g. given the following C++ code:
```c++
extern "C" void foo() {
throw "A";
}
extern "C" void execute(void (*fn)()) {
try {
fn();
} catch(...) {
}
}
```
This program is well-defined to terminate:
```c++
struct dtor {
~dtor() noexcept(false) {
foo();
}
};
void a() {
dtor a;
dtor b;
}
int main() {
execute(a);
return 0;
}
```
But this Rust code doesn't catch the double unwind:
```rust
extern "C-unwind" {
fn foo();
fn execute(f: unsafe extern "C-unwind" fn());
}
struct Dtor;
impl Drop for Dtor {
fn drop(&mut self) {
unsafe { foo(); }
}
}
extern "C-unwind" fn a() {
let _a = Dtor;
let _b = Dtor;
}
fn main() {
unsafe { execute(a) };
}
```
To address this issue, this PR adds an unwind edge to an abort block, so that the Rust example aborts. This is similar to how clang guards against double unwind (except clang calls terminate per C++ spec and we abort).
The cost should be very small; it's an additional trap instruction (well, two for now, since we use TrapUnreachable, but that's a different issue) for each function with landing pads; if LLVM gains support to encode "abort/terminate" info directly in LSDA like GCC does, then it'll be free. It's an additional basic block though so compile time may be worse, so I'd like a perf run.
r? `@ghost`
`@rustbot` label: F-c_unwind
Specifically, rename the `Const` struct as `ConstS` and re-introduce `Const` as
this:
```
pub struct Const<'tcx>(&'tcx Interned<ConstS>);
```
This now matches `Ty` and `Predicate` more closely, including using
pointer-based `eq` and `hash`.
Notable changes:
- `mk_const` now takes a `ConstS`.
- `Const` was copy, despite being 48 bytes. Now `ConstS` is not, so need a
we need separate arena for it, because we can't use the `Dropless` one any
more.
- Many `&'tcx Const<'tcx>`/`&Const<'tcx>` to `Const<'tcx>` changes
- Many `ct.ty` to `ct.ty()` and `ct.val` to `ct.val()` changes.
- Lots of tedious sigil fiddling.
Specifically, change `Ty` from this:
```
pub type Ty<'tcx> = &'tcx TyS<'tcx>;
```
to this
```
pub struct Ty<'tcx>(Interned<'tcx, TyS<'tcx>>);
```
There are two benefits to this.
- It's now a first class type, so we can define methods on it. This
means we can move a lot of methods away from `TyS`, leaving `TyS` as a
barely-used type, which is appropriate given that it's not meant to
be used directly.
- The uniqueness requirement is now explicit, via the `Interned` type.
E.g. the pointer-based `Eq` and `Hash` comes from `Interned`, rather
than via `TyS`, which wasn't obvious at all.
Much of this commit is boring churn. The interesting changes are in
these files:
- compiler/rustc_middle/src/arena.rs
- compiler/rustc_middle/src/mir/visit.rs
- compiler/rustc_middle/src/ty/context.rs
- compiler/rustc_middle/src/ty/mod.rs
Specifically:
- Most mentions of `TyS` are removed. It's very much a dumb struct now;
`Ty` has all the smarts.
- `TyS` now has `crate` visibility instead of `pub`.
- `TyS::make_for_test` is removed in favour of the static `BOOL_TY`,
which just works better with the new structure.
- The `Eq`/`Ord`/`Hash` impls are removed from `TyS`. `Interned`s impls
of `Eq`/`Hash` now suffice. `Ord` is now partly on `Interned`
(pointer-based, for the `Equal` case) and partly on `TyS`
(contents-based, for the other cases).
- There are many tedious sigil adjustments, i.e. adding or removing `*`
or `&`. They seem to be unavoidable.
Split `pauth` target feature
Per discussion on https://github.com/rust-lang/rust/issues/86941 we'd like to split `pauth` into `paca` and `pacg` in order to better support possible future environments that only have the keys available for address or generic authentication. At the moment LLVM has the one `pauth` target_feature while Linux presents separate `paca` and `pacg` flags for feature detection.
Because the use of [target_feature](https://rust-lang.github.io/rfcs/2045-target-feature.html) will "allow the compiler to generate code under the assumption that this code will only be reached in hosts that support the feature", it does not make sense to simply translate `paca` into the LLVM feature `pauth`, as it will generate code as if `pacg` is available.
To accommodate this we error if only one of the two features is present. If LLVM splits them in the future we can remove this restriction without making a breaking change.
r? ```@Amanieu```
Ensure that queries only return Copy types.
This should pervent the perf footgun of returning a result with an expensive `Clone` impl (like a `Vec` of a hash map).
I went for the stupid solution of allocating on an arena everything that was not `Copy`. Some query results could be made Copy easily, but I did not really investigate.
debuginfo: Fix DW_AT_containing_type vtable debuginfo regression
This PR brings back the `DW_AT_containing_type` attribute for vtables after it has accidentally been removed in #89597.
It also implements a more accurate description of vtables. Instead of describing them as an array of void pointers, the compiler will now emit a struct type description with a field for each entry of the vtable.
r? ``@wesleywiser``
This PR should fix issue https://github.com/rust-lang/rust/issues/93164.
~~The PR is blocked on https://github.com/rust-lang/rust/pull/93154 because both of them modify the `codegen/debug-vtable.rs` test case.~~
Stabilize `-Z instrument-coverage` as `-C instrument-coverage`
(Tracking issue for `instrument-coverage`: https://github.com/rust-lang/rust/issues/79121)
This PR stabilizes support for instrumentation-based code coverage, previously provided via the `-Z instrument-coverage` option. (Continue supporting `-Z instrument-coverage` for compatibility for now, but show a deprecation warning for it.)
Many, many people have tested this support, and there are numerous reports of it working as expected.
Move the documentation from the unstable book to stable rustc documentation. Update uses and documentation to use the `-C` option.
Addressing questions raised in the tracking issue:
> If/when stabilized, will the compiler flag be updated to -C instrument-coverage? (If so, the -Z variant could also be supported for some time, to ease migrations for existing users and scripts.)
This stabilization PR updates the option to `-C` and keeps the `-Z` variant to ease migration.
> The Rust coverage implementation depends on (and automatically turns on) -Z symbol-mangling-version=v0. Will stabilizing this feature depend on stabilizing v0 symbol-mangling first? If so, what is the current status and timeline?
This stabilization PR depends on https://github.com/rust-lang/rust/pull/90128 , which stabilizes `-C symbol-mangling-version=v0` (but does not change the default symbol-mangling-version).
> The Rust coverage implementation implements the latest version of LLVM's Coverage Mapping Format (version 4), which forces a dependency on LLVM 11 or later. A compiler error is generated if attempting to compile with coverage, and using an older version of LLVM.
Given that LLVM 13 has now been released, requiring LLVM 11 for coverage support seems like a reasonable requirement. If people don't have at least LLVM 11, nothing else breaks; they just can't use coverage support. Given that coverage support currently requires a nightly compiler and LLVM 11 or newer, allowing it on a stable compiler built with LLVM 11 or newer seems like an improvement.
The [tracking issue](https://github.com/rust-lang/rust/issues/79121) and the [issue label A-code-coverage](https://github.com/rust-lang/rust/labels/A-code-coverage) link to a few open issues related to `instrument-coverage`, but none of them seem like showstoppers. All of them seem like improvements and refinements we can make after stabilization.
The original `-Z instrument-coverage` support went through a compiler-team MCP at https://github.com/rust-lang/compiler-team/issues/278 . Based on that, `@pnkfelix` suggested that this needed a stabilization PR and a compiler-team FCP.
debuginfo: Make sure that type names for closure and generator environments are unique in debuginfo.
Before this change, closure/generator environments coming from different instantiations of the same generic function were all assigned the same name even though they were distinct types with potentially different data layout. Now we append the generic arguments of the originating function to the type name.
This commit also emits `{closure_env#0}` as the name of these types in order to disambiguate them from the accompanying closure function (which keeps being called `{closure#0}`). Previously both were assigned the same name.
NOTE: Changing debuginfo names like this can break pretty printers and other debugger plugins. I think it's OK in this particular case because the names we are changing were ambiguous anyway. In general though it would be great to have a process for doing changes like these.
Before this change, closure/generator environments coming from different
instantiations of the same generic function were all assigned the same
name even though they were distinct types with potentially different data
layout. Now we append the generic arguments of the originating function
to the type name.
This commit also emits '{closure_env#0}' as the name of these types in
order to disambiguate them from the accompanying closure function
'{closure#0}'. Previously both were assigned the same name.
Add `intrinsics::const_deallocate`
Tracking issue: #79597
Related: #91884
This allows deallocation of a memory allocated by `intrinsics::const_allocate`. At the moment, this can be only used to reduce memory usage, but in the future this may be useful to detect memory leaks (If an allocated memory remains after evaluation, raise an error...?).
Print a helpful message if unwinding aborts when it reaches a nounwind function
This is implemented by routing `TerminatorKind::Abort` back through the panic handler, but with a special flag in the `PanicInfo` which indicates that the panic handler should *not* attempt to unwind the stack and should instead abort immediately.
This is useful for the planned change in https://github.com/rust-lang/lang-team/issues/97 which would make `Drop` impls `nounwind` by default.
### Code
```rust
#![feature(c_unwind)]
fn panic() {
panic!()
}
extern "C" fn nounwind() {
panic();
}
fn main() {
nounwind();
}
```
### Before
```
$ ./test
thread 'main' panicked at 'explicit panic', test.rs:4:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Illegal instruction (core dumped)
```
### After
```
$ ./test
thread 'main' panicked at 'explicit panic', test.rs:4:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'main' panicked at 'panic in a function that cannot unwind', test.rs:7:1
stack backtrace:
0: 0x556f8f86ec9b - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hdccefe11a6ac4396
1: 0x556f8f88ac6c - core::fmt::write::he152b28c41466ebb
2: 0x556f8f85d6e2 - std::io::Write::write_fmt::h0c261480ab86f3d3
3: 0x556f8f8654fa - std::panicking::default_hook::{{closure}}::h5d7346f3ff7f6c1b
4: 0x556f8f86512b - std::panicking::default_hook::hd85803a1376cac7f
5: 0x556f8f865a91 - std::panicking::rust_panic_with_hook::h4dc1c5a3036257ac
6: 0x556f8f86f079 - std::panicking::begin_panic_handler::{{closure}}::hdda1d83c7a9d34d2
7: 0x556f8f86edc4 - std::sys_common::backtrace::__rust_end_short_backtrace::h5b70ed0cce71e95f
8: 0x556f8f865592 - rust_begin_unwind
9: 0x556f8f85a764 - core::panicking::panic_no_unwind::h2606ab3d78c87899
10: 0x556f8f85b910 - test::nounwind::hade6c7ee65050347
11: 0x556f8f85b936 - test::main::hdc6e02cb36343525
12: 0x556f8f85b7e3 - core::ops::function::FnOnce::call_once::h4d02663acfc7597f
13: 0x556f8f85b739 - std::sys_common::backtrace::__rust_begin_short_backtrace::h071d40135adb0101
14: 0x556f8f85c149 - std::rt::lang_start::{{closure}}::h70dbfbf38b685e93
15: 0x556f8f85c791 - std::rt::lang_start_internal::h798f1c0268d525aa
16: 0x556f8f85c131 - std::rt::lang_start::h476a7ee0a0bb663f
17: 0x556f8f85b963 - main
18: 0x7f64c0822b25 - __libc_start_main
19: 0x556f8f85ae8e - _start
20: 0x0 - <unknown>
thread panicked while panicking. aborting.
Aborted (core dumped)
```
- Fix style errors.
- L4-bender does not yet support dynamic linking.
- Stack unwinding is not yet supported for x86_64-unknown-l4re-uclibc.
For now, just abort on panics.
- Use GNU-style linker options where possible. As suggested by review:
- Use standard GNU-style ld syntax for relro flags.
- Use standard GNU-style optimization flags and logic.
- Use standard GNU-style ld syntax for --subsystem.
- Don't read environment variables in L4Bender linker. Thanks to
CARGO_ENCODED_RUSTFLAGS introduced in #9601, l4-bender's arguments can
now be passed from the L4Re build system without resorting to custom
parsing of environment variables.
Update some rustc dependencies to deduplicate them
This PR updates `rand` and `itertools` in rustc (not the whole workspace) in order to deduplicate them (and hopefully slightly improve compile times).
~~Currently, `object` is still duplicated, but https://github.com/rust-lang/thorin/pull/15 and updating `thorin` in the future will remove the use of version 0.27.~~ Update: Thorin 0.2 has now been released, and this PR updates `rustc_codegen_ssa` to use it and deduplicate the `object` crate.
There's a final tiny rustc dependency, `cfg-if`, which will be left: as both versions 0.1.x and 1.0 looked to be heavily depended on, they will require a few cascading updates to be removed.
Stabilize `-Z print-link-args` as `--print link-args`
We have stable options for adding linker arguments; we should have a
stable option to help debug linker arguments.
Add documentation for the new option. In the documentation, make it clear that
the *exact* format of the output is not a stable guarantee.
Use iterator instead of recursion in `codegen_place`
This PR fixes the FIXME in `codegen_place` about using iterator instead of recursion when processing the `projection` field in `mir::PlaceRef`. At the same time, it also reduces the right drift.
Improve SIMD casts
* Allows `simd_cast` intrinsic to take `usize` and `isize`
* Adds `simd_as` intrinsic, which is the same as `simd_cast` except for saturating float-to-int conversions (matching the behavior of `as`).
cc `@workingjubilee`
Remove LLVMRustMarkAllFunctionsNounwind
This was originally introduced in #10916 as a way to remove all landing
pads when performing LTO. However this is no longer necessary today
since rustc properly marks all functions and call-sites as nounwind
where appropriate.
In fact this is incorrect in the presence of `extern "C-unwind"` which
must create a landing pad when compiled with `-C panic=abort` so that
foreign exceptions are caught and properly turned into aborts.
Remove deprecated LLVM-style inline assembly
The `llvm_asm!` was deprecated back in #87590 1.56.0, with intention to remove
it once `asm!` was stabilized, which already happened in #91728 1.59.0. Now it
is time to remove `llvm_asm!` to avoid continued maintenance cost.
Closes#70173.
Closes#92794.
Closes#87612.
Closes#82065.
cc `@rust-lang/wg-inline-asm`
r? `@Amanieu`
Generate more precise generator names
Currently all generators are named with a `generator$N` suffix, regardless of where they come from. This means an `async fn` shows up as a generator in stack traces, which can be surprising to async programmers since they should not need to know that async functions are implementated using generators.
This change generators a different name depending on the generator kind, allowing us to tell whether the generator is the result of an async block, an async closure, an async fn, or a plain generator.
r? `@tmandry`
cc `@michaelwoerister` `@wesleywiser` `@dpaoliello`
This was originally introduced in #10916 as a way to remove all landing
pads when performing LTO. However this is no longer necessary today
since rustc properly marks all functions and call-sites as nounwind
where appropriate.
In fact this is incorrect in the presence of `extern "C-unwind"` which
must create a landing pad when compiled with `-C panic=abort` so that
foreign exceptions are caught and properly turned into aborts.
Currently all generators are named with a `generator$N` suffix,
regardless of where they come from. This means an `async fn` shows up as
a generator in stack traces, which can be surprising to async
programmers since they should not need to know that async functions are
implementated using generators.
This change generators a different name depending on the generator kind,
allowing us to tell whether the generator is the result of an async
block, an async closure, an async fn, or a plain generator.
Make rlib metadata strip works with MIPSr6 architecture
Because MIPSr6 has many differences with previous MIPSr2 arch, the previous rlib metadata stripping code in `rustc_codegen_ssa` is only for MIPSr2/r3/r5 (which share the same elf e_flags).
This commit fixed this problem. It makes `rustc_codegen_ssa` happy when compiling rustc for MIPSr6 target or hosts.
e_flags REF: e356027016/llvm/include/llvm/BinaryFormat/ELF.h (L562)
The field is also renamed from `ident` to `name. In most cases,
we don't actually need the `Span`. A new `ident` method is added
to `VariantDef` and `FieldDef`, which constructs the full `Ident`
using `tcx.def_ident_span()`. This method is used in the cases
where we actually need an `Ident`.
This makes incremental compilation properly track changes
to the `Span`, without all of the invalidations caused by storing
a `Span` directly via an `Ident`.
Consolidate checking for msvc when generating debuginfo
If the target we're generating code for is msvc, then we do two main
things differently: we generate type names in a C++ style instead of a
Rust style and we generate debuginfo for enums differently.
I've refactored the code so that there is one function
(`cpp_like_debuginfo`) which determines if we should use the C++ style
of naming types and other debuginfo generation or the regular Rust one.
r? ``@michaelwoerister``
This PR is not urgent so please don't let it interrupt your holidays! 🎄🎁
If the target we're generating code for is msvc, then we do two main
things differently: we generate type names in a C++ style instead of a
Rust style and we generate debuginfo for enums differently.
I've refactored the code so that there is one function
(`cpp_like_debuginfo`) which determines if we should use the C++ style
of naming types and other debuginfo generation or the regular Rust one.
Because MIPSr6 has many differences with previous MIPSr2 arch, the previous rlib metadata stripping code in `rustc_codegen_ssa` is only for MIPSr2/r3/r5 (which share the same elf e_flags).
This commit fixed this problem. It makes `rustc_codegen_ssa` happy when compiling rustc for MIPSr6 target or hosts.
`thorin` is a Rust implementation of a DWARF packaging utility that
supports reading DWARF objects from archive files (i.e. rlibs) and
therefore is better suited for integration into rustc.
Signed-off-by: David Wood <david.wood@huawei.com>
In #79570, `-Z split-dwarf-kind={none,single,split}` was replaced by `-C
split-debuginfo={off,packed,unpacked}`. `-C split-debuginfo`'s packed
and unpacked aren't exact parallels to single and split, respectively.
On Unix, `-C split-debuginfo=packed` will put debuginfo into object
files and package debuginfo into a DWARF package file (`.dwp`) and
`-C split-debuginfo=unpacked` will put debuginfo into dwarf object files
and won't package it.
In the initial implementation of Split DWARF, split mode wrote sections
which did not require relocation into a DWARF object (`.dwo`) file which
was ignored by the linker and then packaged those DWARF objects into
DWARF packages (`.dwp`). In single mode, sections which did not require
relocation were written into object files but ignored by the linker and
were not packaged. However, both split and single modes could be
packaged or not, the primary difference in behaviour was where the
debuginfo sections that did not require link-time relocation were
written (in a DWARF object or the object file).
This commit re-introduces a `-Z split-dwarf-kind` flag, which can be
used to pick between split and single modes when `-C split-debuginfo` is
used to enable Split DWARF (either packed or unpacked).
Signed-off-by: David Wood <david.wood@huawei.com>
Actually set IMAGE_SCN_LNK_REMOVE for .rmeta
The code intended to set the IMAGE_SCN_LNK_REMOVE flag for the
.rmeta section, however the value of this flag was set to zero.
Instead use the actual value provided by the object crate.
This dates back to the original introduction of this code in
PR #84449, so we were never setting this flag. As I'm not on
Windows, I'm not sure whether that means we were embedding .rmeta
into executables, or whether the section ended up getting stripped
for some other reason.
Remove `NullOp::Box`
Follow up of #89030 and MCP rust-lang/compiler-team#460.
~1 month later nothing seems to be broken, apart from a small regression that #89332 (1aac85bb716c09304b313d69d30d74fe7e8e1a8e) shows could be regained by remvoing the diverging path, so it shall be safe to continue and remove `NullOp::Box` completely.
r? `@jonas-schievink`
`@rustbot` label T-compiler
Continue supporting -Z instrument-coverage for compatibility for now,
but show a deprecation warning for it.
Update uses and documentation to use the -C option.
Move the documentation from the unstable book to stable rustc
documentation.
Mark drop calls in landing pads `cold` instead of `noinline`
Now that deferred inlining has been disabled in LLVM (#92110), this shouldn't cause catastrophic size blowup.
I confirmed that the test cases from https://github.com/rust-lang/rust/issues/41696#issuecomment-298696944 still compile quickly (<1s) after this change. ~Although note that I wasn't able to reproduce the original issue using a recent rustc/llvm with deferred inlining enabled, so those tests may no longer be representative. I was also unable to create a modified test case that reproduced the original issue.~ (edit: I reproduced it on CI by accident--the first commit timed out on the LLVM 12 builder, because I forgot to make it conditional on LLVM version)
r? `@nagisa`
cc `@arielb1` (this effectively reverts #42771 "mark calls in the unwind path as !noinline")
cc `@RalfJung` (fixes#46515)
edit: also fixes#87055
Allow loading LLVM plugins with both legacy and new pass manager
Opening a draft PR to get feedback and start discussion on this feature. There is already a codegen option `passes` which allow giving a list of LLVM pass names, however we currently can't use a LLVM pass plugin (as described here : https://llvm.org/docs/WritingAnLLVMPass.html), the only available passes are the LLVM built-in ones.
The proposed modification would be to add another codegen option `pass-plugins`, which can be set with a list of paths to shared library files. These libraries are loaded using the LLVM function `PassPlugin::Load`, which calls the expected symbol `lvmGetPassPluginInfo`, and register the pipeline parsing and optimization callbacks.
An example usage with a single plugin and 3 passes would look like this in the `.cargo/config`:
```toml
rustflags = [
"-C", "pass-plugins=/tmp/libLLVMPassPlugin",
"-C", "passes=pass1 pass2 pass3",
]
```
This would give the same functionality as the opt LLVM tool directly integrated in rust build system.
Additionally, we can also not specify the `passes` option, and use a plugin which inserts passes in the optimization pipeline, as one could do using clang.
The `AggregateKind` enum ends up in the final mir `Body`. Currently,
any changes to `AdtDef` (regardless of how significant they are)
will legitimately cause the overall result of `optimized_mir` to change,
invalidating any codegen re-use involving that mir.
This will get worse once we start hashing the `Span` inside `FieldDef`
(which is itself contained in `AdtDef`).
To try to reduce these kinds of invalidations, this commit changes
`AggregateKind::Adt` to store just the `DefId`, instead of the full
`AdtDef`. This allows the result of `optimized_mir` to be unchanged
if the `AdtDef` changes in a way that doesn't actually affect any
of the MIR we build.
The code intended to set the IMAGE_SCN_LNK_REMOVE flag for the
.rmeta section, however the value of this flag was set to zero.
Instead use the actual value provided by the object crate.
This dates back to the original introduction of this code in
PR #84449, so we were never setting this flag. As I'm not on
Windows, I'm not sure whether that means we were embedding .rmeta
into executables, or whether the section ended up getting stripped
for some other reason.
Explicitly set no ELF flags for .rustc section
For a data section, the object crate will set the SHF_ALLOC by default, which is exactly what we don't want. Explicitly set sh_flags to zero to avoid this.
I checked with `objdump -h` that this produces the right flags for ELF.
Fixes#92013.
Remove `SymbolStr`
This was originally proposed in https://github.com/rust-lang/rust/pull/74554#discussion_r466203544. As well as removing the icky `SymbolStr` type, it allows the removal of a lot of `&` and `*` occurrences.
Best reviewed one commit at a time.
r? `@oli-obk`
Rollup of 7 pull requests
Successful merges:
- #91566 (Apply path remapping to DW_AT_GNU_dwo_name when producing split DWARF)
- #91926 (Remove `in_band_lifetimes` from `rustc_metadata`)
- #91931 (Remove `in_band_lifetimes` from `rustc_codegen_llvm`)
- #92024 (rustc_codegen_llvm: Give each codegen unit a unique DWARF name on all platforms, not just Apple ones.)
- #92037 (Use a const ParamEnv when in default_method_body_is_const)
- #92047 (Set `RUST_BACKTRACE=0` when running location-detail tests)
- #92050 (Add a space and 2 grave accents )
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
For a data section, the object crate will set the SHF_ALLOC by
default, which is exactly what we don't want. Explicitly set
sh_flags to zero to avoid this.
Apply path remapping to DW_AT_GNU_dwo_name when producing split DWARF
`--remap-path-prefix` doesn't apply to paths to `.o` (in case of packed) or `.dwo` (in case of unpacked) files in `DW_AT_GNU_dwo_name`. GCC also has this bug https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91888
DF_ORIGIN flag signifies that the object being loaded may make reference to the $ORIGIN substitution string.
Some implementations are just ignoring DF_ORIGIN and do substitution for $ORIGIN if present (whatever DF_ORIGIN pr
Set the flag inconditionally if rpath is wanted.
Remove `in_band_lifetimes` from `rustc_codegen_ssa`
See #91867 for more information.
In `compiler/rustc_codegen_ssa/src/coverageinfo/map.rs`, there are several functions with an explicit `'a` lifetime but only a single `&'a self` parameter. These lifetimes should be redundant given lifetime elision, unless the existential `impl Iterator` has weird issues regarding that. Should the redundant lifetimes be removed?
The resulting profile will include the crate name and will be stored in
the `--out-dir` directory.
This implementation makes it convenient to use LLVM time trace together
with cargo, in the contrast to the previous implementation which would
overwrite profiles or store them in `.cargo/registry/..`.
asm: Allow using r9 (ARM) and x18 (AArch64) if they are not reserved by the current target
This supersedes https://github.com/rust-lang/rust/pull/88879.
cc `@Skirmisher`
r? `@joshtriplett`
Remove the reg_thumb register class for asm! on ARM
Also restricts r8-r14 from being used on Thumb1 targets as per #90736.
cc ``@Lokathor``
r? ``@joshtriplett``
Use object crate for .rustc metadata generation
We already use the object crate for generating uncompressed .rmeta
metadata object files. This switches the generation of compressed
.rustc object files to use the object crate as well. These have
slightly different requirements in that .rmeta should be completely
excluded from any final compilation artifacts, while .rustc should
be part of shared objects, but not loaded into memory.
The primary motivation for this change is #90326: In LLVM 14, the
current way of setting section flags (and in particular, preventing
the setting of SHF_ALLOC) will no longer work. There are other ways
we could work around this, but switching to the object crate seems
like the most elegant, as we already use it for .rmeta, and as it
makes this independent of the codegen backend. In particular, we
don't need separate handling in codegen_llvm and codegen_gcc.
codegen_cranelift should be able to reuse the implementation as
well, though I have omitted that here, as it is not based on
codegen_ssa.
This change mostly extracts the existing code for .rmeta handling
to allow using it for .rustc as well, and adjusts the codegen
infrastructure to handle the metadata object file separately: We
no longer create a backend-specific module for it, and directly
produce the compiled module instead.
This does not `fix` #90326 by itself yet, as .llvmbc will need to be
handled separately.
r? `@nagisa`
We already use the object crate for generating uncompressed .rmeta
metadata object files. This switches the generation of compressed
.rustc object files to use the object crate as well. These have
slightly different requirements in that .rmeta should be completely
excluded from any final compilation artifacts, while .rustc should
be part of shared objects, but not loaded into memory.
The primary motivation for this change is #90326: In LLVM 14, the
current way of setting section flags (and in particular, preventing
the setting of SHF_ALLOC) will no longer work. There are other ways
we could work around this, but switching to the object crate seems
like the most elegant, as we already use it for .rmeta, and as it
makes this independent of the codegen backend. In particular, we
don't need separate handling in codegen_llvm and codegen_gcc.
codegen_cranelift should be able to reuse the implementation as
well, though I have omitted that here, as it is not based on
codegen_ssa.
This change mostly extracts the existing code for .rmeta handling
to allow using it for .rustc as well, and adjust the codegen
infrastructure to handle the metadata object file separately: We
no longer create a backend-specific module for it, and directly
produce the compiled module instead.
This does not fix#90326 by itself yet, as .llvmbc will need to be
handled separately.
...because alignment is always nonzero.
This helps eliminate redundant runtime alignment checks, when a DST
is a field of a struct whose remaining fields have alignment 1.
Add support for LLVM coverage mapping format versions 5 and 6
This PR cherry-pick's Swatinem's initial commit in unsubmitted PR #90047.
My additional commit augments Swatinem's great starting point, but adds full support for LLVM
Coverage Mapping Format version 6, conditionally, if compiling with LLVM 13.
Version 6 requires adding the compilation directory when file paths are
relative, and since Rustc coverage maps use relative paths, we should
add the expected compilation directory entry.
Note, however, that with the compilation directory, coverage reports
from `llvm-cov show` can now report file names (when the report includes
more than one file) with the full absolute path to the file.
This would be a problem for test results, but the workaround (for the
rust coverage tests) is to include an additional `llvm-cov show`
parameter: `--compilation-dir=.`
Remove workaround for the forward progress handling in LLVM
this workaround was only needed for LLVM < 12 and the minimum LLVM version was updated to 12 in #90175
Fix ld64 flags
- The `-exported_symbols_list` argument appears to be malformed for `ld64` (if you are not going through `clang`).
- The `-dynamiclib` argument isn't support for `ld64`. It should be guarded behind a compiler flag.
These problems are fixed by these changes. I have also refactored the way linker arguments are generated to be ld/compiler agnostic and therefore less error prone.
These changes are necessary to support cross-compilation to darwin targets.
Leave -Z strip available temporarily as an alias, to avoid breaking
cargo until cargo transitions to using -C strip. (If the user passes
both, the -C version wins.)
This commit refactors linker argument generation to leverage a helper
function that abstracts away details governing how these arguments are
transformed and provided to the linker.
This fixes the misuse of the `-exported_symbols_list` when an ld-like
linker is used rather than a compiler. A compiler would expect
`-Wl,-exported_symbols_list,path` but ld would expect
`-exported_symbols_list` and `path` as two seperate arguments. Prior
to this change, an ld-like linker was given
`-exported_symbols_list,path`.
Linker arguments must transformed when Rust is interacting with the
linker through a compiler. This commit introduces a helper function
that abstracts away details of this transformation.
Record more artifact sizes during self-profiling.
This PR adds artifact size recording for
- "linked artifacts" (executables, RLIBs, dylibs, static libs)
- object files
- dwo files
- assembly files
- crate metadata
- LLVM bitcode files
- LLVM IR files
- codegen unit size estimates
Currently the identifiers emitted for these are hard-coded as string literals. Is it worth adding constants to https://github.com/rust-lang/measureme/blob/master/measureme/src/rustc.rs instead? We don't do that for query names and the like -- but artifact kinds might be more stable than query names.
enable `dotprod` target feature in arm
To implement `vdot` neon insturction in stdarch, we need to enable `dotprod` target feature in arm in rustc.
r? `@Amanieu`
Don't abort compilation after giving a lint error
The only reason to use `abort_if_errors` is when the program is so broken that either:
1. later passes get confused and ICE
2. any diagnostics from later passes would be noise
This is never the case for lints, because the compiler has to be able to deal with `allow`-ed lints.
So it can continue to lint and compile even if there are lint errors.
Closes https://github.com/rust-lang/rust/issues/82761. This is a WIP because I have a feeling it will exit with 0 even if there were lint errors; I don't have a computer that can build rustc locally at the moment.
The only reason to use `abort_if_errors` is when the program is so broken that either:
1. later passes get confused and ICE
2. any diagnostics from later passes would be noise
This is never the case for lints, because the compiler has to be able to deal with `allow`-ed lints.
So it can continue to lint and compile even if there are lint errors.
In https://reviews.llvm.org/D71059 LLVM 11, the time trace profiler was
extended to support multiple threads.
`timeTraceProfilerInitialize` creates a thread local profiler instance.
When a thread finishes `timeTraceProfilerFinishThread` moves a thread
local instance into a global collection of instances. Finally when all
codegen work is complete `timeTraceProfilerWrite` writes data from the
current thread local instance and the instances in global collection
of instances.
Previously, the profiler was intialized on a single thread only. Since
this thread performs no code generation on its own, the resulting
profile was empty.
Update LLVM codegen to initialize & finish time trace profiler on each
code generation thread.
Add LLVM CFI support to the Rust compiler
This PR adds LLVM Control Flow Integrity (CFI) support to the Rust compiler. It initially provides forward-edge control flow protection for Rust-compiled code only by aggregating function pointers in groups identified by their number of arguments.
Forward-edge control flow protection for C or C++ and Rust -compiled code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code share the same virtual address space) will be provided in later work as part of this project by defining and using compatible type identifiers (see Type metadata in the design document in the tracking issue #89653).
LLVM CFI can be enabled with -Zsanitizer=cfi and requires LTO (i.e., -Clto).
Thank you, `@eddyb` and `@pcc,` for all the help!
Properly check `target_features` not to trigger an assertion
Fixes#89875
I think it should be a condition instead of an assertion to check if it's a register as it's possible that `reg` is a register class.
Also, this isn't related to the issue directly, but `is_target_supported` doesn't check `target_features` attributes. Is there any way to check it on rustc_codegen_llvm?
r? `@Amanieu`
Avoid a branch on key being local for queries that use the same local and extern providers
Currently based on https://github.com/rust-lang/rust/pull/85810 as it slightly conflicts with it. Only the last two commits are new.
This commit adds LLVM Control Flow Integrity (CFI) support to the Rust
compiler. It initially provides forward-edge control flow protection for
Rust-compiled code only by aggregating function pointers in groups
identified by their number of arguments.
Forward-edge control flow protection for C or C++ and Rust -compiled
code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code
share the same virtual address space) will be provided in later work as
part of this project by defining and using compatible type identifiers
(see Type metadata in the design document in the tracking issue #89653).
LLVM CFI can be enabled with -Zsanitizer=cfi and requires LTO (i.e.,
-Clto).
Erase late-bound regions before computing vtable debuginfo name.
Fixes#90019.
The `msvc_enum_fallback()` for computing enum type names needs to access the memory layout of niche enums in order to determine the type name. `compute_debuginfo_vtable_name()` did not properly erase regions before computing type names which made memory layout computation ICE when encountering un-erased regions.
r? `@wesleywiser`
This performs a substitution of code following the pattern:
let <id> = if let <pat> = ... { identity } else { ... : ! };
To simplify it to:
let <pat> = ... { identity } else { ... : ! };
By adopting the let_else feature.
Create more accurate debuginfo for vtables.
Before this PR all vtables would have the same name (`"vtable"`) in debuginfo. Now they get an unambiguous name that identifies the implementing type and the trait that is being implemented.
This is only one of several possible improvements:
- This PR describes vtables as arrays of `*const u8` pointers. It would nice to describe them as structs where function pointer is represented by a field with a name indicative of the method it maps to. However, this requires coming up with a naming scheme that avoids clashes between methods with the same name (which is possible if the vtable contains multiple traits).
- The PR does not update the debuginfo we generate for the vtable-pointer field in a fat `dyn` pointer. Right now there does not seem to be an easy way of getting ahold of a vtable-layout without also knowing the concrete self-type of a trait object.
r? `@wesleywiser`
rustc_driver: Enable the `WARN` log level by default
This commit changes the `tracing_subscriber` initialization in
`rustc_driver` so that the `WARN` verbosity level is enabled by default
when the `RUSTC_LOG` env variable is empty. If the `RUSTC_LOG` env
variable is set, the filter string in the environment variable is
honored, instead.
Fixes#76824Closes#89623
cc ``@eddyb,`` ``@oli-obk``
Turn vtable_allocation() into a query
This PR removes the untracked vtable-const-allocation cache from the `tcx` and turns the `vtable_allocation()` method into a query.
The change is pretty straightforward and should be backportable without too much effort.
Fixes https://github.com/rust-lang/rust/issues/89598.
Before this commit all vtables would have the same name "vtable" in
debuginfo. Now they get a name that identifies the implementing type
and the trait that is being implemented.
On macOS, make strip="symbols" not pass any options to strip
This makes the output with `strip="symbols"` match the result of just
calling `strip` on the output binary, minimizing the size of the binary.
This largely involves implementing the options debug-info-for-profiling
and profile-sample-use and forwarding them on to LLVM.
AutoFDO can be used on x86-64 Linux like this:
rustc -O -Cdebug-info-for-profiling main.rs -o main
perf record -b ./main
create_llvm_prof --binary=main --out=code.prof
rustc -O -Cprofile-sample-use=code.prof main.rs -o main2
Now `main2` will have feedback directed optimization applied to it.
The create_llvm_prof tool can be obtained from this github repository:
https://github.com/google/autofdoFixes#64892.
Introduce `Rvalue::ShallowInitBox`
Polished version of #88700.
Implements MCP rust-lang/compiler-team#460, and should allow #43596 to go forward.
In short, creating an empty box is split from a nullary-op `NullOp::Box` into two steps, first a call to `exchange_malloc`, then a `Rvalue::ShallowInitBox` which transmutes `*mut u8` to a shallow-initialized `Box<T>`. This allows the `exchange_malloc` call to unwind. Details can be found in the MCP.
`NullOp::Box` is not yet removed, purely to make reverting easier in case anything goes wrong as the result of this PR. If revert is needed a reversion of "Use Rvalue::ShallowInitBox for box expression" commit followed by a test bless should be sufficient.
Experiments in #88700 showed a very slight compile-time perf regression due to (supposedly) slightly more time spent in LLVM. We could omit unwind edge generation (in non-`oom=panic` case) in box expression MIR construction to restore perf; but I don't think it's necessary since runtime perf isn't affected and perf difference is rather small.
This PR allows applying a `#[track_caller]` attribute to a
closure/generator expression. The attribute as interpreted as applying
to the compiler-generated implementation of the corresponding trait
method (`FnOnce::call_once`, `FnMut::call_mut`, `Fn::call`, or
`Generator::resume`).
This feature does not have its own feature gate - however, it requires
`#![feature(stmt_expr_attributes)]` in order to actually apply
an attribute to a closure or generator.
This is implemented in the same way as for functions - an extra
location argument is appended to the end of the ABI. For closures,
this argument is *not* part of the 'tupled' argument storing the
parameters - the final closure argument for `#[track_caller]` closures
is no longer a tuple.
For direct (monomorphized) calls, the necessary support was already
implemented - we just needeed to adjust some assertions around checking
the ABI and argument count to take closures into account.
For calls through a trait object, more work was needed.
When creating a `ReifyShim`, we need to create a shim
for the trait method (e.g. `FnOnce::call_mut`) - unlike normal
functions, closures are never invoked directly, and always go through a
trait method.
Additional handling was needed for `InstanceDef::ClosureOnceShim`. In
order to pass location information throgh a direct (monomorphized) call
to `FnOnce::call_once` on an `FnMut` closure, we need to make
`ClosureOnceShim` aware of `#[tracked_caller]`. A new field
`track_caller` is added to `ClosureOnceShim` - this is used by
`InstanceDef::requires_caller` location, allowing codegen to
pass through the extra location argument.
Since `ClosureOnceShim.track_caller` is only used by codegen,
we end up generating two identical MIR shims - one for
`track_caller == true`, and one for `track_caller == false`. However,
these two shims are used by the entire crate (i.e. it's two shims total,
not two shims per unique closure), so this shouldn't a big deal.
Fix debuginfo for parameters passed via the ScalarPair abi on Windows
Mark all of these as locals so the debugger does not try to interpret
them as being a pointer to the value. This extends the approach used
in #81898.
Fixes#88625
Querify `FnAbi::of_{fn_ptr,instance}` as `fn_abi_of_{fn_ptr,instance}`.
*Note: opening this PR as draft because it's based on #88499*
This more or less replicates the `LayoutOf::layout_of` setup from #88499, to replace `FnAbi::of_{fn_ptr,instance}` with `FnAbiOf::fn_abi_of_{fn_ptr,instance}`, and also route them through queries (which `layout_of` has used for a while).
The two changes at the use sites (other than the names) are:
* return type is now wrapped in `&'tcx`
* the value *is* interned, which may affect performance
* the `extra_args` list is now an interned `&'tcx ty::List<Ty<'tcx>>`
* should be cheap (it's empty for anything other than C variadics)
Theoretically, a `FnAbiOfHelpers` implementer could choose to keep the `Result<...>` instead of eagerly erroring, but the only existing users of these APIs are codegen backends, so they don't (want to) take advantage of this.
At least miri could make use of this, since it prefers propagating errors (it "just" doesn't use `FnAbi` yet - cc `@RalfJung).`
The way this is done is probably less efficient than what is possible, because the queries handle the correctness-oriented API (i.e. the split into `fn` pointers vs instances), whereas a lower-level query could end up with more reuse between different instances with identical signatures.
r? `@nagisa` cc `@oli-obk` `@bjorn3`
Couple of changes to FileSearch and SearchPath
* Turn a couple of regular comments into doc comments
* Move `get_tools_search_paths` from `FileSearch` to `Session`
* Use Lrc instead of Option to avoid duplication of a `SearchPath`
Mark all of these as locals so the debugger does not try to interpret
them as being a pointer to the value. This extends the approach used in
PR #81898.
Introduce NullOp::AlignOf
This PR introduces `Rvalue::NullaryOp(NullOp::AlignOf, ty)`, which will be lowered from `align_of`, similar to `size_of` lowering to `Rvalue::NullaryOp(NullOp::SizeOf, ty)`.
The changes are originally part of #88700 but since it's not dependent on other changes and could have performance impact on its own, it's separated into its own PR.
Move *_max methods back to util
change to inline instead of inline(always)
Remove valid_range_exclusive from scalar
Use WrappingRange instead
implement always_valid_for in a safer way
Fix accidental edit
Fix handling of +whole-archive native link modifier.
This PR fixes a bug in `add_upstream_native_libraries` that led to the `+whole-archive` modifier being ignored when linking in native libs.
~~Note that the PR does not address the situation when `+whole-archive` is combined with `+bundle`.~~
`@wesleywiser's` commit adds validation code that turns combining `+whole-archive` with `+bundle` into an error.
Fixes https://github.com/rust-lang/rust/issues/88085.
r? `@petrochenkov`
cc `@wesleywiser` `@gcoakes`
Provide `layout_of` automatically (given tcx + param_env + error handling).
After #88337, there's no longer any uses of `LayoutOf` within `rustc_target` itself, so I realized I could move the trait to `rustc_middle::ty::layout` and redesign it a bit.
This is similar to #88338 (and supersedes it), but at no ergonomic loss, since there's no funky `C: LayoutOf<Ty = Ty>` -> `Ty: TyAbiInterface<C>` generic `impl` chain, and each `LayoutOf` still corresponds to one `impl` (of `LayoutOfHelpers`) for the specific context.
After this PR, this is what's needed to get `trait LayoutOf` (with the `layout_of` method) implemented on some context type:
* `TyCtxt`, via `HasTyCtxt`
* `ParamEnv`, via `HasParamEnv`
* a way to transform `LayoutError`s into the desired error type
* an error type of `!` can be paired with having `cx.layout_of(...)` return `TyAndLayout` *without* `Result<...>` around it, such as used by codegen
* this is done through a new `LayoutOfHelpers` trait (and so is specifying the type of `cx.layout_of(...)`)
When going through this path (and not bypassing it with a manual `impl` of `LayoutOf`), the end result is that only the error case can be customized, the query itself and the success paths are guaranteed to be uniform.
(**EDIT**: just noticed that because of the supertrait relationship, you cannot actually implement `LayoutOf` yourself, the blanket `impl` fully covers all possible context types that could ever implement it)
Part of the motivation for this shape of API is that I've been working on querifying `FnAbi::of_*`, and what I want/need to introduce for that looks a lot like the setup in this PR - in particular, it's harder to express the `FnAbi` methods in `rustc_target`, since they're much more tied to `rustc` concepts.
r? `@nagisa` cc `@oli-obk` `@bjorn3`
Include debug info for the allocator shim
Issue Details:
In some cases it is necessary to generate an "allocator shim" to forward various Rust allocation functions (e.g., `__rust_alloc`) to an underlying function (e.g., `malloc`). However, since this allocator shim is a manually created LLVM module it is not processed via the normal module processing code and so no debug info is generated for it (if debugging info is enabled).
Fix Details:
* Modify the `debuginfo` code to allow creating debug info for a module without a `CodegenCx` (since it is difficult, and expensive, to create one just to emit some debug info).
* After creating the allocator shim add in basic debug info.
Issue Details:
In some cases it is necessary to generate an "allocator shim" to forward various Rust allocation functions (e.g., `__rust_alloc`) to an underlying function (e.g., `malloc`). However, since this allocator shim is a manually created LLVM module it is not processed via the normal module processing code and so no debug info is generated for it (if debugging info is enabled).
Fix Details:
* Modify the `debuginfo` code to allow creating debug info for a module without a `CodegenCx` (since it is difficult, and expensive, to create one just to emit some debug info).
* After creating the allocator shim add in basic debug info.
rustc_target: `TyAndLayout::field` should never error.
This refactor (making `TyAndLayout::field` return `TyAndLayout` without any `Result` around it) is based on a simple observation, regarding `TyAndLayout::field`:
If `cx.layout_of(ty)` succeeds (for some `cx` and `ty`), then `.field(cx, i)` on the resulting `TyAndLayout` should *always* succeed in computing `cx.layout_of(field_ty)` (where `field_ty` is the type of the `i`th field of `ty`).
The reason for this is that no matter which field is chosen, `cx.layout_of(field_ty)` *will have already been computed*, as part of computing `cx.layout_of(ty)`, as we cannot determine the layout of *any* type without considering the layouts of *all* of its fields.
And so it should be fine to turn any errors into ICEs, since they likely indicate a `cx` mismatch, or some other edge case that is due to a compiler bug (as opposed to ever being an user-facing error).
<hr/>
Each commit should probably be reviewed separately, though note that there's some `where` clauses (in `rustc_target::abi::call::*`) that change in most commits.
cc `@nagisa` `@oli-obk`
Make `-Z gcc-ld=lld` work for Apple targets
`-Z gcc-ld=lld` was introduced in #85961. It does not work on Macos because lld needs be either named `ld64` or passed `-flavor darwin` as the first two arguments in order to select the Mach-O flavor. Rust invokes cc (=clang) on Macos for linking which calls `ld` as linker binary and not `ld64`, so just creating an `ld64` binary and modifying the search path with `-B` does not work.
In order to solve this patch does:
* Set the `lld_flavor` for all Apple-derived targets to `LldFlavor::Ld64`. As far as I can see this actually works towards fixing `-Xlinker=rust-lld` as all those targets use the Mach-O object format.
* Copy/hardlink rust-lld to the gcc-ld subdirectory as ld64 next to ld.
* If `-Z gcc-ld=lld` is used and the target lld flavor is Ld64 add `-fuse-ld=/path/to/ld64` to the linker invocation.
Fixes#86945.
Adjust linking order of static nobundle libraries
Link the static libraries with "-bundle" modifier from upstream rust crate right after linking this rust crate.
Some linker such as GNU linker `ld.bdf` treat order of linking as order of dependency.
After this change, static libraries with "-bundle" modifier is linked in the same order as "+bundle" modifier.
So we can change the value of "bundle" modifier without causing linking error.
fix: https://github.com/rust-lang/rust/issues/87541
r? `@petrochenkov`
Link the static libraries with "-bundle" modifier from upstream rust crate
right after linking this rust crate. Some linker such as GNU linker
`ld.bdf` treat order of linking as order of dependency. After this change,
static libraries with "-bundle" modifier is linked in the same order as
"+bundle" modifier. So we can change the value of "bundle" modifier without
causing linking error.
Use custom wrap-around type instead of RangeInclusive
Two reasons:
1. More memory is allocated than necessary for `valid_range` in `Scalar`. The range is not used as an iterator and `exhausted` is never used.
2. `contains`, `count` etc. methods in `RangeInclusive` are doing very unhelpful(and dangerous!) things when used as a wrap-around range. - In general this PR wants to limit potentially confusing methods, that have a low probability of working.
Doing a local perf run, every metric shows improvement except for instructions.
Max-rss seem to have a very consistent improvement.
Sorry - newbie here, probably doing something wrong.
Trait upcasting coercion (part 3)
By using separate candidates for each possible choice, this fixes type-checking issues in previous commits.
r? `@nikomatsakis`
Upgrade to LLVM 13
Work in progress update to LLVM 13. Main changes:
* InlineAsm diagnostics reported using SrcMgr diagnostic kind are now handled. Previously these used a separate diag handler.
* Codegen tests are updated for additional attributes.
* Some data layouts have changed.
* Switch `#[used]` attribute from `llvm.used` to `llvm.compiler.used` to avoid SHF_GNU_RETAIN flag introduced in https://reviews.llvm.org/D97448, which appears to trigger a bug in older versions of gold.
* Set `LLVM_INCLUDE_TESTS=OFF` to avoid Python 3.6 requirement.
Upstream issues:
* ~~https://bugs.llvm.org/show_bug.cgi?id=51210 (InlineAsm diagnostic reporting for module asm)~~ Fixed by 1558bb80c0.
* ~~https://bugs.llvm.org/show_bug.cgi?id=51476 (Miscompile on AArch64 due to incorrect comparison elimination)~~ Fixed by 81b106584f.
* https://bugs.llvm.org/show_bug.cgi?id=51207 (Can't set custom section flags anymore). Problematic change reverted in our fork, https://reviews.llvm.org/D107216 posted for upstream revert.
* https://bugs.llvm.org/show_bug.cgi?id=51211 (Regression in codegen for #83623). This is an optimization regression that we may likely have to eat for this release. The fix for #83623 was based on an incorrect premise, and this needs to be properly addressed in the MergeICmps pass.
The [compile-time impact](https://perf.rust-lang.org/compare.html?start=ef9549b6c0efb7525c9b012148689c8d070f9bc0&end=0983094463497eec22d550dad25576a894687002) is mixed, but quite positive as LLVM upgrades go.
The LLVM 13 final release is scheduled for Sep 21st. The current nightly is scheduled for stable release on Oct 21st.
r? `@ghost`
This commit updates the backtrace crate in libstd now that dependencies
have been updated to use `memchr` from the standard library as well.
This is mostly just making sure deps are up-to-date and have all the
latest-and-greatest fixes and such.
Closesrust-lang/backtrace-rs#432
Rather than relying on `getPointerElementType()` from LLVM function
pointers, we now pass the function type explicitly when building `call`
or `invoke` instructions.
Allow more "unknown argument" strings from linker
Some toolchains emit slightly different errors, e.g.
ppc-vle-gcc: error: unrecognized option '-no-pie'
Remove the aarch64 `crypto` target_feature
The subfeatures `aes` or `sha2` should be used instead.
This can't yet be done for ARM targets as some LLVM intrinsics still require `crypto`.
Also update the runtime feature detection tests in `library/std` to mirror the updates in `stdarch`. This also helps https://github.com/rust-lang/rust/issues/86941
r? ``@Amanieu``
Trait upcasting coercion (part2)
This is the second part of trait upcasting coercion implementation.
Currently this is blocked on #86264 .
The third part might be implemented using unsafety checking
r? `@bjorn3`
Since RFC 3052 soft deprecated the authors field anyway, hiding it from
crates.io, docs.rs, and making Cargo not add it by default, and it is
not generally up to date/useful information, we should remove it from
crates in this repo.
[debuginfo] Emit associated type bindings in trait object type names.
This PR updates debuginfo type name generation for trait objects to include associated type bindings and auto trait bounds -- so that, for example, the debuginfo type name of `&dyn Iterator<Item=Foo>` and `&dyn Iterator<Item=Bar>` don't both map to just `&dyn Iterator` anymore.
The following table shows examples of debuginfo type names before and after the PR:
| type | before | after |
|------|---------|-------|
| `&dyn Iterator<Item=u32>>` | `&dyn Iterator` | `&dyn Iterator<Item=u32>` |
| `&(dyn Iterator<Item=u32>> + Sync)` | `&dyn Iterator` | `&(dyn Iterator<Item=u32> + Sync)` |
| `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` | `&dyn SomeTrait<bool, i8>` | `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` |
For targets that need C++-like type names, we use `assoc$<Item,u32>` instead of `Item=u32`:
| type | before | after |
|------|---------|-------|
| `&dyn Iterator<Item=u32>>` | `ref$<dyn$<Iterator> >` | `ref$<dyn$<Iterator<assoc$<Item,u32> > > >` |
| `&(dyn Iterator<Item=u32>> + Sync)` | `ref$<dyn$<Iterator> >` | `ref$<dyn$<Iterator<assoc$<Item,u32> >,Sync> >` |
| `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` | `ref$<dyn$<SomeTrait<bool, i8> > >` | `ref$<dyn$<SomeTrait<bool,i8,assoc$<Bar,u32> > >,Send> >` |
The PR also adds self-profiling measurements for debuginfo type name generation (re. https://github.com/rust-lang/rust/issues/86431). It looks like the compiler spends up to 0.5% of its time in that task, so the potential for optimizing it via caching seems limited.
However, the perf run also shows [the biggest regression](https://perf.rust-lang.org/detailed-query.html?commit=585e91c718b0b2c5319e1fffd0ff1e62aaf7ccc2&base_commit=b9197978a90be6f7570741eabe2da175fec75375&benchmark=tokio-webpush-simple-debug&run_name=incr-unchanged) in a test case that does not even invoke the code in question. This suggests that the length of the names we generate here can affect performance by influencing how much data the linker has to copy around.
Fixes https://github.com/rust-lang/rust/issues/86134.
Don't use gc-sections with profile-generate.
When building with profile-generate don't call gc_sections as this can
can sometimes strip out profile data. This missing information in the
prof files can then result in missing functions when using the profile
information.
#78226
r? `@Mark-Simulacrum`
Remove nondeterminism in multiple-definitions test
Compare all fields in `DllImport` when sorting to avoid nondeterminism in the error for multiple inconsistent definitions of an extern function. Restore the multiple-definitions test.
Resolves#87084.
CTFE/Miri engine Pointer type overhaul
This fixes the long-standing problem that we are using `Scalar` as a type to represent pointers that might be integer values (since they point to a ZST). The main problem is that with int-to-ptr casts, there are multiple ways to represent the same pointer as a `Scalar` and it is unclear if "normalization" (i.e., the cast) already happened or not. This leads to ugly methods like `force_mplace_ptr` and `force_op_ptr`.
Another problem this solves is that in Miri, it would make a lot more sense to have the `Pointer::offset` field represent the full absolute address (instead of being relative to the `AllocId`). This means we can do ptr-to-int casts without access to any machine state, and it means that the overflow checks on pointer arithmetic are (finally!) accurate.
To solve this, the `Pointer` type is made entirely parametric over the provenance, so that we can use `Pointer<AllocId>` inside `Scalar` but use `Pointer<Option<AllocId>>` when accessing memory (where `None` represents the case that we could not figure out an `AllocId`; in that case the `offset` is an absolute address). Moreover, the `Provenance` trait determines if a pointer with a given provenance can be cast to an integer by simply dropping the provenance.
I hope this can be read commit-by-commit, but the first commit does the bulk of the work. It introduces some FIXMEs that are resolved later.
Fixes https://github.com/rust-lang/miri/issues/841
Miri PR: https://github.com/rust-lang/miri/pull/1851
r? `@oli-obk`
Handle non-integer const generic parameters in debuginfo type names.
This PR fixes an ICE introduced by https://github.com/rust-lang/rust/pull/85269 which started emitting const generic arguments for debuginfo names but did not cover the case where such an argument could not be evaluated to a flat string of bits.
The fix implemented in this PR is very basic: If `try_eval_bits()` fails for the constant in question, we fall back to generating a stable hash of the constant and emit that instead. This way we get a (virtually) unique name and side step the problem of generating a string representation of a potentially complex value.
The downside is that the generated name will be rather opaque. E.g. the regression test adds a function `const_generic_fn_non_int<()>` which is then rendered as `const_generic_fn_non_int<{CONST#fe3cfa0214ac55c7}>`. I think it's an open question how to deal with this more gracefully.
I'd be interested in ideas on how to do this better.
r? `@wesleywiser`
cc `@dpaoliello` (do you see any problems with this approach?)
cc `@Mark-Simulacrum` & `@nagisa` (who I've seen comment on debuginfo issues recently -- anyone else?)
Fixes https://github.com/rust-lang/rust/issues/86893
When building with profile-generate request that metadata is kept
during the gc_sections call, as this can sometimes strip out profile
data.
This missing information in the prof files can then result in missing
functions when using the profile information.
Improve opaque pointers support
Opaque pointers are coming, and rustc is not ready.
This adds partial support by passing an explicit load type to LLVM. Two issues I've encountered:
* The necessary type was not available at the point where non-temporal copies were generated. I've pushed the code for that upwards out of the memcpy implementation and moved the position of a cast to make do with the types we have available. (I'm not sure that cast is needed at all, but have retained it in the interest of conservativeness.)
* The `PlaceRef::project_deref()` function used during debuginfo generation seems to be buggy in some way -- though I haven't figured out specifically what it does wrong. Replacing it with `load_operand().deref()` did the trick, but I don't really know what I'm doing here.
Simply shift the bitcast from the store to the load, so that
we can use the destination type. I'm not sure the bitcast is
really necessary, but keeping it for now.
I'm not really sure what is wrong here, but I was getting load
type mismatches in the debuginfo code (which is the only place
using this function).
Replacing the project_deref() implementation with a generic
load_operand + deref did the trick.
This makes load generation compatible with opaque pointers.
The generation of nontemporal copies still accesses the pointer
element type, as fixing this requires more movement.
Refactor linker code
This merges `LinkerInfo` into `CrateInfo` as there is no reason to keep them separate. `LinkerInfo::to_linker` is merged into `get_linker` as both have different logic for each linker type and `to_linker` is directly called after `get_linker`. Also contains a couple of small cleanups.
See the individual commits for all changes.