Add LLVM CFI support to the Rust compiler
This PR adds LLVM Control Flow Integrity (CFI) support to the Rust compiler. It initially provides forward-edge control flow protection for Rust-compiled code only by aggregating function pointers in groups identified by their number of arguments.
Forward-edge control flow protection for C or C++ and Rust -compiled code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code share the same virtual address space) will be provided in later work as part of this project by defining and using compatible type identifiers (see Type metadata in the design document in the tracking issue #89653).
LLVM CFI can be enabled with -Zsanitizer=cfi and requires LTO (i.e., -Clto).
Thank you, `@eddyb` and `@pcc,` for all the help!
Properly check `target_features` not to trigger an assertion
Fixes#89875
I think it should be a condition instead of an assertion to check if it's a register as it's possible that `reg` is a register class.
Also, this isn't related to the issue directly, but `is_target_supported` doesn't check `target_features` attributes. Is there any way to check it on rustc_codegen_llvm?
r? `@Amanieu`
Avoid a branch on key being local for queries that use the same local and extern providers
Currently based on https://github.com/rust-lang/rust/pull/85810 as it slightly conflicts with it. Only the last two commits are new.
This commit adds LLVM Control Flow Integrity (CFI) support to the Rust
compiler. It initially provides forward-edge control flow protection for
Rust-compiled code only by aggregating function pointers in groups
identified by their number of arguments.
Forward-edge control flow protection for C or C++ and Rust -compiled
code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code
share the same virtual address space) will be provided in later work as
part of this project by defining and using compatible type identifiers
(see Type metadata in the design document in the tracking issue #89653).
LLVM CFI can be enabled with -Zsanitizer=cfi and requires LTO (i.e.,
-Clto).
Erase late-bound regions before computing vtable debuginfo name.
Fixes#90019.
The `msvc_enum_fallback()` for computing enum type names needs to access the memory layout of niche enums in order to determine the type name. `compute_debuginfo_vtable_name()` did not properly erase regions before computing type names which made memory layout computation ICE when encountering un-erased regions.
r? `@wesleywiser`
This performs a substitution of code following the pattern:
let <id> = if let <pat> = ... { identity } else { ... : ! };
To simplify it to:
let <pat> = ... { identity } else { ... : ! };
By adopting the let_else feature.
Create more accurate debuginfo for vtables.
Before this PR all vtables would have the same name (`"vtable"`) in debuginfo. Now they get an unambiguous name that identifies the implementing type and the trait that is being implemented.
This is only one of several possible improvements:
- This PR describes vtables as arrays of `*const u8` pointers. It would nice to describe them as structs where function pointer is represented by a field with a name indicative of the method it maps to. However, this requires coming up with a naming scheme that avoids clashes between methods with the same name (which is possible if the vtable contains multiple traits).
- The PR does not update the debuginfo we generate for the vtable-pointer field in a fat `dyn` pointer. Right now there does not seem to be an easy way of getting ahold of a vtable-layout without also knowing the concrete self-type of a trait object.
r? `@wesleywiser`
rustc_driver: Enable the `WARN` log level by default
This commit changes the `tracing_subscriber` initialization in
`rustc_driver` so that the `WARN` verbosity level is enabled by default
when the `RUSTC_LOG` env variable is empty. If the `RUSTC_LOG` env
variable is set, the filter string in the environment variable is
honored, instead.
Fixes#76824Closes#89623
cc ``@eddyb,`` ``@oli-obk``
Turn vtable_allocation() into a query
This PR removes the untracked vtable-const-allocation cache from the `tcx` and turns the `vtable_allocation()` method into a query.
The change is pretty straightforward and should be backportable without too much effort.
Fixes https://github.com/rust-lang/rust/issues/89598.
Before this commit all vtables would have the same name "vtable" in
debuginfo. Now they get a name that identifies the implementing type
and the trait that is being implemented.
On macOS, make strip="symbols" not pass any options to strip
This makes the output with `strip="symbols"` match the result of just
calling `strip` on the output binary, minimizing the size of the binary.
This largely involves implementing the options debug-info-for-profiling
and profile-sample-use and forwarding them on to LLVM.
AutoFDO can be used on x86-64 Linux like this:
rustc -O -Cdebug-info-for-profiling main.rs -o main
perf record -b ./main
create_llvm_prof --binary=main --out=code.prof
rustc -O -Cprofile-sample-use=code.prof main.rs -o main2
Now `main2` will have feedback directed optimization applied to it.
The create_llvm_prof tool can be obtained from this github repository:
https://github.com/google/autofdoFixes#64892.
Introduce `Rvalue::ShallowInitBox`
Polished version of #88700.
Implements MCP rust-lang/compiler-team#460, and should allow #43596 to go forward.
In short, creating an empty box is split from a nullary-op `NullOp::Box` into two steps, first a call to `exchange_malloc`, then a `Rvalue::ShallowInitBox` which transmutes `*mut u8` to a shallow-initialized `Box<T>`. This allows the `exchange_malloc` call to unwind. Details can be found in the MCP.
`NullOp::Box` is not yet removed, purely to make reverting easier in case anything goes wrong as the result of this PR. If revert is needed a reversion of "Use Rvalue::ShallowInitBox for box expression" commit followed by a test bless should be sufficient.
Experiments in #88700 showed a very slight compile-time perf regression due to (supposedly) slightly more time spent in LLVM. We could omit unwind edge generation (in non-`oom=panic` case) in box expression MIR construction to restore perf; but I don't think it's necessary since runtime perf isn't affected and perf difference is rather small.
This PR allows applying a `#[track_caller]` attribute to a
closure/generator expression. The attribute as interpreted as applying
to the compiler-generated implementation of the corresponding trait
method (`FnOnce::call_once`, `FnMut::call_mut`, `Fn::call`, or
`Generator::resume`).
This feature does not have its own feature gate - however, it requires
`#![feature(stmt_expr_attributes)]` in order to actually apply
an attribute to a closure or generator.
This is implemented in the same way as for functions - an extra
location argument is appended to the end of the ABI. For closures,
this argument is *not* part of the 'tupled' argument storing the
parameters - the final closure argument for `#[track_caller]` closures
is no longer a tuple.
For direct (monomorphized) calls, the necessary support was already
implemented - we just needeed to adjust some assertions around checking
the ABI and argument count to take closures into account.
For calls through a trait object, more work was needed.
When creating a `ReifyShim`, we need to create a shim
for the trait method (e.g. `FnOnce::call_mut`) - unlike normal
functions, closures are never invoked directly, and always go through a
trait method.
Additional handling was needed for `InstanceDef::ClosureOnceShim`. In
order to pass location information throgh a direct (monomorphized) call
to `FnOnce::call_once` on an `FnMut` closure, we need to make
`ClosureOnceShim` aware of `#[tracked_caller]`. A new field
`track_caller` is added to `ClosureOnceShim` - this is used by
`InstanceDef::requires_caller` location, allowing codegen to
pass through the extra location argument.
Since `ClosureOnceShim.track_caller` is only used by codegen,
we end up generating two identical MIR shims - one for
`track_caller == true`, and one for `track_caller == false`. However,
these two shims are used by the entire crate (i.e. it's two shims total,
not two shims per unique closure), so this shouldn't a big deal.
Fix debuginfo for parameters passed via the ScalarPair abi on Windows
Mark all of these as locals so the debugger does not try to interpret
them as being a pointer to the value. This extends the approach used
in #81898.
Fixes#88625
Querify `FnAbi::of_{fn_ptr,instance}` as `fn_abi_of_{fn_ptr,instance}`.
*Note: opening this PR as draft because it's based on #88499*
This more or less replicates the `LayoutOf::layout_of` setup from #88499, to replace `FnAbi::of_{fn_ptr,instance}` with `FnAbiOf::fn_abi_of_{fn_ptr,instance}`, and also route them through queries (which `layout_of` has used for a while).
The two changes at the use sites (other than the names) are:
* return type is now wrapped in `&'tcx`
* the value *is* interned, which may affect performance
* the `extra_args` list is now an interned `&'tcx ty::List<Ty<'tcx>>`
* should be cheap (it's empty for anything other than C variadics)
Theoretically, a `FnAbiOfHelpers` implementer could choose to keep the `Result<...>` instead of eagerly erroring, but the only existing users of these APIs are codegen backends, so they don't (want to) take advantage of this.
At least miri could make use of this, since it prefers propagating errors (it "just" doesn't use `FnAbi` yet - cc `@RalfJung).`
The way this is done is probably less efficient than what is possible, because the queries handle the correctness-oriented API (i.e. the split into `fn` pointers vs instances), whereas a lower-level query could end up with more reuse between different instances with identical signatures.
r? `@nagisa` cc `@oli-obk` `@bjorn3`
Couple of changes to FileSearch and SearchPath
* Turn a couple of regular comments into doc comments
* Move `get_tools_search_paths` from `FileSearch` to `Session`
* Use Lrc instead of Option to avoid duplication of a `SearchPath`
Mark all of these as locals so the debugger does not try to interpret
them as being a pointer to the value. This extends the approach used in
PR #81898.
Introduce NullOp::AlignOf
This PR introduces `Rvalue::NullaryOp(NullOp::AlignOf, ty)`, which will be lowered from `align_of`, similar to `size_of` lowering to `Rvalue::NullaryOp(NullOp::SizeOf, ty)`.
The changes are originally part of #88700 but since it's not dependent on other changes and could have performance impact on its own, it's separated into its own PR.
Move *_max methods back to util
change to inline instead of inline(always)
Remove valid_range_exclusive from scalar
Use WrappingRange instead
implement always_valid_for in a safer way
Fix accidental edit
Fix handling of +whole-archive native link modifier.
This PR fixes a bug in `add_upstream_native_libraries` that led to the `+whole-archive` modifier being ignored when linking in native libs.
~~Note that the PR does not address the situation when `+whole-archive` is combined with `+bundle`.~~
`@wesleywiser's` commit adds validation code that turns combining `+whole-archive` with `+bundle` into an error.
Fixes https://github.com/rust-lang/rust/issues/88085.
r? `@petrochenkov`
cc `@wesleywiser` `@gcoakes`
Provide `layout_of` automatically (given tcx + param_env + error handling).
After #88337, there's no longer any uses of `LayoutOf` within `rustc_target` itself, so I realized I could move the trait to `rustc_middle::ty::layout` and redesign it a bit.
This is similar to #88338 (and supersedes it), but at no ergonomic loss, since there's no funky `C: LayoutOf<Ty = Ty>` -> `Ty: TyAbiInterface<C>` generic `impl` chain, and each `LayoutOf` still corresponds to one `impl` (of `LayoutOfHelpers`) for the specific context.
After this PR, this is what's needed to get `trait LayoutOf` (with the `layout_of` method) implemented on some context type:
* `TyCtxt`, via `HasTyCtxt`
* `ParamEnv`, via `HasParamEnv`
* a way to transform `LayoutError`s into the desired error type
* an error type of `!` can be paired with having `cx.layout_of(...)` return `TyAndLayout` *without* `Result<...>` around it, such as used by codegen
* this is done through a new `LayoutOfHelpers` trait (and so is specifying the type of `cx.layout_of(...)`)
When going through this path (and not bypassing it with a manual `impl` of `LayoutOf`), the end result is that only the error case can be customized, the query itself and the success paths are guaranteed to be uniform.
(**EDIT**: just noticed that because of the supertrait relationship, you cannot actually implement `LayoutOf` yourself, the blanket `impl` fully covers all possible context types that could ever implement it)
Part of the motivation for this shape of API is that I've been working on querifying `FnAbi::of_*`, and what I want/need to introduce for that looks a lot like the setup in this PR - in particular, it's harder to express the `FnAbi` methods in `rustc_target`, since they're much more tied to `rustc` concepts.
r? `@nagisa` cc `@oli-obk` `@bjorn3`
Include debug info for the allocator shim
Issue Details:
In some cases it is necessary to generate an "allocator shim" to forward various Rust allocation functions (e.g., `__rust_alloc`) to an underlying function (e.g., `malloc`). However, since this allocator shim is a manually created LLVM module it is not processed via the normal module processing code and so no debug info is generated for it (if debugging info is enabled).
Fix Details:
* Modify the `debuginfo` code to allow creating debug info for a module without a `CodegenCx` (since it is difficult, and expensive, to create one just to emit some debug info).
* After creating the allocator shim add in basic debug info.
Issue Details:
In some cases it is necessary to generate an "allocator shim" to forward various Rust allocation functions (e.g., `__rust_alloc`) to an underlying function (e.g., `malloc`). However, since this allocator shim is a manually created LLVM module it is not processed via the normal module processing code and so no debug info is generated for it (if debugging info is enabled).
Fix Details:
* Modify the `debuginfo` code to allow creating debug info for a module without a `CodegenCx` (since it is difficult, and expensive, to create one just to emit some debug info).
* After creating the allocator shim add in basic debug info.
rustc_target: `TyAndLayout::field` should never error.
This refactor (making `TyAndLayout::field` return `TyAndLayout` without any `Result` around it) is based on a simple observation, regarding `TyAndLayout::field`:
If `cx.layout_of(ty)` succeeds (for some `cx` and `ty`), then `.field(cx, i)` on the resulting `TyAndLayout` should *always* succeed in computing `cx.layout_of(field_ty)` (where `field_ty` is the type of the `i`th field of `ty`).
The reason for this is that no matter which field is chosen, `cx.layout_of(field_ty)` *will have already been computed*, as part of computing `cx.layout_of(ty)`, as we cannot determine the layout of *any* type without considering the layouts of *all* of its fields.
And so it should be fine to turn any errors into ICEs, since they likely indicate a `cx` mismatch, or some other edge case that is due to a compiler bug (as opposed to ever being an user-facing error).
<hr/>
Each commit should probably be reviewed separately, though note that there's some `where` clauses (in `rustc_target::abi::call::*`) that change in most commits.
cc `@nagisa` `@oli-obk`
Make `-Z gcc-ld=lld` work for Apple targets
`-Z gcc-ld=lld` was introduced in #85961. It does not work on Macos because lld needs be either named `ld64` or passed `-flavor darwin` as the first two arguments in order to select the Mach-O flavor. Rust invokes cc (=clang) on Macos for linking which calls `ld` as linker binary and not `ld64`, so just creating an `ld64` binary and modifying the search path with `-B` does not work.
In order to solve this patch does:
* Set the `lld_flavor` for all Apple-derived targets to `LldFlavor::Ld64`. As far as I can see this actually works towards fixing `-Xlinker=rust-lld` as all those targets use the Mach-O object format.
* Copy/hardlink rust-lld to the gcc-ld subdirectory as ld64 next to ld.
* If `-Z gcc-ld=lld` is used and the target lld flavor is Ld64 add `-fuse-ld=/path/to/ld64` to the linker invocation.
Fixes#86945.
Adjust linking order of static nobundle libraries
Link the static libraries with "-bundle" modifier from upstream rust crate right after linking this rust crate.
Some linker such as GNU linker `ld.bdf` treat order of linking as order of dependency.
After this change, static libraries with "-bundle" modifier is linked in the same order as "+bundle" modifier.
So we can change the value of "bundle" modifier without causing linking error.
fix: https://github.com/rust-lang/rust/issues/87541
r? `@petrochenkov`
Link the static libraries with "-bundle" modifier from upstream rust crate
right after linking this rust crate. Some linker such as GNU linker
`ld.bdf` treat order of linking as order of dependency. After this change,
static libraries with "-bundle" modifier is linked in the same order as
"+bundle" modifier. So we can change the value of "bundle" modifier without
causing linking error.
Use custom wrap-around type instead of RangeInclusive
Two reasons:
1. More memory is allocated than necessary for `valid_range` in `Scalar`. The range is not used as an iterator and `exhausted` is never used.
2. `contains`, `count` etc. methods in `RangeInclusive` are doing very unhelpful(and dangerous!) things when used as a wrap-around range. - In general this PR wants to limit potentially confusing methods, that have a low probability of working.
Doing a local perf run, every metric shows improvement except for instructions.
Max-rss seem to have a very consistent improvement.
Sorry - newbie here, probably doing something wrong.
Trait upcasting coercion (part 3)
By using separate candidates for each possible choice, this fixes type-checking issues in previous commits.
r? `@nikomatsakis`
Upgrade to LLVM 13
Work in progress update to LLVM 13. Main changes:
* InlineAsm diagnostics reported using SrcMgr diagnostic kind are now handled. Previously these used a separate diag handler.
* Codegen tests are updated for additional attributes.
* Some data layouts have changed.
* Switch `#[used]` attribute from `llvm.used` to `llvm.compiler.used` to avoid SHF_GNU_RETAIN flag introduced in https://reviews.llvm.org/D97448, which appears to trigger a bug in older versions of gold.
* Set `LLVM_INCLUDE_TESTS=OFF` to avoid Python 3.6 requirement.
Upstream issues:
* ~~https://bugs.llvm.org/show_bug.cgi?id=51210 (InlineAsm diagnostic reporting for module asm)~~ Fixed by 1558bb80c0.
* ~~https://bugs.llvm.org/show_bug.cgi?id=51476 (Miscompile on AArch64 due to incorrect comparison elimination)~~ Fixed by 81b106584f.
* https://bugs.llvm.org/show_bug.cgi?id=51207 (Can't set custom section flags anymore). Problematic change reverted in our fork, https://reviews.llvm.org/D107216 posted for upstream revert.
* https://bugs.llvm.org/show_bug.cgi?id=51211 (Regression in codegen for #83623). This is an optimization regression that we may likely have to eat for this release. The fix for #83623 was based on an incorrect premise, and this needs to be properly addressed in the MergeICmps pass.
The [compile-time impact](https://perf.rust-lang.org/compare.html?start=ef9549b6c0efb7525c9b012148689c8d070f9bc0&end=0983094463497eec22d550dad25576a894687002) is mixed, but quite positive as LLVM upgrades go.
The LLVM 13 final release is scheduled for Sep 21st. The current nightly is scheduled for stable release on Oct 21st.
r? `@ghost`
This commit updates the backtrace crate in libstd now that dependencies
have been updated to use `memchr` from the standard library as well.
This is mostly just making sure deps are up-to-date and have all the
latest-and-greatest fixes and such.
Closesrust-lang/backtrace-rs#432
Rather than relying on `getPointerElementType()` from LLVM function
pointers, we now pass the function type explicitly when building `call`
or `invoke` instructions.
Allow more "unknown argument" strings from linker
Some toolchains emit slightly different errors, e.g.
ppc-vle-gcc: error: unrecognized option '-no-pie'
Remove the aarch64 `crypto` target_feature
The subfeatures `aes` or `sha2` should be used instead.
This can't yet be done for ARM targets as some LLVM intrinsics still require `crypto`.
Also update the runtime feature detection tests in `library/std` to mirror the updates in `stdarch`. This also helps https://github.com/rust-lang/rust/issues/86941
r? ``@Amanieu``
Trait upcasting coercion (part2)
This is the second part of trait upcasting coercion implementation.
Currently this is blocked on #86264 .
The third part might be implemented using unsafety checking
r? `@bjorn3`
Since RFC 3052 soft deprecated the authors field anyway, hiding it from
crates.io, docs.rs, and making Cargo not add it by default, and it is
not generally up to date/useful information, we should remove it from
crates in this repo.
[debuginfo] Emit associated type bindings in trait object type names.
This PR updates debuginfo type name generation for trait objects to include associated type bindings and auto trait bounds -- so that, for example, the debuginfo type name of `&dyn Iterator<Item=Foo>` and `&dyn Iterator<Item=Bar>` don't both map to just `&dyn Iterator` anymore.
The following table shows examples of debuginfo type names before and after the PR:
| type | before | after |
|------|---------|-------|
| `&dyn Iterator<Item=u32>>` | `&dyn Iterator` | `&dyn Iterator<Item=u32>` |
| `&(dyn Iterator<Item=u32>> + Sync)` | `&dyn Iterator` | `&(dyn Iterator<Item=u32> + Sync)` |
| `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` | `&dyn SomeTrait<bool, i8>` | `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` |
For targets that need C++-like type names, we use `assoc$<Item,u32>` instead of `Item=u32`:
| type | before | after |
|------|---------|-------|
| `&dyn Iterator<Item=u32>>` | `ref$<dyn$<Iterator> >` | `ref$<dyn$<Iterator<assoc$<Item,u32> > > >` |
| `&(dyn Iterator<Item=u32>> + Sync)` | `ref$<dyn$<Iterator> >` | `ref$<dyn$<Iterator<assoc$<Item,u32> >,Sync> >` |
| `&(dyn SomeTrait<bool, i8, Bar=u32>> + Send)` | `ref$<dyn$<SomeTrait<bool, i8> > >` | `ref$<dyn$<SomeTrait<bool,i8,assoc$<Bar,u32> > >,Send> >` |
The PR also adds self-profiling measurements for debuginfo type name generation (re. https://github.com/rust-lang/rust/issues/86431). It looks like the compiler spends up to 0.5% of its time in that task, so the potential for optimizing it via caching seems limited.
However, the perf run also shows [the biggest regression](https://perf.rust-lang.org/detailed-query.html?commit=585e91c718b0b2c5319e1fffd0ff1e62aaf7ccc2&base_commit=b9197978a90be6f7570741eabe2da175fec75375&benchmark=tokio-webpush-simple-debug&run_name=incr-unchanged) in a test case that does not even invoke the code in question. This suggests that the length of the names we generate here can affect performance by influencing how much data the linker has to copy around.
Fixes https://github.com/rust-lang/rust/issues/86134.
Don't use gc-sections with profile-generate.
When building with profile-generate don't call gc_sections as this can
can sometimes strip out profile data. This missing information in the
prof files can then result in missing functions when using the profile
information.
#78226
r? `@Mark-Simulacrum`
Remove nondeterminism in multiple-definitions test
Compare all fields in `DllImport` when sorting to avoid nondeterminism in the error for multiple inconsistent definitions of an extern function. Restore the multiple-definitions test.
Resolves#87084.