Rollup of 9 pull requests
Successful merges:
- #104199 (Keep track of the start of the argument block of a closure)
- #105050 (Remove useless borrows and derefs)
- #105153 (Create a hacky fail-fast mode that stops tests at the first failure)
- #105164 (Restore `use` suggestion for `dyn` method call requiring `Sized`)
- #105193 (Disable coverage instrumentation for naked functions)
- #105200 (Remove useless filter in unused extern crate check.)
- #105201 (Do not call fn_sig on non-functions.)
- #105208 (Add AmbiguityError for inconsistent resolution for an import)
- #105214 (update Miri)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
codegen-llvm: never combine DSOLocal and DllImport
Prevent DllImport from being attached to DSOLocal definitions in the LLVM IR. The combination makes no sense, since definitions local to the compilation unit will never be imported from external objects.
Additionally, LLVM will refuse the IR if it encounters the combination (introduced in [1]):
```
if (GV.hasDLLImportStorageClass())
Assert(!GV.isDSOLocal(),
"GlobalValue with DLLImport Storage is dso_local!", &GV);
```
Right now, codegen-llvm will only apply DllImport to constants and rely on call-stubs for functions. Hence, we simply extend the codegen of constants to skip DllImport for any local definitions.
This was discovered when switching the EFI targets to the static relocation model [2]. With this fixed, we can start another attempt at this.
[1] 509132b368
[2] https://github.com/rust-lang/rust/issues/101656
Print all features with --print target-features
This fixes `rustc --print target-features` with respect to aliases and tied features.
Before this change, the print command assumed that each LLVM feature corresponds exactly to one rustc feature. In the case of aliases and tied features, this assumption failed and some features (such as aarch64's "pacg") were missing. With this change, every target feature is listed.
Prevent DllImport from being attached to DSOLocal definitions in the
LLVM IR. The combination makes no sense, since definitions local to the
compilation unit will never be imported from external objects.
Additionally, LLVM will refuse the IR if it encounters the
combination (introduced in [1]):
if (GV.hasDLLImportStorageClass())
Assert(!GV.isDSOLocal(),
"GlobalValue with DLLImport Storage is dso_local!", &GV);
Right now, codegen-llvm will only apply DllImport to constants and rely
on call-stubs for functions. Hence, we simply extend the codegen of
constants to skip DllImport for any local definitions.
This was discovered when switching the EFI targets to the static
relocation model [2]. With this fixed, we can start another attempt at
this.
[1] 509132b368
[2] https://github.com/rust-lang/rust/issues/101656
After https://github.com/llvm/llvm-project/commit/8689f5e landed, LLVM takes the intersection of v8a and v8r as default.
This commit brings back v8a support by explicitly specifying v8a in the feature list.
This should solve #97724.
Use `as_deref` in compiler (but only where it makes sense)
This simplifies some code :3
(there are some changes that are not exacly `as_deref`, but more like "clever `Option`/`Result` method use")
Mark functions created for `raw-dylib` on x86 with DllImport storage class
Fix for #104453
## Issue Details
On x86 Windows, LLVM uses 'L' as the prefix for any private global symbols (`PrivateGlobalPrefix`), so when the `raw-dylib` feature creates an undecorated function symbol that begins with an 'L' LLVM misinterprets that as a private global symbol that it created and so fails the compilation at a later stage since such a symbol must have a definition.
## Fix Details
Mark the function we are creating for `raw-dylib` with `DllImport` storage class (this was already being done for MSVC at a later point for `callee::get_fn` but not for GNU (due to "backwards compatibility")): this will cause LLVM to prefix the name with `__imp_` and so it won't mistake it for a private global symbol.
Pass 128-bit C-style enum enumerator values to LLVM
Pass the full 128 bits of C-style enum enumerators through to LLVM. This means that debuginfo for C-style repr128 enums is now emitted correctly for DWARF platforms (as compared to not being correctly emitted on any platform).
Tracking issue: #56071
Improve generating Custom entry function
This commit is aimed at making compiler-generated entry functions (Basically just C `main` right now) more generic so other targets can do similar things for custom entry. This was initially implemented as part of https://github.com/rust-lang/rust/pull/100316.
Currently, this moves the entry function name and Call convention to the target spec.
Signed-off-by: Ayush Singh <ayushsingh1325@gmail.com>
Fix some misleading target feature aliases
This is the first half of a fix for #100752. It looks like these aliases were added in #78361 and slipped under the radar, as these features are not AVX512. These features _do_ add AVX512 instructions when used _in combination_ with AVX512F, but without AVX512F, these features still provide 128-bit and 256-bit vector instructions. A user might be mislead into thinking these features imply AVX512F (which is true of the actual AVX512 features). This PR allows using the names as defined by LLVM, which matches Intel documentation.
A future PR should change the `std::arch` intrinsics to use these names, and finally remove these aliases from rustc.
r? ```@workingjubilee```
cc ```@Amanieu```
For the next commit, `FunctionCx::codegen_*_terminator` need to take a
`&mut Bx` instead of consuming a `Bx`. This triggers a cascade of
similar changes across multiple functions. The resulting code is more
concise and replaces many `&mut bx` expressions with `bx`.
Perform simple scalar replacement of aggregates (SROA) MIR opt
This is a re-open of https://github.com/rust-lang/rust/pull/85796
I copied the debuginfo implementation (first commit) from `@eddyb's` own SROA PR.
This pass replaces plain field accesses by simple locals when possible.
To be eligible, the replaced locals:
- must not be enums or unions;
- must not be used whole;
- must not have their address taken.
The storage and deinit statements are duplicated on each created local.
cc `@tmiasko` who reviewed the former version of this PR.
interpret: support for per-byte provenance
Also factors the provenance map into its own module.
The third commit does the same for the init mask. I can move it in a separate PR if you prefer.
Fixes https://github.com/rust-lang/miri/issues/2181
r? `@oli-obk`
llvm: dwo only emitted when object code emitted
Fixes#103932.
`CompiledModule` should not think a DWARF object was emitted when a bitcode-only compilation has happened, this can confuse archive file creation (which expects to create an archive containing non-existent dwo files).
r? ``````@michaelwoerister``````
This commit is aimed at making compiler generated entry functions
(Basically just C `main` right now) more generic so other targets can do
similar things for custom entry. This was initially implemented as part
of https://github.com/rust-lang/rust/pull/100316.
Currently, this moves the entry function name and Call convention to the
target spec.
Signed-off-by: Ayush Singh <ayushsingh1325@gmail.com>
Fix Access Violation when using lld & ThinLTO on windows-msvc
Users report an AV at runtime of the compiled binary when using lld and ThinLTO on windows-msvc. The AV occurs when accessing a static value which is defined in one crate but used in another. Based on the disassembly of the cross-crate use, it appears that the use is not correctly linked with the definition and is instead assigned a garbage pointer value.
If we look at the symbol tables for each crates' obj file, we can see what is happening:
*lib.obj*:
```
COFF SYMBOL TABLE
...
00E 00000000 SECT2 notype External | _ZN10reproducer7memrchr2FN17h612b61ca0e168901E
...
```
*bin.obj*:
```
COFF SYMBOL TABLE
...
010 00000000 UNDEF notype External | __imp__ZN10reproducer7memrchr2FN17h612b61ca0e168901E
...
```
The use of the symbol has the "import" style symbol name but the declaration doesn't generate any symbol with the same name. As a result, linking the files generates a warning from lld:
> rust-lld: warning: bin.obj: locally defined symbol imported: reproducer::memrchr::FN::h612b61ca0e168901 (defined in lib.obj) [LNK4217]
and the symbol reference remains undefined at runtime leading to the AV.
To fix this, we just need to detect that we are performing ThinLTO (and thus, static linking) and omit the `dllimport` attribute on the extern item in LLVM IR.
Fixes#81408
`CompiledModule` should not think a DWARF object was emitted when a
bitcode-only compilation has happened, this can confuse archive file
creation (which expects to create an archive containing non-existent dwo
files).
Signed-off-by: David Wood <david.wood@huawei.com>
LLVM 16: Update RISCV data layout
The RISCV data layout was changed in 974e2e690b.
This updates all `riscv64*` targets, though I don't really know what the difference between the `gc` and `imac` ones is.
Passes `x test codegen` at LLVM head and with the currently bundled LLVM version. Without this patch, some tests fail with:
> error: internal compiler error: compiler/rustc_codegen_llvm/src/context.rs:192:13: data-layout for target `riscv64gc-unknown-none-elf`, `e-m:e-p:64:64-i64:64-i128:128-n64-S128`, differs from LLVM target's `riscv64` default layout, `e-m:e-p:64:64-i64:64-i128:128-n32:64-S128
Moved type_array function to rustc_codegen_ssa::BaseTypeMethods trait.
This allows using normal alloca function to create arrays as suggested in
https://github.com/rust-lang/rust/pull/104022.
Signed-off-by: Ayush Singh <ayushsingh1325@gmail.com>
LLVM 16: Switch to using MemoryEffects
This adapts the compiler to the changes required by 304f1d59ca.
AFAICT, `WriteOnly` isn't used by the compiler, all `ReadNone` uses were migrated and the remaining use of `ReadOnly` is only for function parameters.
To simplify the FFI, this PR uses an enum to represent `MemoryEffects` across the FFI boundary, which then gets mapped to the matching static factory method when constructing the attribute.
Fixes#103961.
`@rustbot` label +llvm-main
r? `@nikic`
asm: Work around LLVM bug on AArch64
Upstream issue: https://github.com/llvm/llvm-project/issues/58384
LLVM gets confused if we assign a 32-bit value to a 64-bit register, so pass the 32-bit register name to LLVM in that case.
asm: Match clang behavior for inlateout fixed register operands
We have 2 options for representing LLVM constraints for `inlateout` operands on a fixed register (e.g. `r0`): `={r0},0` or `={r0},{r0}`.
This PR changes the behavior to the latter, which matches the behavior of Clang since https://reviews.llvm.org/D87279.
Users report an AV at runtime of the compiled binary when using lld and
ThinLTO on windows-msvc. The AV occurs when accessing a static value
which is defined in one crate but used in another. Based on the
disassembly of the cross-crate use, it appears that the use is not
correctly linked with the definition and is instead assigned a garbage
pointer value.
If we look at the symbol tables for each crates' obj file, we can see
what is happening:
*lib.obj*:
```
COFF SYMBOL TABLE
...
00E 00000000 SECT2 notype External | _ZN10reproducer7memrchr2FN17h612b61ca0e168901E
...
```
*bin.obj*:
```
COFF SYMBOL TABLE
...
010 00000000 UNDEF notype External | __imp__ZN10reproducer7memrchr2FN17h612b61ca0e168901E
...
```
The use of the symbol has the "import" style symbol name but the
declaration doesn't generate any symbol with the same name. As a result,
linking the files generates a warning from lld:
> rust-lld: warning: bin.obj: locally defined symbol imported: reproducer::memrchr::FN::h612b61ca0e168901 (defined in lib.obj) [LNK4217]
and the symbol reference remains undefined at runtime leading to the AV.
To fix this, we just need to detect that we are performing ThinLTO (and
thus, static linking) and omit the `dllimport` attribute on the extern
item in LLVM IR.
The new implementation doesn't use weak lang items and instead changes
`#[alloc_error_handler]` to an attribute macro just like
`#[global_allocator]`.
The attribute will generate the `__rg_oom` function which is called by
the compiler-generated `__rust_alloc_error_handler`. If no `__rg_oom`
function is defined in any crate then the compiler shim will call
`__rdl_oom` in the alloc crate which will simply panic.
This also fixes link errors with `-C link-dead-code` with
`default_alloc_error_handler`: `__rg_oom` was previously defined in the
alloc crate and would attempt to reference the `oom` lang item, even if
it didn't exist. This worked as long as `__rg_oom` was excluded from
linking since it was not called.
This is a prerequisite for the stabilization of
`default_alloc_error_handler` (#102318).
Don't use usub.with.overflow intrinsic
The canonical form of a usub.with.overflow check in LLVM are separate sub + icmp instructions, rather than a usub.with.overflow intrinsic. Using usub.with.overflow will generally result in worse optimization potential.
The backend will attempt to form usub.with.overflow when it comes to actual instruction selection. This is not fully reliable, but I believe this is a better tradeoff than using the intrinsic in IR.
Fixes#103285.
The canonical form of a usub.with.overflow check in LLVM are
separate sub + icmp instructions, rather than a usub.with.overflow
intrinsic. Using usub.with.overflow will generally result in worse
optimization potential.
The backend will attempt to form usub.with.overflow when it comes
to actual instruction selection. This is not fully reliable, but
I believe this is a better tradeoff than using the intrinsic in
IR.
Fixes#103285.
We have 2 options for representing LLVM constraints for `inlateout`
operands on a fixed register (e.g. `r0`): `={r0},0` or `={r0},{r0}`.
This PR changes the behavior to the latter, which matches the behavior
of Clang since https://reviews.llvm.org/D87279.
More dupe word typos
I only picked those changes (from the regex search) that I am pretty certain doesn't change meaning and is just a typo fix. Do correct me if any fix is undesirable and I can revert those. Thanks.
Remove `-Ztime`
Because it has a lot of overlap with `-Ztime-passes` but is generally less useful. Plus some related cleanups.
Best reviewed one commit at a time.
r? `@davidtwco`
The compiler currently has `-Ztime` and `-Ztime-passes`. I've used
`-Ztime-passes` for years but only recently learned about `-Ztime`.
What's the difference? Let's look at the `-Zhelp` output:
```
-Z time=val -- measure time of rustc processes (default: no)
-Z time-passes=val -- measure time of each rustc pass (default: no)
```
The `-Ztime-passes` description is clear, but the `-Ztime` one is less so.
Sounds like it measures the time for the entire process?
No. The real difference is that `-Ztime-passes` prints out info about passes,
and `-Ztime` does the same, but only for a subset of those passes. More
specifically, there is a distinction in the profiling code between a "verbose
generic activity" and an "extra verbose generic activity". `-Ztime-passes`
prints both kinds, while `-Ztime` only prints the first one. (It took me
a close reading of the source code to determine this difference.)
In practice this distinction has low value. Perhaps in the past the "extra
verbose" output was more voluminous, but now that we only print stats for a
pass if it exceeds 5ms or alters the RSS, `-Ztime-passes` is less spammy. Also,
a lot of the "extra verbose" cases are for individual lint passes, and you need
to also use `-Zno-interleave-lints` to see those anyway.
Therefore, this commit removes `-Ztime` and the associated machinery. One thing
to note is that the existing "extra verbose" activities all have an extra
string argument, so the commit adds the ability to accept an extra argument to
the "verbose" activities.
resolve error when attempting to link a universal library on macOS
Previously attempting to link universal libraries into libraries (but not binaries) would produce an error that "File too small to be an archive". This works around this by invoking `lipo -thin` to extract a library for the target platform when passed a univeral library.
Fixes#55235
It's worth acknowledging that this implementation is kind of a horrible hack. Unfortunately I don't know how to do anything better, hopefully this PR will be a jumping off point.
Previously attempting to link universal libraries into libraries (but not binaries) would produce an error that "File too small to be an archive". This works around this by using `object` to extract a library for the target platform when passed a univeral library.
Fixes#55235
Declare `main` as visibility hidden on targets that default to hidden.
On targets with `default_hidden_visibility` set, which is currrently just WebAssembly, declare the generated `main` function with visibility hidden. This makes it consistent with clang's WebAssembly target, where `main` is just a user function that gets the same visibility as any other user function, which is hidden on WebAssembly unless explicitly overridden.
This will help simplify use cases which in the future may want to automatically wasm-export all visibility-"default" symbols. `main` isn't intended to be wasm-exported, and marking it hidden prevents it from being wasm-exported in that scenario.
LLVM [D131158] changed the SystemZ data layout to always set 64-bit
vector alignment, which used to be conditional on the "vector" feature.
[D131158]: https://reviews.llvm.org/D131158
On targets with `default_hidden_visibility` set, which is currrently
just WebAssembly, declare the generated `main` function with visibility
hidden. This makes it consistent with clang's WebAssembly target, where
`main` is just a user function that gets the same visibility as any
other user function, which is hidden on WebAssembly unless explicitly
overridden.
This will help simplify use cases which in the future may want to
automatically wasm-export all visibility-"default" symbols. `main` isn't
intended to be wasm-exported, and marking it hidden prevents it from
being wasm-exported in that scenario.
Remove support for legacy PM
This removes support for optimizing with LLVM's legacy pass manager, as well as the unstable `-Znew-llvm-pass-manager` option. We have been defaulting to the new PM since LLVM 13 (except for s390x that waited for 14), and LLVM 15 removed support altogether. The only place we still use the legacy PM is for writing the output file, just like `llc` does.
cc #74705
r? ``@nikic``
Implement simd_as for pointers
Expands `simd_as` (and `simd_cast`) to handle pointer-to-pointer, pointer-to-integer, and integer-to-pointer conversions.
cc ``@programmerjake`` ``@thomcc``
On later stages, the feature is already stable.
Result of running:
rg -l "feature.let_else" compiler/ src/librustdoc/ library/ | xargs sed -s -i "s#\\[feature.let_else#\\[cfg_attr\\(bootstrap, feature\\(let_else\\)#"
Add support for MIPS VZ ISA extension
[Link to relevant LLVM line where virt extension is specified](83fab8cee9/llvm/lib/Target/Mips/Mips.td (L172-L173))
This has been tested on mips-unknown-linux-musl with a target-cpu that is >= MIPS32 5 and `target-features=+virt`. The example was checked in a disassembler to ensure the correct assembly sequence was being generated using the virtualization instructions.
Needed additional work:
* MIPS is missing from [the Rust reference CPU feature lists](https://doc.rust-lang.org/reference/attributes/codegen.html#available-features)
Example docs for later:
```md
#### `mips` or `mips64`
This platform requires that `#[target_feature]` is only applied to [`unsafe`
functions][unsafe function]. This target's feature support is currently unstable
and must be enabled by `#![feature(mips_target_feature)]` ([Issue #44839])
[Issue #44839]: https://github.com/rust-lang/rust/issues/44839
Further documentation on these features can be found in the [MIPS Instruction Set
Reference Manual], or elsewhere on [mips.com].
[MIPS Instruction Set Reference Manual]: https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00086-2B-MIPS32BIS-AFP-6.06.pdf
[developer.arm.com]: https://www.mips.com/products/architectures/ase/
Feature | Implicitly Enables | Description
---------------|--------------------|-------------------
`fp64` | | 64-bit Floating Point
`msa` | | "MIPS SIMD Architecture"
`virt` | | Virtualization instructions (VZ ASE)
```
If the above is good I can also submit a PR for that if there's interest in documenting it while it's still unstable. Otherwise that can be dropped, I just wrote it before realizing it was possibly not a good idea.
Relevant to #44839
Add inline-llvm option for disabling/enabling LLVM inlining
In this PR, a new -Z option `inline-llvm` is added in order to be able to turn on/off LLVM inlining.
The capability of turning on/off inlining in LLVM backend is needed for testing performance implications of using recently enabled inlining in rustc's frontend (with -Z inline-mir=yes option, #91743). It would be interesting to see the performance effect using rustc's frontend inlining only without LLVM inlining enabled. Currently LLVM is still doing inlining no mater what value inline-mir is set to. With the option `inline-llvm` being added in this PR, user can turn off LLVM inlining by using `-Z inline-llvm=no` option (the default of inline-llvm is 'yes', LLVM inlining enabled).
Rollup of 5 pull requests
Successful merges:
- #99517 (Display raw pointer as *{mut,const} T instead of *-ptr in errors)
- #99928 (Do not leak type variables from opaque type relation)
- #100473 (Attempt to normalize `FnDef` signature in `InferCtxt::cmp`)
- #100653 (Move the cast_float_to_int fallback code to GCC)
- #100941 (Point at the string inside literal and mention if we need string inte…)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Move the cast_float_to_int fallback code to GCC
Now that we require at least LLVM 13, that codegen backend is always
using its intrinsic `fptosi.sat` and `fptoui.sat` conversions, so it
doesn't need the manual implementation. However, the GCC backend still
needs it, so we can move all of that code down there.
Use object instead of LLVM for reading bitcode from rlibs
Together with changes I plan to make as part of https://github.com/rust-lang/rust/pull/97485 this will allow entirely removing usage of LLVM's archive reader and thus allow removing `archive_ro.rs` and `ArchiveWrapper.cpp`.
Rollup of 9 pull requests
Successful merges:
- #95376 (Add `vec::Drain{,Filter}::keep_rest`)
- #100092 (Fall back when relating two opaques by substs in MIR typeck)
- #101019 (Suggest returning closure as `impl Fn`)
- #101022 (Erase late bound regions before comparing types in `suggest_dereferences`)
- #101101 (interpret: make read-pointer-as-bytes a CTFE-only error with extra information)
- #101123 (Remove `register_attr` feature)
- #101175 (Don't --bless in pre-push hook)
- #101176 (rustdoc: remove unused CSS selectors for `.table-display`)
- #101180 (Add another MaybeUninit array test with const)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
interpret: make read-pointer-as-bytes a CTFE-only error with extra information
Next step in the reaction to https://github.com/rust-lang/rust/issues/99923. Also teaches Miri to implicitly strip provenance in more situations when transmuting pointers to integers, which fixes https://github.com/rust-lang/miri/issues/2456.
Pointer-to-int transmutation during CTFE now produces a message like this:
```
= help: this code performed an operation that depends on the underlying bytes representing a pointer
= help: the absolute address of a pointer is not known at compile-time, so such operations are not supported
```
r? ``@oli-obk``
rustc_middle: Remove `Visibility::Invisible`
It had a different meaning in the past, but now it's only used as an implementation detail of import resolution.
Add pointer masking convenience functions
This PR adds the following public API:
```rust
impl<T: ?Sized> *const T {
fn mask(self, mask: usize) -> *const T;
}
impl<T: ?Sized> *mut T {
fn mask(self, mask: usize) -> *const T;
}
// mod intrinsics
fn mask<T>(ptr: *const T, mask: usize) -> *const T
```
This is equivalent to `ptr.map_addr(|a| a & mask)` but also uses a cool llvm intrinsic.
Proposed in https://github.com/rust-lang/rust/pull/95643#issuecomment-1121562352
cc `@Gankra` `@scottmcm` `@RalfJung`
r? rust-lang/libs-api
Currently they try to be very precise. But they are wrong, i.e. they
don't match what's happening in the loop below. This code isn't hot
enough for it to matter that much.
Because `PassMode::Cast` is by far the largest variant, but is
relatively rare.
This requires making `PassMode` not impl `Copy`, and `Clone` is no
longer necessary. This causes lots of sigil adjusting, but nothing very
notable.
I couldn't find where exactly it's documented, but apperantly pointers to void
type are invalid in llvm - void is only allowed as a return type of functions.
This commit adds the following functions all of which have a signature
`pointer, usize -> pointer`:
- `<*mut T>::mask`
- `<*const T>::mask`
- `intrinsics::ptr_mask`
These functions are equivalent to `.map_addr(|a| a & mask)` but they
utilize `llvm.ptrmask` llvm intrinsic.
*masks your pointers*
Now that we require at least LLVM 13, that codegen backend is always
using its intrinsic `fptosi.sat` and `fptoui.sat` conversions, so it
doesn't need the manual implementation. However, the GCC backend still
needs it, so we can move all of that code down there.
Update the minimum external LLVM to 13
With this change, we'll have stable support for LLVM 13 through 15 (pending release).
For reference, the previous increase to LLVM 12 was #90175.
r? `@nagisa`
Add support for generating unique profraw files by default when using `-C instrument-coverage`
Currently, enabling the rustc flag `-C instrument-coverage` instruments the given crate and by default uses the naming scheme `default.profraw` for any instrumented profile files generated during the execution of a binary linked against this crate. This leads to multiple binaries being executed overwriting one another and causing only the last executable run to contain actual coverage results.
This can be overridden by manually setting the environment variable `LLVM_PROFILE_FILE` to use a unique naming scheme.
This PR adds a change to add support for a reasonable default for rustc to use when enabling coverage instrumentation similar to how the Rust compiler treats generating these same `profraw` files when PGO is enabled.
The new naming scheme is set to `default_%m_%p.profraw` to ensure the uniqueness of each file being generated using [LLVMs special pattern strings](https://clang.llvm.org/docs/SourceBasedCodeCoverage.html#running-the-instrumented-program).
Today the compiler sets the default for PGO `profraw` files to `default_%m.profraw` to ensure a unique file for each run. The same can be done for the instrumented profile files generated via the `-C instrument-coverage` flag as well which LLVM has API support for.
Linked Issue: https://github.com/rust-lang/rust/issues/100381
r? `@wesleywiser`
debuginfo: Generalize C++-like encoding for enums.
The updated encoding should be able to handle niche layouts where more than one variant has fields (as introduced in https://github.com/rust-lang/rust/pull/94075).
The new encoding is more uniform as there is no structural difference between direct-tag, niche-tag, and no-tag layouts anymore. The only difference between those cases is that the "dataful" variant in a niche-tag enum will have a `(start, end)` pair denoting the tag range instead of a single value.
The new encoding now also supports 128-bit tags, which occur in at least some standard library types. These tags are represented as `u64` pairs so that debuggers (which don't always have support for 128-bit integers) can reliably deal with them. The downside is that this adds quite a bit of complexity to the encoding and especially to the corresponding NatVis.
The new encoding seems to increase the size of (x86_64-pc-windows-msvc) debuginfo by 10-15%. The size of binaries is not affected (release builds were built with `-Cdebuginfo=2`, numbers are in kilobytes):
EXE | before | after | relative
-- | -- | -- | --
cargo (debug) | 40453 | 40450 | +0%
ripgrep (debug) | 10275 | 10273 | +0%
cargo (release) | 16186 | 16185 | +0%
ripgrep (release) | 4727 | 4726 | +0%
PDB | before | after | relative
-- | -- | -- | --
cargo (debug) | 236524 | 261412 | +11%
ripgrep (debug) | 53140 | 59060 | +11%
cargo (release) | 148516 | 169620 | +14%
ripgrep (release) | 10676 | 11804 | +11%
Given that the new encoding is more general, this is to be expected. Only platforms using C++-like debuginfo are affected -- which currently is only `*-pc-windows-msvc`.
*TODO*
- [x] Properly update documentation
- [x] Add regression tests for new optimized enum layouts as introduced by #94075.
r? `@wesleywiser`
https://reviews.llvm.org/D120026 changed atomics on thumbv6m to
use libatomic, to ensure that atomic load/store are compatible with
atomic RMW/CAS. However, Rust wants to expose only load/store
without libcalls.
https://reviews.llvm.org/D130480 added support for this behind
the +atomics-32 target feature, so enable that feature.
Introduce an ArchiveBuilderBuilder
This avoids monomorphizing all linker code for each codegen backend and will allow passing in extra information to the archive builder from the codegen backend. I'm going to use this in https://github.com/rust-lang/rust/pull/97485 to allow passing in the right function to extract symbols from object files to a generic archive builder to be used by cg_llvm, cg_clif and cg_gcc.
This avoids monomorphizing all linker code for each codegen backend and
will allow passing in extra information to the archive builder from the
codegen backend.
codegen: use new {re,de,}allocator annotations in llvm
This obviates the patch that teaches LLVM internals about
_rust_{re,de}alloc functions by putting annotations directly in the IR
for the optimizer.
The sole test change is required to anchor FileCheck to the body of the
`box_uninitialized` method, so it doesn't see the `allocalign` on
`__rust_alloc` and get mad about the string `alloca` showing up. Since I
was there anyway, I added some checks on the attributes to prove the
right attributes got set.
r? `@nikic`
This obviates the patch that teaches LLVM internals about
_rust_{re,de}alloc functions by putting annotations directly in the IR
for the optimizer.
The sole test change is required to anchor FileCheck to the body of the
`box_uninitialized` method, so it doesn't see the `allocalign` on
`__rust_alloc` and get mad about the string `alloca` showing up. Since I
was there anyway, I added some checks on the attributes to prove the
right attributes got set.
While we're here, we also emit allocator attributes on
__rust_alloc_zeroed. This should allow LLVM to perform more
optimizations for zeroed blocks, and probably fixes#90032. [This
comment](https://github.com/rust-lang/rust/issues/24194#issuecomment-308791157)
mentions "weird UB-like behaviour with bitvec iterators in
rustc_data_structures" so we may need to back this change out if things
go wrong.
The new test cases require LLVM 15, so we copy them into LLVM
14-supporting versions, which we can delete when we drop LLVM 14.
Enable raw-dylib for bin crates
Fixes#93842
When `raw-dylib` is used in a `bin` crate, we need to collect all of the `raw-dylib` functions, generate the import library and add that to the linker command line.
I also changed the tests so that 1) the C++ dlls are created after the Rust dlls, thus there is no chance of accidentally using them in the Rust linking process and 2) disabled generating import libraries when building with MSVC.
Add fine-grained LLVM CFI support to the Rust compiler
This PR improves the LLVM Control Flow Integrity (CFI) support in the Rust compiler by providing forward-edge control flow protection for Rust-compiled code only by aggregating function pointers in groups identified by their return and parameter types.
Forward-edge control flow protection for C or C++ and Rust -compiled code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code share the same virtual address space) will be provided in later work as part of this project by identifying C char and integer type uses at the time types are encoded (see Type metadata in the design document in the tracking issue https://github.com/rust-lang/rust/issues/89653).
LLVM CFI can be enabled with -Zsanitizer=cfi and requires LTO (i.e., -Clto).
Thank you again, `@eddyb,` `@nagisa,` `@pcc,` and `@tmiasko` for all the help!
Add support for LLVM ShadowCallStack.
LLVMs ShadowCallStack provides backward edge control flow integrity protection by using a separate shadow stack to store and retrieve a function's return address.
LLVM currently only supports this for AArch64 targets. The x18 register is used to hold the pointer to the shadow stack, and therefore this only works on ABIs which reserve x18. Further details are available in the [LLVM ShadowCallStack](https://clang.llvm.org/docs/ShadowCallStack.html) docs.
# Usage
`-Zsanitizer=shadow-call-stack`
# Comments/Caveats
* Currently only enabled for the aarch64-linux-android target
* Requires the platform to define a runtime to initialize the shadow stack, see the [LLVM docs](https://clang.llvm.org/docs/ShadowCallStack.html) for more detail.
This commit improves the LLVM Control Flow Integrity (CFI) support in
the Rust compiler by providing forward-edge control flow protection for
Rust-compiled code only by aggregating function pointers in groups
identified by their return and parameter types.
Forward-edge control flow protection for C or C++ and Rust -compiled
code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code
share the same virtual address space) will be provided in later work as
part of this project by identifying C char and integer type uses at the
time types are encoded (see Type metadata in the design document in the
tracking issue #89653).
LLVM CFI can be enabled with -Zsanitizer=cfi and requires LTO (i.e.,
-Clto).
make vtable pointers entirely opaque
This implements the scheme discussed in https://github.com/rust-lang/unsafe-code-guidelines/issues/338: vtable pointers should be considered entirely opaque and not even readable by Rust code, similar to function pointers.
- We have a new kind of `GlobalAlloc` that symbolically refers to a vtable.
- Miri uses that kind of allocation when generating a vtable.
- The codegen backends, upon encountering such an allocation, call `vtable_allocation` to obtain an actually dataful allocation for this vtable.
- We need new intrinsics to obtain the size and align from a vtable (for some `ptr::metadata` APIs), since direct accesses are UB now.
I had to touch quite a bit of code that I am not very familiar with, so some of this might not make much sense...
r? `@oli-obk`
Allow to disable thinLTO buffer to support lto-embed-bitcode lld feature
Hello
This change is to fix issue (https://github.com/rust-lang/rust/issues/84395) in which passing "-lto-embed-bitcode=optimized" to lld when linking rust code via linker-plugin-lto doesn't produce the expected result.
Instead of emitting a single unified module into a llvmbc section of the linked elf, it emits multiple submodules.
This is caused because rustc emits the BC modules after running llvm `createWriteThinLTOBitcodePass` pass.
Which in turn triggers a thinLTO linkage and causes the said issue.
This patch allows via compiler flag (-Cemit-thin-lto=<bool>) to select between running `createWriteThinLTOBitcodePass` and `createBitcodeWriterPass`.
Note this pattern of selecting between those 2 passes is common inside of LLVM code.
The default is to match the old behavior.
Only compile #[used] as llvm.compiler.used for ELF targets
This returns `#[used]` to how it worked prior to the LLVM 13 update. The intention is not that this is a stable promise.
I'll add tests later today. The tests will test things that we don't actually promise, though.
It's a deliberately small patch, mostly comments. And assuming it's reviewed and lands in time, IMO it should at least be considered for uplifting to beta (so that it can be in 1.59), as the change broke many crates in the ecosystem, even if they are relying on behavior that is not guaranteed.
# Background
LLVM has two ways of preventing removal of an unused variable: `llvm.compiler.used`, which must be present in object files, but allows the linker to remove the value, and `llvm.used` which is supposed to apply to the linker as well, if possible.
Prior to LLVM 13, `llvm.used` and `llvm.compiler.used` were the same on ELF targets, although they were different elsewhere. Prior to our update to LLVM 13, we compiled `#[used]` using `llvm.used` unconditionally, even though we only ever promised behavior like `llvm.compiler.used`.
In LLVM 13, ELF targets gained some support for preventing linker removal of `llvm.used` via the SHF_RETAIN section flag. This has some compatibility issues though: Concretely: some older versions `ld.gold` (specifically ones prior to v2.36, released in Jan 2021) had a bug where it would fail to place a `#[used] #[link_section = ".init_array"]` static in between `__init_array_start`/`__init_array_end`, leading to code that does this failing to run a static constructor. This is technically not a thing we guarantee will work, is a common use case, and is needed in `libstd` (for example, to get access to `std::env::args()` even if Rust does not control `main`, such as when in a `cdylib` crate).
As a result, when updating to LLVM 13, we unconditionally switched to using `llvm.compiler.used`, which mirror the guarantees we make for `#[used]` and doesn't require the latest ld.gold. Unfortunately, this happened to break quite a bit of things in the ecosystem, as non-ELF targets had come to rely on `#[used]` being slightly stronger. In particular, there are cases where it will even break static constructors on these targets[^initinit] (and in fact, breaks way more use cases, as Mach-O uses special sections as an interface to the OS/linker/loader in many places).
As a result, we only switch to `llvm.compiler.used` on ELF[^elfish] targets. The rationale here is:
1. It is (hopefully) identical to the semantics we used prior to the LLVM13 update as prior to that update we unconditionally used `llvm.used`, but on ELF `llvm.used` was the same as `llvm.compiler.used`.
2. It seems to be how Clang compiles this, and given that they have similar (but stronger) compatibility promises, that makes sense.
[^initinit]: For Mach-O targets: It is not always guaranteed that `__DATA,__mod_init_func` is a GC root if it does not have the `S_MOD_INIT_FUNC_POINTERS` flag which we cannot add. In most cases, when ld64 transformed this section into `__DATA_CONST,__mod_init_func` it gets applied, but it's not clear that that is intentional (let alone guaranteed), and the logic is complex enough that it probably happens sometimes, and people in the wild report it occurring.
[^elfish]: Actually, there's not a great way to tell if it's ELF, so I've approximated it.
This is pretty ad-hoc and hacky! We probably should have a firmer set of guarantees here, but this change should relax the pressure on coming up with that considerably, returning it to previous levels.
---
Unsure who should review so leaving it open, but for sure CC `@nikic`
Remove branch target prologues from `#[naked] fn`
This patch hacks around rust-lang/rust#98768 for now via injecting appropriate attributes into the LLVMIR we emit for naked functions. I intend to pursue this upstream so that these attributes can be removed in general, but it's slow going wading through C++ for me.
Revert "Work around invalid DWARF bugs for fat LTO"
Since September, the toolchain has not been generating reliable DWARF
information for static variables when LTO is on. This has affected
projects in the embedded space where the use of LTO is typical. In our
case, it has kept us from bumping past the 2021-09-22 nightly toolchain
lest our debugger break. This has been a pretty dramatic regression for
people using debuggers and static variables. See #90357 for more info
and a repro case.
This commit is a mechanical revert of
d5de680e20 from PR #89041, which caused
the issue. (Note on that PR that the commit's author has requested it be
reverted.)
I have locally verified that this fixes#90357 by restoring the
functionality of both the repro case I posted on that bug, and debugger
behavior on real programs. There do not appear to be test cases for this
in the toolchain; if I've missed them, point me at 'em and I'll update
them.
Adding the option to control from rustc CLI
if the resulted ".o" bitcode module files are with
thinLTO info or regular LTO info.
Allows using "-lto-embed-bitcode=optimized" during linkage
correctly.
Signed-off-by: Ziv Dunkelman <ziv.dunkelman@nextsilicon.com>