Add cross-language LLVM CFI support to the Rust compiler
This PR adds cross-language LLVM Control Flow Integrity (CFI) support to the Rust compiler by adding the `-Zsanitizer-cfi-normalize-integers` option to be used with Clang `-fsanitize-cfi-icall-normalize-integers` for normalizing integer types (see https://reviews.llvm.org/D139395).
It provides forward-edge control flow protection for C or C++ and Rust -compiled code "mixed binaries" (i.e., for when C or C++ and Rust -compiled code share the same virtual address space). For more information about LLVM CFI and cross-language LLVM CFI support for the Rust compiler, see design document in the tracking issue #89653.
Cross-language LLVM CFI can be enabled with -Zsanitizer=cfi and -Zsanitizer-cfi-normalize-integers, and requires proper (i.e., non-rustc) LTO (i.e., -Clinker-plugin-lto).
Thank you again, ``@bjorn3,`` ``@nikic,`` ``@samitolvanen,`` and the Rust community for all the help!
This commit adds cross-language LLVM Control Flow Integrity (CFI)
support to the Rust compiler by adding the
`-Zsanitizer-cfi-normalize-integers` option to be used with Clang
`-fsanitize-cfi-icall-normalize-integers` for normalizing integer types
(see https://reviews.llvm.org/D139395).
It provides forward-edge control flow protection for C or C++ and Rust
-compiled code "mixed binaries" (i.e., for when C or C++ and Rust
-compiled code share the same virtual address space). For more
information about LLVM CFI and cross-language LLVM CFI support for the
Rust compiler, see design document in the tracking issue #89653.
Cross-language LLVM CFI can be enabled with -Zsanitizer=cfi and
-Zsanitizer-cfi-normalize-integers, and requires proper (i.e.,
non-rustc) LTO (i.e., -Clinker-plugin-lto).
Currently a `{D,Subd}iagnosticMessage` can be created from any type that
impls `Into<String>`. That includes `&str`, `String`, and `Cow<'static,
str>`, which are reasonable. It also includes `&String`, which is pretty
weird, and results in many places making unnecessary allocations for
patterns like this:
```
self.fatal(&format!(...))
```
This creates a string with `format!`, takes a reference, passes the
reference to `fatal`, which does an `into()`, which clones the
reference, doing a second allocation. Two allocations for a single
string, bleh.
This commit changes the `From` impls so that you can only create a
`{D,Subd}iagnosticMessage` from `&str`, `String`, or `Cow<'static,
str>`. This requires changing all the places that currently create one
from a `&String`. Most of these are of the `&format!(...)` form
described above; each one removes an unnecessary static `&`, plus an
allocation when executed. There are also a few places where the existing
use of `&String` was more reasonable; these now just use `clone()` at
the call site.
As well as making the code nicer and more efficient, this is a step
towards possibly using `Cow<'static, str>` in
`{D,Subd}iagnosticMessage::{Str,Eager}`. That would require changing
the `From<&'a str>` impls to `From<&'static str>`, which is doable, but
I'm not yet sure if it's worthwhile.
Fix Unreadable non-UTF-8 output on localized MSVC
Fixes#35785 by converting non UTF-8 linker output to Unicode using the OEM code page.
Before:
```text
= note: Non-UTF-8 output: LINK : fatal error LNK1181: cannot open input file \'m\x84rchenhaft.obj\'\r\n
```
After:
```text
= note: LINK : fatal error LNK1181: cannot open input file 'märchenhaft.obj'
```
The difference is more dramatic if using a non-ascii language pack for Windows.
only error combining +whole-archive and +bundle for rlibs
Fixes https://github.com/rust-lang/rust/issues/110912
Checking `flavor == RlibFlavor::Normal` was accidentally lost in 601fc8b36b1962285e371cf3c54eeb3b1b9b3a74
https://github.com/rust-lang/rust/pull/105601
That caused combining +whole-archive and +bundle link modifiers on non-rlib crates to fail with a confusing error message saying that combination is unstable for rlibs. In particular, this caused the build to fail when +whole-archive was used on staticlib crates, even though +whole-archive effectively does nothing on non-bin crates because the final linker invocation is left to an external build system.
cc ``@petrochenkov``
Use MIR's `Offset` for pointer `add` too
~~Status: draft while waiting for #110822 to land, since this is built atop that.~~
~~r? `@ghost~~`
Canonical Rust code has mostly moved to `add`/`sub` on pointers, which take `usize`, instead of `offset` which takes `isize`. (And, relatedly, when `sub_ptr` was added it turned out it replaced every single in-tree use of `offset_from`, because `usize` is just so much more useful than `isize` in Rust.)
Unfortunately, `intrinsics::offset` could only accept `*const` and `isize`, so there's a *huge* amount of type conversions back and forth being done. They're identity conversions in the backend, but still end up producing quite a lot of unhelpful MIR.
This PR changes `intrinsics::offset` to accept `*const` *and* `*mut` along with `isize` *and* `usize`. Conveniently, the backends and CTFE already handle this, since MIR's `BinOp::Offset` [already supports all four combinations](adaac6b166/compiler/rustc_const_eval/src/transform/validate.rs (L523-L528)).
To demonstrate the difference, I added some `mir-opt/pre-codegen/` tests around slice indexing. Here's the difference to `[T]::get_mut`, since it uses `<*mut _>::add` internally:
```diff
`@@` -79,30 +70,21 `@@` fn slice_get_mut_usize(_1: &mut [u32], _2: usize) -> Option<&mut u32> {
StorageLive(_12); // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
StorageLive(_9); // scope 6 at $SRC_DIR/core/src/slice/index.rs:LL:COL
_9 = _8 as *mut u32 (PtrToPtr); // scope 11 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageLive(_13); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _13 = _2 as isize (IntToInt); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageLive(_14); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageLive(_15); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _15 = _9 as *const u32 (Pointer(MutToConstPointer)); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _14 = Offset(move _15, _13); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageDead(_15); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- _7 = move _14 as *mut u32 (PtrToPtr); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageDead(_14); // scope 15 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
- StorageDead(_13); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
+ _7 = Offset(_9, _2); // scope 13 at $SRC_DIR/core/src/ptr/mut_ptr.rs:LL:COL
StorageDead(_9); // scope 6 at $SRC_DIR/core/src/slice/index.rs:LL:COL
StorageDead(_12); // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
StorageDead(_11); // scope 3 at $SRC_DIR/core/src/slice/index.rs:LL:COL
```
1c1c8e442a (diff-a841b6a4538657add3f39bc895744331453d0625e7aace128b1f604f0b63c8fdR80)
Fixes https://github.com/rust-lang/rust/issues/110912
Checking `flavor == RlibFlavor::Normal` was accidentally lost in
601fc8b36b1962285e371cf3c54eeb3b1b9b3a74
https://github.com/rust-lang/rust/pull/105601
That caused combining +whole-archive and +bundle link modifiers on
non-rlib crates to fail with a confusing error message saying that
combination is unstable for rlibs. In particular, this caused the
build to fail when +whole-archive was used on staticlib crates, even
though +whole-archive effectively does nothing on non-bin crates because
the final linker invocation is left to an external build system.
Nicer ICE for #67981
Provides a slightly nicer ICE for #67981, documenting the problem. A proper fix will be necessary before `#![feature(unsized_fn_params)]` can be stabilized.
The problem is that the design of the `"rust-call"` ABI is fundamentally not compatible with `unsized_fn_params`. `"rust-call"` functions need to collect their arguments into a tuple, but if the arguments are not `Sized`, said tuple is potentially not even a valid type—and if it is, it requires `alloca` to create.
``@rustbot`` label +A-abi +A-codegen +F-unboxed_closures +F-unsized_fn_params
Fixes#35785 by converting non UTF-8 linker output to Unicode using the OEM code page.
Before:
```text
= note: Non-UTF-8 output: LINK : fatal error LNK1181: cannot open input file \'m\x84rchenhaft.obj\'\r\n
```
After:
```text
= note: LINK : fatal error LNK1181: cannot open input file 'märchenhaft.obj'
```
The difference is more dramatic if using a non-ascii language pack for Visual Studio.
Rollup of 10 pull requests
Successful merges:
- #108760 (Add lint to deny diagnostics composed of static strings)
- #109444 (Change tidy error message for TODOs)
- #110419 (Spelling library)
- #110550 (Suggest deref on comparison binop RHS even if type is not Copy)
- #110641 (Add new rustdoc book chapter to describe in-doc settings)
- #110798 (pass `unused_extern_crates` in `librustdoc::doctest::make_test`)
- #110819 (simplify TrustedLen impls)
- #110825 (diagnostics: add test case for already-solved issue)
- #110835 (Make some region folders a little stricter.)
- #110847 (rustdoc-json: Time serialization.)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
They're semantically the same, so this means the backends don't need to handle the intrinsic and means fewer MIR basic blocks in pointer arithmetic code.
Report allocation errors as panics
OOM is now reported as a panic but with a custom payload type (`AllocErrorPanicPayload`) which holds the layout that was passed to `handle_alloc_error`.
This should be review one commit at a time:
- The first commit adds `AllocErrorPanicPayload` and changes allocation errors to always be reported as panics.
- The second commit removes `#[alloc_error_handler]` and the `alloc_error_hook` API.
ACP: https://github.com/rust-lang/libs-team/issues/192Closes#51540Closes#51245
Add offset_of! macro (RFC 3308)
Implements https://github.com/rust-lang/rfcs/pull/3308 (tracking issue #106655) by adding the built in macro `core::mem::offset_of`. Two of the future possibilities are also implemented:
* Nested field accesses (without array indexing)
* DST support (for `Sized` fields)
I wrote this a few months ago, before the RFC merged. Now that it's merged, I decided to rebase and finish it.
cc `@thomcc` (RFC author)
Support AIX-style archive type
Reading facility of AIX big archive has been supported by `object` since 0.30.0.
Writing facility of AIX big archive has already been supported by `ar_archive_writer`, but we need to bump the version to support the new archive type enum.
Add `rustc_fluent_macro` to decouple fluent from `rustc_macros`
Fluent, with all the icu4x it brings in, takes quite some time to compile. `fluent_messages!` is only needed in further downstream rustc crates, but is blocking more upstream crates like `rustc_index`. By splitting it out, we allow `rustc_macros` to be compiled earlier, which speeds up `x check compiler` by about 5 seconds (and even more after the needless dependency on `serde_json` is removed from `rustc_data_structures`).
Encode hashes as bytes, not varint
In a few places, we store hashes as `u64` or `u128` and then apply `derive(Decodable, Encodable)` to the enclosing struct/enum. It is more efficient to encode hashes directly than try to apply some varint encoding. This PR adds two new types `Hash64` and `Hash128` which are produced by `StableHasher` and replace every use of storing a `u64` or `u128` that represents a hash.
Distribution of the byte lengths of leb128 encodings, from `x build --stage 2` with `incremental = true`
Before:
```
( 1) 373418203 (53.7%, 53.7%): 1
( 2) 196240113 (28.2%, 81.9%): 3
( 3) 108157958 (15.6%, 97.5%): 2
( 4) 17213120 ( 2.5%, 99.9%): 4
( 5) 223614 ( 0.0%,100.0%): 9
( 6) 216262 ( 0.0%,100.0%): 10
( 7) 15447 ( 0.0%,100.0%): 5
( 8) 3633 ( 0.0%,100.0%): 19
( 9) 3030 ( 0.0%,100.0%): 8
( 10) 1167 ( 0.0%,100.0%): 18
( 11) 1032 ( 0.0%,100.0%): 7
( 12) 1003 ( 0.0%,100.0%): 6
( 13) 10 ( 0.0%,100.0%): 16
( 14) 10 ( 0.0%,100.0%): 17
( 15) 5 ( 0.0%,100.0%): 12
( 16) 4 ( 0.0%,100.0%): 14
```
After:
```
( 1) 372939136 (53.7%, 53.7%): 1
( 2) 196240140 (28.3%, 82.0%): 3
( 3) 108014969 (15.6%, 97.5%): 2
( 4) 17192375 ( 2.5%,100.0%): 4
( 5) 435 ( 0.0%,100.0%): 5
( 6) 83 ( 0.0%,100.0%): 18
( 7) 79 ( 0.0%,100.0%): 10
( 8) 50 ( 0.0%,100.0%): 9
( 9) 6 ( 0.0%,100.0%): 19
```
The remaining 9 or 10 and 18 or 19 are `u64` and `u128` respectively that have the high bits set. As far as I can tell these are coming primarily from `SwitchTargets`.
Fluent, with all the icu4x it brings in, takes quite some time to
compile. `fluent_messages!` is only needed in further downstream rustc
crates, but is blocking more upstream crates like `rustc_index`. By
splitting it out, we allow `rustc_macros` to be compiled earlier, which
speeds up `x check compiler` by about 5 seconds (and even more after the
needless dependency on `serde_json` is removed from
`rustc_data_structures`).
Preserve argument indexes when inlining MIR
We store argument indexes on VarDebugInfo. Unlike the previous method of relying on the variable index to know whether a variable is an argument, this survives MIR inlining.
We also no longer check if var.source_info.scope is the outermost scope. When a function gets inlined, the arguments to the inner function will no longer be in the outermost scope. What we care about though is whether they were in the outermost scope prior to inlining, which we know by whether we assigned an argument index.
Fixes#83217
I considered using `Option<NonZeroU16>` instead of `Option<u16>` to store the index. I didn't because `TypeFoldable` isn't implemented for `NonZeroU16` and because it looks like due to padding, it currently wouldn't make any difference. But I indexed from 1 anyway because (a) it'll make it easier if later it becomes worthwhile to use a `NonZeroU16` and because the arguments were previously indexed from 1, so it made for a smaller change.
This is my first PR on rust-lang/rust, so apologies if I've gotten anything not quite right.