rust/compiler/rustc_codegen_ssa
bors ceeb5ade20 Auto merge of #93718 - thomcc:used-macho, r=pnkfelix
Only compile #[used] as llvm.compiler.used for ELF targets

This returns `#[used]` to how it worked prior to the LLVM 13 update. The intention is not that this is a stable promise.

I'll add tests later today. The tests will test things that we don't actually promise, though.

It's a deliberately small patch, mostly comments. And assuming it's reviewed and lands in time, IMO it should at least be considered for uplifting to beta (so that it can be in 1.59), as the change broke many crates in the ecosystem, even if they are relying on behavior that is not guaranteed.

# Background

LLVM has two ways of preventing removal of an unused variable: `llvm.compiler.used`, which must be present in object files, but allows the linker to remove the value, and `llvm.used` which is supposed to apply to the linker as well, if possible.

Prior to LLVM 13, `llvm.used` and `llvm.compiler.used` were the same on ELF targets, although they were different elsewhere. Prior to our update to LLVM 13, we compiled `#[used]` using `llvm.used` unconditionally, even though we only ever promised behavior like `llvm.compiler.used`.

In LLVM 13, ELF targets gained some support for preventing linker removal of `llvm.used` via the SHF_RETAIN section flag. This has some compatibility issues though: Concretely: some older versions `ld.gold` (specifically ones prior to v2.36, released in Jan 2021) had a bug where it would fail to place a `#[used] #[link_section = ".init_array"]` static in between `__init_array_start`/`__init_array_end`, leading to code that does this failing to run a static constructor. This is technically not a thing we guarantee will work, is a common use case, and is needed in `libstd` (for example, to get access to `std::env::args()` even if Rust does not control `main`, such as when in a `cdylib` crate).

As a result, when updating to LLVM 13, we unconditionally switched to using `llvm.compiler.used`, which mirror the guarantees we make for `#[used]` and doesn't require the latest ld.gold. Unfortunately, this happened to break quite a bit of things in the ecosystem, as non-ELF targets had come to rely on `#[used]` being slightly stronger. In particular, there are cases where it will even break static constructors on these targets[^initinit] (and in fact, breaks way more use cases, as Mach-O uses special sections as an interface to the OS/linker/loader in many places).

As a result, we only switch to `llvm.compiler.used` on ELF[^elfish] targets. The rationale here is:

1. It is (hopefully) identical to the semantics we used prior to the LLVM13 update as prior to that update we unconditionally used `llvm.used`, but on ELF `llvm.used` was the same as `llvm.compiler.used`.

2. It seems to be how Clang compiles this, and given that they have similar (but stronger) compatibility promises, that makes sense.

[^initinit]: For Mach-O targets: It is not always guaranteed that `__DATA,__mod_init_func` is a GC root if it does not have the `S_MOD_INIT_FUNC_POINTERS` flag which we cannot add. In most cases, when ld64 transformed this section into `__DATA_CONST,__mod_init_func` it gets applied, but it's not clear that that is intentional (let alone guaranteed), and the logic is complex enough that it probably happens sometimes, and people in the wild report it occurring.

[^elfish]: Actually, there's not a great way to tell if it's ELF, so I've approximated it.

This is pretty ad-hoc and hacky! We probably should have a firmer set of guarantees here, but this change should relax the pressure on coming up with that considerably, returning it to previous levels.

---

Unsure who should review so leaving it open, but for sure CC `@nikic`
2022-07-21 06:59:32 +00:00
..

Please read the rustc-dev-guide chapter on Backend Agnostic Codegen.