This is a big map that ends up inside of a `CrateContext` during translation for
all codegen units. This means that any change to the map may end up causing an
incremental recompilation of a codegen unit! In order to reduce the amount of
dependencies here between codegen units and the actual input crate this commit
refactors dealing with exported symbols and such into various queries.
The new queries are largely based on existing queries with filled out
implementations for the local crate in addition to external crates, but the main
idea is that while translating codegen untis no unit needs the entire set of
exported symbols, instead they only need queries about particulare `DefId`
instances every now and then.
The linking stage, however, still generates a full list of all exported symbols
from all crates, but that's going to always happen unconditionally anyway, so no
news there!
This commit moves the `collect_and_partition_translation_items` function into a
query on `TyCtxt` instead of a free function in trans, allowing us to track
dependencies and such of the function.
This commit refactors the `collect_crate_translation_items` function to only
require the `TyCtxt` instead of a `SharedCrateContext` in preparation for
query-ifying this portion of trans.
The SystemZ `LALR` instruction provides PC-relative addressing for
globals, but only to *even* addresses, so other compilers make sure that
such globals are always 2-byte aligned. In Clang, this is modeled with
`TargetInfo::MinGlobalAlign`, and `TargetOptions::min_global_align` now
serves the same purpose for rustc.
In Clang, the only targets that set this are SystemZ, Lanai, and NVPTX,
and the latter two don't have targets in rust master.
This commit started by moving methods from `CrateStore` to queries, but it ended
up necessitating some deeper refactorings to move more items in general to
queries.
Before this commit the *resolver* would walk over the AST and process foreign
modules (`extern { .. }` blocks) and collect `#[link]` annotations. It would
then also process the command line `-l` directives and such. This information
was then stored as precalculated lists in the `CrateStore` object for iterating
over later.
After this, commit, however, this pass no longer happens during resolution but
now instead happens through queries. A query for the linked libraries of a crate
will crawl the crate for `extern` blocks and then process the linkage
annotations at that time.
The symbol map is not good for incremental: it has inputs from every fn
in existence, and it will change if anything changes. One could imagine
cheating with the symbol-map and exempting it from the usual dependency
tracking, since the results are fully deterministic. Instead, I opted to
just add a per-CGU cache, on the premise that recomputing some symbol
names is not going to be so very expensive.
similar to GCC's __attribute((used))__. This attribute prevents LLVM from
optimizing away a non-exported symbol, within a compilation unit (object file),
when there are no references to it.
This is better explained with an example:
```
#[used]
static LIVE: i32 = 0;
static REFERENCED: i32 = 0;
static DEAD: i32 = 0;
fn internal() {}
pub fn exported() -> &'static i32 {
&REFERENCED
}
```
Without optimizations, LLVM pretty much preserves all the static variables and
functions within the compilation unit.
```
$ rustc --crate-type=lib --emit=obj symbols.rs && nm -C symbols.o
0000000000000000 t drop::h1be0f8f27a2ba94a
0000000000000000 r symbols::REFERENCED::hb3bdfd46050bc84c
0000000000000000 r symbols::DEAD::hc2ea8f9bd06f380b
0000000000000000 r symbols::LIVE::h0970cf9889edb56e
0000000000000000 T symbols::exported::h6f096c2b1fc292b2
0000000000000000 t symbols::internal::h0ac1aadbc1e3a494
```
With optimizations, LLVM will drop dead code. Here `internal` is dropped because
it's not a exported function/symbol (i.e. not `pub`lic). `DEAD` is dropped for
the same reason. `REFERENCED` is preserved, even though it's not exported,
because it's referenced by the `exported` function. Finally, `LIVE` survives
because of the `#[used]` attribute even though it's not exported or referenced.
```
$ rustc --crate-type=lib -C opt-level=3 --emit=obj symbols.rs && nm -C symbols.o
0000000000000000 r symbols::REFERENCED::hb3bdfd46050bc84c
0000000000000000 r symbols::LIVE::h0970cf9889edb56e
0000000000000000 T symbols::exported::h6f096c2b1fc292b2
```
Note that the linker knows nothing about `#[used]` and will drop `LIVE`
because no other object references to it.
```
$ echo 'fn main() {}' >> symbols.rs
$ rustc symbols.rs && nm -C symbols | grep LIVE
```
At this time, `#[used]` only works on `static` variables.
Every codegen unit gets its own local counter for generating new symbol
names. This makes bitcode and object files reproducible at the binary
level even when incremental compilation is used.
The `Linkage` enum in librustc_llvm got out of sync with the version in LLVM and it caused two variants of the #[linkage=""] attribute to break.
This adds the functions `LLVMRustGetLinkage` and `LLVMRustSetLinkage` which convert between the Rust Linkage enum and the LLVM one, which should stop this from breaking every time LLVM changes it.
Fixes#33992