Add -Z no-unique-section-names to reduce ELF header bloat.
This change adds a new compiler flag that can help reduce the size of ELF binaries that contain many functions.
By default, when enabling function sections (which is the default for most targets), the LLVM backend will generate different section names for each function. For example, a function `func` would generate a section called `.text.func`. Normally this is fine because the linker will merge all those sections into a single one in the binary. However, starting with [LLVM 12](https://github.com/llvm/llvm-project/commit/ee5d1a04), the backend will also generate unique section names for exception handling, resulting in thousands of `.gcc_except_table.*` sections ending up in the final binary because some linkers like LLD don't currently merge or strip these EH sections (see discussion [here](https://reviews.llvm.org/D83655)). This can bloat the ELF headers and string table significantly in binaries that contain many functions.
The new option is analogous to Clang's `-fno-unique-section-names`, and instructs LLVM to generate the same `.text` and `.gcc_except_table` section for each function, resulting in a smaller final binary.
The motivation to add this new option was because we have a binary that ended up with so many ELF sections (over 65,000) that it broke some existing ELF tools, which couldn't handle so many sections.
Here's our old binary:
```
$ readelf --sections old.elf | head -1
There are 71746 section headers, starting at offset 0x2a246508:
$ readelf --sections old.elf | grep shstrtab
[71742] .shstrtab STRTAB 0000000000000000 2977204c ad44bb 00 0 0 1
```
That's an 11MB+ string table. Here's the new binary using this option:
```
$ readelf --sections new.elf | head -1
There are 43 section headers, starting at offset 0x29143ca8:
$ readelf --sections new.elf | grep shstrtab
[40] .shstrtab STRTAB 0000000000000000 29143acc 0001db 00 0 0 1
```
The whole binary size went down by over 20MB, which is quite significant.
Implement -Z location-detail flag
This PR implements the `-Z location-detail` flag as described in https://github.com/rust-lang/rfcs/pull/2091 .
`-Z location-detail=val` controls what location details are tracked when using `caller_location`. This allows users to control what location details are printed as part of panic messages, by allowing them to exclude any combination of filenames, line numbers, and column numbers. This option is intended to provide users with a way to mitigate the size impact of `#[track_caller]`.
Some measurements of the savings of this approach on an embedded binary can be found here: https://github.com/rust-lang/rust/issues/70579#issuecomment-942556822 .
Closes#70580 (unless people want to leave that open as a place for discussion of further improvements).
This is my first real PR to rust, so any help correcting mistakes / understanding side effects / improving my tests is appreciated :)
I have one question: RFC 2091 specified this as a debugging option (I think that is what -Z implies?). Does that mean this can never be stabilized without a separate MCP? If so, do I need to submit an MCP now, or is the initial RFC specifying this option sufficient for this to be merged as is, and then an MCP would be needed for eventual stabilization?
Report fatal lexer errors in `--cfg` command line arguments
Fixes#89358. The erroneous behavior was apparently introduced by `@Mark-Simulacrum` in a678e31911; the idea is to silence individual parser errors and instead emit one catch-all error message after parsing. However, for the example in #89358, a fatal lexer error is created here:
edebf77e00/compiler/rustc_parse/src/lexer/mod.rs (L340-L349)
This fatal error aborts the compilation, and so the call to `new_parser_from_source_str()` never returns and the catch-all error message is never emitted. I have therefore changed the `SilentEmitter` to silence only non-fatal errors; with my changes, for the rustc invocation described in #89358:
```sh
rustc --cfg "abc\""
```
I get the following output:
```
error[E0765]: unterminated double quote string
|
= note: this error occurred on the command line: `--cfg=abc"`
```
Add support for artifact size profiling
This adds support for profiling artifact file sizes (incremental compilation artifacts and query cache to begin with).
Eventually we want to track this in perf.rlo so we can ensure that file sizes do not change dramatically on each pull request.
This relies on support in measureme: https://github.com/rust-lang/measureme/pull/169. Once that lands we can update this PR to not point to a git dependency.
This was worked on together with `@michaelwoerister.`
r? `@wesleywiser`
This change adds a new compiler flag that can help reduce the size of
ELF binaries that contain many functions.
By default, when enabling function sections (which is the default for most
targets), the LLVM backend will generate different section names for each
function. For example, a function "func" would generate a section called
".text.func". Normally this is fine because the linker will merge all those
sections into a single one in the binary. However, starting with LLVM 12
(llvm/llvm-project@ee5d1a0), the backend will
also generate unique section names for exception handling, resulting in
thousands of ".gcc_except_table.*" sections ending up in the final binary
because some linkers don't currently merge or strip these EH sections.
This can bloat the ELF headers and string table significantly in
binaries that contain many functions.
The new option is analogous to Clang's -fno-unique-section-names, and
instructs LLVM to generate the same ".text" and ".gcc_except_table"
section for each function, resulting in smaller object files and
potentially a smaller final binary.
Correct decoding of foreign expansions during incr. comp.
Fixes https://github.com/rust-lang/rust/issues/74946
The original issue was due to a wrong assertion in `expn_hash_to_expn_id`.
The secondary issue was due to a mismatch between the encoding and decoding paths for expansions that are created after the TyCtxt is created.
This largely involves implementing the options debug-info-for-profiling
and profile-sample-use and forwarding them on to LLVM.
AutoFDO can be used on x86-64 Linux like this:
rustc -O -Cdebug-info-for-profiling main.rs -o main
perf record -b ./main
create_llvm_prof --binary=main --out=code.prof
rustc -O -Cprofile-sample-use=code.prof main.rs -o main2
Now `main2` will have feedback directed optimization applied to it.
The create_llvm_prof tool can be obtained from this github repository:
https://github.com/google/autofdoFixes#64892.
Couple of changes to FileSearch and SearchPath
* Turn a couple of regular comments into doc comments
* Move `get_tools_search_paths` from `FileSearch` to `Session`
* Use Lrc instead of Option to avoid duplication of a `SearchPath`
Introduce -Z remap-cwd-prefix switch
This switch remaps any absolute paths rooted under the current
working directory to a new value. This includes remapping the
debug info in `DW_AT_comp_dir` and `DW_AT_decl_file`.
Importantly, this flag does not require passing the current working
directory to the compiler, such that the command line can be
run on any machine (with the same input files) and produce the
same results. This is critical property for debugging compiler
issues that crop up on remote machines.
This is based on adetaylor's dbc4ae7cba
Major Change Proposal: https://github.com/rust-lang/compiler-team/issues/450
Discussed on #38322. Would resolve issue #87325.
Add -Z panic-in-drop={unwind,abort} command-line option
This PR changes `Drop` to abort if an unwinding panic attempts to escape it, making the process abort instead. This has several benefits:
- The current behavior when unwinding out of `Drop` is very unintuitive and easy to miss: unwinding continues, but the remaining drops in scope are simply leaked.
- A lot of unsafe code doesn't expect drops to unwind, which can lead to unsoundness:
- https://github.com/servo/rust-smallvec/issues/14
- https://github.com/bluss/arrayvec/issues/3
- There is a code size and compilation time cost to this: LLVM needs to generate extra landing pads out of all calls in a drop implementation. This can compound when functions are inlined since unwinding will then continue on to process drops in the callee, which can itself unwind, etc.
- Initial measurements show a 3% size reduction and up to 10% compilation time reduction on some crates (`syn`).
One thing to note about `-Z panic-in-drop=abort` is that *all* crates must be built with this option for it to be sound since it makes the compiler assume that dropping `Box<dyn Any>` will never unwind.
cc https://github.com/rust-lang/lang-team/issues/97
We can instead if either the LHS or RHS types contain
`TyKind::Error`. In addition to covering the case where
we would have previously updated `if_let_suggestions`, this might
also prevent redundant errors in other cases as well.
Remove `Session.trait_methods_not_found`
Instead, avoid registering the problematic well-formed obligation
to begin with. This removes global untracked mutable state,
and avoids potential issues with incremental compilation.
Remove `Session.used_attrs` and move logic to `CheckAttrVisitor`
Instead of updating global state to mark attributes as used,
we now explicitly emit a warning when an attribute is used in
an unsupported position. As a side effect, we are to emit more
detailed warning messages (instead of just a generic "unused" message).
`Session.check_name` is removed, since its only purpose was to mark
the attribute as used. All of the callers are modified to use
`Attribute.has_name`
Additionally, `AttributeType::AssumedUsed` is removed - an 'assumed
used' attribute is implemented by simply not performing any checks
in `CheckAttrVisitor` for a particular attribute.
We no longer emit unused attribute warnings for the `#[rustc_dummy]`
attribute - it's an internal attribute used for tests, so it doesn't
mark sense to treat it as 'unused'.
With this commit, a large source of global untracked state is removed.
Instead, avoid registering the problematic well-formed obligation
to begin with. This removes global untracked mutable state,
and avoids potential issues with incremental compilation.
Instead of updating global state to mark attributes as used,
we now explicitly emit a warning when an attribute is used in
an unsupported position. As a side effect, we are to emit more
detailed warning messages (instead of just a generic "unused" message).
`Session.check_name` is removed, since its only purpose was to mark
the attribute as used. All of the callers are modified to use
`Attribute.has_name`
Additionally, `AttributeType::AssumedUsed` is removed - an 'assumed
used' attribute is implemented by simply not performing any checks
in `CheckAttrVisitor` for a particular attribute.
We no longer emit unused attribute warnings for the `#[rustc_dummy]`
attribute - it's an internal attribute used for tests, so it doesn't
mark sense to treat it as 'unused'.
With this commit, a large source of global untracked state is removed.
Fixes#85019
A `SourceFile` created during compilation may have a relative
path (e.g. if rustc itself is invoked with a relative path).
When we write out crate metadata, we convert all relative paths
to absolute paths using the current working direction.
However, the working directory is not included in the crate hash.
This means that the crate metadata can change while the crate
hash remains the same. Among other problems, this can cause a
fingerprint mismatch ICE, since incremental compilation uses
the crate metadata hash to determine if a foreign query is green.
This commit moves the field holding the working directory from
`Session` to `Options`, including it as part of the crate hash.
Plugin interface cleanup
The first commit performs two uncontroversial cleanups. The second commit removes `#[plugin_registrar]` and instead requires you to export a `__rustc_plugin_registrar` function, this will require a change to servo's script_plugins (cc `@jdm)`
Fix feature gate checking of static-nobundle and native_link_modifiers
Feature native_link_modifiers_bundle don't need feature static-nobundle
to work.
Also check the feature gates when using native_link_modifiers from command line options. Current nighly compiler don't check those feature gate.
```
> touch lib.rs
> rustc +nightly lib.rs -L /usr/lib -l static:+bundle=dl --crate-type=rlib
> rustc +nightly lib.rs -L /usr/lib -l dylib:+as-needed=dl --crate-type=dylib -Ctarget-feature=-crt-static
> rustc +nightly lib.rs -L /usr/lib -l static:-bundle=dl --crate-type=rlib
error[E0658]: kind="static-nobundle" is unstable
|
= note: see issue #37403 <https://github.com/rust-lang/rust/issues/37403> for more information
= help: add `#![feature(static_nobundle)]` to the crate attributes to enable
error: aborting due to previous error
For more information about this error, try `rustc --explain E0658`.
```
First found this in https://github.com/rust-lang/rust/pull/85600#discussion_r676612655
Fix overflow in rustc happening if the `err_count()` is reduced in a stage.
This can happen if stashed diagnostics are removed or replaced with fewer errors. The semantics stay the same if built without overflow checks. Fixes#84219.
Background: I came across this independently by running `RUSTFLAGS="-C overflow-checks=on" ./x.py test`. Fixing this will allow us to move on and find further overflow errors with testing or fuzzing.
Add back -Zno-profiler-runtime
This was removed by #85284 in favor of `-Zprofiler-runtime=<name>`.However the suggested `-Zprofiler-runtime=None` doesn't work because`None` is treated as a crate name.
This was removed by #85284 in favor of -Zprofiler-runtime=<name>.
However the suggested -Zprofiler-runtime=None doesn't work because
"None" is treated as a crate name.
Add flag to configure `large_assignments` lint
The `large_assignments` lints detects moves over specified limit. The
limit is configured through `move_size_limit = "N"` attribute placed at
the root of a crate. When attribute is absent, the lint is disabled.
Make it possible to enable the lint without making any changes to the
source code, through a new flag `-Zmove-size-limit=N`. For example, to
detect moves exceeding 1023 bytes in a cargo crate, including all
dependencies one could use:
```
$ env RUSTFLAGS=-Zmove-size-limit=1024 cargo build -vv
```
Lint tracking issue #83518.
Remove unstable `--pretty` flag
It doesn't do anything `--unpretty` doesn't, and due to a bug, also
didn't show up in `--help`. I don't think there's any reason to keep it
around, I haven't seen anyone using it.
Closes https://github.com/rust-lang/rust/issues/36473.
Allow combining -Cprofile-generate and -Cpanic=unwind when targeting MSVC.
The LLVM limitation that previously prevented this has been fixed in LLVM 9 which is older than the oldest LLVM version we currently support.
Fixes https://github.com/rust-lang/rust/issues/61002.
r? ``@nagisa`` (or anyone else from ``@rust-lang/wg-llvm)``
Profile incremental compilation hashing fingerprints
Adds profiling instrumentation for the hashing of incremental compilation fingerprints per query.
This will eventually feed into the `measureme` and `rustc-perf` infrastructure for tracking if computing hashes changes over time.
TODOs:
* [x] Address the FIXME where we are including node interning in the hash timing.
* [ ] Update measureme/summarize to handle this new data: https://github.com/rust-lang/measureme/pull/166
* [ ] ~Update rustc-perf to handle the new data from measureme~ (will be done at a later time)
r? `@ghost`
cc `@michaelwoerister`
MSVC.
The LLVM limitation that previously prevented this has been fixed in LLVM
9 which is older than the oldest LLVM version we currently support.
See https://github.com/rust-lang/rust/issues/61002.
target abi
Implement cfg(target_abi) (RFC 2992)
Add an `abi` field to `TargetOptions`, defaulting to "". Support using
`cfg(target_abi = "...")` for conditional compilation on that field.
Gated by `feature(cfg_target_abi)`.
Add a test for `target_abi`, and a test for the feature gate.
Add `target_abi` to tidy as a platform-specific cfg.
Update targets to use `target_abi`
All eabi targets have `target_abi = "eabi".`
All eabihf targets have `target_abi = "eabihf"`.
`armv6_unknown_freebsd` and `armv7_unknown_freebsd` have `target_abi = "eabihf"`.
All abi64 targets have `target_abi = "abi64"`.
All ilp32 targets have `target_abi = "ilp32"`.
All softfloat targets have `target_abi = "softfloat"`.
All *-uwp-windows-* targets have `target_abi = "uwp"`.
All spe targets have `target_abi = "spe"`.
All macabi targets have `target_abi = "macabi"`.
aarch64-apple-ios-sim has `target_abi = "sim"`.
`x86_64-fortanix-unknown-sgx` has `target_abi = "fortanix"`.
`x86_64-unknown-linux-gnux32` has `target_abi = "x32"`.
Add FIXME entries for targets for which existing values need to change
once `cfg_target_abi` becomes stable. (All of them are tier 3 targets.)
Add a test for `target_abi` in `--print cfg`.
Add an `abi` field to `TargetOptions`, defaulting to "". Support using
`cfg(target_abi = "...")` for conditional compilation on that field.
Gated by `feature(cfg_target_abi)`.
Add a test for `target_abi`, and a test for the feature gate.
Add `target_abi` to tidy as a platform-specific cfg.
This does not add an abi to any existing target.
The `large_assignments` lints detects moves over specified limit. The
limit is configured through `move_size_limit = "N"` attribute placed at
the root of a crate. When attribute is absent, the lint is disabled.
Make it possible to enable the lint without making any changes to the
source code, through a new flag `-Zmove-size-limit=N`. For example, to
detect moves exceeding 1023 bytes in a cargo crate, including all
dependencies one could use:
```
$ env RUSTFLAGS=-Zmove-size-limit=1024 cargo build -vv
```
add `track_path::path` fn for usage in `proc_macro`s
Adds a way to declare a dependency on external files without including them, to either re-trigger the build of a file as well as covering the use case of including dependencies within the `rustc` invocation, such that tools like `sccache`/`cachepot` are able to handle references to external files which are not included.
Ref #73921
Add support for leaf function frame pointer elimination
This PR adds ability for the target specifications to specify frame
pointer emission type that's not just “always” or “whatever cg decides”.
In particular there's a new mode that allows omission of the frame
pointer for leaf functions (those that don't call any other functions).
We then set this new mode for Aarch64-based Apple targets.
Fixes#86196
This PR adds ability for the target specifications to specify frame
pointer emission type that's not just “always” or “whatever cg decides”.
In particular there's a new mode that allows omission of the frame
pointer for leaf functions (those that don't call any other functions).
We then set this new mode for Aarch64-based Apple targets.
Fixes#86196
This creates a CSV with name "closure_profile_XXXXX.csv", where the
variable part is the process id of the compiler.
To profile a cargo project you can run one of the following depending on
if you're compiling a library or a binary:
```
cargo +stage1 rustc --lib -- -Zprofile-closures
cargo +stage1 rustc --bin -- -Zprofile-closures
```
Allow loading of llvm plugins on nightly
Based on a discussion in #82734 / with `@wsmoses.`
Mainly moves [this](0149bc4e7e) behind a -Z flag, so it can only be used on nightly,
as requested by `@nagisa` in https://github.com/rust-lang/rust/issues/82734#issuecomment-835863940
This change allows loading of llvm plugins like Enzyme.
Right now it also requires a shared library LLVM build of rustc for symbol resolution.
```rust
// test.rs
extern { fn __enzyme_autodiff(_: usize, ...) -> f64; }
fn square(x : f64) -> f64 {
return x * x;
}
fn main() {
unsafe {
println!("Hello, world {} {}!", square(3.0), __enzyme_autodiff(square as usize, 3.0));
}
}
```
```
./rustc test.rs -Z llvm-plugins="./LLVMEnzyme-12.so" -C passes="enzyme"
./test
Hello, world 9 6!
```
I will try to figure out how to simplify the usage and get this into stable in a later iteration,
but having this on nightly will already help testing further steps.
Emit warnings for unused fields in custom targets.
Add a warning which lists any fields in a custom target `json` file that aren't used. Currently unrecognized fields are ignored so, for example, a typo in the `json` will silently produce a target which isn't the one intended.
Provide option for specifying the profiler runtime
Currently, if `-Zinstrument-coverage` is enabled, the target is linked
against the `library/profiler_builtins` crate (which pulls in LLVM's
compiler-rt runtime).
This option enables backends to specify an alternative runtime crate for
handling injected instrumentation calls.
Fix force-warns to allow dashes.
The `--force-warns` flag was not allowing lint names with dashes, only supporting underscores. This changes it to allow dashes to match the behavior of the A/W/D/F flags.