Commit Graph

1698 Commits

Author SHA1 Message Date
Oli Scherer
4457ef2c6d Forbid old-style simd_shuffleN intrinsics 2023-08-03 09:29:00 +00:00
bors
abd3637e42 Auto merge of #105545 - erikdesjardins:ptrclean, r=bjorn3
cleanup: remove pointee types

This can't be merged until the oldest LLVM version we support uses opaque pointers, which will be the case after #114148. (Also note `-Cllvm-args="-opaque-pointers=0"` can technically be used in LLVM 15, though I don't think we should support that configuration.)

I initially hoped this would provide some minor perf win, but in https://github.com/rust-lang/rust/pull/105412#issuecomment-1341224450 it had very little impact, so this is only valuable as a cleanup.

As a followup, this will enable #96242 to be resolved.

r? `@ghost`

`@rustbot` label S-blocked
2023-08-01 19:44:17 +00:00
Matthias Krüger
58f963fb65
Rollup merge of #113717 - cuishuang:master, r=Nilstrieb
remove repetitive words
2023-07-31 22:49:47 +02:00
cui fliter
88c7b16e03 remove repetitive words
Signed-off-by: cui fliter <imcusg@gmail.com>
2023-07-31 16:13:02 +08:00
Nicholas Nethercote
c17c8dc78e Remove unnecessary semicolon. 2023-07-31 16:34:13 +10:00
Nicholas Nethercote
5673f47042 Clean up generate_lto_work.
This function has some shared code for the thin LTO and fat LTO cases,
but those cases have so little in common that it's actually clearer to
treat them fully separately.
2023-07-31 16:21:03 +10:00
Nicholas Nethercote
d404699fb1 Fix LLVM thread names on Windows.
PR #112946 tweaked the naming of LLVM threads, but messed things up
slightly, resulting in threads on Windows having names like `optimize
module {} regex.f10ba03eb5ec7975-cgu.0`.

This commit removes the extraneous `{} `.
2023-07-31 16:21:03 +10:00
Nicholas Nethercote
90ce358afa Introduce running_with_any_token closure.
It makes things a little clearer.
2023-07-31 16:21:02 +10:00
Nicholas Nethercote
3b44f5b0eb Use standard Rust capitalization rules for names containing "LTO". 2023-07-31 16:21:02 +10:00
Nicholas Nethercote
a08220bcab Tweak structure of the message loop.
The main loop has a *very* complex condition, which includes two
mentions of `codegen_state`. The body of the loop then immediately
switches on the `codegen_state`.

I find it easier to understand if it's a `loop` and we check for exit
conditions after switching on `codegen_state`. We end up with a tiny bit
of code duplication, but it's clear that (a) we never exit in the
`Ongoing` case, (b) we exit in the `Completed` state only if several
things are true (and there's interaction with LTO there), and (c) we
exit in the `Aborted` state if a couple of things are true. Also, the
exit conditions are all simple conjunctions.
2023-07-31 16:21:02 +10:00
Nicholas Nethercote
179bf19813 Tweak a loop condition.
This loop condition involves `codegen_state`, `work_items`, and
`running_with_own_token`. But the body of the loop cannot modify
`codegen_state`, so repeatedly checking it is unnecessary.
2023-07-31 16:21:02 +10:00
Nicholas Nethercote
d21d31cce7 Move maybe_start_llvm_timer's body into spawn_work.
The two functions are alway called together. This commit factors out the
repeated code.
2023-07-31 16:21:02 +10:00
Nicholas Nethercote
3517fe899e Remove CodegenContext::worker.
`CodegenContext` is immutable except for the `worker` field - we clone
`CodegenContext` in multiple places, changing the `worker` field each
time. It's simpler to move the `worker` field out of `CodegenContext`.
2023-07-31 16:21:02 +10:00
Nicholas Nethercote
4a120f33f7 Remove ExtraBackendMethods::spawn_thread.
It's no longer used, and `spawn_named_thread` is preferable, because
naming threads is helpful when profiling.
2023-07-31 16:21:02 +10:00
Nicholas Nethercote
e78fb95dfa Give the coordinator thread a name.
This is useful when profiling with a profiler like Samply.
2023-07-31 16:21:02 +10:00
Nicholas Nethercote
176610c2cd Remove some unused values in codegen_crate. 2023-07-31 16:21:02 +10:00
Nicholas Nethercote
8b9e3f0dd6 Remove an unnecessary pub. 2023-07-31 16:21:00 +10:00
Nicholas Nethercote
f81fe9d702 Rename MainThreadWorkerState.
The `Worker` is unnecessary, and just makes it longer than necessary.
2023-07-31 16:20:18 +10:00
Nicholas Nethercote
5bef04ed38 Rename things related to the main thread's operations.
It took me some time to understand how the main thread can lend a
jobserver token to an LLVM thread. This commit renames a couple of
things to make it clearer.

- Rename the `LLVMing` variant as `Lending`, because that is a clearer
  description of what is happening.
- Rename `running` as `running_with_own_token`, which makes it clearer
  that there might be one additional LLVM thread running (with a loaned
  token). Also add a comment to its definition.
2023-07-31 16:20:18 +10:00
Nicholas Nethercote
fd017d3c17 Add some assertions.
- Thin and fat LTO can't happen together.
- `NeedsLink` and (non-allocator) `Compiled` work item results can't
  happen together.
2023-07-31 16:20:18 +10:00
Nicholas Nethercote
4f598b852c Add comments to WorkItemResult.
And rename the `Compiled` variant as `Finished`, because that name makes
it clearer there is nothing left to do, contrasting nicely with the
`Needs*` variants.
2023-07-31 16:20:18 +10:00
Nicholas Nethercote
a8c71f0a15 Inline and remove submit_pre_codegened_module_to_llvm.
It has a single callsite, and provides little value.
2023-07-31 16:20:18 +10:00
Matthias Krüger
3ce90b1649 inline format!() args up to and including rustc_codegen_llvm 2023-07-30 14:22:50 +02:00
Erik Desjardins
04303cfb3a cg_ssa: remove pointee types and pointercast/bitcast-of-ptr 2023-07-29 13:18:20 -04:00
Matthias Krüger
c3cd05198a
Rollup merge of #113872 - nnethercote:tweak-cgu-sorting, r=pnkfelix
Tweak CGU sorting in a couple of places.

In `base.rs`, tweak how the CGU size interleaving works. Since #113777, it's much more common to have multiple CGUs with identical sizes. With the existing code these same-sized items ended up in the opposite-to-desired order due to the stable sorting. The code now starts with a reverse sort (like is done in `partitioning.rs`) which gives the behaviour we want. This doesn't matter much for perf, but makes profiles in `samply` look more like what we expect.

In `partitioning.rs`, we can use `sort_by_key` instead of `sort_by_cached_key` because `CGU::size_estimate()` is cheap. (There is an identical CGU sort earlier in that function that already uses `sort_by_key`.)

r? `@pnkfelix`
2023-07-27 06:04:12 +02:00
Oli Scherer
2b444672e1 Use a builder instead of boolean/option arguments 2023-07-25 13:51:15 +00:00
Matthias Krüger
abde841f0a remove redundant clones 2023-07-23 09:48:07 +02:00
bors
1c44af9b79 Auto merge of #111836 - calebzulawski:target-feature-closure, r=workingjubilee
Fix #[inline(always)] on closures with target feature 1.1

Fixes #108655.  I think this is the most obvious solution that isn't overly complicated.  The comment includes more justification, but I think this is likely better than demoting the `#[inline(always)]` to `#[inline]`, since existing code is unaffected.
2023-07-23 00:16:03 +00:00
bors
d908a5b08e Auto merge of #113892 - RalfJung:uninit-undef-poison, r=wesleywiser
clarify MIR uninit vs LLVM undef/poison

In [this LLVM discussion](https://discourse.llvm.org/t/rfc-load-instruction-uninitialized-memory-semantics/67481) I learned that mapping our uninitialized memory in MIR to poison in LLVM would be quite problematic due to the lack of a byte type. I am not sure where to write down this insight but this seems like a reasonable start.
2023-07-21 19:32:17 +00:00
Matthias Krüger
b1d1e99c22
Rollup merge of #113780 - dtolnay:printkindpath, r=b-naber
Support `--print KIND=PATH` command line syntax

As is already done for `--emit KIND=PATH` and `-L KIND=PATH`.

In the discussion of #110785, it was pointed out that `--print KIND=PATH` is nicer than trying to apply the single global `-o` path to `--print`'s output, because in general there can be multiple print requests within a single rustc invocation, and anyway `-o` would already be used for a different meaning in the case of `link-args` and `native-static-libs`.

I am interested in using `--print cfg=PATH` in Buck2. Currently Buck2 works around the lack of support for `--print KIND=PATH` by [indirecting through a Python wrapper script](d43cf3a51a/prelude/rust/tools/get_rustc_cfg.py) to redirect rustc's stdout into the location dictated by the build system.

From skimming Cargo's usages of `--print`, it definitely seems like it would benefit from `--print KIND=PATH` too. Currently it is working around the lack of this by inserting `--crate-name=___ --print=crate-name` so that it can look for a line containing `___` as a delimiter between the 2 other `--print` informations it actually cares about. This is commented as a "HACK" and "abuse". 31eda6f7c3/src/cargo/core/compiler/build_context/target_info.rs (L242) (FYI `@weihanglo` as you dealt with this recently in https://github.com/rust-lang/cargo/pull/11633.)

Mentioning reviewers active in #110785: `@fee1-dead` `@jyn514` `@bjorn3`
2023-07-21 06:52:28 +02:00
Matthias Krüger
2734b5ada9
Rollup merge of #113723 - khei4:khei4/llvm-stats, r=oli-obk,nikic
Resurrect: rustc_llvm: Add a -Z `print-codegen-stats` option to expose LLVM statistics.

This resurrects PR https://github.com/rust-lang/rust/pull/104000, which has sat idle for a while. And I want to see the effect of stack-move optimizations on LLVM (like https://reviews.llvm.org/D153453) :).

I have applied the changes requested by `@oli-obk` and `@nagisa`  https://github.com/rust-lang/rust/pull/104000#discussion_r1014625377 and https://github.com/rust-lang/rust/pull/104000#discussion_r1014642482 in the latest commits.

r? `@oli-obk`

-----

LLVM has a neat [statistics](https://llvm.org/docs/ProgrammersManual.html#the-statistic-class-stats-option) feature that tracks how often optimizations kick in. It's very handy for optimization work. Since we expose the LLVM pass timings, I thought it made sense to expose the LLVM statistics too.

-----
(Edit: fix broken link
(Edit2: fix segmentation fault and use malloc

If `rustc` is built with
```toml
[llvm]
assertions = true
```
Then you can see like
```
rustc +stage1 -Z print-codegen-stats -C opt-level=3  tmp.rs
===-------------------------------------------------------------------------===
                          ... Statistics Collected ...
===-------------------------------------------------------------------------===
         3 aa                           - Number of MayAlias results
       193 aa                           - Number of MustAlias results
       531 aa                           - Number of NoAlias results
...
```

And the current default build emits only
```
$ rustc +stage1 -Z print-codegen-stats -C opt-level=3  tmp.rs
===-------------------------------------------------------------------------===
                          ... Statistics Collected ...
===-------------------------------------------------------------------------===
$
```
This might be better to emit the message to tell assertion flag necessity, but now I can't find how to do that...
2023-07-21 06:52:27 +02:00
David Tolnay
26fd6b15b0
Add note about writing native-static-libs to file 2023-07-20 11:04:32 -07:00
David Tolnay
dcfe94a009
Implement printing to file for link-args and native-static-libs 2023-07-20 11:04:31 -07:00
David Tolnay
6e734fce63
Implement printing to file in llvm_util 2023-07-20 11:04:31 -07:00
David Tolnay
c80cbe4bae
Implement printing to file in codegen_backend.print 2023-07-20 11:04:31 -07:00
David Tolnay
c0dc0c6875
Store individual output file name with every PrintRequest 2023-07-20 11:04:30 -07:00
Ralf Jung
41a73d8251 clarify MIR uninit vs LLVM undef/poison 2023-07-20 18:43:54 +02:00
Matthias Krüger
8c17e0701e
Rollup merge of #113529 - oli-obk:simd_shuffle_evaluated, r=wesleywiser
Permit pre-evaluated constants in simd_shuffle

fixes https://github.com/rust-lang/rust/issues/113500
2023-07-20 17:19:32 +02:00
bors
b14fd2359f Auto merge of #113695 - bjorn3:fix_rlib_cdylib_metadata_handling, r=pnkfelix,petrochenkov
Verify that all crate sources are in sync

This ensures that rustc will not attempt to link against a cdylib as if it is a rust dylib when an rlib for the same crate is available. Previously rustc didn't actually check if any further formats of a crate which has been loaded are of the same version and if they are actually valid. This caused a cdylib to be interpreted as rust dylib as soon as the corresponding rlib was loaded. As cdylibs don't export any rust symbols, linking would fail if rustc decides to link against the cdylib rather than the rlib.

Two crates depended on the previous behavior by separately compiling a test crate as both rlib and dylib. These have been changed to capture their original spirit to the best of my ability while still working when rustc verifies that all crates are in sync. It is unlikely that build systems depend on the current behavior and in any case we are taking a lot of measures to ensure that any change to either the source or the compilation options (including crate type) results in rustc rejecting it as incompatible. We merely didn't do this check here for now obsolete perf reasons.

Fixes https://github.com/rust-lang/rust/issues/10786
Fixes https://github.com/rust-lang/rust/issues/82151
Fixes https://github.com/rust-lang/rust/issues/82972
Closes https://github.com/bevy-cheatbook/bevy-cheatbook/issues/114
2023-07-20 09:00:10 +00:00
Oli Scherer
c7428d5052 Monomorphize constants before inspecting them 2023-07-20 08:53:09 +00:00
bors
a6cdd81eff Auto merge of #108714 - estebank:ice_dump, r=oli-obk
On nightly, dump ICE backtraces to disk

Implement rust-lang/compiler-team#578.

When an ICE is encountered on nightly releases, the new rustc panic handler will also write the contents of the backtrace to disk. If any `delay_span_bug`s are encountered, their backtrace is also added to the file. The platform and rustc version will also be collected.

<img width="1032" alt="Screenshot 2023-03-03 at 2 13 25 PM" src="https://user-images.githubusercontent.com/1606434/222842420-8e039740-4042-4563-b31d-599677171acf.png">

The current behavior will *always* write to disk on nightly builds, regardless of whether the backtrace is printed to the terminal, unless the environment variable `RUSTC_ICE_DISK_DUMP` is set to `0`. This is a compromise and can be changed.
2023-07-20 01:29:17 +00:00
Nicholas Nethercote
8c31219d5c Tweak CGU sorting in a couple of places.
In `base.rs`, tweak how the CGU size interleaving works. Since #113777,
it's much more common to have multiple CGUs with identical sizes. With
the existing code these same-sized items ended up in the
opposite-to-desired order due to the stable sorting. The code now starts
with a reverse sort (like is done in `partitioning.rs`) which gives the
behaviour we want. This doesn't matter much for perf, but makes profiles
in `samply` look more like what we expect.

In `partitioning.rs`, we can use `sort_by_key` instead of
`sort_by_cached_key` because `CGU::size_estimate()` is cheap. (There is
an identical CGU sort earlier in that function that already uses
`sort_by_key`.)
2023-07-20 09:58:13 +10:00
Dylan DPC
c1d6d322f4
Rollup merge of #113716 - DianQK:add-no_builtins-to-function, r=pnkfelix
Add the `no-builtins` attribute to functions when `no_builtins` is applied at the crate level.

**When `no_builtins` is applied at the crate level, we should add the `no-builtins` attribute to each function to ensure it takes effect in LTO.**

This is also the reason why no_builtins does not take effect in LTO as mentioned in #35540.

Now, `#![no_builtins]` should be similar to `-fno-builtin` in clang/gcc, see https://clang.godbolt.org/z/z4j6Wsod5.

Next, we should make `#![no_builtins]` participate in LTO again. That makes sense, as LTO also takes into consideration function-level instruction optimizations, such as the MachineOutliner. More importantly, when a user writes a large `#![no_builtins]` crate, they would like this crate to participate in LTO as well.

We should also add a function-level no_builtins attribute to allow users to have more control over it. This is similar to Clang's `__attribute__((no_builtin))` feature, see https://clang.godbolt.org/z/Wod6KK6eq. Before implementing this feature, maybe we should discuss whether to support more fine-grained control, such as `__attribute__((no_builtin("memcpy")))`.

Related discussions:
- #109821
- #35540

Next (a separate pull request?):
- [ ] Revert #35637
- [ ] Add a function-level `no_builtin` attribute?
2023-07-19 22:37:06 +05:30
bjorn3
8c9a8b63c9 Fix review comments 2023-07-19 14:53:26 +00:00
bjorn3
52853c2694 Don't compress dylib metadata 2023-07-19 14:47:06 +00:00
Esteban Küber
8eb5843a59 On nightly, dump ICE backtraces to disk
Implement rust-lang/compiler-team#578.

When an ICE is encountered on nightly releases, the new rustc panic
handler will also write the contents of the backtrace to disk. If any
`delay_span_bug`s are encountered, their backtrace is also added to the
file. The platform and rustc version will also be collected.
2023-07-19 14:10:07 +00:00
DianQK
cc08749df2
Add the no-builtins attribute to functions when no_builtins is applied at the crate level.
When `no_builtins` is applied at the crate level, we should add the
`no-builtins` attribute to each function to ensure it takes effect in LTO.
2023-07-18 22:15:47 +08:00
chenx97
d3727148a0 support for mips32r6 as a target_arch value 2023-07-18 18:58:18 +08:00
chenx97
a132b3ec03 merge patterns 2023-07-18 18:58:18 +08:00
chenx97
c6e03cd951 support for mips64r6 as a target_arch value 2023-07-18 18:58:18 +08:00