All fields except the discriminant (including `outer_fields`)
should be put into structures inside the variant part, which gives
an equivalent layout but offers us much better integration with
debuggers.
Implement RFC 1260 with feature_name `imported_main`.
This is the second extraction part of #84062 plus additional adjustments.
This (mostly) implements RFC 1260.
However there's still one test case failure in the extern crate case. Maybe `LocalDefId` doesn't work here? I'm not sure.
cc https://github.com/rust-lang/rust/issues/28937
r? `@petrochenkov`
The Eq trait has a special hidden function. MIR `InstrumentCoverage`
would add this function to the coverage map, but it is never called, so
the `Eq` trait would always appear uncovered.
Fixes: #83601
The fix required creating a new function attribute `no_coverage` to mark
functions that should be ignored by `InstrumentCoverage` and the
coverage `mapgen` (during codegen).
While testing, I also noticed two other issues:
* spanview debug file output ICEd on a function with no body. The
workaround for this is included in this PR.
* `assert_*!()` macro coverage can appear covered if followed by another
`assert_*!()` macro. Normally they appear uncovered. I submitted a new
Issue #84561, and added a coverage test to demonstrate this issue.
This commit updates rustc, with an applicable LLVM version, to use
LLVM's new `llvm.fpto{u,s}i.sat.*.*` intrinsics to implement saturating
floating-point-to-int conversions. This results in a little bit tighter
codegen for x86/x86_64, but the main purpose of this is to prepare for
upcoming changes to the WebAssembly backend in LLVM where wasm's
saturating float-to-int instructions will now be implemented with these
intrinsics.
This change allows simplifying a good deal of surrounding code, namely
removing a lot of wasm-specific behavior. WebAssembly no longer has any
special-casing of saturating arithmetic instructions and the need for
`fptoint_may_trap` is gone and all handling code for that is now
removed. This means that the only wasm-specific logic is in the
`fpto{s,u}i` instructions which only get used for "out of bounds is
undefined behavior". This does mean that for the WebAssembly target
specifically the Rust compiler will no longer be 100% compatible with
pre-LLVM 12 versions, but it seems like that's unlikely to be relied on
by too many folks.
Note that this change does immediately regress the codegen of saturating
float-to-int casts on WebAssembly due to the specialization of the LLVM
intrinsic not being present in our LLVM fork just yet. I'll be following
up with an LLVM update to pull in those patches, but affects a few other
SIMD things in flight for WebAssembly so I wanted to separate this change.
Eventually the entire `cast_float_to_int` function can be removed when
LLVM 12 is the minimum version, but that will require sinking the
complexity of it into other backends such as Cranelfit.
`fast-math` implies things like functions not being able to accept as an
argument or return as a result, say, `inf` which made these functions
confusingly named or behaving incorrectly, depending on how you
interpret it. Since the time when these intrinsics have been implemented
the intrinsics user's (stdsimd) approach has changed significantly and
so now it is required that these intrinsics operate normally rather than
in "whatever" way.
Fixes#84268
LLVM supports many functions from math.h in its IR. Many of these have
single-instruction variants on various platforms. So, let's add them so
std::arch can use them.
Yes, exact comparison is intentional: rounding must always return a
valid integer-equal value, except for inf/NAN.
Categorize and explain target features support
There are 3 different uses of the `-C target-feature` args passed to rustc:
1. All of the features are passed to LLVM, which uses them to configure code-generation. This is sort-of stabilized since 1.0 though LLVM does change/add/remove target features regularly.
2. Target features which are in [the compiler's allowlist](69e1d22ddb/compiler/rustc_codegen_ssa/src/target_features.rs (L12-L34)) can be used in `cfg!(target_feature)` etc. These may have different names than in LLVM and are renamed before passing them to LLVM.
3. Target features which are in the allowlist and which are stabilized or feature-gate-enabled can be used in `#[target_feature]`.
It can be confusing that `rustc --print target-features` just prints out the LLVM features without separating out the rustc features or even mentioning that the dichotomy exists.
This improves the situation by separating out the rustc and LLVM target features and adding a brief explanation about the difference.
Abbreviated Example Output:
```
$ rustc --print target-features
Features supported by rustc for this target:
adx - Support ADX instructions.
aes - Enable AES instructions.
...
xsaves - Support xsaves instructions.
crt-static - Enables libraries with C Run-time Libraries(CRT) to be statically linked.
Code-generation features supported by LLVM for this target:
16bit-mode - 16-bit mode (i8086).
32bit-mode - 32-bit mode (80386).
...
x87 - Enable X87 float instructions.
xop - Enable XOP instructions.
Use +feature to enable a feature, or -feature to disable it.
For example, rustc -C target-cpu=mycpu -C target-feature=+feature1,-feature2
Code-generation features cannot be used in cfg or #[target_feature],
and may be renamed or removed in a future version of LLVM or rustc.
```
Motivated by #83975.
CC https://github.com/rust-lang/rust/issues/49653
This commit implements the idea of a new ABI for the WebAssembly target,
one called `"wasm"`. This ABI is entirely of my own invention
and has no current precedent, but I think that the addition of this ABI
might help solve a number of issues with the WebAssembly targets.
When `wasm32-unknown-unknown` was first added to Rust I naively
"implemented an abi" for the target. I then went to write `wasm-bindgen`
which accidentally relied on details of this ABI. Turns out the ABI
definition didn't match C, which is causing issues for C/Rust interop.
Currently the compiler has a "wasm32 bindgen compat" ABI which is the
original implementation I added, and it's purely there for, well,
`wasm-bindgen`.
Another issue with the WebAssembly target is that it's not clear to me
when and if the default C ABI will change to account for WebAssembly's
multi-value feature (a feature that allows functions to return multiple
values). Even if this does happen, though, it seems like the C ABI will
be guided based on the performance of WebAssembly code and will likely
not match even what the current wasm-bindgen-compat ABI is today. This
leaves a hole in Rust's expressivity in binding WebAssembly where given
a particular import type, Rust may not be able to import that signature
with an updated C ABI for multi-value.
To fix these issues I had the idea of a new ABI for WebAssembly, one
called `wasm`. The definition of this ABI is "what you write
maps straight to wasm". The goal here is that whatever you write down in
the parameter list or in the return values goes straight into the
function's signature in the WebAssembly file. This special ABI is for
intentionally matching the ABI of an imported function from the
environment or exporting a function with the right signature.
With the addition of a new ABI, this enables rustc to:
* Eventually remove the "wasm-bindgen compat hack". Once this
ABI is stable wasm-bindgen can switch to using it everywhere.
Afterwards the wasm32-unknown-unknown target can have its default ABI
updated to match C.
* Expose the ability to precisely match an ABI signature for a
WebAssembly function, regardless of what the C ABI that clang chooses
turns out to be.
* Continue to evolve the definition of the default C ABI to match what
clang does on all targets, since the purpose of that ABI will be
explicitly matching C rather than generating particular function
imports/exports.
Naturally this is implemented as an unstable feature initially, but it
would be nice for this to get stabilized (if it works) in the near-ish
future to remove the wasm32-unknown-unknown incompatibility with the C
ABI. Doing this, however, requires the feature to be on stable because
wasm-bindgen works with stable Rust.
Allow specifying alignment for functions
Fixes#75072
This allows the user to specify alignment for functions, which can be useful for low level work where functions need to necessarily be aligned to a specific value.
I believe the error cases not covered in the match are caught earlier based on my testing so I had them just return `None`.
Set dso_local for hidden, private and local items
This should probably have no real effect in most cases, as e.g. `hidden`
visibility already implies `dso_local` (or at least LLVM IR does not
preserve the `dso_local` setting if the item is already `hidden`), but
it should fix `-Crelocation-model=static` and improve codegen in
executables.
Note that this PR does not exhaustively port the logic in [clang], only the
portion that is necessary to fix a regression from LLVM 12 that relates to
`-Crelocation_model=static`.
Fixes#83335
[clang]: 3001d080c8/clang/lib/CodeGen/CodeGenModule.cpp (L945-L1039)
Use FromStr trait for number option parsing
Replace `parse_uint` with generic `parse_number` based on `FromStr`.
Use it for parsing inlining threshold to avoid casting later.
Allow clobbering unsupported registers in asm!
Previously registers could only be marked as clobbered if the target feature for that register was enabled. This restriction is now removed.
cc #81092
r? ``@nagisa``
Translate counters from Rust 1-based to LLVM 0-based counter ids
A colleague contacted me and asked why Rust's counters start at 1, when
Clangs appear to start at 0. There is a reason why Rust's internal
counters start at 1 (see the docs), and I tried to keep them consistent
when codegenned to LLVM's coverage mapping format. LLVM should be
tolerant of missing counters, but as my colleague pointed out,
`llvm-cov` will silently fail to generate a coverage report for a
function based on LLVM's assumption that the counters are 0-based.
See:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp#L170
Apparently, if, for example, a function has no branches, it would have
exactly 1 counter. `CounterValues.size()` would be 1, and (with the
1-based index), the counter ID would be 1. This would fail the check
and abort reporting coverage for the function.
It turns out that by correcting for this during coverage map generation,
by subtracting 1 from the Rust Counter ID (both when generating the
counter increment intrinsic call, and when adding counters to the map),
some uncovered functions (including in tests) now appear covered! This
corrects the coverage for a few tests!
r? `@tmandry`
FYI: `@wesleywiser`
A colleague contacted me and asked why Rust's counters start at 1, when
Clangs appear to start at 0. There is a reason why Rust's internal
counters start at 1 (see the docs), and I tried to keep them consistent
when codegenned to LLVM's coverage mapping format. LLVM should be
tolerant of missing counters, but as my colleague pointed out,
`llvm-cov` will silently fail to generate a coverage report for a
function based on LLVM's assumption that the counters are 0-based.
See:
https://github.com/llvm/llvm-project/blob/main/llvm/lib/ProfileData/Coverage/CoverageMapping.cpp#L170
Apparently, if, for example, a function has no branches, it would have
exactly 1 counter. `CounterValues.size()` would be 1, and (with the
1-based index), the counter ID would be 1. This would fail the check
and abort reporting coverage for the function.
It turns out that by correcting for this during coverage map generation,
by subtracting 1 from the Rust Counter ID (both when generating the
counter increment intrinsic call, and when adding counters to the map),
some uncovered functions (including in tests) now appear covered! This
corrects the coverage for a few tests!
This should have no real effect in most cases, as e.g. `hidden`
visibility already implies `dso_local` (or at least LLVM IR does not
preserve the `dso_local` setting if the item is already `hidden`), but
it should fix `-Crelocation-model=static` and improve codegen in
executables.
Note that this PR does not exhaustively port the logic in [clang]. Only
the obviously correct portion and what is necessary to fix a regression
from LLVM 12 that relates to `-Crelocation_model=static`.
Fixes#83335
[clang]: 3001d080c8/clang/lib/CodeGen/CodeGenModule.cpp (L945-L1039)
Run LLVM coverage instrumentation passes before optimization passes
This matches the behavior of Clang and allows us to remove several
hacks which were needed to ensure functions weren't optimized away
before reaching the instrumentation pass.
Fixes#83429
cc `@richkadel`
r? `@tmandry`
This matches the behavior of Clang and allows us to remove several
hacks which were needed to ensure functions weren't optimized away
before reaching the instrumentation pass.
- Add back `HirIdVec`, with a comment that it will soon be used.
- Add back `*_region` functions, with a comment they may soon be used.
- Remove `-Z borrowck_stats` completely. It didn't do anything.
- Remove `make_nop` completely.
- Add back `current_loc`, which is used by an out-of-tree tool.
- Fix style nits
- Remove `AtomicCell` with `cfg(parallel_compiler)` for consistency.
Found with https://github.com/est31/warnalyzer.
Dubious changes:
- Is anyone else using rustc_apfloat? I feel weird completely deleting
x87 support.
- Maybe some of the dead code in rustc_data_structures, in case someone
wants to use it in the future?
- Don't change rustc_serialize
I plan to scrap most of the json module in the near future (see
https://github.com/rust-lang/compiler-team/issues/418) and fixing the
tests needed more work than I expected.
TODO: check if any of the comments on the deleted code should be kept.
Import small cold functions
The Rust code is often written under an assumption that for generic
methods inline attribute is mostly unnecessary, since for optimized
builds using ThinLTO, a method will be code generated in at least one
CGU and available for import.
For example, deref implementations for Box, Vec, MutexGuard, and
MutexGuard are not currently marked as inline, neither is identity
implementation of From trait.
In PGO builds, when functions are determined to be cold, the default
multiplier of zero will stop the import, no matter how trivial the
implementation.
Increase slightly the default multiplier from 0 to 0.1.
r? `@ghost`
coverage bug fixes and optimization support
Adjusted LLVM codegen for code compiled with `-Zinstrument-coverage` to
address multiple, somewhat related issues.
Fixed a significant flaw in prior coverage solution: Every counter
generated a new counter variable, but there should have only been one
counter variable per function. This appears to have bloated .profraw
files significantly. (For a small program, it increased the size by
about 40%. I have not tested large programs, but there is anecdotal
evidence that profraw files were way too large. This is a good fix,
regardless, but hopefully it also addresses related issues.
Fixes: #82144
Invalid LLVM coverage data produced when compiled with -C opt-level=1
Existing tests now work up to at least `opt-level=3`. This required a
detailed analysis of the LLVM IR, comparisons with Clang C++ LLVM IR
when compiled with coverage, and a lot of trial and error with codegen
adjustments.
The biggest hurdle was figuring out how to continue to support coverage
results for unused functions and generics. Rust's coverage results have
three advantages over Clang's coverage results:
1. Rust's coverage map does not include any overlapping code regions,
making coverage counting unambiguous.
2. Rust generates coverage results (showing zero counts) for all unused
functions, including generics. (Clang does not generate coverage for
uninstantiated template functions.)
3. Rust's unused functions produce minimal stubbed functions in LLVM IR,
sufficient for including in the coverage results; while Clang must
generate the complete LLVM IR for each unused function, even though
it will never be called.
This PR removes the previous hack of attempting to inject coverage into
some other existing function instance, and generates dedicated instances
for each unused function. This change, and a few other adjustments
(similar to what is required for `-C link-dead-code`, but with lower
impact), makes it possible to support LLVM optimizations.
Fixes: #79651
Coverage report: "Unexecuted instantiation:..." for a generic function
from multiple crates
Fixed by removing the aforementioned hack. Some "Unexecuted
instantiation" notices are unavoidable, as explained in the
`used_crate.rs` test, but `-Zinstrument-coverage` has new options to
back off support for either unused generics, or all unused functions,
which avoids the notice, at the cost of less coverage of unused
functions.
Fixes: #82875
Invalid LLVM coverage data produced with crate brotli_decompressor
Fixed by disabling the LLVM function attribute that forces inlining, if
`-Z instrument-coverage` is enabled. This attribute is applied to
Rust functions with `#[inline(always)], and in some cases, the forced
inlining breaks coverage instrumentation and reports.
FYI: `@wesleywiser`
r? `@tmandry`
The frontend shouldn't be deciding whether or not to use mutable
noalias attributes, as this is a pure LLVM concern. Only provide
the necessary information and do the actual decision in
codegen_llvm.
* Use Markdown list syntax and unindent a bit to prevent Markdown
interpreting the nested lists as code blocks
* A few more small typographical cleanups
Adjusted LLVM codegen for code compiled with `-Zinstrument-coverage` to
address multiple, somewhat related issues.
Fixed a significant flaw in prior coverage solution: Every counter
generated a new counter variable, but there should have only been one
counter variable per function. This appears to have bloated .profraw
files significantly. (For a small program, it increased the size by
about 40%. I have not tested large programs, but there is anecdotal
evidence that profraw files were way too large. This is a good fix,
regardless, but hopefully it also addresses related issues.
Fixes: #82144
Invalid LLVM coverage data produced when compiled with -C opt-level=1
Existing tests now work up to at least `opt-level=3`. This required a
detailed analysis of the LLVM IR, comparisons with Clang C++ LLVM IR
when compiled with coverage, and a lot of trial and error with codegen
adjustments.
The biggest hurdle was figuring out how to continue to support coverage
results for unused functions and generics. Rust's coverage results have
three advantages over Clang's coverage results:
1. Rust's coverage map does not include any overlapping code regions,
making coverage counting unambiguous.
2. Rust generates coverage results (showing zero counts) for all unused
functions, including generics. (Clang does not generate coverage for
uninstantiated template functions.)
3. Rust's unused functions produce minimal stubbed functions in LLVM IR,
sufficient for including in the coverage results; while Clang must
generate the complete LLVM IR for each unused function, even though
it will never be called.
This PR removes the previous hack of attempting to inject coverage into
some other existing function instance, and generates dedicated instances
for each unused function. This change, and a few other adjustments
(similar to what is required for `-C link-dead-code`, but with lower
impact), makes it possible to support LLVM optimizations.
Fixes: #79651
Coverage report: "Unexecuted instantiation:..." for a generic function
from multiple crates
Fixed by removing the aforementioned hack. Some "Unexecuted
instantiation" notices are unavoidable, as explained in the
`used_crate.rs` test, but `-Zinstrument-coverage` has new options to
back off support for either unused generics, or all unused functions,
which avoids the notice, at the cost of less coverage of unused
functions.
Fixes: #82875
Invalid LLVM coverage data produced with crate brotli_decompressor
Fixed by disabling the LLVM function attribute that forces inlining, if
`-Z instrument-coverage` is enabled. This attribute is applied to
Rust functions with `#[inline(always)], and in some cases, the forced
inlining breaks coverage instrumentation and reports.
Make source-based code coverage compatible with MIR inlining
When codegenning code coverage use the instance that coverage data was
originally generated for, to ensure basic level of compatibility with
MIR inlining.
Fixes#83061
Adjust `-Ctarget-cpu=native` handling in cg_llvm
When cg_llvm encounters the `-Ctarget-cpu=native` it computes an
explciit set of features that applies to the target in order to
correctly compile code for the host CPU (because e.g. `skylake` alone is
not sufficient to tell if some of the instructions are available or
not).
However there were a couple of issues with how we did this. Firstly, the
order in which features were overriden wasn't quite right – conceptually
you'd expect `-Ctarget-cpu=native` option to override the features that
are implicitly set by the target definition. However due to how other
`-Ctarget-cpu` values are handled we must adopt the following order
of priority:
* Features from -Ctarget-cpu=*; are overriden by
* Features implied by --target; are overriden by
* Features from -Ctarget-feature; are overriden by
* function specific features.
Another problem was in that the function level `target-features`
attribute would overwrite the entire set of the globally enabled
features, rather than just the features the
`#[target_feature(enable/disable)]` specified. With something like
`-Ctarget-cpu=native` we'd end up in a situation wherein a function
without `#[target_feature(enable)]` annotation would have a broader
set of features compared to a function with one such attribute. This
turned out to be a cause of heavy run-time regressions in some code
using these function-level attributes in conjunction with
`-Ctarget-cpu=native`, for example.
With this PR rustc is more careful about specifying the entire set of
features for functions that use `#[target_feature(enable/disable)]` or
`#[instruction_set]` attributes.
Sadly testing the original reproducer for this behaviour is quite
impossible – we cannot rely on `-Ctarget-cpu=native` to be anything in
particular on developer or CI machines.
cc https://github.com/rust-lang/rust/issues/83027 `@BurntSushi`
When cg_llvm encounters the `-Ctarget-cpu=native` it computes an
explciit set of features that applies to the target in order to
correctly compile code for the host CPU (because e.g. `skylake` alone is
not sufficient to tell if some of the instructions are available or
not).
However there were a couple of issues with how we did this. Firstly, the
order in which features were overriden wasn't quite right – conceptually
you'd expect `-Ctarget-cpu=native` option to override the features that
are implicitly set by the target definition. However due to how other
`-Ctarget-cpu` values are handled we must adopt the following order
of priority:
* Features from -Ctarget-cpu=*; are overriden by
* Features implied by --target; are overriden by
* Features from -Ctarget-feature; are overriden by
* function specific features.
Another problem was in that the function level `target-features`
attribute would overwrite the entire set of the globally enabled
features, rather than just the features the
`#[target_feature(enable/disable)]` specified. With something like
`-Ctarget-cpu=native` we'd end up in a situation wherein a function
without `#[target_feature(enable)]` annotation would have a broader
set of features compared to a function with one such attribute. This
turned out to be a cause of heavy run-time regressions in some code
using these function-level attributes in conjunction with
`-Ctarget-cpu=native`, for example.
With this PR rustc is more careful about specifying the entire set of
features for functions that use `#[target_feature(enable/disable)]` or
`#[instruction_set]` attributes.
Sadly testing the original reproducer for this behaviour is quite
impossible – we cannot rely on `-Ctarget-cpu=native` to be anything in
particular on developer or CI machines.
Allow rustdoc to handle asm! of foreign architectures
This allows rustdoc to process code containing `asm!` for architectures other than the current one. Since this never reaches codegen, we just replace target-specific registers and register classes with a dummy one.
Fixes#82869
Consider functions to be reachable for code coverage purposes, either
when they reach the code generation directly, or indirectly as inlined
part of another function.
When codegenning code coverage use the instance that coverage data was
originally generated for, to ensure basic level of compatibility with
MIR inlining.
The Rust code is often written under an assumption that for generic
methods inline attribute is mostly unnecessary, since for optimized
builds using ThinLTO, a method will be generated in at least one CGU and
available for import.
For example, deref implementations for Box, Vec, MutexGuard, and
MutexGuard are not currently marked as inline, neither is identity
implementation of From trait.
In PGO builds, when functions are determined to be cold, the default
multiplier of zero will stop the import, even for completely trivial
functions.
Increase slightly the default multiplier from 0 to 0.1 to import them
regardless.
This removes all of the code we had in place to work-around LLVM's
handling of forward progress. From this removal excluded is a workaround
where we'd insert a `sideeffect` into clearly infinite loops such as
`loop {}`. This code remains conditionally effective when the LLVM
version is earlier than 12.0, which fixed the forward progress related
miscompilations at their root.
Use u32 over Option<u32> in DebugLoc
~~Changes `Option<u32>` fields in `DebugLoc` to `Option<NonZeroU32>`. Since the respective fields (`line` and `col`) are guaranteed to be 1-based, this layout optimization is a freebie.~~
EDIT: Changes `Option<u32>` fields in `DebugLoc` to `u32`. As `@bugadani` pointed out, an `Option<NonZeroU32>` is probably an unnecessary layer of abstraction since the `None` variant is always used as `UNKNOWN_LINE_NUMBER` (which is just `0`). Also, `SourceInfo` in `metadata.rs` already uses a `u32` instead of an `Option<u32>` to encode the same information, so I think this change is warranted.
Since `@jyn514` raised some concerns over measuring performance in a similar PR (#82255), does this need a perf run?
Upgrade to LLVM 12
This implements the necessary adjustments to make rustc work with LLVM 12. I didn't encounter any major issues so far.
r? `@cuviper`
The definition of this struct changes in LLVM 12 due to the addition
of branch coverage support. To avoid future mismatches, declare our
own struct and then convert between them.
Update measureme dependency to the latest version
This version adds the ability to use `rdpmc` hardware-based performance
counters instead of wall-clock time for measuring duration. This also
introduces a dependency on the `perf-event-open-sys` crate on Linux
which is used when using hardware counters.
r? ```@oli-obk```
Replace const_cstr with cstr crate
This PR replaces the `const_cstr` macro inside `rustc_data_structures` with `cstr` macro from [cstr](https://crates.io/crates/cstr) crate.
The two macros basically serve the same purpose, which is to generate `&'static CStr` from a string literal. `cstr` is better because it validates the literal at compile time, while the existing `const_cstr` does it at runtime when `debug_assertions` is enabled. In addition, the value `cstr` generates can be used in constant context (which is seemingly not needed anywhere currently, though).
This version adds the ability to use `rdpmc` hardware-based performance
counters instead of wall-clock time for measuring duration. This also
introduces a dependency on the `perf-event-open-sys` crate on Linux
which is used when using hardware counters.
Set path of the compile unit to the source directory
As part of the effort to implement split dwarf debug info, we ended up
setting the compile unit location to the output directory rather than
the source directory. Furthermore, it seems like we failed to remap the
prefixes for this as well!
The desired behaviour is to instead set the `DW_AT_GNU_dwo_name` to a
path relative to compiler's working directory. This still allows
debuggers to find the split dwarf files, while not changing the
behaviour of the code that is compiling with regular debug info, and not
changing the compiler's behaviour with regards to reproducibility.
Fixes#82074
cc `@alexcrichton` `@davidtwco`
Don't fail to remove files if they are missing
In the backend we may want to remove certain temporary files, but in
certain other situations these files might not be produced in the first
place. We don't exactly care about that, and the intent is really that
these files are gone after a certain point in the backend.
Here we unify the backend file removing calls to use `ensure_removed`
which will attempt to delete a file, but will not fail if it does not
exist (anymore).
The tradeoff to this approach is, of course, that we may miss instances
were we are attempting to remove files at wrong paths due to some bug –
compilation would silently succeed but the temporary files would remain
there somewhere.
On 32-bit ARM platforms, the register `r14` has the alias `lr`. When used as an output register in `asm!`, rustc canonicalizes the name to `r14`. LLVM only knows the register by the name `lr`, and rejects it. This changes rustc's LLVM code generation to output `lr` instead.
In the backend we may want to remove certain temporary files, but in
certain other situations these files might not be produced in the first
place. We don't exactly care about that, and the intent is really that
these files are gone after a certain point in the backend.
Here we unify the backend file removing calls to use `ensure_removed`
which will attempt to delete a file, but will not fail if it does not
exist (anymore).
The tradeoff to this approach is, of course, that we may miss instances
were we are attempting to remove files at wrong paths due to some bug –
compilation would silently succeed but the temporary files would remain
there somewhere.
As part of the effort to implement split dwarf debug info, we ended up
setting the compile unit location to the output directory rather than
the source directory. Furthermore, it seems like we failed to remap the
prefixes for this as well!
The desired behaviour is to instead set the `DW_AT_GNU_dwo_name` to a
path relative to compiler's working directory. This still allows
debuggers to find the split dwarf files, while not changing the
behaviour of the code that is compiling with regular debug info, and not
changing the compiler's behaviour with regards to reproducibility.
Fixes#82074
Improve SIMD type element count validation
Resolvesrust-lang/stdsimd#53.
These changes are motivated by `stdsimd` moving in the direction of const generic vectors, e.g.:
```rust
#[repr(simd)]
struct SimdF32<const N: usize>([f32; N]);
```
This makes a few changes:
* Establishes a maximum SIMD lane count of 2^16 (65536). This value is arbitrary, but attempts to validate lane count before hitting potential errors in the backend. It's not clear what LLVM's maximum lane count is, but cranelift's appears to be much less than `usize::MAX`, at least.
* Expands some SIMD intrinsics to support arbitrary lane counts. This resolves the ICE in the linked issue.
* Attempts to catch invalid-sized vectors during typeck when possible.
Unresolved questions:
* Generic-length vectors can't be validated in typeck and are only validated after monomorphization while computing layout. This "works", but the errors simply bail out with no context beyond the name of the type. Should these errors instead return `LayoutError` or otherwise provide context in some way? As it stands, users of `stdsimd` could trivially produce monomorphization errors by making zero-length vectors.
cc `@bjorn3`
Avoid a hir access inside get_static
Together with #81056 this ensures that the codegen unit DepNode doesn't have a direct dependency on any part of the hir.
Add a new ABI to support cmse_nonsecure_call
This adds support for the `cmse_nonsecure_call` feature to be able to perform non-secure function call.
See the discussion on Zulip [here](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Support.20for.20callsite.20attributes/near/223054928).
This is a followup to #75810 which added `cmse_nonsecure_entry`. As for that PR, I assume that the changes are small enough to not have to go through a RFC but I don't mind doing one if needed 😃
I did not yet create a tracking issue, but if most of it is fine, I can create one and update the various files accordingly (they refer to the other tracking issue now).
On the Zulip chat, I believe `@jonas-schievink` volunteered to be a reviewer 💯
Add AArch64 big-endian and ILP32 targets
This PR adds 3 new AArch64 targets:
- `aarch64_be-unknown-linux-gnu`
- `aarch64-unknown-linux-gnu_ilp32`
- `aarch64_be-unknown-linux-gnu_ilp32`
It also fixes some ABI issues on big-endian ARM and AArch64.
This commit adds a new ABI to be selected via `extern
"C-cmse-nonsecure-call"` on function pointers in order for the compiler to
apply the corresponding cmse_nonsecure_call callsite attribute.
For Armv8-M targets supporting TrustZone-M, this will perform a
non-secure function call by saving, clearing and calling a non-secure
function pointer using the BLXNS instruction.
See the page on the unstable book for details.
Signed-off-by: Hugues de Valon <hugues.devalon@arm.com>
rustc: Stabilize `-Zrun-dsymutil` as `-Csplit-debuginfo`
This commit adds a new stable codegen option to rustc,
`-Csplit-debuginfo`. The old `-Zrun-dsymutil` flag is deleted and now
subsumed by this stable flag. Additionally `-Zsplit-dwarf` is also
subsumed by this flag but still requires `-Zunstable-options` to
actually activate. The `-Csplit-debuginfo` flag takes one of
three values:
* `off` - This indicates that split-debuginfo from the final artifact is
not desired. This is not supported on Windows and is the default on
Unix platforms except macOS. On macOS this means that `dsymutil` is
not executed.
* `packed` - This means that debuginfo is desired in one location
separate from the main executable. This is the default on Windows
(`*.pdb`) and macOS (`*.dSYM`). On other Unix platforms this subsumes
`-Zsplit-dwarf=single` and produces a `*.dwp` file.
* `unpacked` - This means that debuginfo will be roughly equivalent to
object files, meaning that it's throughout the build directory
rather than in one location (often the fastest for local development).
This is not the default on any platform and is not supported on Windows.
Each target can indicate its own default preference for how debuginfo is
handled. Almost all platforms default to `off` except for Windows and
macOS which default to `packed` for historical reasons.
Some equivalencies for previous unstable flags with the new flags are:
* `-Zrun-dsymutil=yes` -> `-Csplit-debuginfo=packed`
* `-Zrun-dsymutil=no` -> `-Csplit-debuginfo=unpacked`
* `-Zsplit-dwarf=single` -> `-Csplit-debuginfo=packed`
* `-Zsplit-dwarf=split` -> `-Csplit-debuginfo=unpacked`
Note that `-Csplit-debuginfo` still requires `-Zunstable-options` for
non-macOS platforms since split-dwarf support was *just* implemented in
rustc.
There's some more rationale listed on #79361, but the main gist of the
motivation for this commit is that `dsymutil` can take quite a long time
to execute in debug builds and provides little benefit. This means that
incremental compile times appear that much worse on macOS because the
compiler is constantly running `dsymutil` over every single binary it
produces during `cargo build` (even build scripts!). Ideally rustc would
switch to not running `dsymutil` by default, but that's a problem left
to get tackled another day.
Closes#79361
This commit adds a new stable codegen option to rustc,
`-Csplit-debuginfo`. The old `-Zrun-dsymutil` flag is deleted and now
subsumed by this stable flag. Additionally `-Zsplit-dwarf` is also
subsumed by this flag but still requires `-Zunstable-options` to
actually activate. The `-Csplit-debuginfo` flag takes one of
three values:
* `off` - This indicates that split-debuginfo from the final artifact is
not desired. This is not supported on Windows and is the default on
Unix platforms except macOS. On macOS this means that `dsymutil` is
not executed.
* `packed` - This means that debuginfo is desired in one location
separate from the main executable. This is the default on Windows
(`*.pdb`) and macOS (`*.dSYM`). On other Unix platforms this subsumes
`-Zsplit-dwarf=single` and produces a `*.dwp` file.
* `unpacked` - This means that debuginfo will be roughly equivalent to
object files, meaning that it's throughout the build directory
rather than in one location (often the fastest for local development).
This is not the default on any platform and is not supported on Windows.
Each target can indicate its own default preference for how debuginfo is
handled. Almost all platforms default to `off` except for Windows and
macOS which default to `packed` for historical reasons.
Some equivalencies for previous unstable flags with the new flags are:
* `-Zrun-dsymutil=yes` -> `-Csplit-debuginfo=packed`
* `-Zrun-dsymutil=no` -> `-Csplit-debuginfo=unpacked`
* `-Zsplit-dwarf=single` -> `-Csplit-debuginfo=packed`
* `-Zsplit-dwarf=split` -> `-Csplit-debuginfo=unpacked`
Note that `-Csplit-debuginfo` still requires `-Zunstable-options` for
non-macOS platforms since split-dwarf support was *just* implemented in
rustc.
There's some more rationale listed on #79361, but the main gist of the
motivation for this commit is that `dsymutil` can take quite a long time
to execute in debug builds and provides little benefit. This means that
incremental compile times appear that much worse on macOS because the
compiler is constantly running `dsymutil` over every single binary it
produces during `cargo build` (even build scripts!). Ideally rustc would
switch to not running `dsymutil` by default, but that's a problem left
to get tackled another day.
Closes#79361
Refractor a few more types to `rustc_type_ir`
In the continuation of #79169, ~~blocked on that PR~~.
This PR:
- moves `IntVarValue`, `FloatVarValue`, `InferTy` (and friends) and `Variance`
- creates the `IntTy`, `UintTy` and `FloatTy` enums in `rustc_type_ir`, based on their `ast` and `chalk_ir` equilavents, and uses them for types in the rest of the compiler.
~~I will split up that commit to make this easier to review and to have a better commit history.~~
EDIT: done, I split the PR in commits of 200-ish lines each
r? `````@nikomatsakis````` cc `````@jackh726`````
Target stack-probe support configurable finely
This adds capability to configure the target's stack probe support in a
more precise manner than just on/off. In particular now we allow
choosing between always inline-asm, always call or either one of those
depending on the LLVM version.
Note that this removes the ability to turn off the generation of the
stack-probe attribute. This is valid to replace it with inline-asm for all targets because
`probe-stack="inline-asm"` will not generate any machine code on targets
that do not currently support stack probes. This makes support for stack
probes on targets that don't have any right now automatic with LLVM
upgrades in the future.
(This is valid to do based on the fact that clang unconditionally sets
this attribute when `-fstack-clash-protection` is used, AFAICT)
cc #77885
r? `@cuviper`
Use Option::unwrap_or instead of open-coding it
r? ```@oli-obk``` Noticed this while we were talking about the other PR just now 😆
```@rustbot``` modify labels +C-cleanup +T-compiler
This adds capability to configure the target's stack probe support in a
more precise manner than just on/off. In particular now we allow
choosing between always inline-asm, always call or either one of those
depending on the LLVM version on a per-target basis.
Make target-cpu=native detect individual features
This PR makes target-cpu=native check for and enable/disable individual features instead of detecting and targeting a CPU by name. This brings the flag's behavior more in line with clang and gcc and ensures that the host actually supports each feature that we are compiling for.
This should resolve issues with miscompilations on e.g. "Haswell" Pentiums and Celerons that lack support for AVX, and also enable support for `aes` on Broadwell processors that support it. It should also resolve issues with failing to detect feature support in newer CPUs that aren't yet known by LLVM (see: #80633).
Fixes#54688Fixes#48464Fixes#38218
Update and improve `rustc_codegen_{llvm,ssa}` docs
Fixes#75342.
These docs were very out of date and misleading. They even said that
they codegen'd the *AST*!
For some reason, the `rustc_codegen_ssa::base` docs were exactly
identical to the `rustc_codegen_llvm::base` docs. They didn't really
make sense, because they had LLVM-specific information even though
`rustc_codegen_ssa` is supposed to be somewhat generic. So I removed
them as they were misleading.
r? ``@pnkfelix`` maybe?
remove unused return type of dropck::check_drop_obligations()
don't wrap return type in Option in get_macro_by_def_id() since we would always return Some(..)
remove redundant return type of back::write::optimize()
don't Option-wrap return type of compute_type_parameters() since we always return Some(..)
don't return empty Result in assemble_generator_candidates()
don't return empty Result in assemble_closure_candidates()
don't return empty result in assemble_fn_pointer_candidates()
don't return empty result in assemble_candidates_from_impls()
don't return empty result in assemble_candidates_from_auto_impls()
don't return emtpy result in assemble_candidates_for_trait_alias()
don't return empty result in assemble_builtin_bound_candidates()
don't return empty results in assemble_extension_candidates_for_traits_in_scope() and assemble_extension_candidates_for_trait()
remove redundant wrapping of return type of StripItem::strip() since it always returns Some(..)
remove unused return type of assemble_extension_candidates_for_all_traits()
These docs were very out of date and misleading. They even said that
they codegen'd the *AST*!
For some reason, the `rustc_codegen_ssa::base` docs were exactly
identical to the `rustc_codegen_llvm::base` docs. They didn't really
make sense, because they had LLVM-specific information even though
`rustc_codegen_ssa` is supposed to be somewhat generic. So I removed
them as they were misleading.
llvm-dwp concatenates `DW_AT_comp_dir` with `DW_AT_GNU_dwo_name` (only
when `DW_AT_comp_dir` exists), which can result in it failing to find
the DWARF object files.
In earlier testing, `DW_AT_comp_dir` wasn't present in the final
object and the current directory was the output directory.
When running tests through compiletest, the working directory of the
compilation is different from output directory and that resulted in
`DW_AT_comp_dir` being in the object file (and set to the current
working directory, rather than the output directory), and
`DW_AT_GNU_dwo_name` being set to the full path (rather than just
the filename), so llvm-dwp was failing.
This commit changes the compilation directory provided to LLVM to match
the output directory, where DWARF objects are output; and ensures that
only the filename is used for `DW_AT_GNU_dwo_name`.
Signed-off-by: David Wood <david@davidtw.co>
This commit implements Split DWARF support, wiring up the flag (added in
earlier commits) to the modified FFI wrapper (also from earlier
commits).
Signed-off-by: David Wood <david@davidtw.co>
This commit removes the `TargetMachineFactory` struct and adds a
`TargetMachineFactoryFn` type alias which is used everywhere that the
previous, long type was used.
Signed-off-by: David Wood <david@davidtw.co>
This commit modifies the FFI bindings to LLVM required for Split DWARF
support in rustc. In particular:
- `addPassesToEmitFile`'s wrapper, `LLVMRustWriteOutputFile` now takes
a `DwoPath` `const char*`. When disabled, `nullptr` should be provided
which will preserve existing behaviour. When enabled, the path to the
`.dwo` file should be provided.
- `createCompileUnit`'s wrapper, `LLVMRustDIBuilderCreateCompileUnit`
now has two additional arguments, for the `DWOId` and to enable
`SplitDebugInlining`. `DWOId` should always be zero.
- `createTargetMachine`'s wrapper, `LLVMRustCreateTargetMachine` has an
additional argument which should be provided the path to the `.dwo`
when enabled.
Signed-off-by: David Wood <david@davidtw.co>
[mir-opt] Allow debuginfo to be generated for a constant or a Place
Prior to this commit, debuginfo was always generated by mapping a name
to a Place. This has the side-effect that `SimplifyLocals` cannot remove
locals that are only used for debuginfo because their other uses have
been const-propagated.
To allow these locals to be removed, we now allow debuginfo to point to
a constant value. The `ConstProp` pass detects when debuginfo points to
a local with a known constant value and replaces it with the value. This
allows the later `SimplifyLocals` pass to remove the local.
Fixes: #79725
Some macros can create a situation where `fn_sig_span` and `body_span`
map to different files.
New documentation on coverage tests incorrectly assumed multiple test
binaries could just be listed at the end of the `llvm-cov` command,
but it turns out each binary needs a `--object` prefix.
This PR fixes the bug and updates the documentation to correct that
issue. It also fixes a few other minor issues in internal implementation
comments, and adds documentation on getting coverage results for doc
tests.
Prior to this commit, debuginfo was always generated by mapping a name
to a Place. This has the side-effect that `SimplifyLocals` cannot remove
locals that are only used for debuginfo because their other uses have
been const-propagated.
To allow these locals to be removed, we now allow debuginfo to point to
a constant value. The `ConstProp` pass detects when debuginfo points to
a local with a known constant value and replaces it with the value. This
allows the later `SimplifyLocals` pass to remove the local.
Added one more test (two files) showing coverage of generics and unused
functions across crates.
Created and referenced new Issues, as requested.
Added comments.
Added a note about the possible effects of compiler options on LLVM
coverage maps.
Fixes multiple issue with counters, with simplification
Includes a change to the implicit else span in ast_lowering, so coverage
of the implicit else no longer spans the `then` block.
Adds coverage for unused closures and async function bodies.
Fixes: #78542
Adding unreachable regions for known MIR missing from coverage map
Cleaned up PR commits, and removed link-dead-code requirement and tests
Coverage no longer depends on Issue #76038 (`-C link-dead-code` is
no longer needed or enforced, so MSVC can use the same tests as
Linux and MacOS now)
Restrict adding unreachable regions to covered files
Improved the code that adds coverage for uncalled functions (with MIR
but not-codegenned) to avoid generating coverage in files not already
included in the files with covered functions.
Resolved last known issue requiring --emit llvm-ir workaround
Fixed bugs in how unreachable code spans were added.
Add wasm32 support to inline asm
There is some contention around inline asm and wasm, and I really only made this to figure out the process of hacking on rustc, but I figured as long as the code existed, it was worth uploading.
cc `@Amanieu`
Support repr(simd) on ADTs containing a single array field
This is a squash and rebase of `@gnzlbg's` #63531
I've never actually written code in the compiler before so just fumbled my way around until it would build 😅
I imagine there'll be some work we need to do in `rustc_codegen_cranelift` too for this now, but might need some input from `@bjorn3` to know what that is.
cc `@rust-lang/project-portable-simd`
-----
This PR allows using `#[repr(simd)]` on ADTs containing a single array field:
```rust
#[repr(simd)] struct S0([f32; 4]);
#[repr(simd)] struct S1<const N: usize>([f32; N]);
#[repr(simd)] struct S2<T, const N: usize>([T; N]);
```
This should allow experimenting with portable packed SIMD abstractions on nightly that make use of const generics.
Upgrades the coverage map to Version 4
Changes the coverage map injected into binaries compiled with
`-Zinstrument-coverage` to LLVM Coverage Mapping Format, Version 4 (from
Version 3). Note, binaries compiled with this version will require LLVM
tools from at least LLVM Version 11.
r? ``@wesleywiser``
* `rustc` should now compile under LLVM 9 or 10
* Compiler generates an error if `-Z instrument-coverage` is specified
but LLVM version is less than 11
* Coverage tests that require `-Z instrument-coverage` and run codegen
should be skipped if LLVM version is less than 11
This is useful for embedded targets where small code size is desired.
For example, on my project (thumbv7em-none-eabi) this yields a 0.6% code size reduction.
Changes the coverage map injected into binaries compiled with
`-Zinstrument-coverage` to LLVM Coverage Mapping Format, Version 4 (from
Version 3). Note, binaries compiled with this version will require LLVM
tools from at least LLVM Version 11.
It is applied exactly when the return value has an indirect pass mode.
Except for InReg on x86 fastcall, arg attrs are now only used for
optimization purposes and thus are fine to ignore.
The `#[naked]` attribute disabled prologue / epilogue emission for the
function and it is responsibility of a developer to provide them. The
compiler is no position to inline such functions correctly.
Disable inlining of naked functions at LLVM and MIR level.
Fix setting inline hint based on `InstanceDef::requires_inline`
For instances where `InstanceDef::requires_inline` is true, an attempt
is made to set an inline hint though a call to the `inline` function.
The attempt is ineffective, since all attributes will be usually removed
by the second call.
Fix the issue by applying the attributes only once, with user provided
attributes having a priority when provided.
Closes#79108.
Updated the list of white-listed target features for x86
This PR both adds in-source documentation on what to look out for when adding a new (X86) feature set and [adds all that are detectable at run-time in Rust stable as of 1.27.0](https://github.com/rust-lang/stdarch/blob/master/crates/std_detect/src/detect/arch/x86.rs).
This should only enable the use of the corresponding LLVM intrinsics.
Actual intrinsics need to be added separately in rust-lang/stdarch.
It also re-orders the run-time-detect test statements to be more consistent
with the actual list of intrinsics whitelisted and removes underscores not present
in the actual names (which might be mistaken as being part of the name)
The reference for LLVM's feature names used is [this file](https://github.com/llvm/llvm-project/blob/master/llvm/include/llvm/Support/X86TargetParser.def).
This PR was motivated as the compiler end's part for allowing #67329 to be adressed over on rust-lang/stdarch
[self-profiling] Include the estimated size of each cgu in the profile
This is helpful when looking for CGUs where the size estimate isn't a
good indicator of compilation time.
I verified that moving the profiling timer call doesn't affect the
results.
Results:
<img width="297" alt="Screen Shot 2020-11-03 at 7 25 04 AM" src="https://user-images.githubusercontent.com/831192/97985503-5901d100-1da6-11eb-9f10-f3e399702952.png">
`measureme` doesn't have support for custom arg names yet so `arg0` is the CGU name and `arg1` is the estimated size.
For instances where `InstanceDef::requires_inline` is true, an attempt
is made to set an inline hint though a call to the `inline` function.
The attempt is ineffective, since all attributes will be usually removed
by the second call.
Fix the issue by applying the attributes only once, with user provided
attributes having a priority when provided.
Allow making `RUSTC_BOOTSTRAP` conditional on the crate name
Motivation: This came up in the [Zulip stream](https://rust-lang.zulipchat.com/#narrow/stream/233931-t-compiler.2Fmajor-changes/topic/Require.20users.20to.20confirm.20they.20know.20RUSTC_.E2.80.A6.20compiler-team.23350/near/208403962) for https://github.com/rust-lang/compiler-team/issues/350.
See also https://github.com/rust-lang/cargo/pull/6608#issuecomment-458546258; this implements https://github.com/rust-lang/cargo/issues/6627.
The goal is for this to eventually allow prohibiting setting `RUSTC_BOOTSTRAP` in build.rs (https://github.com/rust-lang/cargo/issues/7088).
## User-facing changes
- `RUSTC_BOOTSTRAP=1` still works; there is no current plan to remove this.
- Things like `RUSTC_BOOTSTRAP=0` no longer activate nightly features. In practice this shouldn't be a big deal, since `RUSTC_BOOTSTRAP` is the opposite of stable and everyone uses `RUSTC_BOOTSTRAP=1` anyway.
- `RUSTC_BOOTSTRAP=x` will enable nightly features only for crate `x`.
- `RUSTC_BOOTSTRAP=x,y` will enable nightly features only for crates `x` and `y`.
## Implementation changes
The main change is that `UnstableOptions::from_environment` now requires
an (optional) crate name. If the crate name is unknown (`None`), then the new feature is not available and you still have to use `RUSTC_BOOTSTRAP=1`. In practice this means the feature is only available for `--crate-name`, not for `#![crate_name]`; I'm interested in supporting the second but I'm not sure how.
Other major changes:
- Added `Session::is_nightly_build()`, which uses the `crate_name` of
the session
- Added `nightly_options::match_is_nightly_build`, a convenience method
for looking up `--crate-name` from CLI arguments.
`Session::is_nightly_build()`should be preferred where possible, since
it will take into account `#![crate_name]` (I think).
- Added `unstable_features` to `rustdoc::RenderOptions`
I'm not sure whether this counts as T-compiler or T-lang; _technically_ RUSTC_BOOTSTRAP is an implementation detail, but it's been used so much it seems like this counts as a language change too.
r? `@joshtriplett`
cc `@Mark-Simulacrum` `@hsivonen`
rustc_target: Mark UEFI targets as `is_like_windows`/`is_like_msvc`
And document what `is_like_windows` and `is_like_msvc` actually mean in more detail.
Addresses FIXMEs left from https://github.com/rust-lang/rust/pull/71030.
r? `@nagisa`
Add asm register information for SPIR-V
As discussed in [zulip](https://rust-lang.zulipchat.com/#narrow/stream/182449-t-compiler.2Fhelp/topic/Defining.20asm!.20for.20new.20architecture), we at [rust-gpu](https://github.com/EmbarkStudios/rust-gpu) would like to support `asm!` for our SPIR-V backend. However, we cannot do so purely without frontend support: [this match](d4ea0b3e46/compiler/rustc_target/src/asm/mod.rs (L185)) fails and so `asm!` is not supported ([error reported here](d4ea0b3e46/compiler/rustc_ast_lowering/src/expr.rs (L1095))). To resolve this, we need to stub out register information for SPIR-V to support getting the `asm!` content all the way to [`AsmBuilderMethods::codegen_inline_asm`](https://doc.rust-lang.org/nightly/nightly-rustc/rustc_codegen_ssa/traits/trait.AsmBuilderMethods.html#tymethod.codegen_inline_asm), at which point the rust-gpu backend can do all the parsing and codegen that is needed.
This is a pretty weird PR - adding support for a backend that isn't in-tree feels pretty gross to me, but I don't see an easy way around this. ``@Amanieu`` said I should submit it anyway, so, here we are! Let me know if this needs to go through a more formal process (MCP?) and what I should do to help this along.
I based this off the [wasm asm PR](https://github.com/rust-lang/rust/pull/78684), which unfortunately this PR conflicts with that one quite a bit, sorry for any merge conflict pain :(
---
Some open questions:
- What do we call the register class? Some context, SPIR-V is an SSA-based IR, there are "instructions" that create IDs (referred to as `<id>` in the spec), which can be referenced by other instructions. So, `reg` isn't exactly accurate, they're SSA IDs, not re-assignable registers.
- What happens when a SPIR-V register gets to the LLVM backend? Right now it's a `bug!`, but should that be a `sess.fatal()`? I'm not sure if it's even possible to reach that point, maybe there's a check that prevents the `spirv` target from even reaching that codepath.
This commit grepped for LLVM_VERSION_GE, LLVM_VERSION_LT, get_major_version and
min-llvm-version and statically evaluated every expression possible
(and sensible) assuming that the LLVM version is >=9 now
The discussion seems to have resolved that this lint is a bit "noisy" in
that applying it in all places would result in a reduction in
readability.
A few of the trivial functions (like `Path::new`) are fine to leave
outside of closures.
The general rule seems to be that anything that is obviously an
allocation (`Box`, `Vec`, `vec![]`) should be in a closure, even if it
is a 0-sized allocation.
with an eye on merging `TargetOptions` into `Target`.
`TargetOptions` as a separate structure is mostly an implementation detail of `Target` construction, all its fields logically belong to `Target` and available from `Target` through `Deref` impls.
This PR allows using `#[repr(simd)]` on ADTs containing a
single array field:
```rust
#[repr(simd)] struct S0([f32; 4]);
#[repr(simd)] struct S1<const N: usize>([f32; N]);
#[repr(simd)] struct S2<T, const N: usize>([T; N]);
```
This should allow experimenting with portable packed SIMD
abstractions on nightly that make use of const generics.
The main change is that `UnstableOptions::from_environment` now requires
an (optional) crate name. If the crate name is unknown (`None`), then the new feature is not available and you still have to use `RUSTC_BOOTSTRAP=1`. In practice this means the feature is only available for `--crate-name`, not for `#![crate_name]`; I'm interested in supporting the second but I'm not sure how.
Other major changes:
- Added `Session::is_nightly_build()`, which uses the `crate_name` of
the session
- Added `nightly_options::match_is_nightly_build`, a convenience method
for looking up `--crate-name` from CLI arguments.
`Session::is_nightly_build()`should be preferred where possible, since
it will take into account `#![crate_name]` (I think).
- Added `unstable_features` to `rustdoc::RenderOptions`
There is a user-facing change here: things like `RUSTC_BOOTSTRAP=0` no
longer active nightly features. In practice this shouldn't be a big
deal, since `RUSTC_BOOTSTRAP` is the opposite of stable and everyone
uses `RUSTC_BOOTSTRAP=1` anyway.
- Add tests
Check against `Cheat`, not whether nightly features are allowed.
Nightly features are always allowed on the nightly channel.
- Only call `is_nightly_build()` once within a function
- Use booleans consistently for rustc_incremental
Sessions can't be passed through threads, so `read_file` couldn't take a
session. To be consistent, also take a boolean in `write_file_header`.
Implementing the Graph traits for the BasicCoverageBlock
graph.
optimized replacement of counters with expressions plus new BCB graphviz
* Avoid adding coverage to unreachable blocks.
* Special case for Goto at the end of the body. Make it non-reportable.
Improved debugging and formatting options (from env)
Don't automatically add counters to BCBs without CoverageSpans. They may
still get counters but only if there are dependencies from
other BCBs that have spans, I think.
Make CodeRegions optional for Counters too. It is
possible to inject counters (`llvm.instrprof.increment` intrinsic calls
without corresponding code regions in the coverage map. An expression
can still uses these counter values.
Refactored instrument_coverage.rs -> instrument_coverage/mod.rs, and
then broke up the mod into multiple files.
Compiling with coverage, with the expression optimization, works on
the json5format crate and its dependencies.
Refactored debug features from mod.rs to debug.rs
Add support for SHA256 source file hashing
Adds support for `-Z src-hash-algorithm sha256`, which became available in LLVM 11.
Using an older version of LLVM will cause an error `invalid checksum kind` if the hash algorithm is set to sha256.
r? `@eddyb`
cc #70401 `@est31`
This is helpful when looking for CGUs where the size estimate isn't a
good indicator of compilation time.
I verified that moving the profiling timer call doesn't affect the
results.
foreign_modules query hash table lookups
When compiling a large monolithic crate we're seeing huge times in the `foreign_modules` query due to repeated iteration over foreign modules (in order to find a module by its id). This implements hash table lookups so that which massively reduces time spent in that query in this particular case. We'll need to see if the overhead of creating the hash table has a negative impact on performance in more normal compilation scenarios.
I'm working with `@wesleywiser` on this.
This lets rustc users tweak whether the linker should relax ELF relocations,
namely whether it should emit R_X86_64_GOTPCRELX relocations instead of
R_X86_64_GOTPCREL, as the former is allowed by the ABI to be further
optimised. The default value is whatever the target defines.
Implement -Z function-sections=yes|no
This lets rustc users tweak whether all functions should be put in their own TEXT section, using whatever default value the target defines if the flag is missing.
I'm having fun experimenting with musl libc and trying to implement the start symbol in Rust, that means avoiding code that requires relocations, and AFAIK putting everything in its own section makes the toolchain generate `GOTPCREL` relocations for symbols that could use plain old PC-relative addressing (at least on `x86_64`) if they were all in the same section.
This lets rustc users tweak whether all functions should be put in their own
TEXT section, using whatever default value the target defines if the flag
is missing.
rustc_mir: track inlined callees in SourceScopeData.
We now record which MIR scopes are the roots of *other* (inlined) functions's scope trees, which allows us to generate the correct debuginfo in codegen, similar to what LLVM inlining generates.
This PR makes the `ui` test `backtrace-debuginfo` pass, if the MIR inliner is turned on by default.
Also, `#[track_caller]` is now correct in the face of MIR inlining (cc `@anp).`
Fixes#76997.
r? `@rust-lang/wg-mir-opt`
Updated the added documentation in llvm_util.rs to note which copies of LLVM need to be inspected.
Removed avx512bf16 and avx512vp2intersect because they are unsupported before LLVM 9 with the build with external LLVM 8 being supported
Re-introduced detection testing previously removed for un-requestable features tsc and mmx
Properly define va_arg and va_list for aarch64-apple-darwin
From [Apple][]:
> Because of these changes, the type `va_list` is an alias for `char*`,
> and not for the struct type in the generic procedure call standard.
With this change `/x.py test --stage 1 src/test/ui/abi/variadic-ffi`
passes.
Fixes#78092
[Apple]: https://developer.apple.com/documentation/xcode/writing_arm64_code_for_apple_platforms
This PR both adds in-source documentation on what to look out for
when adding a new (X86) feature set and adds all that are detectable at run-time in Rust stable
as of 1.27.0.
This should only enable the use of the corresponding LLVM intrinsics.
Actual intrinsics need to be added separately in rust-lang/stdarch.
It also re-orders the run-time-detect test statements to be more consistent
with the actual list of intrinsics whitelisted and removes underscores not present
in the actual names (which might be mistaken as being part of the name)