Enable MIR inlining
Continuation of https://github.com/rust-lang/rust/pull/82280 by `@wesleywiser.`
#82280 has shown nice compile time wins could be obtained by enabling MIR inlining.
Most of the issues in https://github.com/rust-lang/rust/issues/81567 are now fixed,
except the interaction with polymorphization which is worked around specifically.
I believe we can proceed with enabling MIR inlining in the near future
(preferably just after beta branching, in case we discover new issues).
Steps before merging:
- [x] figure out the interaction with polymorphization;
- [x] figure out how miri should deal with extern types;
- [x] silence the extra arithmetic overflow warnings;
- [x] remove the codegen fulfilment ICE;
- [x] remove the type normalization ICEs while compiling nalgebra;
- [ ] tweak the inlining threshold.
Added llvm lifetime annotations to function call argument temporaries.
The goal of this change is to ensure that llvm will do stack slot
optimization on these temporaries. This ensures that in code like:
```rust
const A: [u8; 1024] = [0; 1024];
fn copy_const() {
f(A);
f(A);
}
```
we only use 1024 bytes of stack space, instead of 2048 bytes.
I am new to developing for the rust compiler, and as such not entirely sure, but I believe this should be sufficient to close#98156.
Also, this does not contain a test case to ensure this keeps working, primarily because I am not sure how to go about testing this. I would love some suggestions as to how that could be approached.
Simplify memory ordering intrinsics
This changes the names of the atomic intrinsics to always fully include their memory ordering arguments.
```diff
- atomic_cxchg
+ atomic_cxchg_seqcst_seqcst
- atomic_cxchg_acqrel
+ atomic_cxchg_acqrel_release
- atomic_cxchg_acqrel_failrelaxed
+ atomic_cxchg_acqrel_relaxed
// And so on.
```
- `seqcst` is no longer implied
- The failure ordering on chxchg is no longer implied in some cases, but now always explicitly part of the name.
- `release` is no longer shortened to just `rel`. That was especially confusing, since `relaxed` also starts with `rel`.
- `acquire` is no longer shortened to just `acq`, such that the names now all match the `std::sync::atomic::Ordering` variants exactly.
- This now allows for more combinations on the compare exchange operations, such as `atomic_cxchg_acquire_release`, which is necessary for #68464.
- This PR only exposes the new possibilities through unstable intrinsics, but not yet through the stable API. That's for [a separate PR](https://github.com/rust-lang/rust/pull/98383) that requires an FCP.
Suffixes for operations with a single memory order:
| Order | Before | After |
|---------|--------------|------------|
| Relaxed | `_relaxed` | `_relaxed` |
| Acquire | `_acq` | `_acquire` |
| Release | `_rel` | `_release` |
| AcqRel | `_acqrel` | `_acqrel` |
| SeqCst | (none) | `_seqcst` |
Suffixes for compare-and-exchange operations with two memory orderings:
| Success | Failure | Before | After |
|---------|---------|--------------------------|--------------------|
| Relaxed | Relaxed | `_relaxed` | `_relaxed_relaxed` |
| Relaxed | Acquire | ❌ | `_relaxed_acquire` |
| Relaxed | SeqCst | ❌ | `_relaxed_seqcst` |
| Acquire | Relaxed | `_acq_failrelaxed` | `_acquire_relaxed` |
| Acquire | Acquire | `_acq` | `_acquire_acquire` |
| Acquire | SeqCst | ❌ | `_acquire_seqcst` |
| Release | Relaxed | `_rel` | `_release_relaxed` |
| Release | Acquire | ❌ | `_release_acquire` |
| Release | SeqCst | ❌ | `_release_seqcst` |
| AcqRel | Relaxed | `_acqrel_failrelaxed` | `_acqrel_relaxed` |
| AcqRel | Acquire | `_acqrel` | `_acqrel_acquire` |
| AcqRel | SeqCst | ❌ | `_acqrel_seqcst` |
| SeqCst | Relaxed | `_failrelaxed` | `_seqcst_relaxed` |
| SeqCst | Acquire | `_failacq` | `_seqcst_acquire` |
| SeqCst | SeqCst | (none) | `_seqcst_seqcst` |
rustc_target: Remove some redundant target properties
`is_like_emscripten` is equivalent to `os == "emscripten"`, so it's removed.
`is_like_fuchsia` is equivalent to `os == "fuchsia"`, so it's removed.
`is_like_osx` also falls into the same category and is equivalent to `vendor == "apple"`, but it's commonly used so I kept it as is for now.
`is_like_(solaris,windows,wasm)` are combinations of different operating systems or architectures (see compiler/rustc_target/src/spec/tests/tests_impl.rs) so they are also kept as is.
I think `is_like_wasm` (and maybe `is_like_osx`) are sufficiently closed sets, so we can remove these fields as well and replace them with methods like `fn is_like_wasm() { arch == "wasm32" || arch == "wasm64" }`.
On other hand, `is_like_solaris` and `is_like_windows` are sufficiently open and I can imagine custom targets introducing other values for `os`.
This is kind of a gray area.
Update no_default_libraries handling for emscripten target
```@sbc100``` says:
> `-sDEFAULT_LIBRARY_FUNCS_TO_INCLUDE=[]` is almost certainly wrong/out-of-date. This setting defaults to the empty list anyway these days so its redundant. Also we now support `-nodefaultlibs` so you can use that, as with other toolchains.
https://github.com/rust-lang/rust/issues/98303#issuecomment-1162163684
The goal of this change is to ensure that llvm will do stack slot
optimization on these temporaries. This ensures that in code like:
```rust
const A: [u8; 1024] = [0; 1024];
fn copy_const() {
f(A);
f(A);
}
```
we only use 1024 bytes of stack space, instead of 2048 bytes.
Remove the source archive functionality of ArchiveWriter
We now build archives through strictly additive means rather than taking an existing archive and potentially substracting parts. This is simpler and makes it easier to swap out the archive writer in https://github.com/rust-lang/rust/pull/97485.
Remove dereferencing of Box from codegen
Through #94043, #94414, #94873, and #95328, I've been fixing issues caused by Box being treated like a pointer when it is not a pointer. However, these PRs just introduced special cases for Box. This PR removes those special cases and instead transforms a deref of Box into a deref of the pointer it contains.
Hopefully, this is the end of the Box<T, A> ICEs.
once cell renamings
This PR does the renamings proposed in https://github.com/rust-lang/rust/issues/74465#issuecomment-1153703128
- Move/rename `lazy::{OnceCell, Lazy}` to `cell::{OnceCell, LazyCell}`
- Move/rename `lazy::{SyncOnceCell, SyncLazy}` to `sync::{OnceLock, LazyLock}`
(I used `Lazy...` instead of `...Lazy` as it seems to be more consistent, easier to pronounce, etc)
```@rustbot``` label +T-libs-api -T-libs
Emscripten target: replace -g4 with -g, and -g3 with --profiling-funcs
Emscripten prints the following warning:
```
emcc: warning: please replace -g4 with -gsource-map [-Wdeprecated]
```
`@sbc100`
Move `finish` out of the `Encoder` trait.
This simplifies things, but requires making `CacheEncoder` non-generic.
(This was previously merged as commit 4 in #94732 and then was reverted
in #97905 because it caused a perf regression.)
r? `@ghost`
Support lint expectations for `--force-warn` lints (RFC 2383)
Rustc has a `--force-warn` flag, which overrides lint level attributes and forces the diagnostics to always be warn. This means, that for lint expectations, the diagnostic can't be suppressed as usual. This also means that the expectation would not be fulfilled, even if a lint had been triggered in the expected scope.
This PR now also tracks the expectation ID in the `ForceWarn` level. I've also made some minor adjustments, to possibly catch more bugs and make the whole implementation more robust.
This will probably conflict with https://github.com/rust-lang/rust/pull/97718. That PR should ideally be reviewed and merged first. The conflict itself will be trivial to fix.
---
r? `@wesleywiser`
cc: `@flip1995` since you've helped with the initial review and also discussed this topic with me. 🙃
Follow-up of: https://github.com/rust-lang/rust/pull/87835
Issue: https://github.com/rust-lang/rust/issues/85549
Yeah, and that's it.
This simplifies things, but requires making `CacheEncoder` non-generic.
(This was previously merged as commit 4 in #94732 and then was reverted
in #97905 because it caused a perf regression.)
Rename rustc_serialize::opaque::Encoder as MemEncoder.
This avoids the name clash with `rustc_serialize::Encoder` (a trait),
and allows lots qualifiers to be removed and imports to be simplified
(e.g. fewer `as` imports).
(This was previously merged as commit 5 in #94732 and then was reverted
in #97905 because of a perf regression caused by commit 4 in #94732.)
r? ```@bjorn3```
Use unchecked mul to compute slice sizes
This allows LLVM to realize that `slice.len() > 0` iff `slice.len() * size_of::<T>() > 0`, allowing a branch on the latter to be folded into the former when dropping vecs and boxed slices, in some cases.
Fixes (partially) #96497
Add the intrinsic
declare {i8*, i1} @llvm.type.checked.load(i8* %ptr, i32 %offset, metadata %type)
This is used in the VFE optimization when lowering loading functions
from vtables to LLVM IR. The `metadata` is used to map the function to
all vtables this function could belong to. This ensures that functions
from vtables that might be used somewhere won't get removed.
Rename the `ConstS::val` field as `kind`.
And likewise for the `Const::val` method.
Because its type is called `ConstKind`. Also `val` is a confusing name
because `ConstKind` is an enum with seven variants, one of which is
called `Value`. Also, this gives consistency with `TyS` and `PredicateS`
which have `kind` fields.
The commit also renames a few `Const` variables from `val` to `c`, to
avoid confusion with the `ConstKind::Value` variant.
r? `@BoxyUwU`
This avoids the name clash with `rustc_serialize::Encoder` (a trait),
and allows lots qualifiers to be removed and imports to be simplified
(e.g. fewer `as` imports).
(This was previously merged as commit 5 in #94732 and then was reverted
in #97905 because of a perf regression caused by commit 4 in #94732.)
And likewise for the `Const::val` method.
Because its type is called `ConstKind`. Also `val` is a confusing name
because `ConstKind` is an enum with seven variants, one of which is
called `Value`. Also, this gives consistency with `TyS` and `PredicateS`
which have `kind` fields.
The commit also renames a few `Const` variables from `val` to `c`, to
avoid confusion with the `ConstKind::Value` variant.
Add Apple WatchOS compile targets
Hello,
I would like to add the following target triples for Apple WatchOS as Tier 3 platforms:
armv7k-apple-watchos
arm64_32-apple-watchos
x86_64-apple-watchos-sim
There are some pre-requisites Pull Requests:
https://github.com/rust-lang/compiler-builtins/pull/456 (merged)
https://github.com/alexcrichton/cc-rs/pull/662 (pending)
https://github.com/rust-lang/libc/pull/2717 (merged)
There will be a subsequent PR with standard library changes for WatchOS. Previous compiler and library changes were in a single PR (https://github.com/rust-lang/rust/pull/94736) which is now closed in favour of separate PRs.
Many thanks!
Vlad.
### Tier 3 Target Requirements
Adds support for Apple WatchOS compile targets.
Below are details on how this target meets the requirements for tier 3:
> tier 3 target must have a designated developer or developers (the "target maintainers") on record to be CCed when issues arise regarding the target. (The mechanism to track and CC such developers may evolve over time.)
`@deg4uss3r` has volunteered to be the target maintainer. I am also happy to help if a second maintainer is required.
> Targets must use naming consistent with any existing targets; for instance, a target for the same CPU or OS as an existing Rust target should use the same name for that CPU or OS. Targets should normally use the same names and naming conventions as used elsewhere in the broader ecosystem beyond Rust (such as in other toolchains), unless they have a very good reason to diverge. Changing the name of a target can be highly disruptive, especially once the target reaches a higher tier, so getting the name right is important even for a tier 3 target.
Uses the same naming as the LLVM target, and the same convention as other Apple targets.
> Target names should not introduce undue confusion or ambiguity unless absolutely necessary to maintain ecosystem compatibility. For example, if the name of the target makes people extremely likely to form incorrect beliefs about what it targets, the name should be changed or augmented to disambiguate it.
I don't believe there is any ambiguity here.
> Tier 3 targets may have unusual requirements to build or use, but must not create legal issues or impose onerous legal terms for the Rust project or for Rust developers or users.
I don't see any legal issues here.
> The target must not introduce license incompatibilities.
> Anything added to the Rust repository must be under the standard Rust license (MIT OR Apache-2.0).
> The target must not cause the Rust tools or libraries built for any other host (even when supporting cross-compilation to the target) to depend on any new dependency less permissive than the Rust licensing policy. This applies whether the dependency is a Rust crate that would require adding new license exceptions (as specified by the tidy tool in the rust-lang/rust repository), or whether the dependency is a native library or binary. In other words, the introduction of the target must not cause a user installing or running a version of Rust or the Rust tools to be subject to any new license requirements.
> If the target supports building host tools (such as rustc or cargo), those host tools must not depend on proprietary (non-FOSS) libraries, other than ordinary runtime libraries supplied by the platform and commonly used by other binaries built for the target. For instance, rustc built for the target may depend on a common proprietary C runtime library or console output library, but must not depend on a proprietary code generation library or code optimization library. Rust's license permits such combinations, but the Rust project has no interest in maintaining such combinations within the scope of Rust itself, even at tier 3.
> Targets should not require proprietary (non-FOSS) components to link a functional binary or library.
> "onerous" here is an intentionally subjective term. At a minimum, "onerous" legal/licensing terms include but are not limited to: non-disclosure requirements, non-compete requirements, contributor license agreements (CLAs) or equivalent, "non-commercial"/"research-only"/etc terms, requirements conditional on the employer or employment of any particular Rust developers, revocable terms, any requirements that create liability for the Rust project or its developers or users, or any requirements that adversely affect the livelihood or prospects of the Rust project or its developers or users.
I see no issues with any of the above.
> Neither this policy nor any decisions made regarding targets shall create any binding agreement or estoppel by any party. If any member of an approving Rust team serves as one of the maintainers of a target, or has any legal or employment requirement (explicit or implicit) that might affect their decisions regarding a target, they must recuse themselves from any approval decisions regarding the target's tier status, though they may otherwise participate in discussions.
> This requirement does not prevent part or all of this policy from being cited in an explicit contract or work agreement (e.g. to implement or maintain support for a target). This requirement exists to ensure that a developer or team responsible for reviewing and approving a target does not face any legal threats or obligations that would prevent them from freely exercising their judgment in such approval, even if such judgment involves subjective matters or goes beyond the letter of these requirements.
Only relevant to those making approval decisions.
> Tier 3 targets should attempt to implement as much of the standard libraries as possible and appropriate (core for most targets, alloc for targets that can support dynamic memory allocation, std for targets with an operating system or equivalent layer of system-provided functionality), but may leave some code unimplemented (either unavailable or stubbed out as appropriate), whether because the target makes it impossible to implement or challenging to implement. The authors of pull requests are not obligated to avoid calling any portions of the standard library on the basis of a tier 3 target not implementing those portions.
core and alloc can be used. std support will be added in a subsequent PR.
> The target must provide documentation for the Rust community explaining how to build for the target, using cross-compilation if possible. If the target supports running tests (even if they do not pass), the documentation must explain how to run tests for the target, using emulation if possible or dedicated hardware if necessary.
Use --target=<target> option to cross compile, just like any target. Tests can be run using the WatchOS simulator (see https://developer.apple.com/documentation/xcode/running-your-app-in-the-simulator-or-on-a-device).
> Tier 3 targets must not impose burden on the authors of pull requests, or other developers in the community, to maintain the target. In particular, do not post comments (automated or manual) on a PR that derail or suggest a block on the PR based on a tier 3 target. Do not send automated messages or notifications (via any medium, including via `@)` to a PR author or others involved with a PR regarding a tier 3 target, unless they have opted into such messages.
> Backlinks such as those generated by the issue/PR tracker when linking to an issue or PR are not considered a violation of this policy, within reason. However, such messages (even on a separate repository) must not generate notifications to anyone involved with a PR who has not requested such notifications.
I don't foresee this being a problem.
> Patches adding or updating tier 3 targets must not break any existing tier 2 or tier 1 target, and must not knowingly break another tier 3 target without approval of either the compiler team or the maintainers of the other tier 3 target.
> In particular, this may come up when working on closely related targets, such as variations of the same architecture with different features. Avoid introducing unconditional uses of features that another variation of the target may not have; use conditional compilation or runtime detection, as appropriate, to let each target run code supported by that target.
No other targets should be affected by the pull request.
Revert part of #94372 to improve performance
#94732 was supposed to give small but widespread performance improvements, as judged from three per-merge performance runs. But the performance run that occurred after merging included a roughly equal number of improvements and regressions, for unclear reasons.
This PR is for a test run reverting those changes, to see what happens.
r? `@ghost`
This avoids the name clash with `rustc_serialize::Encoder` (a trait),
and allows lots qualifiers to be removed and imports to be simplified
(e.g. fewer `as` imports).
There are two impls of the `Encoder` trait: `opaque::Encoder` and
`opaque::FileEncoder`. The former encodes into memory and is infallible, the
latter writes to file and is fallible.
Currently, standard `Result`/`?`/`unwrap` error handling is used, but this is a
bit verbose and has non-trivial cost, which is annoying given how rare failures
are (especially in the infallible `opaque::Encoder` case).
This commit changes how `Encoder` fallibility is handled. All the `emit_*`
methods are now infallible. `opaque::Encoder` requires no great changes for
this. `opaque::FileEncoder` now implements a delayed error handling strategy.
If a failure occurs, it records this via the `res` field, and all subsequent
encoding operations are skipped if `res` indicates an error has occurred. Once
encoding is complete, the new `finish` method is called, which returns a
`Result`. In other words, there is now a single `Result`-producing method
instead of many of them.
This has very little effect on how any file errors are reported if
`opaque::FileEncoder` has any failures.
Much of this commit is boring mechanical changes, removing `Result` return
values and `?` or `unwrap` from expressions. The more interesting parts are as
follows.
- serialize.rs: The `Encoder` trait gains an `Ok` associated type. The
`into_inner` method is changed into `finish`, which returns
`Result<Vec<u8>, !>`.
- opaque.rs: The `FileEncoder` adopts the delayed error handling
strategy. Its `Ok` type is a `usize`, returning the number of bytes
written, replacing previous uses of `FileEncoder::position`.
- Various methods that take an encoder now consume it, rather than being
passed a mutable reference, e.g. `serialize_query_result_cache`.
Add some unstable target features for the wasm target codegen
I was experimenting with cross-language LTO for the wasm target recently
between Rust and C and found that C was injecting the `+mutable-globals`
flag on all functions. When specifying the corresponding
`-Ctarget-feature=+mutable-globals` feature to Rust it prints a warning
about an unknown feature. I've added the `mutable-globals` feature plus
another few I know of to the list of known features for wasm targets.
These features all continue to be unstable to source code as they were
before.
Fix ICEs from zsts within unsized types with non-zero offsets
- Fixes#97732
- Fixes ICEs while compiling `alloc` with `-Z randomize-layout`
r? ``@eddyb``
Various refactors to the incr comp workproduct handling
This is the result of me looking into adding support for having multiple object files for a single codegen unit to incr comp. This is necessary to support inline assembly in cg_clif without requiring partial linking which is not supported on Windows and seems to fail on macOS for some reason. Cg_clif uses an external assembler to handle inline asm and thus produces one object file with regular functions and one object file containing compiled inline asm for each codegen unit which uses inline asm. Current incr comp can't handle this. This PR doesn't yet add support for this, but it makes it easier to do so.
I was experimenting with cross-language LTO for the wasm target recently
between Rust and C and found that C was injecting the `+mutable-globals`
flag on all functions. When specifying the corresponding
`-Ctarget-feature=+mutable-globals` feature to Rust it prints a warning
about an unknown feature. I've added the `mutable-globals` feature plus
another few I know of to the list of known features for wasm targets.
These features all continue to be unstable to source code as they were
before.
Add support for embedding pretty printers via `#[debugger_visualizer]` attribute
Initial support for [RFC 3191](https://github.com/rust-lang/rfcs/pull/3191) in PR https://github.com/rust-lang/rust/pull/91779 was scoped to supporting embedding NatVis files using a new attribute. This PR implements the pretty printer support as stated in the RFC mentioned above.
This change includes embedding pretty printers in the `.debug_gdb_scripts` just as the pretty printers for rustc are embedded today. Also added additional tests for embedded pretty printers. Additionally cleaned up error checking so all error checking is done up front regardless of the current target.
RFC: https://github.com/rust-lang/rfcs/pull/3191
- The logic is now unified for all targets (wasm targets should also be supported now)
- Additional "symlink" files like `ld64` are eliminated
- lld-wrapper is used for propagating the correct lld flavor
- Cleanup "unwrap or exit" logic in lld-wrapper
Ensure all error checking for `#[debugger_visualizer]` is done up front and not when the `debugger_visualizer` query is run.
Clean up potential ODR violations when embedding pretty printers into the `__rustc_debug_gdb_scripts_section__` section.
Respond to PR comments and update documentation.
Fix e_flags for 32-bit MIPS targets in generated object file
In #95604 the compiler started generating a temporary symbols.o which is added to the linker invocation. This object file has an `e_flags` which is invalid for 32-bit MIPS targets. Even though symbols.o doesn't contain code, linking these targets with [lld fails](https://github.com/llvm/llvm-project/blob/main/lld/ELF/Arch/MipsArchTree.cpp#L76-L79) with
```
rust-lld: error: foo-cgu.0.rcgu.o: ABI 'o32' is incompatible with target ABI 'n64'
```
because it omits the ABI bits (`EF_MIPS_ABI_O32`) so lld assumes it's using the N64 ABI. This breaks linking on nightly for the out-of-tree [mipsel-sony-psx target](https://github.com/ayrtonm/psx-sdk-rs/issues/9), the builtin mipsel-sony-psp target (cc `@overdrivenpotato)` and probably any other 32-bit MIPS target using lld.
This PR sets the ABI in `e_flags` to O32 since that's the only ABI for 32-bit MIPS that LLVM supports. It also sets other `e_flags` bits based on the target to avoid similar issues with the object file arch and PIC. I had to bump the object crate version since some of these constants were [added recently](https://github.com/gimli-rs/object/pull/433). I'm not sure if this PR needs a test, but I can confirm that it fixes the linking issue on both targets I mentioned.
Like we have `add`/`sub` which are the `usize` version of `offset`, this adds the `usize` equivalent of `offset_from`. Like how `.add(d)` replaced a whole bunch of `.offset(d as isize)`, you can see from the changes here that it's fairly common that code actually knows the order between the pointers and *wants* a `usize`, not an `isize`.
As a bonus, this can do `sub nuw`+`udiv exact`, rather than `sub`+`sdiv exact`, which can be optimized slightly better because it doesn't have to worry about negatives. That's why the slice iterators weren't using `offset_from`, though I haven't updated that code in this PR because slices are so perf-critical that I'll do it as its own change.
This is an intrinsic, like `offset_from`, so that it can eventually be allowed in CTFE. It also allows checking the extra safety condition -- see the test confirming that CTFE catches it if you pass the pointers in the wrong order.