Mark drop calls in landing pads `cold` instead of `noinline`
Now that deferred inlining has been disabled in LLVM (#92110), this shouldn't cause catastrophic size blowup.
I confirmed that the test cases from https://github.com/rust-lang/rust/issues/41696#issuecomment-298696944 still compile quickly (<1s) after this change. ~Although note that I wasn't able to reproduce the original issue using a recent rustc/llvm with deferred inlining enabled, so those tests may no longer be representative. I was also unable to create a modified test case that reproduced the original issue.~ (edit: I reproduced it on CI by accident--the first commit timed out on the LLVM 12 builder, because I forgot to make it conditional on LLVM version)
r? `@nagisa`
cc `@arielb1` (this effectively reverts #42771 "mark calls in the unwind path as !noinline")
cc `@RalfJung` (fixes#46515)
edit: also fixes#87055
[rustc_builtin_macros] add indices to format_foreign::printf::Substitution::Escape
Fixes#92267.
The problem was that the escape string "%%" does not need to appear at the very beginning of the format string, but
the iterator implementation assumed that it did.
The solution follows the pattern used by `format_foregin:🐚:Subtitution::Escape`: 8ed935e92d/compiler/rustc_builtin_macros/src/format_foreign.rs (L629)
Fix whitespace in pretty printed PatKind::Range
Follow-up to #92238 fixing one of the FIXMEs.
```rust
macro_rules! repro {
($pat:pat) => {
stringify!($pat)
};
}
fn main() {
println!("{}", repro!(0..=1));
}
```
Before: `0 ..=1`
After: `0..=1`
The canonical spacing applied by rustfmt has no space after the lower expr. Rustc's parser diagnostics also do not put a space there:
df96fb166f/compiler/rustc_parse/src/parser/pat.rs (L754)
Implement split_at_spare_mut without Deref to a slice so that the spare slice is valid
~I'm not sure I understand what's going on here correctly. And I'm pretty sure this safety comment needs to be changed. I'm just referring to the same thing that `as_mut_ptr_range` does.~ (Thanks `@RalfJung` for the guidance and clearing things up)
I tried to run https://github.com/rust-lang/miri-test-libstd on alloc with -Zmiri-track-raw-pointers, and got a failure on the test `vec::test_extend_from_within`.
I minimized the test failure into this program:
```rust
#![feature(vec_split_at_spare)]
fn main() {
Vec::<i32>::with_capacity(1).split_at_spare_mut();
}
```
The problem is that the existing implementation is actually getting a pointer range where both pointers are derived from the initialized region of the Vec's allocation, but we need the second one to be valid for the region between len and capacity. (thanks Ralf for clearing this up)
Lock bootstrap (x.py) build directory
Closes#76661, closes#80849,
`x.py` creates a lock file at `project_root/lock.db`
r? `@jyn514` , because he was one that told me about this~
Fixes#73026
See also: #64467, #89468
The issue stems from a `FatalError` being silently raised in
`panictry_buffer`. Normally this is not a problem, because
`panictry_buffer` emits the causes of the error, but they are not
themselves fatal, so they get filtered out by the silent emitter.
To fix this, we use a parser entrypoint which doesn't use
`panictry_buffer`, and we handle the error ourselves.
Add Attribute::meta_kind
The `AttrItem::meta` function is being called on a lot of places, however almost always the caller is only interested in the `kind` of the result `MetaItem`. Before, the `path` had to be cloned in order to get the kind, now it does not have to be.
There is a larger related "problem". In a lot of places, something wants to know contents of attributes. This is accessed through `Attribute::meta_item_list`, which calls `AttrItem::meta` (now `AttrItem::meta_kind`), among other methods. When this function is called, the meta item list has to be recreated from scratch. Everytime something asks a simple question (like is this item/list of attributes `#[doc(hidden)]`?), the tokens of the attribute(s) are cloned, parsed and the results are allocated on the heap. That seems really unnecessary. What would be the best way to cache this? Turn `meta_item_list` into a query perhaps? Related PR: https://github.com/rust-lang/rust/pull/92227
r? rust-lang/rustdoc
Rollup of 7 pull requests
Successful merges:
- #90383 (Extend check for UnsafeCell in consts to cover unions)
- #91375 (config.rs: Add support for a per-target default_linker option.)
- #91480 (rustdoc: use smaller number of colors to distinguish items)
- #92338 (Add try_reserve and try_reserve_exact for OsString)
- #92405 (Add a couple needs-asm-support headers to tests)
- #92435 (Sync rustc_codegen_cranelift)
- #92440 (Fix mobile toggles position)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Sync rustc_codegen_cranelift
The main highlight this sync is enforcing rustfmt and lack of warnings on cg_clif's CI. I will open a separate PR to remove the cg_clif exceptions for them from this repo.
r? `@ghost`
`@rustbot` label +A-codegen +A-cranelift +T-compiler
Add try_reserve and try_reserve_exact for OsString
Add `try_reserve` and `try_reserve_exact` for OsString.
Part of https://github.com/rust-lang/rust/issues/91789
I will squash the commits after PR is ready to merge.
Signed-off-by: Xuanwo <github@xuanwo.io>
config.rs: Add support for a per-target default_linker option.
* src/bootstrap/config.rs (Target) <default_linker>: New field.
(TomlTarget): Likewise.
* src/bootstrap/compile.rs (rustc_cargo_env): Prefer a
target-specified default_linker over a global one if available.
* config.toml.example: Adjust doc.
Store liveness in interval sets for region inference
On the 100,000 line test case from https://github.com/rust-lang/rust/issues/90445, this reduces memory usage from 35 GB to 444 MB at peak (based on DHAT results, though with regular malloc), and yields a 9.4x speedup, with wall time going from 14.5 seconds to 1.5s. Performance results show that for the majority of real-world code this has little to no impact, but it's expected to generally scale better for auto-generated functions and other cases which stress this area of the compiler, as results on #90445 illustrate.
There may also be further room for improvement in future PRs making use of this data structures benefits over raw bitsets (which, at some level, are a less perfect fit for representing liveness, which is almost always composed of contiguous ranges, not point locations).
Fixes#90445.
Import `SourceFile`s from crate before decoding foreign `Span`
Fixes#92163Fixes#92014
When writing to the incremental cache, we encode all `Span`s
we encounter, regardless of whether or not their `SourceFile`
comes from the local crate, or from a foreign crate.
When we decode a `Span`, we use the `StableSourceFileId` we encoded
to locate the matching `SourceFile` in the current session. If this
id corresponds to a `SourceFile` from another crate, then we need to
have already imported that `SourceFile` into our current session.
This usually happens automatically during resolution / macro expansion,
when we try to resolve definitions from other crates. In certain cases,
however, we may try to load a `Span` from a transitive dependency
without having ever imported the `SourceFile`s from that crate, leading
to an ICE.
This PR fixes the issue by enconding the `SourceFile`'s `CrateNum`
when we encode a `Span`. During decoding, we call `imported_source_files()`
when we encounter a foreign `CrateNum`, which ensure that all
`SourceFile`s from that crate are imported into the current session.
Prevent spurious build failures and other bugs caused by parallel runs of
x.py. We back the lock with sqlite, so that we have a cross-platform locking
strategy, and which can be done nearly first in the build process (from Python),
which helps move the lock as early as possible.
Region inference contains several bitsets which are filled with large intervals
representing liveness. These can cause excessive memory usage, and are
relatively slow when growing to large sizes compared to the IntervalSet.
This is a compact, fast storage for variable-sized sets, typically consisting of
larger ranges. It is less efficient than a bitset if ranges are both small and
the domain size is small, but will still perform acceptably. With enormous
domain sizes and large ranges, the interval set performs much better, as it can
be much more densely packed in memory than the uncompressed bit set alternative.
It appears `find_max_slow` comes from the BinaryHeap docs, where the
try_reserve example is a slow implementation of find_max. It has no
relevance to this code in OsString though.