Consistently avoid constructing optimized MIR when not doing codegen
The optimized MIR for closures is being encoded unconditionally, while
being unnecessary for cargo check. This turns out to be especially
costly with MIR inlining enabled, since it triggers computation of
optimized MIR for all callees that are being examined for inlining
purposes https://github.com/rust-lang/rust/pull/77307#issuecomment-751915450.
Skip encoding of optimized MIR for closures, enum constructors, struct
constructors, and trait fns when not doing codegen, like it is already
done for other items since 49433.
Separate out a `hir::Impl` struct
This makes it possible to pass the `Impl` directly to functions, instead
of having to pass each of the many fields one at a time. It also
simplifies matches in many cases.
See `rustc_save_analysis::dump_visitor::process_impl` or `rustdoc::clean::clean_impl` for a good example of how this makes `impl`s easier to work with.
r? `@petrochenkov` maybe?
This makes it possible to pass the `Impl` directly to functions, instead
of having to pass each of the many fields one at a time. It also
simplifies matches in many cases.
The optimized MIR for closures is being encoded unconditionally, while
being unnecessary for cargo check. This turns out to be especially
costly with MIR inlining enabled, since it triggers computation of
optimized MIR for all callees that are being examined for inlining
purposes.
Skip encoding of optimized MIR for closures, enum constructors, struct
constructors, and trait fns when not doing codegen, like it is already
done for other items since 49433.
Remove much of the special-case handling around crate metadata
dependency tracking by replacing `DepKind::CrateMetadata` and the
pre-allocation of corresponding `DepNodes` with on-demand invocation
of the `crate_hash` query.
Make CTFE able to check for UB...
... by not doing any optimizations on the `const fn` MIR used in CTFE. This means we duplicate all `const fn`'s MIR now, once for CTFE, once for runtime. This PR is for checking the perf effect, so we have some data when talking about https://github.com/rust-lang/const-eval/blob/master/rfcs/0000-const-ub.md
To do this, we now have two queries for obtaining mir: `optimized_mir` and `mir_for_ctfe`. It is now illegal to invoke `optimized_mir` to obtain the MIR of a const/static item's initializer, an array length, an inline const expression or an enum discriminant initializer. For `const fn`, both `optimized_mir` and `mir_for_ctfe` work, the former returning the MIR that LLVM should use if the function is called at runtime. Similarly it is illegal to invoke `mir_for_ctfe` on regular functions.
This is all checked via appropriate assertions and I don't think it is easy to get wrong, as there should be no `mir_for_ctfe` calls outside the const evaluator or metadata encoding. Almost all rustc devs should keep using `optimized_mir` (or `instance_mir` for that matter).
Reduce a large memory spike that happens during serialization by writing
the incr comp structures to file by way of a fixed-size buffer, rather
than an unbounded vector.
Effort was made to keep the instruction count close to that of the
previous implementation. However, buffered writing to a file inherently
has more overhead than writing to a vector, because each write may
result in a handleable error. To reduce this overhead, arrangements are
made so that each LEB128-encoded integer can be written to the buffer
with only one capacity and error check. Higher-level optimizations in
which entire composite structures can be written with one capacity and
error check are possible, but would require much more work.
The performance is mostly on par with the previous implementation, with
small to moderate instruction count regressions. The memory reduction is
significant, however, so it seems like a worth-while trade-off.
Move variable into the only branch where it is relevant
At the `if` branch `filter` (the `let` binding) is `None` iff `filter` (the parameter) was `None`.
We can branch on the parameter, move the binding into the `if`, and the complexity of handling
`Option<Option<_>` largely dissolves.
`@rustbot` modify labels +C-cleanup +T-compiler
Note: I have no idea how hot this code is. If this method frequently gets called with a `None` filter, there might be a small perf improvement.
Clean up in `each_child_of_item`
This PR hopes to eliminate some of the surprising elements I encountered while reading the function.
- `macros_only` is checked against inside the loop body, but if it is `true`, the loop is skipped anyway
- only query `span` when relevant
- no need to allocate attribute vector
At the `if` branch `filter` (the `let` binding) is `None` iff `filter` (the parameter) was `None`.
We can branch on the parameter, move the binding into the `if`, and the complexity of handling
`Option<Option<_>` largely dissolves.
Adds checks for:
* `no_core` attribute
* explicitly-enabled `legacy` symbol mangling
* mir_opt_level > 1 (which enables inlining)
I removed code from the `Inline` MIR pass that forcibly disabled
inlining if `-Zinstrument-coverage` was set. The default `mir_opt_level`
does not enable inlining anyway. But if the level is explicitly set and
is greater than 1, I issue a warning.
The new warnings show up in tests, which is much better for diagnosing
potential option conflicts in these cases.
When encoding a proc-macro crate, there may be gaps in the table (since
we only encode the crate root and proc-macro items). Account for this by
checking if the entry is present, rather than using `unwrap()`
Implement lazy decoding of DefPathTable during incremental compilation
PR https://github.com/rust-lang/rust/pull/75813 implemented lazy decoding of the `DefPathTable` from crate metadata. However, it requires decoding the entire `DefPathTable` when incremental compilation is active, so that we can map a decoded `DefPathHash` to a `DefId` from an arbitrary crate.
This PR adds support for lazy decoding of dependency `DefPathTable`s when incremental compilation si active.
When we load the incremental cache and dep
graph, we need the ability to map a `DefPathHash` to a `DefId` in the
current compilation session (if the corresponding definition still
exists).
This is accomplished by storing the old `DefId` (that is, the `DefId`
from the previous compilation session) for each `DefPathHash` we need to
remap. Since a `DefPathHash` includes the owning crate, the old crate is
guaranteed to be the right one (if the definition still exists). We then
use the old `DefIndex` as an initial guess, which we validate by
comparing the expected and actual `DefPathHash`es. In most cases,
foreign crates will be completely unchanged, which means that we our
guess will be correct. If our guess is wrong, we fall back to decoding
the entire `DefPathTable` for the foreign crate. This still represents
an improvement over the status quo, since we can skip decoding the
entire `DefPathTable` for other crates (where all of our guesses were
correct).
The discussion seems to have resolved that this lint is a bit "noisy" in
that applying it in all places would result in a reduction in
readability.
A few of the trivial functions (like `Path::new`) are fine to leave
outside of closures.
The general rule seems to be that anything that is obviously an
allocation (`Box`, `Vec`, `vec![]`) should be in a closure, even if it
is a 0-sized allocation.
with an eye on merging `TargetOptions` into `Target`.
`TargetOptions` as a separate structure is mostly an implementation detail of `Target` construction, all its fields logically belong to `Target` and available from `Target` through `Deref` impls.
foreign_modules query hash table lookups
When compiling a large monolithic crate we're seeing huge times in the `foreign_modules` query due to repeated iteration over foreign modules (in order to find a module by its id). This implements hash table lookups so that which massively reduces time spent in that query in this particular case. We'll need to see if the overhead of creating the hash table has a negative impact on performance in more normal compilation scenarios.
I'm working with `@wesleywiser` on this.
$ touch empty.rs
$ env RUSTC_LOG=debug rustc +stage1 --crate-type=lib empty.rs
Fails with a `BorrowMutError` because source map files are already
borrowed while `features_query` attempts to format a log message
containing a span.
Release the borrow before the query to avoid the issue.
Fixes#75982
The direct parent of a module may not be a module
(e.g. `const _: () = { #[path = "foo.rs"] mod foo; };`).
To find the parent of a module for purposes of resolution, we need to
walk up the tree until we hit a module or a crate root.
Preparation for a subsequent change that replaces
rustc_target::config::Config with its wrapped Target.
On its own, this commit breaks the build. I don't like making
build-breaking commits, but in this instance I believe that it
makes review easier, as the "real" changes of this PR can be
seen much more easily.
Result of running:
find compiler/ -type f -exec sed -i -e 's/target\.target\([)\.,; ]\)/target\1/g' {} \;
find compiler/ -type f -exec sed -i -e 's/target\.target$/target/g' {} \;
find compiler/ -type f -exec sed -i -e 's/target.ptr_width/target.pointer_width/g' {} \;
./x.py fmt
Fixes#77523
Now that hygiene serialization is implemented, we also need to record
`expansion_that_defined` so that we properly handle a foreign
`SyntaxContext`.
Currently, we serialize the same crate metadata for proc-macro crates as
we do for normal crates. This is quite wasteful - almost none of this
metadata is ever used, and much of it can't even be deserialized (if it
contains a foreign `CrateNum`).
This PR changes metadata encoding to skip encoding the majority of crate
metadata for proc-macro crates. Most of the `Lazy<[T]>` fields are left
completetly empty, while the non-lazy fields are left as-is.
Additionally, proc-macros now have a def span that does not include
their body. This was done for normal functions in #75465, but was missed
for proc-macros.
As a result of this PR, we should only ever encode local `CrateNum`s
when encoding proc-macro crates. I've added a specialized serialization
impl for `CrateNum` to assert this.
emit errors during AbstractConst building
There changes are currently still untested, so I don't expect this to pass CI 😆
It seems to me like this is the direction we want to go in, though we didn't have too much of a discussion about this.
r? @oli-obk
Don't query stability data when `staged_api` is off
This data only needs to be encoded when `#![feature(staged_api)]` or `-Zforce-unstable-if-unmarked` is on. Running these queries takes measurable time on large crates with many items, so skip it when the unstable flags have not been enabled.
Previously, we would throw away the `SyntaxContext` of any span with a
dummy location during metadata encoding. This commit makes metadata Span
encoding consistent with incr-cache Span encoding - an 'invalid span'
tag is only used when it doesn't lose any information.