This is step 2 towards fixing #77548.
In the codegen and codegen-units test suites, the `//` comment markers
were kept in order not to affect any source locations. This is because
these tests cannot be automatically `--bless`ed.
This commit removes the obsolete printer and replaces all uses of it
with `FmtPrinter`. Of the replaced uses, all but one use was in `debug!`
logging, two cases were notable:
- `MonoItem::to_string` is used in `-Z print-mono-items` and therefore
affects the output of all codegen-units tests.
- `DefPathBasedNames` was used in `librustc_codegen_llvm/type_of.rs`
with `LLVMStructCreateNamed` and that'll now get different values, but
this should result in no functional change.
Signed-off-by: David Wood <david@davidtw.co>
This commit restricts the substitution polymorphization added in #75255
to only apply to the tupled upvar substitution, rather than all
substitutions, fixing a bunch of regressions when polymorphization is
enabled.
Signed-off-by: David Wood <david@davidtw.co>
By always polymorphizing substitutions, functions which take closures as
arguments (e.g. `impl Fn()`) can have fewer mono items when some of the
argument closures can be polymorphized.
Signed-off-by: David Wood <david@davidtw.co>
This commit disables polymorphisation to resolve regressions related to
closures which inherit unused generic parameters and are then used in
casts or reflection.
Signed-off-by: David Wood <david@davidtw.co>
This commit implements the `unused_generic_params` query, an initial
version of polymorphization which detects when an item does not use
generic parameters and is being needlessly monomorphized as a result.
Signed-off-by: David Wood <david@davidtw.co>
This commit intends to fix an accidental regression from #70846. The
goal of #70846 was to build compiler-builtins with a maximal number of
CGUs to ensure that each module in the source corresponds to an object
file. This high degree of control for compiler-builtins is desirable to
ensure that there's at most one exported symbol per CGU, ideally
enabling compiler-builtins to not conflict with the system libgcc as
often.
In #70846, however, only part of the compiler understands that
compiler-builtins is built with many CGUs. The rest of the compiler
thinks it's building with `sess.codegen_units()`. Notably the
calculation of `sess.lto()` consults `sess.codegen_units()`, which when
there's only one CGU it disables ThinLTO. This means that
compiler-builtins is built without ThinLTO, which is quite harmful to
performance! This is the root of the cause from #73135 where intrinsics
were found to not be inlining trivial functions.
The fix applied in this commit is to remove the special-casing of
compiler-builtins in the compiler. Instead the build system is now
responsible for special-casing compiler-builtins. It doesn't know
exactly how many CGUs will be needed but it passes a large number that
is assumed to be much greater than the number of source-level modules
needed. After reading the various locations in the compiler source, this
seemed like the best solution rather than adding more and more special
casing in the compiler for compiler-builtins.
Closes#73135
`-C incremental` was introduced over two years ago. `-Z incremental` was
kept for transitioning, but it's been long enough now that it should be
ok to remove it.
This commit addresses #64319 by removing the `dylib` crate type from the
list of crate type that exports generic symbols. The bug in #64319
arises because a `dylib` crate type was trying to export a symbol in an
uptream crate but it miscalculated the symbol name of the uptream
symbol. This isn't really necessary, though, since `dylib` crates aren't
that heavily used, so we can just conservatively say that the `dylib`
crate type never exports generic symbols, forcibly removing them from
the exported symbol lists if were to otherwise find them.
The fix here happens in two places:
* First is in the `local_crate_exports_generics` method, indicating that
it's now `false` for the `Dylib` crate type. Only rlibs actually
export generics at this point.
* Next is when we load exported symbols from upstream crate. If, for our
compilation session, the crate may be included from a dynamic library,
then its generic symbols are removed. When the crate was linked into a
dynamic library its symbols weren't exported, so we can't consider
them a candidate to link against.
Overally this should avoid situations where we incorrectly calculate the
upstream symbol names in the face of differnet `share_generics` options,
ultimately...
Closes#64319
drop glue takes in mutable references, it should reflect that in its type
When drop glue begins, it should retag, like all functions taking references do. But to do that, it needs to take the reference at a proper type: `&mut T`, not `*mut T`.
Failing to retag can mean that the memory the reference points to remains frozen, and `EscapeToRaw` on a frozen location is a NOP, meaning later mutations cause a Stacked Borrows violation.
Cc @nikomatsakis @Gankro because Stacked Borrows
Cc @eddyb for the changes to miri argument passing (the intention is to allow passing `*mut [u8]` when `&mut [u8]` is expected and vice versa)
This commit updates the handling of `#[inline(always)]` functions at -O0 to
ensure that it's always inlined regardless of the number of codegen units used.
Closes#45201
This commit tweaks the behavior of inlining functions into multiple codegen
units when rustc is compiling in debug mode. Today rustc will unconditionally
treat `#[inline]` functions by translating them into all codegen units that
they're needed within, marking the linkage as `internal`. This commit changes
the behavior so that in debug mode (compiling at `-O0`) rustc will instead only
translate `#[inline]` functions into *one* codegen unit, forcing all other
codegen units to reference this one copy.
The goal here is to improve debug compile times by reducing the amount of
translation that happens on behalf of multiple codegen units. It was discovered
in #44941 that increasing the number of codegen units had the adverse side
effect of increasing the overal work done by the compiler, and the suspicion
here was that the compiler was inlining, translating, and codegen'ing more
functions with more codegen units (for example `String` would be basically
inlined into all codegen units if used). The strategy in this commit should
reduce the cost of `#[inline]` functions to being equivalent to one codegen
unit, which is only translating and codegen'ing inline functions once.
Collected [data] shows that this does indeed improve the situation from [before]
as the overall cpu-clock time increases at a much slower rate and when pinned to
one core rustc does not consume significantly more wall clock time than with one
codegen unit.
One caveat of this commit is that the symbol names for inlined functions that
are only translated once needed some slight tweaking. These inline functions
could be translated into multiple crates and we need to make sure the symbols
don't collideA so the crate name/disambiguator is mixed in to the symbol name
hash in these situations.
[data]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334880911
[before]: https://github.com/rust-lang/rust/issues/44941#issuecomment-334583384
This commit is an implementation of LLVM's ThinLTO for consumption in rustc
itself. Currently today LTO works by merging all relevant LLVM modules into one
and then running optimization passes. "Thin" LTO operates differently by having
more sharded work and allowing parallelism opportunities between optimizing
codegen units. Further down the road Thin LTO also allows *incremental* LTO
which should enable even faster release builds without compromising on the
performance we have today.
This commit uses a `-Z thinlto` flag to gate whether ThinLTO is enabled. It then
also implements two forms of ThinLTO:
* In one mode we'll *only* perform ThinLTO over the codegen units produced in a
single compilation. That is, we won't load upstream rlibs, but we'll instead
just perform ThinLTO amongst all codegen units produced by the compiler for
the local crate. This is intended to emulate a desired end point where we have
codegen units turned on by default for all crates and ThinLTO allows us to do
this without performance loss.
* In anther mode, like full LTO today, we'll optimize all upstream dependencies
in "thin" mode. Unlike today, however, this LTO step is fully parallelized so
should finish much more quickly.
There's a good bit of comments about what the implementation is doing and where
it came from, but the tl;dr; is that currently most of the support here is
copied from upstream LLVM. This code duplication is done for a number of
reasons:
* Controlling parallelism means we can use the existing jobserver support to
avoid overloading machines.
* We will likely want a slightly different form of incremental caching which
integrates with our own incremental strategy, but this is yet to be
determined.
* This buys us some flexibility about when/where we run ThinLTO, as well as
having it tailored to fit our needs for the time being.
* Finally this allows us to reuse some artifacts such as our `TargetMachine`
creation, where all our options we used today aren't necessarily supported by
upstream LLVM yet.
My hope is that we can get some experience with this copy/paste in tree and then
eventually upstream some work to LLVM itself to avoid the duplication while
still ensuring our needs are met. Otherwise I fear that maintaining these
bindings may be quite costly over the years with LLVM updates!
Drop of arrays is now translated in trans::block in an ugly way that I
should clean up in a later PR, and does not handle panics in the middle
of an array drop, but this commit & PR are growing too big.