rustc_mir: don't build unused unwind cleanup blocks
When building a scope exit, don't build unwind cleanup blocks unless they will actually be used by the unwind path of a drop - the unused blocks are removed by SimplifyCfg, but they can cause a significant performance slowdown before they are removed. That fixes#43511.
Also a few other small MIR cleanups & optimizations.
r? @eddyb
The mutability system now checks where derefs go through borrows in the
loan chain, and can correctly detect mutable borrows inside structs and
tuples.
add documentation for function pointers as a primitive
This PR adds a new kind of primitive to the standard library documentation: Function pointers. It's useful to be able to discuss them separately from closure-trait-objects, and to have something to point to when discussing function pointers as a *type* and not a *trait*.
Fixes#17104
This helps with vectorization in some cases, such as (0..u16::MAX).collect::<Vec<u16>>(),
as LLVM is able to change the loop condition to use equality instead of less than
Run translation and LLVM in parallel when compiling with multiple CGUs
This is still a work in progress but the bulk of the implementation is done, so I thought it would be good to get it in front of more eyes.
This PR makes the compiler start running LLVM while translation is still in progress, effectively allowing for more parallelism towards the end of the compilation pipeline. It also allows the main thread to switch between either translation or running LLVM, which allows to reduce peak memory usage since not all LLVM module have to be kept in memory until linking. This is especially good for incr. comp. but it works just as well when running with `-Ccodegen-units=N`.
In order to help tuning and debugging the work scheduler, the PR adds the `-Ztrans-time-graph` flag which spits out html files that show how work packages where scheduled:
![Building regex](https://user-images.githubusercontent.com/1825894/28679272-f6752bd8-72f2-11e7-8a6c-56207855ce95.png)
(red is translation, green is llvm)
One side effect here is that `-Ztime-passes` might show something not quite correct because trans and LLVM are not strictly separated anymore. I plan to have some special handling there that will try to produce useful output.
One open question is how to determine whether the trans-thread should switch to intermediate LLVM processing.
TODO:
- [x] Restore `-Z time-passes` output for LLVM.
- [x] Update documentation, esp. for work package scheduling.
- [x] Tune the scheduling algorithm.
cc @alexcrichton @rust-lang/compiler
trans::mir::constant - fix assignment error recovery
trans::mir::constant - fix assignment error recovery
We used to not store anything when the RHS of an assignment returned an error, which caused ICEs downstream.
Fixes#43197.
Instead of finding the next free disambiguator by incrementing it until
you find a place, store the next available disambiguator in an hash-map.
This avoids O(n^2) performance when lots of items have the same
un-disambiguated `DefPathData` - e.g. all `use` items have
`DefPathData::Misc`.
We already had a cache for file contents, but we read the source-file
before testing the cache, causing obvious slowness, so this just avoids
loading the source-file when the cache already has the contents.
add docs for references as a primitive
Just like #43529 did for function pointers, here is a new primitive page for references.
This PR will pull in impls on references if it's a reference to a generic type parameter. Initially i was only able to pull in impls that were re-exported from another crate; crate-local impls got a different representation in the AST, and i had to change how types were resolved when cleaning it. (This is the change at the bottom of `librustdoc/clean/mod.rs`, in `resolve_type`.) I'm unsure the full ramifications of the change, but from what it looks like, it shouldn't impact anything major. Likewise, references to generic type parameters also get the `&'a [mut]` linked to the new page.
cc @rust-lang/docs: Is this sufficient information? The listing of trait impls kinda feels redundant (especially if we can get the automated impl listing sorted again), but i still think it's useful to point out that you can use these in a generic context.
Fixes#15654
resolve: Try to fix instability in import suggestions
cc https://github.com/rust-lang/rust/pull/42033
`lookup_import_candidates` walks module graph in DFS order and skips modules that were already visited (which is correct because there can be cycles).
However it means that if we visited `std::prelude::v1::Result::Ok` first, we will never visit `std::result::Result::Ok` because `Result` will be skipped as already visited (note: enums are also modules here), and otherwise, if we visited `std::result::Result::Ok` first, we will never get to `std::prelude::v1::Result::Ok`.
What child module of `std` (`prelude` or `result`) we will visit first, depends on randomized hashing, so we have instability in diagnostics.
With this patch modules' children are visited in stable order in `lookup_import_candidates`, this should fix the issue, but let's see what Travis will say.
r? @oli-obk
Three small fixes for save-analysis
First commit does some naive deduplication of macro uses. We end up with lots of duplication here because of the weird way we get this data (we extract a use for every span generated by a macro use).
Second commit is basically a typo fix.
Third commit is a bit interesting, it partially reverts a change from #40939 where temporary variables in format! (and thus println!) got a span with the primary pointing at the value stored into the temporary (e.g., `x` in `println!("...", x)`). If `format!` had a definition it should point at the temporary in the macro def, but since it is built-in, that is not possible (for now), so `DUMMY_SP` is the best we can do (using the span in the callee really breaks save-analysis because it thinks `x` is a definition as well as a reference).
There aren't a test for this stuff because: the deduplication is filtered by any of the users of save-analysis, so it is purely an efficiency change. I couldn't actually find an example for the second commit that we have any machinery to test, and the third commit is tested by the RLS, so there will be a test once I update the RLS version and and uncomment the previously failing tests).
r? @jseyfried
These need to be inlined across crates to avoid showing up as one-instruction
functions in profiles! In the benchmark from #43578 this decreased the
translation item collection step from 30s to 23s, and looks like it also allowed
vectorization elsewhere of the operations!
Commit c4710203c098b in #43492 make `LLVMRustHasFeature` "more robust"
by using `getFeatureTable()`. However, this function is specific to
Rust's own LLVM fork, not upstream LLVM-4.0, so we need to use
`#if LLVM_RUSTLLVM` to guard this call.
borrowck: skip CFG construction when there is nothing to propagate
CFG construction takes a large amount of time and memory, especially for
large constants. If such a constant contains no actions on lvalues, it
can't have borrowck problems and can be ignored by it.
This removes the 4.9GB borrowck peak from #36799. It seems that HIR had
grown by 300MB and MIR had grown by 500MB from the last massif
collection and that remains to be investigated, but this at least shaves
the borrowck peak.
r? @nikomatsakis
Set `LLVM_LINK_LLVM_DYLIB=ON` -- "If enabled, tools will be linked with
the libLLVM shared library." Rust doesn't ship any of the LLVM tools,
and only needs a few at all for some test cases, so statically linking
the tools is just a waste of space. I've also had memory issues on
slower machines with LLVM debuginfo enabled, when several tools start
linking in parallel consuming several GBs each.
With the default configuration, `build/x86_64-unknown-linux-gnu/llvm`
was 1.5GB before, now down to 731MB. The difference is more drastic
with `--enable-llvm-release-debuginfo`, from 28GB to "only" 13GB.
This does not change the linking behavior of `rustc_llvm`.
Removing nops can allow more basic blocks to be merged, but merging
basic blocks can't allow for more nops to be removed, so we should
remove nops first.
This doesn't matter *that* much, because normally we run SimplifyCfg
several times, but there's no reason not to do it.
I saw MIR cache invalidation somewhat hot on my profiler when per-BB
indexin was used. That shouldn't matter much, but there is no good
reason not to use an iterator.