The LLVM PassManager has a PrepareForThinLTO flag, which is intended
when compilation occurs in conjunction with linking by ThinLTO. The
flag has two effects:
* The NameAnonGlobal pass is run after all other passes, which
ensures that all globals have a name.
* In optimized builds, a number of late passes (mainly related to
vectorization and unrolling) are disabled, on the rationale that
these a) will increase codesize of the intermediate artifacts
and b) will be run by ThinLTO again anyway.
This patch enables the use of PrepareForThinLTO if Thin or ThinLocal
linking is used.
The background for this change is the CI failure in #49479, which
we assume to be caused by the NameAnonGlobal pass not being run.
As this changes which passes LLVM runs, this might have performance
(or other) impact, so we want to land this separately.
added missing implementation hint
Fixes [#50151](https://github.com/rust-lang/rust/issues/50151).
Actually, i don't know, should following code
`let x = |ref x: isize| { x += 1; };`
emit
`note: an implementation of std::ops::AddAssign might be missing for &isize`
or
`note: this is a reference to a type that + can be applied to; you need to dereference this variable once for this operation to work`
or both
Use the correct crt*.o files when linking musl targets.
This is supposed to support optionally using the system copy of musl
libc instead of the included one if supported. This currently only
affects the start files, which is enough to allow building rustc on musl
targets.
Most of the changes are analogous to crt-static.
Excluding the start files is something musl based distributions usually patch into their copy of rustc:
- https://github.com/alpinelinux/aports/blob/eb064c8/community/rust/musl-fix-linux_musl_base.patch
- https://github.com/voidlinux/void-packages/blob/77400fc/srcpkgs/rust/patches/link-musl-dynamically.patch
For third-party distributions that not yet carry those patches it would be nice if it was supported without the need to patch upstream sources.
## Reasons
### What breaks?
Some start files were missed when originally writing the logic to swap in musl start files (gcc comes with its own start files, which are suppressed by -nostdlib, but not manually included later on). This caused #36710, which also affects rustc with the internal llvm copy or any other system libraries that need crtbegin/crtend.
### How is it fixed?
The system linker already has all the logic to decide which start files to include, so we can just defer to it (except of course if it doesn't target musl).
### Why is it optional?
In #40113 it was first tried to remove the start files, which broke compiling musl-targeting static binaries with a glibc-targeting compiler. This is why it eventually landed without removing the start files. Being an option side-steps the issue.
### Why are the start files still installed?
This has the nice side-effect, that the produced rust-std-* binaries can still be used by on a glibc-targeting system with a rustc built against glibc.
## Does it work?
With the following build script (using [musl-cross-make](https://github.com/richfelker/musl-cross-make)): https://shadowice.org/~mixi/rust-musl/build.sh, I was able to cross-compile a musl-host musl-targeting rustc on a glibc-based system. The resulting binaries are at https://shadowice.org/~mixi/rust-musl/binaries/. This also requires #50103 and #50104 (which are also applied to the branch the build script uses).
Rename the 2018 edition lint names
* `rust_2018_breakage` -> `rust_2018_compatibility` - the lint for ensuring
that your code, in the 2015 edition, is compatible with the 2018 edition's
semantics. This is required to pass *before* you enable the 2018 edition.
* `rust_2018_migration` -> `rust_2018_idioms` - the lint for writing idiomatic
code after you've already enabled the 2018 edition
Improve single-use and zero-use lifetime lints
The code now correctly identifies *when* to lint -- or more correctly, anyhow -- but it doesn't yet offer suggestions for how to fix.
(I just remembered when writing this I had meant to go back over some of these cases around e.g. impl Trait and double check that everything is right...)
cc #44752
r? @cramertj
Rollup of 18 pull requests
Successful merges:
- #49423 (Extend tests for RFC1598 (GAT))
- #50010 (Give SliceIndex impls a test suite of girth befitting the implementation (and fix a UTF8 boundary check))
- #50447 (Fix update-references for tests within subdirectories.)
- #50514 (Pull in a wasm fix from LLVM upstream)
- #50524 (Make DepGraph::previous_work_products immutable)
- #50532 (Don't use Lock for heavily accessed CrateMetadata::cnum_map.)
- #50538 ( Make CrateNum allocation more thread-safe. )
- #50564 (Inline `Span` methods.)
- #50565 (Use SmallVec for DepNodeIndex within dep_graph.)
- #50569 (Allow for specifying a linker plugin for cross-language LTO)
- #50572 (Clarify in the docs that `mul_add` is not always faster.)
- #50574 (add fn `into_inner(self) -> (Idx, Idx)` to RangeInclusive (#49022))
- #50575 (std: Avoid `ptr::copy` if unnecessary in `vec::Drain`)
- #50588 (Move "See also" disambiguation links for primitive types to top)
- #50590 (Fix tuple struct field spans)
- #50591 (Restore RawVec::reserve* documentation)
- #50598 (Remove unnecessary mutable borrow and resizing in DepGraph::serialize)
- #50606 (Retry when downloading the Docker cache.)
Failed merges:
- #50161 (added missing implementation hint)
- #50558 (Remove all reference to DepGraph::work_products)
Map the stack guard page with max protection on NetBSD
On NetBSD the initial mmap() protection of a mapping can not be made
less restrictive with mprotect().
So when mapping a stack guard page, use the maximum protection
we ever want to use, then mprotect() it to the permission we
want it to have initially.
Fixes#50313
* `rust_2018_breakage` -> `rust_2018_compatibility` - the lint for ensuring
that your code, in the 2015 edition, is compatible with the 2018 edition's
semantics. This is required to pass *before* you enable the 2018 edition.
* `rust_2018_migration` -> `rust_2018_idioms` - the lint for writing idiomatic
code after you've already enabled the 2018 edition
Retry when downloading the Docker cache.
As a safety measure, prevent spuriously needing to rebuild the docker image in case the network was reset while downloading.
Also, adjusted the retry function to insert a sleep between retries, because retrying immediately will often just hit the same issue.
Remove unnecessary mutable borrow and resizing in DepGraph::serialize
I might be mistaken, but I noticed this whilst in this file for something else. It appears that this mutable borrow is unnecessary and since it's locking it should be removed. The resizing looks redundant since nothing additional is added to the fingerprints in this function, so that can also be removed.
Restore RawVec::reserve* documentation
When the RawVec::try_reserve* methods were added, they took the place of
the ::reserve* methods in the source file, and new ::reserve* methods
wrapping the new try_reserve* methods were created. But the
documentation didn't move along, such that:
- reserve_* methods are barely documented.
- try_reserve_* methods have unmodified documentation from reserve_*,
such that their documentation indicate they are panicking/aborting.
This moves the documentation back to the right methods, with a
placeholder documentation for the try_reserve* methods.
std: Avoid `ptr::copy` if unnecessary in `vec::Drain`
This commit is spawned out of a performance regression investigation in #50496.
In tracking down this regression it turned out that the `expand_statements`
function in the compiler was taking quite a long time. Further investigation
showed two key properties:
* The function was "fast" on glibc 2.24 and slow on glibc 2.23
* The hottest function was memmove from glibc
Combined together it looked like glibc gained an optimization to the memmove
function in 2.24. Ideally we don't want to rely on this optimization, so I
wanted to dig further to see what was happening.
The hottest part of `expand_statements` was `Drop for Drain` in the call to
`splice` where we insert new statements into the original vector. This *should*
be a cheap operation because we're draining and replacing iterators of the exact
same length, but under the hood memmove was being called a lot, causing a
slowdown on glibc 2.23.
It turns out that at least one of the optimizations in glibc 2.24 was that
`memmove` where the src/dst are equal becomes much faster. [This program][prog]
executes in ~2.5s against glibc 2.23 and ~0.3s against glibc 2.24, exhibiting
how glibc 2.24 is optimizing `memmove` if the src/dst are equal.
And all that brings us to what this commit itself is doing. The change here is
purely to `Drop for Drain` to avoid the call to `ptr::copy` if the region being
copied doesn't actually need to be copied. For normal usage of just `Drain`
itself this check isn't really necessary, but because `Splice` internally
contains `Drain` this provides a nice speed boost on glibc 2.23. Overall this
should fix the regression seen in #50496 on glibc 2.23 and also fix the
regression on Windows where `memmove` looks to not have this optimization.
Note that the way `splice` was called in `expand_statements` would cause a
quadratic number of elements to be copied via `memmove` which is likely why the
tuple-stress benchmark showed such a severe regression.
Closes#50496
[prog]: https://gist.github.com/alexcrichton/c05bc51c6771bba5ae5b57561a6c1cd3
Allow for specifying a linker plugin for cross-language LTO
This PR makes the `-Zcross-lang-lto` flag optionally take the path to the `LLVMgold.so` linker plugin. If this path is specified, `rustc` will invoke the linker with the correct arguments (i.e. `-plugin` and various `-plugin-opt`s).
This can be used to ergonomically enable cross-language LTO for Rust programs with C/C++ dependencies:
```
clang -O2 test.c -otest.o -c -flto=thin
llvm-ar -rv libxxx.a test.o
rustc -L. main.rs -Zcross-lang-lto=/usr/lib64/LLVMgold.so -O -Clink-arg=-fuse-ld=gold
```
- Note that in theory this should work with Gold, LLD, and newer versions of binutils' LD but on my current system I could only get it to work with Gold.
- Also note that this will work best if the Clang version and Rust's LLVM version are close enough. Clang 6.0 works well with the current nightly.
r? @alexcrichton
Inline `Span` methods.
Because they are simple and hot.
This change speeds up some incremental runs of a few rustc-perf
benchmarks, the best by 3%.
Here are the ones with a speedup of at least 1%:
```
coercions
avg: -1.1% min: -3.4% max: -0.2%
html5ever-opt
avg: -0.8% min: -1.7% max: -0.2%
clap-rs-check
avg: -0.3% min: -1.4% max: 0.7%
html5ever
avg: -0.7% min: -1.2% max: -0.4%
html5ever-check
avg: -0.9% min: -1.1% max: -0.8%
clap-rs
avg: -0.4% min: -1.1% max: -0.1%
crates.io-check
avg: -0.8% min: -1.0% max: -0.6%
serde-opt
avg: -0.6% min: -1.0% max: -0.3%
```
Make CrateNum allocation more thread-safe.
This PR makes sure that we can't have race conditions when assigning CrateNums. It's a slight improvement but a larger refactoring of the CrateStore/CrateLoader infrastructure would be good, I think.
r? @Zoxc
Don't use Lock for heavily accessed CrateMetadata::cnum_map.
The `cnum_map` in `CrateMetadata` is used for two things:
1. to map `CrateNums` between crates (used a lot during decoding)
2. to construct the (reverse) post order of the crate graph
For the second case, we need to modify the map after the fact, which is why the map is wrapped in a `Lock`. This is bad for the first case, which does not need the modification and does lots of small reads from the map.
This PR splits case (2) out into a separate `dependencies` field. This allows to make the `cnum_map` immutable (and shifts the interior mutability to a less busy data structure).
Fixes#50502
r? @Zoxc
Give SliceIndex impls a test suite of girth befitting the implementation (and fix a UTF8 boundary check)
So one day I was writing something in my codebase that basically amounted to `impl SliceIndex for (Bound<usize>, Bound<usize>)`, and I said to myself:
*Boy, gee, golly! I never realized bounds checking was so tricky!*
At some point when I had around 60 lines of tests for it, I decided to go see how the standard library does it to see if I missed any edge cases. ...That's when I discovered that libcore only had about 40 lines of tests for slicing altogether, and none of them even used `..=`.
---
This PR includes:
* **Literally the first appearance of the word `get_unchecked_mut` in any directory named `test` or `tests`.**
* Likewise the first appearance of `get_mut` used with _any type of range argument_ in these directories.
* Tests for the panics on overflow with `..=`.
* I wanted to test on `[(); usize::MAX]` as well but that takes linear time in debug mode </3
* A horrible and ugly test-generating macro for the `should_panic` tests that increases the DRYness by a single order of magnitude (which IMO wasn't enough, but I didn't want to go any further and risk making the tests inaccessible to next guy).
* Same stuff for str!
* Actually, the existing `str` tests were pretty good. I just helped filled in the holes.
* [A fix for the bug it caught](https://github.com/rust-lang/rust/issues/50002). (only one ~~sadly~~)
Optimize layout of TypeVariants
This makes references to `Slice` use thin pointers by storing the slice length in the slice itself. `GeneratorInterior` is replaced by storing the movability of generators in `TyGenerator` and the interior witness is stored in `GeneratorSubsts` (which is just a wrapper around `&'tcx Substs`, like `ClosureSubsts`). Finally the fields of `TypeAndMut` is stored inline in `TyRef`. These changes combine to reduce `TypeVariants` from 48 bytes to 24 bytes on x86_64.
r? @michaelwoerister
Prevent spuriously needing to rebuild the docker image when the network
was down.
Also, adjusted the retry function to insert a sleep between retries,
because retrying immediately will often just hit the same issue.