std: Avoid `ptr::copy` if unnecessary in `vec::Drain`
This commit is spawned out of a performance regression investigation in #50496.
In tracking down this regression it turned out that the `expand_statements`
function in the compiler was taking quite a long time. Further investigation
showed two key properties:
* The function was "fast" on glibc 2.24 and slow on glibc 2.23
* The hottest function was memmove from glibc
Combined together it looked like glibc gained an optimization to the memmove
function in 2.24. Ideally we don't want to rely on this optimization, so I
wanted to dig further to see what was happening.
The hottest part of `expand_statements` was `Drop for Drain` in the call to
`splice` where we insert new statements into the original vector. This *should*
be a cheap operation because we're draining and replacing iterators of the exact
same length, but under the hood memmove was being called a lot, causing a
slowdown on glibc 2.23.
It turns out that at least one of the optimizations in glibc 2.24 was that
`memmove` where the src/dst are equal becomes much faster. [This program][prog]
executes in ~2.5s against glibc 2.23 and ~0.3s against glibc 2.24, exhibiting
how glibc 2.24 is optimizing `memmove` if the src/dst are equal.
And all that brings us to what this commit itself is doing. The change here is
purely to `Drop for Drain` to avoid the call to `ptr::copy` if the region being
copied doesn't actually need to be copied. For normal usage of just `Drain`
itself this check isn't really necessary, but because `Splice` internally
contains `Drain` this provides a nice speed boost on glibc 2.23. Overall this
should fix the regression seen in #50496 on glibc 2.23 and also fix the
regression on Windows where `memmove` looks to not have this optimization.
Note that the way `splice` was called in `expand_statements` would cause a
quadratic number of elements to be copied via `memmove` which is likely why the
tuple-stress benchmark showed such a severe regression.
Closes#50496
[prog]: https://gist.github.com/alexcrichton/c05bc51c6771bba5ae5b57561a6c1cd3
Allow for specifying a linker plugin for cross-language LTO
This PR makes the `-Zcross-lang-lto` flag optionally take the path to the `LLVMgold.so` linker plugin. If this path is specified, `rustc` will invoke the linker with the correct arguments (i.e. `-plugin` and various `-plugin-opt`s).
This can be used to ergonomically enable cross-language LTO for Rust programs with C/C++ dependencies:
```
clang -O2 test.c -otest.o -c -flto=thin
llvm-ar -rv libxxx.a test.o
rustc -L. main.rs -Zcross-lang-lto=/usr/lib64/LLVMgold.so -O -Clink-arg=-fuse-ld=gold
```
- Note that in theory this should work with Gold, LLD, and newer versions of binutils' LD but on my current system I could only get it to work with Gold.
- Also note that this will work best if the Clang version and Rust's LLVM version are close enough. Clang 6.0 works well with the current nightly.
r? @alexcrichton
Inline `Span` methods.
Because they are simple and hot.
This change speeds up some incremental runs of a few rustc-perf
benchmarks, the best by 3%.
Here are the ones with a speedup of at least 1%:
```
coercions
avg: -1.1% min: -3.4% max: -0.2%
html5ever-opt
avg: -0.8% min: -1.7% max: -0.2%
clap-rs-check
avg: -0.3% min: -1.4% max: 0.7%
html5ever
avg: -0.7% min: -1.2% max: -0.4%
html5ever-check
avg: -0.9% min: -1.1% max: -0.8%
clap-rs
avg: -0.4% min: -1.1% max: -0.1%
crates.io-check
avg: -0.8% min: -1.0% max: -0.6%
serde-opt
avg: -0.6% min: -1.0% max: -0.3%
```
Make CrateNum allocation more thread-safe.
This PR makes sure that we can't have race conditions when assigning CrateNums. It's a slight improvement but a larger refactoring of the CrateStore/CrateLoader infrastructure would be good, I think.
r? @Zoxc
Don't use Lock for heavily accessed CrateMetadata::cnum_map.
The `cnum_map` in `CrateMetadata` is used for two things:
1. to map `CrateNums` between crates (used a lot during decoding)
2. to construct the (reverse) post order of the crate graph
For the second case, we need to modify the map after the fact, which is why the map is wrapped in a `Lock`. This is bad for the first case, which does not need the modification and does lots of small reads from the map.
This PR splits case (2) out into a separate `dependencies` field. This allows to make the `cnum_map` immutable (and shifts the interior mutability to a less busy data structure).
Fixes#50502
r? @Zoxc
Give SliceIndex impls a test suite of girth befitting the implementation (and fix a UTF8 boundary check)
So one day I was writing something in my codebase that basically amounted to `impl SliceIndex for (Bound<usize>, Bound<usize>)`, and I said to myself:
*Boy, gee, golly! I never realized bounds checking was so tricky!*
At some point when I had around 60 lines of tests for it, I decided to go see how the standard library does it to see if I missed any edge cases. ...That's when I discovered that libcore only had about 40 lines of tests for slicing altogether, and none of them even used `..=`.
---
This PR includes:
* **Literally the first appearance of the word `get_unchecked_mut` in any directory named `test` or `tests`.**
* Likewise the first appearance of `get_mut` used with _any type of range argument_ in these directories.
* Tests for the panics on overflow with `..=`.
* I wanted to test on `[(); usize::MAX]` as well but that takes linear time in debug mode </3
* A horrible and ugly test-generating macro for the `should_panic` tests that increases the DRYness by a single order of magnitude (which IMO wasn't enough, but I didn't want to go any further and risk making the tests inaccessible to next guy).
* Same stuff for str!
* Actually, the existing `str` tests were pretty good. I just helped filled in the holes.
* [A fix for the bug it caught](https://github.com/rust-lang/rust/issues/50002). (only one ~~sadly~~)
Optimize layout of TypeVariants
This makes references to `Slice` use thin pointers by storing the slice length in the slice itself. `GeneratorInterior` is replaced by storing the movability of generators in `TyGenerator` and the interior witness is stored in `GeneratorSubsts` (which is just a wrapper around `&'tcx Substs`, like `ClosureSubsts`). Finally the fields of `TypeAndMut` is stored inline in `TyRef`. These changes combine to reduce `TypeVariants` from 48 bytes to 24 bytes on x86_64.
r? @michaelwoerister
Prevent spuriously needing to rebuild the docker image when the network
was down.
Also, adjusted the retry function to insert a sleep between retries,
because retrying immediately will often just hit the same issue.
Instead of tracking the "cause" of each bit that gets added, try to
recover that by walking outlives relationships. This is currently
imprecise, since it ignores the "point" where the outlives relationship
is incurred -- but that's ok, since we're about to stop considering that
overall in a later commit. This does seem to affect one error message
negatively, I didn't dig *too* hard to find out why.
./x.py test should be able to run individual tests
Allows user to be able to run individual tests by specifying filename i.e `./x.py test src/test/run-pass/foo.rs`
Fixes#48483
When the RawVec::try_reserve* methods were added, they took the place of
the ::reserve* methods in the source file, and new ::reserve* methods
wrapping the new try_reserve* methods were created. But the
documentation didn't move along, such that:
- reserve_* methods are barely documented.
- try_reserve_* methods have unmodified documentation from reserve_*,
such that their documentation indicate they are panicking/aborting.
This moves the documentation back to the right methods, with a
placeholder documentation for the try_reserve* methods.
Currently on CI we predominately compile LLVM with the default system compiler
which means gcc on Linux, some version of Clang on OSX, MSVC on Windows, and
gcc on MinGW. This commit switches Linux, OSX, and Windows to all use Clang
6.0.0 to build LLVM (aka the C/C++ compiler as part of the bootstrap). This
looks to generate faster code according to #49879 which translates to a faster
rustc (as LLVM internally is faster)
The major changes here were to the containers that build Linux releases,
basically adding a new step that uses the previous gcc 4.8 compiler to compile
the next Clang 6.0.0 compiler. Otherwise the OSX and Windows scripts have been
updated to download precompiled versions of Clang 6 and configure the build to
use them.
Note that `cc` was updated here to fix using `clang-cl` with `cc-rs` on MSVC, as
well as an update to `sccache` on Windows which was needed to correctly work
with `clang-cl`. Finally the MinGW compiler is entirely left out here
intentionally as it's currently thought that Clang can't generate C++ code for
MinGW and we need to use gcc, but this should be verified eventually.
Refactor auto trait handling in librustdoc to be accessible from librustc.
These commits transfer some of the functionality introduced in https://github.com/rust-lang/rust/pull/47833 to librustc with the intention of making the tools to work with auto traits accessible to third-party code, for example [rust-semverver](https://github.com/rust-lang-nursery/rust-semverver).
Some rough edges remain, and I'm certain some of the FIXMEs introduced will need some discussion, most notably the fairly ugly overall approach to pull out the core logic into librustc, which was previously fairly tightly coupled with various bits and bobs from librustdoc.
cc @Aaron1011