travis: Parallelize tests on Android
Currently our slowest test suite on android, run-pass, takes over 5 times longer
than the x86_64 component (~400 -> ~2200s). Typically QEMU emulation does indeed
add overhead, but not 5x for this kind of workload. One of the slowest parts of
the Android process is that *compilation* happens serially. Tests themselves
need to run single-threaded on the emulator (due to how the test harness works)
and this forces the compiles themselves to be single threaded.
Now Travis gives us more than one core per machine, so it'd be much better if we
could take advantage of them! The emulator itself is still fundamentally
single-threaded, but we should see a nice speedup by sending binaries for it to
run much more quickly.
It turns out that we've already got all the toos to do this in-tree. The
qemu-test-{server,client} that are in use for the ARM Linux testing are a
perfect match for the Android emulator. This commit migrates the custom adb
management code in compiletest/rustbuild to the same qemu-test-{server,client}
implementation that ARM Linux uses.
This allows us to lift the parallelism restriction on the compiletest test
suites, namely run-pass. Consequently although we'll still basically run the
tests themselves in single threaded mode we'll be able to compile all of them in
parallel, keeping the pipeline much more full hopefully and using more cores for
the work at hand. Additionally the architecture here should be a bit speedier as
it should have less overhead than adb which is a whole new process on both the
host and the emulator!
Locally on an 8 core machine I've seen the run-pass test suite speed up from
taking nearly an hour to only taking 5 minutes. I don't think we'll see quite a
drastic speedup on Travis but I'm hoping this change can place the Android tests
well below 2 hours instead of just above 2 hours.
Because the client/server here are now repurposed for more than just QEMU,
they've been renamed to `remote-test-{server,client}`.
Note that this PR does not currently modify how debuginfo tests are executed on
Android. While parallelizable it wouldn't be quite as easy, so that's left to
another day. Thankfull that test suite is much smaller than the run-pass test
suite.
Currently our slowest test suite on android, run-pass, takes over 5 times longer
than the x86_64 component (~400 -> ~2200s). Typically QEMU emulation does indeed
add overhead, but not 5x for this kind of workload. One of the slowest parts of
the Android process is that *compilation* happens serially. Tests themselves
need to run single-threaded on the emulator (due to how the test harness works)
and this forces the compiles themselves to be single threaded.
Now Travis gives us more than one core per machine, so it'd be much better if we
could take advantage of them! The emulator itself is still fundamentally
single-threaded, but we should see a nice speedup by sending binaries for it to
run much more quickly.
It turns out that we've already got all the tools to do this in-tree. The
qemu-test-{server,client} that are in use for the ARM Linux testing are a
perfect match for the Android emulator. This commit migrates the custom adb
management code in compiletest/rustbuild to the same qemu-test-{server,client}
implementation that ARM Linux uses.
This allows us to lift the parallelism restriction on the compiletest test
suites, namely run-pass. Consequently although we'll still basically run the
tests themselves in single threaded mode we'll be able to compile all of them in
parallel, keeping the pipeline much more full and using more cores for the work
at hand. Additionally the architecture here should be a bit speedier as it
should have less overhead than adb which is a whole new process on both the host
and the emulator!
Locally on an 8 core machine I've seen the run-pass test suite speed up from
taking nearly an hour to only taking 6 minutes. I don't think we'll see quite a
drastic speedup on Travis but I'm hoping this change can place the Android tests
well below 2 hours instead of just above 2 hours.
Because the client/server here are now repurposed for more than just QEMU,
they've been renamed to `remote-test-{server,client}`.
Note that this PR does not currently modify how debuginfo tests are executed on
Android. While parallelizable it wouldn't be quite as easy, so that's left to
another day. Thankfully that test suite is much smaller than the run-pass test
suite.
As a final fix I discovered that the ARM and Android test suites were actually
running all library unit tests (e.g. stdtest, coretest, etc) twice. I've
corrected that to only run tests once which should also give a nice boost in
overall cycle time here.
Recently we switched from the win32 MinGW toolchain to the pthreads-based
toolchain. We ship `gcc.exe` from this toolchain with the `rust-mingw` package
in the standard distribution but the pthreads version of `gcc.exe` depends on
`libwinpthread-1.dll`. While we're shipping this DLL for the compiler to depend
on we're not shipping it for gcc. As a workaround just copy the dll to gcc.exe
location and don't attempt to share for now.
cc https://github.com/rust-lang/rust/issues/31840#issuecomment-297478538
Instead of hard-coding the command to run, using the environment
variable `GDB_CMD` (that defaults to `gdb`) allows using a different
debugger than the default `gdb` executable.
This gives the possibility to use `cgdb` as the debugger, which provides
a nicer user interface. Note that one has to use `GDB_CMD="cgdb --"` to
use cgdb (note the trailing `--`) to let cgdb pass the proper arguments
to `gdb`.
This commit shrinks the size of the aforementioned table from
2,102 bytes to 1,197 bytes. This is achieved by an observation that
most u16 entries are common in its upper byte. Specifically:
- SINGLETONS now uses two tables, one for (upper byte, lower count)
and another for a series of lower bytes. For each upper byte given
number of lower bytes are read and compared.
- NORMAL now uses a variable length format for the count of "true"
codepoints and "false" codepoints (one byte with MSB unset, or
two big-endian bytes with the first MSB set).
The code size and relative performance roughly remains same as this
commit tries to optimize for both. The new table and algorithm has
been verified for the equivalence to older ones.
Add intrinsics & target features for rd{rand,seed}
One question is whether or not we want to map feature name `rdrnd` to `rdrand` instead.
EDIT: as for use case, I would like to port my rdrand crate from inline assembly to these intrinsics.
This commit adds a new flag to the configure script,
`--enable-extended`, which is intended for specifying a desire to
compile the full suite of Rust tools such as Cargo, the RLS, etc. This
is also an indication that the build system should create combined
installers such as the pkg/exe/msi artifacts.
Currently the `--enable-extended` flag just indicates that combined
installers should be built, and Cargo is itself not compiled just yet
but rather only downloaded from its location. The intention here is to
quickly get to feature parity with the current release process and then
we can start improving it afterwards.
All new files in this PR inside `src/etc/installer` are copied from the
rust-packaging repository.
Reduce the size of static data in std_unicode::tables
`BoolTrie` works well for sets of code points spread out through most of Unicode’s range, but is uses a lot of space for sets with few, mostly low, code points.
This switches a few of its instances to a similar but simpler trie data structure.
CC @raphlinus, who wrote the original `BoolTrie`.
## Before
`size_of::<BoolTrie>()` is 1552, which is added to `table.r3.len() * 8 + t.r5.len() + t.r6.len() * 8`:
* `Cc_table`: 1632
* `White_Space_table`: 1656
* `Pattern_White_Space_table`: 1640
* Total: 4928 bytes
## After
`size_of::<SmallBoolTrie>()` is 32, which is added to `t.r1.len() + t.r2.len() * 8`:
* `Cc_table`: 51
* `White_Space_table`: 273
* `Pattern_White_Space_table`: 193
* Total: 517 bytes
## Difference
Every Rust program with `std` statically linked should be about 4 KB smaller.
`BoolTrie` works well for sets of code points spread out through
most of Unicode’s range, but is uses a lot of space for sets
with few, mostly low, code points.
This switches a few of its instances to a similar but simpler trie
data structure.
## Before
`size_of::<BoolTrie>()` is 1552, which is added to
`table.r3.len() * 8 + t.r5.len() + t.r6.len() * 8`:
* `Cc_table`: 1632
* `White_Space_table`: 1656
* `Pattern_White_Space_table`: 1640
* Total: 4928 bytes
## After
`size_of::<SmallBoolTrie>()` is 32, which is added to
`t.r1.len() + t.r2.len() * 8`:
* `Cc_table`: 51
* `White_Space_table`: 273
* `Pattern_White_Space_table`: 193
* Total: 517 bytes
## Difference
Every Rust program with `std` statically linked should be about 4 KB smaller.
- `--emit=asm --target=nvptx64-nvidia-cuda` can be used to turn a crate
into a PTX module (a `.s` file).
- intrinsics like `__syncthreads` and `blockIdx.x` are exposed as
`"platform-intrinsics"`.
- "cabi" has been implemented for the nvptx and nvptx64 architectures.
i.e. `extern "C"` works.
- a new ABI, `"ptx-kernel"`. That can be used to generate "global"
functions. Example: `extern "ptx-kernel" fn kernel() { .. }`. All
other functions are "device" functions.
The problem occured due to lines like
```
3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;;
4DB5;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;;
```
in `UnicodeData.txt`, which the script previously interpreted as two
characters, although it represents the whole range.
Fixes#34318.
gdb: Fix pretty-printing special-cased Rust types
gdb trunk now reports fully qualified type names, just like lldb. Move lldb code for extracting unqualified names to shared file.
For current releases of gdb, `extract_type_name` should just be a no-op.
Fixes#35155
Make version check in gdb_rust_pretty_printing.py more compatible.
Some versions of Python don't support the `major` field on the object returned by `sys.version_info`.
Fixes#35724
r? @brson
LLVM upgrade
As discussed in https://internals.rust-lang.org/t/need-help-with-emscripten-port/3154/46 I'm trying to update the used LLVM checkout in Rust.
I basically took @shepmaster's code and applied it on top (though I did the commits manually, the [original commits have better descriptions](https://github.com/rust-lang/rust/compare/master...avr-rust:avr-support).
With these changes I was able to build rustc. `make check` throws one last error on `run-pass/issue-28950.rs`. Output: https://gist.github.com/badboy/bcdd3bbde260860b6159aa49070a9052
I took the metadata changes as is and they seem to work, though it now uses the module in another step. I'm not sure if this is the best and correct way.
Things to do:
* [x] ~~Make `run-pass/issue-28950.rs` pass~~ unrelated
* [x] Find out how the `PositionIndependentExecutable` setting is now used
* [x] Is the `llvm::legacy` still the right way to do these things?
cc @brson @alexcrichton
Use the same procedure as Python to determine whether a character is
printable, described in [PEP 3138]. In particular, this means that the
following character classes are escaped:
- Cc (Other, Control)
- Cf (Other, Format)
- Cs (Other, Surrogate), even though they can't appear in Rust strings
- Co (Other, Private Use)
- Cn (Other, Not Assigned)
- Zl (Separator, Line)
- Zp (Separator, Paragraph)
- Zs (Separator, Space), except for the ASCII space `' '` (`0x20`)
This allows for user-friendly inspection of strings that are not
English (e.g. compare `"\u{e9}\u{e8}\u{ea}"` to `"éèê"`).
Fixes#34318.
[PEP 3138]: https://www.python.org/dev/peps/pep-3138/
If local-rust is the same as the current version, then force a local-rebuild
In Debian, we would like the option to build/rebuild the current release from
*either* the current or previous stable release. So we use enable-local-rust
instead of enable-local-rebuild, and read the bootstrap key dynamically from
whatever is installed locally.
In general, it does not make much sense to allow enable-local-rust without also
setting the bootstrap key, since the build would fail otherwise.
(The way I detect "the bootstrap key of [the local] rustc installation" is a bit hacky, suggestions welcome.)
Avoid redundant downloads when bootstrapping
If the local file is available, then verify it against the hash we just
downloaded, and if it matches then we don't need to download it again.
Add x86 intrinsics for bit manipulation (BMI 1.0, BMI 2.0, and TBM).
This PR adds the LLVM x86 intrinsics for the bit manipulation instruction sets (BMI 1.0, BMI 2.0, and TBM).
The objective of this pull-request is to allow building a library that implements all the algorithms offered by those instruction sets, using compiler intrinsics for the targets that support them (by means of `target_feature`).
The target features added are:
- `bmi`: Bit Manipulation Instruction Set 1.0, available in Intel >= Haswell and AMD's >= Jaguar/Piledriver,
- `bmi2`: Bit Manipulation Instruction Set 2.0, available in Intel >= Haswell and AMD's >= Excavator,
- `tbm`: Trailing Bit Manipulation, available only in AMD's Piledriver (won't be available in newer CPUs).
The intrinsics added are:
- BMI 1.0:
- `bextr`: Bit field extract (with register).
- BMI 2.0:
- `bzhi`: Zero high bits starting with specified bit position.
- `pdep`: Parallel bits deposit.
- `pext`: Parallel bits extract.
- TBM:
- `bextri`: Bit field extract (with immediate).
Convert makefiles to build LLVM/compiler-rt with CMake
This is certainly buggy, but I have successfully built on x86_64-unknown-linux-gnu and x86_64-pc-windows-gnu. I haven't built successfully on mac yet, and I've seen mysterious test failures on Linux, but I'm interested in throwing this at the bots to see what they think.
Efficient trie lookup for boolean Unicode properties
Replace binary search of ranges with trie lookup using leaves of
64-bit bitmap chunks. Benchmarks suggest this is approximately 10x
faster than the bsearch approach.
gdb Pretty Print: generic encoded was failing on reference/pointer types
If you debug this program using **gdb**
```rust
fn main() {
let x = 10;
let y = Some(&x);
// additional code
}
```
And you try to print **y**'s value from the debugger, you get the following:
```
(gdb) print y
Python Exception <class 'gdb.error'> Cannot convert value to int.:
$1 = {RUST$ENCODED$ENUM$0$None = Some = {0x7fff5fbff97c}}
```
What happens is that inside **debugger_pretty_printers_common.py** the method `is_null_variant` doesn't have any special handling for pointer values so it ends up calling `.as_integer()` on `discriminant_val` (which holds a pointer) and fails.
Considering it needs to handle pointers and return _true_ when the pointer is _null_, I modified the `.as_integer()` method in **gdb_rust_pretty_printing.py** to take pointers into consideration.
After this modification **gdb** prints **y** like this:
```
(gdb) print y
$1 = Some = {0x7fff5fbff97c}
```
Now, it would be nice to print something useful (instead of a pointer address) but the pretty printer doesn't currently handle references/pointers so that's a completely different subject.
When bootstrapping Rust using a previously built toolchain, I noticed
a number of libraries were not copied in. As a result the copied in
rustc fails to execute because it can't find all its dependences.
Add them into the local_stage0.sh script.
Forcing them to be embedded in makefiles precludes being able to run them in
rustbuild, and adding them to compiletest gives us a great way to leverage
future enhancements to our "all encompassing test suite runner" as well as just
moving more things into Rust.
All tests are still Makefile-based in the sense that they rely on `make` being
available to run them, but there's no longer any Makefile-trickery to run them
and rustbuild can now run them out of the box as well.
rustdoc: Only record the same impl once
Due to inlining it is possible to visit the same module multiple times during `<Cache as DocFolder>::fold_crate`, so we keep track of the modules we've already visited.
fixes#33054
r? @alexcrichton
Sanity check Python on OSX for LLDB tests
Two primary changes:
* Don't get past the configure stage if `python` isn't coming from `/usr/bin`
* Call `debugger.Terminate()` to prevent segfaults on newer versions of LLDB.
Closes#32994
Due to inlining it is possible to visit the same module multiple times
during `<Cache as DocFolder>::fold_crate`, so we keep track of the
modules we've already visited.
Adds a comment which explains the trie structure, and also does a
little arithmetic on lookup (no measurable impact, looks like modern
CPUs do this arithmetic in parallel with the memory lookup to find the
node) to save a bit of space. As a result, the memory impact of the
compiled tables is within a couple hundred bytes of the old
bsearch-range structure.
Replace binary search of ranges with trie lookup using leaves of
64-bit bitmap chunks. Benchmarks suggest this is approximately 10x
faster than the bsearch approach.
This commit removes all infrastructure from the repository for our so-called
snapshots to instead bootstrap the compiler from stable releases. Bootstrapping
from a previously stable release is a long-desired feature of distros because
they're not fans of downloading binary stage0 blobs from us. Additionally, this
makes our own CI easier as we can decommission all of the snapshot builders and
start having a regular cadence to when we update the stage0 compiler.
A new `src/etc/get-stage0.py` script was added which shares some code with
`src/bootstrap/bootstrap.py` to read a new file, `src/stage0.txt`, which lists
the current stage0 compiler as well as cargo that we bootstrap from. This script
will download the relevant `rustc` package an unpack it into `$target/stage0` as
we do today.
One problem of bootstrapping from stable releases is that we're not able to
compile unstable code (e.g. all the `#![feature]` directives in libcore/libstd).
To overcome this we employ two strategies:
* The bootstrap key of the previous compiler is hardcoded into `src/stage0.txt`
(enabled as a result of #32731) and exported by the build system. This enables
nightly features in the compiler we download.
* The standard library and compiler are pinned to a specific stage0, which
doesn't change, so we're guaranteed that we'll continue compiling as we start
from a known fixed source.
The process for making a release will also need to be tweaked now to continue to
cadence of bootstrapping from the previous release. This process looks like:
1. Merge `beta` to `stable`
2. Produce a new stable compiler.
3. Change `master` to bootstrap from this new stable compiler.
4. Merge `master` to `beta`
5. Produce a new beta compiler
6. Change `master` to bootstrap from this new beta compiler.
Step 3 above should involve very few changes as `master` was previously
bootstrapping from `beta` which is the same as `stable` at that point in time.
Step 6, however, is where we benefit from removing lots of `#[cfg(stage0)]` and
get to use new features. This also shouldn't slow the release too much as steps
1-5 requires little work other than waiting and step 6 just needs to happen at
some point during a release cycle, it's not time sensitive.
Closes#29555Closes#29557
Right now on the most recent version of LLDB installed on OSX we'll segfault on
all the LLDB tests if this isn't called (unfortunately). Hopefully we've updated
LLDB on the bots to actually get this working everywhere!
Closes#32994
This commit rewrites all of the tidy checks we have, namely:
* featureck
* errorck
* tidy
* binaries
into Rust under a new `tidy` tool inside of the `src/tools` directory. This at
the same time deletes all the corresponding Python tidy checks so we can be sure
to only have one source of truth for all the tidy checks.
cc #31590
rustbuild: Fix dist for non-host targets
The `rust-std` package that we produce is expected to have not only the standard
library but also libtest for compiling unit tests. Unfortunately this does not
currently happen due to the way rustbuild is structured.
There are currently two main stages of compilation in rustbuild, one for the
standard library and one for the compiler. This is primarily done to allow us to
fill in the sysroot right after the standard library has finished compiling to
continue compiling the rest of the crates. Consequently the entire compiler does
not have to explicitly depend on the standard library, and this also should
allow us to pull in crates.io dependencies into the build in the future because
they'll just naturally build against the std we just produced.
These phases, however, do not represent a cross-compiled build. Target-only
builds also require libtest, and libtest is currently part of the
all-encompassing "compiler build". There's unfortunately no way to learn about
just libtest and its dependencies (in a great and robust fashion) so to ensure
that we can copy the right artifacts over this commit introduces a new build
step, libtest.
The new libtest build step has documentation, dist, and link steps as std/rustc
already do. The compiler now depends on libtest instead of libstd, and all
compiler crates can now assume that test and its dependencies are implicitly
part of the sysroot (hence explicit dependencies being removed). This makes the
build a tad less parallel as in theory many rustc crates can be compiled in
parallel with libtest, but this likely isn't where we really need parallelism
either (all the time is still spent in the compiler).
All in all this allows the `dist-std` step to depend on both libstd and libtest,
so `rust-std` packages produced by rustbuild should start having both the
standard library and libtest.
Closes#32523
The `rust-std` package that we produce is expected to have not only the standard
library but also libtest for compiling unit tests. Unfortunately this does not
currently happen due to the way rustbuild is structured.
There are currently two main stages of compilation in rustbuild, one for the
standard library and one for the compiler. This is primarily done to allow us to
fill in the sysroot right after the standard library has finished compiling to
continue compiling the rest of the crates. Consequently the entire compiler does
not have to explicitly depend on the standard library, and this also should
allow us to pull in crates.io dependencies into the build in the future because
they'll just naturally build against the std we just produced.
These phases, however, do not represent a cross-compiled build. Target-only
builds also require libtest, and libtest is currently part of the
all-encompassing "compiler build". There's unfortunately no way to learn about
just libtest and its dependencies (in a great and robust fashion) so to ensure
that we can copy the right artifacts over this commit introduces a new build
step, libtest.
The new libtest build step has documentation, dist, and link steps as std/rustc
already do. The compiler now depends on libtest instead of libstd, and all
compiler crates can now assume that test and its dependencies are implicitly
part of the sysroot (hence explicit dependencies being removed). This makes the
build a tad less parallel as in theory many rustc crates can be compiled in
parallel with libtest, but this likely isn't where we really need parallelism
either (all the time is still spent in the compiler).
All in all this allows the `dist-std` step to depend on both libstd and libtest,
so `rust-std` packages produced by rustbuild should start having both the
standard library and libtest.
Closes#32523
This commit improves the compile time of `rustc_platform_intrinsics` from 23s to
3.6s if compiling with `-O` and from 77s to 17s if compiling with `-O -g`. The
compiled rlib size also drops from 3.1M to 1.2M.
The wins here were gained by removing the destructors associated with `Type` by
removing the internal `Box` and `Vec` indirections. These destructors meant that
a lot of landing pads and extra code were generated to manage the runtime
representations. Instead everything can basically be statically computed and
shoved into rodata, so all we need is a giant string compare to lookup what's
what.
Closes#28273
This defines the `_mm256_blendv_pd` and `_mm256_blendv_ps` intrinsics.
The `_mm256_blend_pd` and `_mm256_blend_ps` intrinsics are not available
as LLVM intrinsics. In Clang they are implemented using the
shufflevector builtin.
Intel reference: https://software.intel.com/en-us/node/524070.