Improvements to building and CI for mingw/msys
I was getting error messages when trying to follow the build instructions the mingw build for Rust, and managed to track the issue down to an incomparability of Rust's bootstrap program with MSYS2's version of git. Essentially, the problem is that MSYS2's git works in emulated unix-y paths, but bootstrap expects a Windows path. I found a workaround for this by using relative paths instead of absolute paths.
Along with that fix, this PR also updates the build instructions for MinGW to be compatible with modern versions of MSYS2, and some changes to CI to make sure that MSYS2's version of git is tested. In particular, I'm suggesting using the [MSYS2 github action](https://github.com/marketplace/actions/setup-msys2) specially made for this purpose, which is much less hacky than the old approach and gives us more control of what packages are installed. I also cleaned up as many alternate versions of key tools as I could find from PATH, to avoid accidental usage, and cleaned up some abuses of the `CUSTOM_MINGW` environment variable.
This fixes https://github.com/rust-lang/rust/issues/105696 and fixes https://github.com/rust-lang/rust/issues/117567
This seems to fix two sporadic errors that have been appearing in CI.
One is an issue with cmake being unable to verify that cmake is able to
build a simple test program. The other is a `invalid r_symbolnum`
linking error when trying to build one of cranelift's tests.
This is intended as a temporary fix until we can figure out how to
resolve those issues.
Bump Fuchsia, build tests, and use 8 core bots
- Build Fuchsia on 8 cores instead of 16
- Skip building cranelift for Fuchsia
- Bump Fuchsia (includes building tests)
This includes a change to the upstream build_fuchsia_from_rust_ci script that builds a minimal set of tests, to improve coverage on this builder. This would have caught https://github.com/rust-lang/rust-clippy/issues/11952 and #119593.
See prior discussion on #119400 about building on 8 cores instead of 16. This PR combines changes from that and #119399, plus clean up.
r? `@Mark-Simulacrum`
This commit temporarily reverts the addition of M1 runners on GitHub
Actions to work around a billing issue related to their beta. It also
removes the `aarch64-apple` job, which was only added after the addition
of M1 runners. Since it has never been tested on the prior hardware, we
are skipping the tests to reduce the risk of build failures.
update which targets we test Miri on
I hope this doesn't cost too much time; running only the "pass" tests should be reasonably fast (1-2 minutes on my system).
Fixes https://github.com/rust-lang/rust/issues/117167
This avoids needlessly building cg_clif for other targets and makes it
easier for the dist code to determine if it should distribute cg_clif as
component.
ci: add a runner for vanilla LLVM 17
For CI cost, this can be seen as replacing the llvm-14 runner we dropped in #114148.
Also, I've set `IS_NOT_LATEST_LLVM` in the llvm-16 runner, since that's not the latest anymore.
Remove wasm32-unknown-emscripten tests from CI
This builder tested the wasm32-unknown-emscripten target, which is tier 2 (and so not eligible for testing). In the recent beta [promotion](https://github.com/rust-lang/rust/pull/116362#issuecomment-1744960904), we ran into a problem with this target: emscripten doesn't support passing environment variables into the std environment, so we can't enable RUSTC_BOOTSTRAP for libtest in order to pass -Zunstable-options.
We worked around this for the beta/stable branches, but given this problem, and its tier 2 status, just dropping the target's tests entirely seems warranted. Downgrading to tier 3 may also be a good idea, but that is a separate conversation not proposed here.
This builder tested the wasm32-unknown-emscripten target, which is tier
2 (and so not eligible for testing). In the recent beta promotion, we
ran into a problem with this target: emscripten doesn't support
passing environment variables into the std environment, so we can't
enable RUSTC_BOOTSTRAP for libtest in order to pass -Zunstable-options.
We worked around this for the beta/stable branches, but given this
problem, and its tier 2 status, just dropping the target's tests
entirely seems warranted. Downgrading to tier 3 may also be a good idea,
but that is a separate conversation not proposed here.
Raise minimum supported Apple OS versions
This implements the proposal to raise the minimum supported Apple OS versions as laid out in the now-completed MCP (https://github.com/rust-lang/compiler-team/issues/556).
As of this PR, rustc and the stdlib now support these versions as the baseline:
- macOS: 10.12 Sierra
- iOS: 10
- tvOS: 10
- watchOS: 5 (Unchanged)
In addition to everything this breaks indirectly, these changes also erase the `armv7-apple-ios` target (currently tier 3) because the oldest supported iOS device now uses ARMv7s. Not sure what the policy around tier3 target removal is but shimming it is not an option due to the linker refusing.
[Per comment](https://github.com/rust-lang/compiler-team/issues/556#issuecomment-1297175073), this requires a FCP to merge. cc `@wesleywiser.`
CI: use smaller machines in PR runs
mingw-check job-linux-16c -> job-linux-4c
~job-linux-4c 20 min in auto job
~job-linux-16c 13 min in pr job
with current pr regressed to almost 21 min, it's ok.
mingw-check-tidy job-linux-16c -> job-linux-4c small enough, so reduce to minimal
~ job-linux-16c 3 min
with current pr regressed to almost 5 min, it's ok.
x86_64-gnu-tools job-linux-16c this is top job by time in PR, so don't touch it
~ job-linux-8c 1.30 hour in auto job
~ job-linux-16c 1 hour in pr job (affected by #114613, actual time ~ 30 min)
x86_64-gnu-llvm-15 job-linux-16c don't change too
~ job-linux-8c 1.30 hour in auto job
~ job-linux-16c 30 min in pr job
Noticed while working on https://github.com/rust-lang/rust/pull/114621, so current time affected by always rebuilded docker images (but pr images always rebuilded before too, so nvm)
CI: include workflow name in concurrency group
Currently, this won't change anything, because we only have one relevant workflow (`CI`), but for future proofing we should probably include the workflow name in the concurrency group.
Found by ``@klensy`` [here](https://github.com/rust-lang/rust/pull/113059#discussion_r1247213606).
Currently, this won't change anything, because we only have one relevant workflow (`CI`), but for future proofing we should probably include the workflow name in the concurrency group.
Github action to periodically `cargo update` to keep dependencies current
Opens a PR periodically with the results of `cargo update`. If an unmerged PR for the branch `cargo_update` already exists, it will edit then reopen it if necessary.
~~This also uses [`cargo-upgrades`](https://gitlab.com/kornelski/cargo-upgrades) to provide a list of available major upgrades in the PR body.~~
It includes the list of changes output by `cargo update` in the commit message and PR body. Note that this output is currently sub-optimal due to https://github.com/rust-lang/cargo/issues/9408, but if updates are made more regularly that is less likely to show up.
Example PR: https://github.com/pitaj/rust/pull/2
Example action run: https://github.com/pitaj/rust/actions/runs/5035731903
Prior discussion: https://rust-lang.zulipchat.com/#narrow/stream/242791-t-infra/topic/dependabot.20updates.3F
Up for discussion:
- What period do we want? Currently weekly
- What user should it use? Currently "Github Actions"
- Do we need the extra security of provided by executing `cargo update` and `cargo-upgrades` in a separate job?
If not I can simplify it to not need artifacts.
- PR message wording
- PR should probably always be `rollup=always`?
- What branch should it use?
- What should it do if no updates are available? Currently fails the job on empty commit
- Should the yml file live in `src/ci` instead of directly under workflows?
- ~~Is using the latest nightly toolchain enough to ensure compatibility with `Cargo.lock` and `Cargo.toml`s in master?~~
Now pulls the bootstrap version from stage0.json
r? infra
- Keep Cargo.lock dependencies current
- Presents output from `cargo update` in commit and PR
- Edit existing open PR, otherwise open a new one
- Skip if existing open PR is S-waiting-on-bors
Remove aws cli install.
All runner images have the AWS CLI 2 installed, so there isn't a really strong reason to install our own version anymore.
The version we were installing was 1.27.122. The runner images currently have 2.11.x (the exact version varies by image).
I do not have the means to really test if the new version has any issues. I looked at all the `aws` commands, and none of them seem to be doing anything unusual. The page at https://docs.aws.amazon.com/cli/latest/userguide/cliv2-migration-changes.html contains a list of all the breaking changes, and I didn't see anything that looked important.
Optimize builder sizes
The infra-team is continuously monitoring the efficiency of the CI system in an effort to improve overall build times and resource usage. Some builders have used much less than their allocated resources, so we are testing smaller builder sizes for them.
r? `@pietroalbini`
The infra-team is continuously monitoring the efficiency of the build
system in an effort to improve overall build times and resource usage.
The builders for some of the `x86_64-gnu` targets have used much less
resources than allocated in the past, so we are testing a smaller
builder size for them.
The infra-team is continuously monitoring the efficiency of the build
system in an effort to improve overall build times and resource usage.
The builder for the `mingw-check` target have used much less resources
than allocated in the past, so we are testing a smaller builder size for
it.
The infra-team is continuously monitoring the efficiency of the build
system in an effort to improve overall build times and resource usage.
The builders for the `i686-gnu` targets have used much less resources
than allocated in the past, so we are testing a smaller builder size for
them.
Like #107044, this will let us track compatibility with LLVM 16 going
forward, especially after we eventually upgrade our own to the next.
This also drops `tidy` here and in `x86_64-gnu-llvm-15`, syncing with
that change in #106085.
These builders aren't particularly high on overall average CPU usage and finish in typically around
30 minutes. Cutting their core counts will hopefully not significantly increase wall-time while
cutting costs, allowing us to shift some of the wins into our slower builders.
For now this keeps all the configuration identical (AFAICT) but we'll
likely want to play with the specifics to move some of the slower
builders to larger machines and the faster builders to smaller machines,
likely reducing overall usage and improving CI times.
Previously, it would only run on changes to subtrees, submodules, or select directories.
That made it so that changes to the compiler that broke tools would only be detected on a full bors merge.
This makes it so the tools builder runs by default, making it easier to catch breaking changes to clippy (which was the most effected).
Enable Cargo's sparse protocol in CI
This enables the sparse protocol in CI in order to exercise and dogfood it. This is intended test the production server in a real-world situation.
Closes#107342
Port pgo.sh to Python
This PR ports the `pgo.sh` multi stage build file from bash to Python, to make it easier to add new functionality and gather statistics. Main changes:
1) `pgo.sh` rewritten from Bash to Python. Jump from ~200 Bash LOC to ~650 Python LOC. Bash is, unsurprisingly, more concise for running scripts and binaries.
2) Better logging. Each separate stage is now clearly separated in logs, and the logs can be quickly grepped to find out which stage has completed or failed, and how long it took.
3) Better statistics. At the end of the run, there is now a table that shows the duration of the individual stages, along with a percentual ratio of the total workflow run:
```
2023-01-15T18:13:49.9896916Z stage-build INFO: Timer results
2023-01-15T18:13:49.9902185Z ---------------------------------------------------------
2023-01-15T18:13:49.9902605Z Build rustc (LLVM PGO): 1815.67s (21.47%)
2023-01-15T18:13:49.9902949Z Gather profiles (LLVM PGO): 418.73s ( 4.95%)
2023-01-15T18:13:49.9903269Z Build rustc (rustc PGO): 584.46s ( 6.91%)
2023-01-15T18:13:49.9903835Z Gather profiles (rustc PGO): 806.32s ( 9.53%)
2023-01-15T18:13:49.9904154Z Build rustc (LLVM BOLT): 1662.92s (19.66%)
2023-01-15T18:13:49.9904464Z Gather profiles (LLVM BOLT): 715.18s ( 8.46%)
2023-01-15T18:13:49.9914463Z Final build: 2454.00s (29.02%)
2023-01-15T18:13:49.9914798Z Total duration: 8457.27s
2023-01-15T18:13:49.9915305Z ---------------------------------------------------------
```
A sample run can be seen [here](https://github.com/rust-lang/rust/actions/runs/3923980164/jobs/6707932029).
I tried to keep the code compatible with Python 3.6 and don't use dependencies, which required me to reimplement some small pieces of functionality (like formatting bytes). I suppose that it shouldn't be so hard to upgrade to a newer Python or install dependencies in the CI container, but I'd like to avoid it if it won't be needed.
The code is in a single file `stage-build.py`, so it's a bit cluttered. I can also separate it into multiple files, although having it in a single file has some benefits. The code could definitely be nicer, but I'm a bit wary of introducing a lot of abstraction and similar stuff, as long as the code is stuffed into a single file.
Currently, the Python pipeline should faithfully mirror the bash pipeline one by one. After this PR, I'd like to try to optimize it, e.g. by caching the LLVM builds on S3.
r? `@Mark-Simulacrum`
This duplicates mingw-check into two jobs where one job
runs `tidy` only while the other job does not. The tidy
job will not cancel other jobs on failure.