LLVM appears to prefer (spend less time optimizing) long if chains if
it receives them in approzimately source order.
This fixes a ~10% regression for optimized builds of the encoding benchmark
on perf.rlo due to the changes to decision tree construction.
Special-case literals in `parse_bottom_expr`.
This makes parsing faster, particularly for code with large constants,
for two reasons:
- it skips all the keyword comparisons for literals;
- it skips the allocation done by the `mk_expr` call in
`parse_literal_maybe_minus`.
r? @petrochenkov
rustc: replace `TyCtxt<'a, 'gcx, 'tcx>` with `TyCtxt<'gcx, 'tcx>`.
This first lifetime parameter of `TyCtxt` has been phantom for a while, thanks to @Zoxc, but was never removed, and I'm doing this now in preparation for removing the `'gcx`/`'tcx` split.
I wasn't going to do this as a separate step, and instead start converting uses of `TyCtxt` to a single-lifetime alias of it (e.g. `type TyCx<'tcx> = TyCtxt<'tcx, 'tcx, 'tcx>;`) but it turns out (as @Zoxc rightly predicted) that there is far more fallout from not needing a lifetime for the first parameter of `TyCtxt`.
That is, going from `TyCtxt<'a, 'gcx, 'tcx>` to `TyCtxt<'tcx, 'gcx, 'tcx>` (the first commit in this PR) has the largest amount of fallout out of all the changes we might make (because it can require removing the `'a` parameter of `struct`s containing `tcx: TyCtxt<'a, ...>`), and is the hardest to automate (because `'a` is used everywhere, not just with `TyCtxt`, unlike, say `'gcx, 'tcx` -> `'tcx`).
So I'm submitting this now to get it out of the way and reduce further friction in the future.
**EDIT**: for the `rustfmt` commit, I used https://github.com/rust-lang/rustfmt/issues/1324#issuecomment-482109952, and manually filtered out some noise, like in #61735, but unlike that PR, there was also a weird bug to work around.
It should be reviewed separately, and dropped if unwanted.
cc @rust-lang/compiler r? @nikomatsakis
ci: Disable LLVM/debug assertions for distcheck
The purpose of distcheck is to test `./x.py test` from a tarball, not to
test that all assertions pass all the time. These assertions are largely
just redundant with other builders, so skip the assertions for now and
save a good chunk of time on CI.
cc #61185
Allow attributes in formal function parameters
Implements https://github.com/rust-lang/rust/issues/60406.
This is my first contribution to the compiler and since this is a large and complex project, I am not fully aware of the consequences of the changes I have made.
**TODO**
- [x] Forbid some built-in attributes.
- [x] Expand cfg/cfg_attr
Rollup of 9 pull requests
Successful merges:
- #60187 (Generator optimization: Overlap locals that never have storage live at the same time)
- #61348 (Implement Clone::clone_from for Option and Result)
- #61568 (Use Symbol, Span in libfmt_macros)
- #61632 (ci: Collect CPU usage statistics on Azure)
- #61654 (use pattern matching for slices destructuring)
- #61671 (implement nth_back for Range(Inclusive))
- #61688 (is_fp and is_floating_point do the same thing, remove the former)
- #61705 (Pass cflags rather than cxxflags to LLVM as CMAKE_C_FLAGS)
- #61734 (Migrate rust-by-example to MdBook2)
Failed merges:
r? @ghost
ci: Collect CPU usage statistics on Azure
This commit adds a script which we'll execute on Azure Pipelines which
is intended to run in the background and passively collect CPU usage
statistics for our builders. The intention here is that we can use this
information over time to diagnose issues with builders, see where we can
optimize our build, fix parallelism issues, etc. This might not end up
being too useful in the long run but it's data we've wanted to collect
for quite some time now, so here's a stab at it!
Comments about how this is intended to work can be found in the python
script used here to collect CPU usage statistics.
Closes#48828
Use Symbol, Span in libfmt_macros
I'm not super happy with this, personally, but I think it might be a decent start -- happy to take suggestions as to how to expand this or change things further.
r? @estebank
Fixes#60795
Generator optimization: Overlap locals that never have storage live at the same time
The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](08bfe16129/src/libstd/macros.rs (L365-L381))), which we rely on to implement the optimization.
More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.**
To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with.
Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout.
For the layout computation, we use a simplified approach.
1. Start with the set of locals assigned to only one variant. The rest are disqualified.
2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping.
Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone".
So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this:
You can think of a generator as an enum, where some fields are shared between variants. e.g.
```rust
enum Generator {
Unresumed,
Poisoned,
Returned,
Suspend0(A, B, X),
Suspend1(B),
Suspend2(A, Y, Z),
}
```
where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them.
Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time.
---
Depends on:
- [x] #59897 Multi-variant layouts for generators
- [x] #60840 Preserve local scopes in generator MIR
- [x] #61373 Emit StorageDead along unwind paths for generators
Before merging:
- [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened #60889)
- [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location)
- [x] Clean up TODO
- [x] Fix the layout code to enforce that the same field never moves around in the generator
- [x] Add tests for async/await
- [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity)
- [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at https://github.com/rust-lang/rust/pull/59897#issuecomment-489468627)
Generator optimization: Overlap locals that never have storage live at the same time
The specific goal of this optimization is to optimize async fns which use `await!`. Notably, `await!` has an enclosing scope around the futures it awaits ([definition](08bfe16129/src/libstd/macros.rs (L365-L381))), which we rely on to implement the optimization.
More generally, the optimization allows overlapping the storage of some locals which are never storage-live at the same time. **We care about storage-liveness when computing the layout, because knowing a field is `StorageDead` is the only way to prove it will not be accessed, either directly or through a reference.**
To determine whether we can overlap two locals in the generator layout, we look at whether they might *both* be `StorageLive` at any point in the MIR. We use the `MaybeStorageLive` dataflow analysis for this. We iterate over every location in the MIR, and build a bitset for each local of the locals it might potentially conflict with.
Next, we assign every saved local to one or more variants. The variants correspond to suspension points, and we include the set of locals live across a given suspension point in the variant. (Note that we use liveness instead of storage-liveness here; this ensures that the local has actually been initialized in each variant it has been included in. If the local is not live across a suspension point, then it doesn't need to be included in that variant.). It's important to note that the variants are a "view" into our layout.
For the layout computation, we use a simplified approach.
1. Start with the set of locals assigned to only one variant. The rest are disqualified.
2. For each pair of locals which may conflict *and are not assigned to the same variant*, we pick one local to disqualify from overlapping.
Disqualified locals go into a non-overlapping "prefix" at the beginning of our layout. This means they always have space reserved for them. All the locals that are allowed to overlap in each variant are then laid out after this prefix, in the "overlap zone".
So, if A and B were disqualified, and X, Y, and Z were all eligible for overlap, our generator might look something like this:
You can think of a generator as an enum, where some fields are shared between variants. e.g.
```rust
enum Generator {
Unresumed,
Poisoned,
Returned,
Suspend0(A, B, X),
Suspend1(B),
Suspend2(A, Y, Z),
}
```
where every mention of `A` and `B` refer to the same field, which does not move when changing variants. Note that `A` and `B` would automatically be sent to the prefix in this example. Assuming that `X` is never `StorageLive` at the same time as either `Y` or `Z`, it would be allowed to overlap with them.
Note that if two locals (`Y` and `Z` in this case) are assigned to the same variant in our generator, their memory would never overlap in the layout. Thus they can both be eligible for the overlapping section, even if they are storage-live at the same time.
---
Depends on:
- [x] #59897 Multi-variant layouts for generators
- [x] #60840 Preserve local scopes in generator MIR
- [x] #61373 Emit StorageDead along unwind paths for generators
Before merging:
- [x] ~Wrap the types of all generator fields in `MaybeUninitialized` in layout::ty::field~ (opened #60889)
- [x] Make PR description more complete (e.g. explain why storage liveness is important and why we have to check every location)
- [x] Clean up TODO
- [x] Fix the layout code to enforce that the same field never moves around in the generator
- [x] Add tests for async/await
- [x] ~Reduce # bits we store by half, since the conflict relation is symmetric~ (note: decided not to do this, for simplicity)
- [x] Store liveness information for each yield point in our `GeneratorLayout`, that way we can emit more useful debuginfo AND tell miri which fields are definitely initialized for a given variant (see discussion at https://github.com/rust-lang/rust/pull/59897#issuecomment-489468627)
Add deny(unused_lifetimes) to all the crates that have deny(internal).
@Zoxc brought up, regarding #61722, that we don't force the removal of unused lifetimes.
Turns out that it's not that bad to enable for compiler crates (I wonder why it's not `warn` by default?).
I would've liked to enable `single_use_lifetimes` as well, but https://github.com/rust-lang/rust/issues/53738 makes it unusable for now.
For the `rustfmt` commit, I used https://github.com/rust-lang/rustfmt/issues/1324#issuecomment-482109952, and manually filtered out some noise.
r? @oli-obk cc @rust-lang/compiler
Remove some legacy proc macro flavors
Namely
- `IdentTT` (`foo! ident { ... }`). Can be replaced with `foo! { ident ... }` or something similar.
- `MultiDecorator`. Can be replaced by `MultiModifier` (aka `LegacyAttr` after renaming).
- `DeclMacro`. It was a less powerful duplicate of `NormalTT` (aka `LegacyBang` after renaming) and can be replaced by it.
Stuff like this slows down any attempts to refactor the expansion infra, so it's desirable to retire it already.
I'm not sure whether a lang team decision is necessary, but would be nice to land this sooner because I have some further work in this area scheduled.
The documentation commit (a9397fd0d5) describes how the remaining variants are different from each other and shows that there's actually some system behind them.
The last commit renames variants of `SyntaxExtension` in more systematic way.
- `ProcMacro` -> `Bang`
- `NormalTT` -> `LegacyBang`
- `AttrProcMacro` -> `Attr`
- `MultiModifier` -> `LegacyAttr`
- `ProcMacroDerive` -> `Derive`
- `BuiltinDerive` -> `LegacyDerive`
All the `Legacy*` variants are AST-based, as opposed to "modern" token-based variants.
move some tests into subfolders
This reduces the size of the test folders without making the moved tests harder to find.
Is this kind of change desired/worth the effort?
This commit adds a script which we'll execute on Azure Pipelines which
is intended to run in the background and passively collect CPU usage
statistics for our builders. The intention here is that we can use this
information over time to diagnose issues with builders, see where we can
optimize our build, fix parallelism issues, etc. This might not end up
being too useful in the long run but it's data we've wanted to collect
for quite some time now, so here's a stab at it!
Comments about how this is intended to work can be found in the python
script used here to collect CPU usage statistics.
Closes#48828