trans: generalize immediate temporaries to all MIR locals.
Added `Mir::local_index` which gives you an unified index for `Arg`, `Var`, `Temp` and `ReturnPointer`.
Also available is `Mir::count_locals` which returns the total number of the above locals.
This simplifies a lot of the code which can treat all of the local lvalues in the same manner.
If we had `-> impl Iterator`, I could have added a bunch of useful `Ty` or `Lvalue` iterators for all locals.
We could of course manually write such iterators as they are needed.
The only place which currently takes advantage of unified locals is trans' alloca elision.
Currently it's not as good as it could be, due to our usage of `llvm.dbg.declare` in debug mode.
But passing some arguments and variables as immediates has some effect on release-mode `libsyntax`:
Old trans:
```
time: 11.500; rss: 710MB translation
time: 0.002; rss: 710MB assert dep graph
time: 0.000; rss: 710MB serialize dep graph
time: 4.410; rss: 628MB llvm function passes [0]
time: 84.485; rss: 633MB llvm module passes [0]
time: 23.898; rss: 634MB codegen passes [0]
time: 0.002; rss: 634MB codegen passes [0]
time: 113.408; rss: 634MB LLVM passes
```
`-Z orbit`, previously:
```
time: 12.588; rss: 723MB translation
time: 0.002; rss: 723MB assert dep graph
time: 0.000; rss: 723MB serialize dep graph
time: 4.597; rss: 642MB llvm function passes [0]
time: 77.347; rss: 646MB llvm module passes [0]
time: 24.703; rss: 648MB codegen passes [0]
time: 0.002; rss: 615MB codegen passes [0]
time: 107.233; rss: 615MB LLVM passes
```
`-Z orbit`, after this PR:
```
time: 13.820; rss: 672MB translation
time: 0.002; rss: 672MB assert dep graph
time: 0.000; rss: 672MB serialize dep graph
time: 3.969; rss: 591MB llvm function passes [0]
time: 72.294; rss: 595MB llvm module passes [0]
time: 24.610; rss: 597MB codegen passes [0]
time: 0.002; rss: 597MB codegen passes [0]
time: 101.439; rss: 597MB LLVM passes
```
Implementation of #34168
r? @brson
cc @alexcrichton
cc @steveklabnik
cc @jonathandturner
I only updated `librustc_privacy/diagnostics.rs`, and I already found a case where the code doesn't throw the expected error code (E0448).
Fixes#34168.
- src links/redirects to extern fn from another crate had an extra '/'.
- src links to `pub use` of a crate module had an extra '/'.
- src links to renamed reexports from another crate used the new name
for the link but should use the original name.
We no C++ and an incredibly small amount of C code as part of the build, so
there's not really much need for us to strictly check the version of compilers
as we're not really stressing anything. LLVM is a pretty huge chunk of C++ but
it should be the responsibility of LLVM to ensure that it can build with a
particular clang/gcc version, not ours (as this logic changes over time).
These version checks seem to basically just by us a regular stream of PRs every
six weeks or so when a new version is releases, so they're not really buying us
much. As a result, remove them and we can add then back piecemeal perhaps as a
blacklist if we really need to.
Pretty-print attributes on tuple structs and add tests
This adds support to the pretty printer to print attributes added to tuple struct elements. Furthermore, it adds a test that makes sure we will print attributes on all variant data types.
Fix ICE in memory categorization of tuple patterns
Fixes https://github.com/rust-lang/rust/issues/34334
It seems to be ok for `pat_ty` to return `Err` even if type checking is done, because it uses `infcx.node_ty` which is supposed to return `Err` for all kinds of erroneous types so its callers could quickly bail out with `?`.
r? @arielb1
Fixed the `TAGS.rustc.emacs` and `TAGS.rustc.vi` make targets.
(They were added to `ctags.mk` in PR #33256, but I guess I must have
only tested running `make TAGS.emacs TAGS.rustc.emacs` and not `make
TAGS.rustc.emacs` on its own.)
Specialize .zip() for efficient slice and slice iteration
The idea is to introduce a private trait TrustedRandomAccess and specialize .zip() for random access iterators into a counted loop.
The implementation in the PR is internal and has no visible effect in the API
Why a counted loop? To have each slice iterator compile to just a pointer, and both pointers are indexed with the same loop counter value in the generated code. When this succeeds, copying loops are readily recognized and replaced with memcpy and addition loops autovectorize well.
The TrustedRandomAccess approach works very well on the surface. Microbenchmarks optimize well, following the ideas above, and that is a dramatic improvement of .zip()'s codegen.
```rust
// old zip before this PR: bad, byte-for-byte loop
// with specialized zip: memcpy
pub fn copy_zip(xs: &[u8], ys: &mut [u8]) {
for (a, b) in ys.iter_mut().zip(xs) {
*a = *b;
}
}
// old zip before this PR: single addition per iteration
// with specialized zip: vectorized
pub fn add_zip(xs: &[f32], ys: &mut [f32]) {
for (a, b) in ys.iter_mut().zip(xs) { *a += *b; }
}
// old zip before this PR: single addition per iteration
// with specialized zip: vectorized (!!)
pub fn add_zip3(xs: &[f32], ys: &[f32], zs: &mut [f32]) {
for ((a, b), c) in zs.iter_mut().zip(xs).zip(ys) { *a += *b * *c; }
}
```
Yet in more complex situations, the .zip() loop can still fall back to its old behavior where phantom null checks throw in fake premature end of the loop conditionals. Remember that a NULL inside
Option<(&T, &T)> makes it a `None` value and a premature (in this case)
end of the loop.
So even if we have 1) an explicit `Some` in the code and 2) the types of the pointers are `&T` or `&mut T` which are nonnull, we can still get a phantom null check at that point.
One example that illustrates the difference is `copy_zip` with slice versus Vec arguments. The involved iterator types are exactly the same, but the Vec version doesn't compile down to memcpy. Investigating into this, the function argument metadata emitted to llvm plays the biggest role. As eddyb summarized, we need nonnull for the loop to autovectorize and noalias for it to replace with memcpy.
There was an experiment to use `assume` to add a non-null assumption on each of the two elements in the specialized zip iterator, but this only helped in some of the test cases and regressed others. Instead I think the nonnull/noalias metadata issue is something we need to solve separately anyway.
These have conditionally implemented TrustedRandomAccess
- Enumerate
- Zip
These have not implemented it
- Map is sideeffectful. The forward case would be workable, but the double ended case is complicated.
- Chain, exact length semantics unclear
- Filter, FilterMap, FlatMap and many others don't offer random access and/or exact length