book: Fixed links in book/compiler-plugins.md
Updated the links referring to roman_numerals.rs and lint_plugin_test.rs. Went from src/test/auxiliary/ to src/test/run-pass-fulldeps/auxiliary/.
Remove unzip() SizeHint hack
This was using an invalid iterator so is likely to end with buggy
behaviour.
It also doesn't even benefit many type in std including Vec so removing it
shouldn't cause any problems.
Fixes: #33468
trans: generalize immediate temporaries to all MIR locals.
Added `Mir::local_index` which gives you an unified index for `Arg`, `Var`, `Temp` and `ReturnPointer`.
Also available is `Mir::count_locals` which returns the total number of the above locals.
This simplifies a lot of the code which can treat all of the local lvalues in the same manner.
If we had `-> impl Iterator`, I could have added a bunch of useful `Ty` or `Lvalue` iterators for all locals.
We could of course manually write such iterators as they are needed.
The only place which currently takes advantage of unified locals is trans' alloca elision.
Currently it's not as good as it could be, due to our usage of `llvm.dbg.declare` in debug mode.
But passing some arguments and variables as immediates has some effect on release-mode `libsyntax`:
Old trans:
```
time: 11.500; rss: 710MB translation
time: 0.002; rss: 710MB assert dep graph
time: 0.000; rss: 710MB serialize dep graph
time: 4.410; rss: 628MB llvm function passes [0]
time: 84.485; rss: 633MB llvm module passes [0]
time: 23.898; rss: 634MB codegen passes [0]
time: 0.002; rss: 634MB codegen passes [0]
time: 113.408; rss: 634MB LLVM passes
```
`-Z orbit`, previously:
```
time: 12.588; rss: 723MB translation
time: 0.002; rss: 723MB assert dep graph
time: 0.000; rss: 723MB serialize dep graph
time: 4.597; rss: 642MB llvm function passes [0]
time: 77.347; rss: 646MB llvm module passes [0]
time: 24.703; rss: 648MB codegen passes [0]
time: 0.002; rss: 615MB codegen passes [0]
time: 107.233; rss: 615MB LLVM passes
```
`-Z orbit`, after this PR:
```
time: 13.820; rss: 672MB translation
time: 0.002; rss: 672MB assert dep graph
time: 0.000; rss: 672MB serialize dep graph
time: 3.969; rss: 591MB llvm function passes [0]
time: 72.294; rss: 595MB llvm module passes [0]
time: 24.610; rss: 597MB codegen passes [0]
time: 0.002; rss: 597MB codegen passes [0]
time: 101.439; rss: 597MB LLVM passes
```
Implementation of #34168
r? @brson
cc @alexcrichton
cc @steveklabnik
cc @jonathandturner
I only updated `librustc_privacy/diagnostics.rs`, and I already found a case where the code doesn't throw the expected error code (E0448).
Fixes#34168.
Pretty-print attributes on tuple structs and add tests
This adds support to the pretty printer to print attributes added to tuple struct elements. Furthermore, it adds a test that makes sure we will print attributes on all variant data types.
Fix ICE in memory categorization of tuple patterns
Fixes https://github.com/rust-lang/rust/issues/34334
It seems to be ok for `pat_ty` to return `Err` even if type checking is done, because it uses `infcx.node_ty` which is supposed to return `Err` for all kinds of erroneous types so its callers could quickly bail out with `?`.
r? @arielb1
Fixed the `TAGS.rustc.emacs` and `TAGS.rustc.vi` make targets.
(They were added to `ctags.mk` in PR #33256, but I guess I must have
only tested running `make TAGS.emacs TAGS.rustc.emacs` and not `make
TAGS.rustc.emacs` on its own.)
Specialize .zip() for efficient slice and slice iteration
The idea is to introduce a private trait TrustedRandomAccess and specialize .zip() for random access iterators into a counted loop.
The implementation in the PR is internal and has no visible effect in the API
Why a counted loop? To have each slice iterator compile to just a pointer, and both pointers are indexed with the same loop counter value in the generated code. When this succeeds, copying loops are readily recognized and replaced with memcpy and addition loops autovectorize well.
The TrustedRandomAccess approach works very well on the surface. Microbenchmarks optimize well, following the ideas above, and that is a dramatic improvement of .zip()'s codegen.
```rust
// old zip before this PR: bad, byte-for-byte loop
// with specialized zip: memcpy
pub fn copy_zip(xs: &[u8], ys: &mut [u8]) {
for (a, b) in ys.iter_mut().zip(xs) {
*a = *b;
}
}
// old zip before this PR: single addition per iteration
// with specialized zip: vectorized
pub fn add_zip(xs: &[f32], ys: &mut [f32]) {
for (a, b) in ys.iter_mut().zip(xs) { *a += *b; }
}
// old zip before this PR: single addition per iteration
// with specialized zip: vectorized (!!)
pub fn add_zip3(xs: &[f32], ys: &[f32], zs: &mut [f32]) {
for ((a, b), c) in zs.iter_mut().zip(xs).zip(ys) { *a += *b * *c; }
}
```
Yet in more complex situations, the .zip() loop can still fall back to its old behavior where phantom null checks throw in fake premature end of the loop conditionals. Remember that a NULL inside
Option<(&T, &T)> makes it a `None` value and a premature (in this case)
end of the loop.
So even if we have 1) an explicit `Some` in the code and 2) the types of the pointers are `&T` or `&mut T` which are nonnull, we can still get a phantom null check at that point.
One example that illustrates the difference is `copy_zip` with slice versus Vec arguments. The involved iterator types are exactly the same, but the Vec version doesn't compile down to memcpy. Investigating into this, the function argument metadata emitted to llvm plays the biggest role. As eddyb summarized, we need nonnull for the loop to autovectorize and noalias for it to replace with memcpy.
There was an experiment to use `assume` to add a non-null assumption on each of the two elements in the specialized zip iterator, but this only helped in some of the test cases and regressed others. Instead I think the nonnull/noalias metadata issue is something we need to solve separately anyway.
These have conditionally implemented TrustedRandomAccess
- Enumerate
- Zip
These have not implemented it
- Map is sideeffectful. The forward case would be workable, but the double ended case is complicated.
- Chain, exact length semantics unclear
- Filter, FilterMap, FlatMap and many others don't offer random access and/or exact length
(They were added to `ctags.mk` in PR #33256, but I guess I must have
only tested running `make TAGS.emacs TAGS.rustc.emacs` and not `make
TAGS.rustc.emacs` on its own.)
This adds support to the pretty printer to print attributes
added to tuple struct elements. Furthermore, it adds a test
that makes sure we will print attributes on all variant data
types.
Revert using ? for try! in the libsyntax pretty printer
The use of ...?instead of try!(...) in libsyntax makes extracting libsyntax into syntex quite painful since it's not stable yet. This makes backports take a much longer time and causes a lot of problems for the syntex dependencies. Even if it was, it'd take a few release cycles until syntex would be able to use it. Since it's not stable and that this feature is just syntax sugar, it would be most helpful if we could remove it.
cc #34311
[MIR] Cache drops for early scope exits
Previously we would rebuild all drops on every early exit from a scope, which for code like:
```rust
match x {
A => return 1,
B => return 2,
...
C => return 27
}
```
would produce 27 exactly same chains of drops for each return, basically a `O(n*m)` explosion. [This](https://cloud.githubusercontent.com/assets/679122/16125192/3355e32c-33fb-11e6-8564-c37cab2477a0.png) is such a case for a match on 80-variant enum with 3 droppable variables in scope.
For [`::core::iter::Iterator::partial_cmp`](6edea2cfda/src/libcore/iter/iterator.rs (L1909)) the CFG looked like [this](https://cloud.githubusercontent.com/assets/679122/16122708/ce0024d8-33f0-11e6-93c2-e1c44b910db2.png) (after initial SimplifyCfg). With this patch the CFG looks like [this](https://cloud.githubusercontent.com/assets/679122/16122806/294fb16e-33f1-11e6-95f6-16c5438231af.png) instead.
Some numbers (overall very small wins, however neither of the crates have many cases which abuse this corner case):
| | old time | old rss | new time | new rss |
|-------------------------|----------|---------|----------|----------|
| core dump | 0.879 | 224MB | 0.871 | 223MB |
| core MIR passes | 0.759 | 224MB | 0.718 | 223MB |
| core MIR codegen passes | 1.762 | 230MB | 1.442 | 228MB |
| core trans | 3.263 | 279MB | 3.116 | 278MB |
| core llvm passes | 5.611 | 263MB | 5.565 | 263MB |
| std dump | 0.487 | 190MB | 0.475 | 192MB |
| std MIR passes | 0.311 | 190MB | 0.288 | 192MB |
| std MIR codegen passes | 0.753 | 195MB | 0.720 | 197MB |
| std trans | 2.589 | 287MB | 2.523 | 287MB |
| std llvm passes | 7.268 | 245MB | 7.447 | 246MB |