Pretty-print attributes on tuple structs and add tests
This adds support to the pretty printer to print attributes added to tuple struct elements. Furthermore, it adds a test that makes sure we will print attributes on all variant data types.
Fix ICE in memory categorization of tuple patterns
Fixes https://github.com/rust-lang/rust/issues/34334
It seems to be ok for `pat_ty` to return `Err` even if type checking is done, because it uses `infcx.node_ty` which is supposed to return `Err` for all kinds of erroneous types so its callers could quickly bail out with `?`.
r? @arielb1
Fixed the `TAGS.rustc.emacs` and `TAGS.rustc.vi` make targets.
(They were added to `ctags.mk` in PR #33256, but I guess I must have
only tested running `make TAGS.emacs TAGS.rustc.emacs` and not `make
TAGS.rustc.emacs` on its own.)
Specialize .zip() for efficient slice and slice iteration
The idea is to introduce a private trait TrustedRandomAccess and specialize .zip() for random access iterators into a counted loop.
The implementation in the PR is internal and has no visible effect in the API
Why a counted loop? To have each slice iterator compile to just a pointer, and both pointers are indexed with the same loop counter value in the generated code. When this succeeds, copying loops are readily recognized and replaced with memcpy and addition loops autovectorize well.
The TrustedRandomAccess approach works very well on the surface. Microbenchmarks optimize well, following the ideas above, and that is a dramatic improvement of .zip()'s codegen.
```rust
// old zip before this PR: bad, byte-for-byte loop
// with specialized zip: memcpy
pub fn copy_zip(xs: &[u8], ys: &mut [u8]) {
for (a, b) in ys.iter_mut().zip(xs) {
*a = *b;
}
}
// old zip before this PR: single addition per iteration
// with specialized zip: vectorized
pub fn add_zip(xs: &[f32], ys: &mut [f32]) {
for (a, b) in ys.iter_mut().zip(xs) { *a += *b; }
}
// old zip before this PR: single addition per iteration
// with specialized zip: vectorized (!!)
pub fn add_zip3(xs: &[f32], ys: &[f32], zs: &mut [f32]) {
for ((a, b), c) in zs.iter_mut().zip(xs).zip(ys) { *a += *b * *c; }
}
```
Yet in more complex situations, the .zip() loop can still fall back to its old behavior where phantom null checks throw in fake premature end of the loop conditionals. Remember that a NULL inside
Option<(&T, &T)> makes it a `None` value and a premature (in this case)
end of the loop.
So even if we have 1) an explicit `Some` in the code and 2) the types of the pointers are `&T` or `&mut T` which are nonnull, we can still get a phantom null check at that point.
One example that illustrates the difference is `copy_zip` with slice versus Vec arguments. The involved iterator types are exactly the same, but the Vec version doesn't compile down to memcpy. Investigating into this, the function argument metadata emitted to llvm plays the biggest role. As eddyb summarized, we need nonnull for the loop to autovectorize and noalias for it to replace with memcpy.
There was an experiment to use `assume` to add a non-null assumption on each of the two elements in the specialized zip iterator, but this only helped in some of the test cases and regressed others. Instead I think the nonnull/noalias metadata issue is something we need to solve separately anyway.
These have conditionally implemented TrustedRandomAccess
- Enumerate
- Zip
These have not implemented it
- Map is sideeffectful. The forward case would be workable, but the double ended case is complicated.
- Chain, exact length semantics unclear
- Filter, FilterMap, FlatMap and many others don't offer random access and/or exact length
(They were added to `ctags.mk` in PR #33256, but I guess I must have
only tested running `make TAGS.emacs TAGS.rustc.emacs` and not `make
TAGS.rustc.emacs` on its own.)
This adds support to the pretty printer to print attributes
added to tuple struct elements. Furthermore, it adds a test
that makes sure we will print attributes on all variant data
types.
Revert using ? for try! in the libsyntax pretty printer
The use of ...?instead of try!(...) in libsyntax makes extracting libsyntax into syntex quite painful since it's not stable yet. This makes backports take a much longer time and causes a lot of problems for the syntex dependencies. Even if it was, it'd take a few release cycles until syntex would be able to use it. Since it's not stable and that this feature is just syntax sugar, it would be most helpful if we could remove it.
cc #34311
[MIR] Cache drops for early scope exits
Previously we would rebuild all drops on every early exit from a scope, which for code like:
```rust
match x {
A => return 1,
B => return 2,
...
C => return 27
}
```
would produce 27 exactly same chains of drops for each return, basically a `O(n*m)` explosion. [This](https://cloud.githubusercontent.com/assets/679122/16125192/3355e32c-33fb-11e6-8564-c37cab2477a0.png) is such a case for a match on 80-variant enum with 3 droppable variables in scope.
For [`::core::iter::Iterator::partial_cmp`](6edea2cfda/src/libcore/iter/iterator.rs (L1909)) the CFG looked like [this](https://cloud.githubusercontent.com/assets/679122/16122708/ce0024d8-33f0-11e6-93c2-e1c44b910db2.png) (after initial SimplifyCfg). With this patch the CFG looks like [this](https://cloud.githubusercontent.com/assets/679122/16122806/294fb16e-33f1-11e6-95f6-16c5438231af.png) instead.
Some numbers (overall very small wins, however neither of the crates have many cases which abuse this corner case):
| | old time | old rss | new time | new rss |
|-------------------------|----------|---------|----------|----------|
| core dump | 0.879 | 224MB | 0.871 | 223MB |
| core MIR passes | 0.759 | 224MB | 0.718 | 223MB |
| core MIR codegen passes | 1.762 | 230MB | 1.442 | 228MB |
| core trans | 3.263 | 279MB | 3.116 | 278MB |
| core llvm passes | 5.611 | 263MB | 5.565 | 263MB |
| std dump | 0.487 | 190MB | 0.475 | 192MB |
| std MIR passes | 0.311 | 190MB | 0.288 | 192MB |
| std MIR codegen passes | 0.753 | 195MB | 0.720 | 197MB |
| std trans | 2.589 | 287MB | 2.523 | 287MB |
| std llvm passes | 7.268 | 245MB | 7.447 | 246MB |
Previously we would rebuild all drops on every early exit from a scope, which for code like:
```rust
match x {
a => return 1,
b => return 2,
...
z => return 27
}
```
would produce 27 exactly same chains of drops for each return, a O(n*m) explosion in drops.