Add ArrayVec and AccumulateVec to reduce heap allocations during interning of slices
Updates `mk_tup`, `mk_type_list`, and `mk_substs` to allow interning directly from iterators. The previous PR, #37220, changed some of the calls to pass a borrowed slice from `Vec` instead of directly passing the iterator, and these changes further optimize that to avoid the allocation entirely.
This change yields 50% less malloc calls in [some cases](https://pastebin.mozilla.org/8921686). It also yields decent, though not amazing, performance improvements:
```
futures-rs-test 4.091s vs 4.021s --> 1.017x faster (variance: 1.004x, 1.004x)
helloworld 0.219s vs 0.220s --> 0.993x faster (variance: 1.010x, 1.018x)
html5ever-2016- 3.805s vs 3.736s --> 1.018x faster (variance: 1.003x, 1.009x)
hyper.0.5.0 4.609s vs 4.571s --> 1.008x faster (variance: 1.015x, 1.017x)
inflate-0.1.0 3.864s vs 3.883s --> 0.995x faster (variance: 1.232x, 1.005x)
issue-32062-equ 0.309s vs 0.299s --> 1.033x faster (variance: 1.014x, 1.003x)
issue-32278-big 1.614s vs 1.594s --> 1.013x faster (variance: 1.007x, 1.004x)
jld-day15-parse 1.390s vs 1.326s --> 1.049x faster (variance: 1.006x, 1.009x)
piston-image-0. 10.930s vs 10.675s --> 1.024x faster (variance: 1.006x, 1.010x)
reddit-stress 2.302s vs 2.261s --> 1.019x faster (variance: 1.010x, 1.026x)
regex.0.1.30 2.250s vs 2.240s --> 1.005x faster (variance: 1.087x, 1.011x)
rust-encoding-0 1.895s vs 1.887s --> 1.005x faster (variance: 1.005x, 1.018x)
syntex-0.42.2 29.045s vs 28.663s --> 1.013x faster (variance: 1.004x, 1.006x)
syntex-0.42.2-i 13.925s vs 13.868s --> 1.004x faster (variance: 1.022x, 1.007x)
```
We implement a small-size optimized vector, intended to be used primarily for collection of presumed to be short iterators. This vector cannot be "upsized/reallocated" into a heap-allocated vector, since that would require (slow) branching logic, but during the initial collection from an iterator heap-allocation is possible.
We make the new `AccumulateVec` and `ArrayVec` generic over implementors of the `Array` trait, of which there is currently one, `[T; 8]`. In the future, this is likely to expand to other values of N.
Huge thanks to @nnethercote for collecting the performance and other statistics mentioned above.
Every codegen unit gets its own local counter for generating new symbol
names. This makes bitcode and object files reproducible at the binary
level even when incremental compilation is used.
The `Linkage` enum in librustc_llvm got out of sync with the version in LLVM and it caused two variants of the #[linkage=""] attribute to break.
This adds the functions `LLVMRustGetLinkage` and `LLVMRustSetLinkage` which convert between the Rust Linkage enum and the LLVM one, which should stop this from breaking every time LLVM changes it.
Fixes#33992
There were various places that we are invoking `drain_fulfillment_cx`
with a "result" of `()`. This is kind of pointless, since it amounts to
just a call to `select_all_or_error` along with some extra overhead.
rustc_trans: don't round up the DST prefix size to its alignment.
Fixes#35815 by using `ty::layout` and `min_size` to compute the size of the DST prefix.
`ty::layout::Struct::min_size` is not rounded up to alignment, which could be smaller for the DST field.
Address ICEs running w/ incremental compilation and building glium
Fixes for various ICEs I encountered trying to build glium with incremental compilation enabled. Building glium now works. Of the 4 ICEs, I have test cases for 3 of them -- I didn't isolate a test for the last commit and kind of want to go do other things -- most notably, figuring out why incremental isn't saving much *effort*.
But if it seems worthwhile and I can come back and try to narrow down the problem.
r? @michaelwoerister
Fixes#34991Fixes#32015
The way we do HIR inlining introduces reads of the "Hir" into the graph,
but this Hir in fact belongs to other crates, so when we try to load
later, we ICE because the Hir nodes in question don't belond to the
crate (and we haven't done inlining yet). This pass rewrites those HIR
nodes to the metadata from which the inlined HIR was loaded.
MSVC requires unwinding code to be split to a tree of *funclets*, where each funclet
can only branch to itself or to to its parent.
Luckily, the code we generates matches this pattern. Recover that structure in
an analyze pass and translate according to that.