Add slice::sort_by_cached_key as a memoised sort_by_key
At present, `slice::sort_by_key` calls its key function twice for each comparison that is made. When the key function is expensive (which can often be the case when `sort_by_key` is chosen over `sort_by`), this can lead to very suboptimal behaviour.
To address this, I've introduced a new slice method, `sort_by_cached_key`, which has identical semantic behaviour to `sort_by_key`, except that it guarantees the key function will only be called once per element.
Where there are `n` elements and the key function is `O(m)`:
- `slice::sort_by_cached_key` time complexity is `O(m n log m n)`, compared to `slice::sort_by_key`'s `O(m n + n log n)`.
- `slice::sort_by_cached_key` space complexity remains at `O(n + m)`. (Technically, it now reserves a slice of size `n`, whereas before it reserved a slice of size `n/2`.)
`slice::sort_unstable_by_key` has not been given an analogue, as it is important that unstable sorts are in-place, which is not a property that is guaranteed here. However, this also means that `slice::sort_unstable_by_key` is likely to be slower than `slice::sort_by_cached_key` when the key function does not have negligible complexity. We might want to explore this trade-off further in the future.
Benchmarks (for a vector of 100 `i32`s):
```
# Lexicographic: `|x| x.to_string()`
test bench_sort_by_key ... bench: 112,638 ns/iter (+/- 19,563)
test bench_sort_by_cached_key ... bench: 15,038 ns/iter (+/- 4,814)
# Identity: `|x| *x`
test bench_sort_by_key ... bench: 1,346 ns/iter (+/- 238)
test bench_sort_by_cached_key ... bench: 1,839 ns/iter (+/- 765)
# Power: `|x| x.pow(31)`
test bench_sort_by_key ... bench: 3,624 ns/iter (+/- 738)
test bench_sort_by_cached_key ... bench: 1,997 ns/iter (+/- 311)
# Abs: `|x| x.abs()`
test bench_sort_by_key ... bench: 1,546 ns/iter (+/- 174)
test bench_sort_by_cached_key ... bench: 1,668 ns/iter (+/- 790)
```
(So it seems functions that are single operations do perform slightly worse with this method, but for pretty much any more complex key, you're better off with this optimisation.)
I've definitely found myself using expensive keys in the past and wishing this optimisation was made (e.g. for https://github.com/rust-lang/rust/pull/47415). This feels like both desirable and expected behaviour, at the small cost of slightly more stack allocation and minute degradation in performance for extremely trivial keys.
Resolves#34447.
Fix implicit closure return type generation for libsyntax
The `lambda` function for constructing closures in libsyntax was explicitly setting the return type to `_`, which resulted in incorrect corresponding syntax (as `|| -> _ x` is not valid, without the enclosing brackets). This meant the generated code, when printed, was invalid.
I also took the opportunity to slightly improve the generated code for the `RustcEncodable::encode` method for unit structs.
Fixes#42213.
implement minmax intrinsics
This adds the `simd_{fmin,fmax}` intrinsics, which do a vertical (lane-wise) `min`/`max` for floating point vectors that's equivalent to Rust's `min`/`max` for `f32`/`f64`.
It might make sense to make `{f32,f64}::{min,max}` use the `minnum` and `minmax` intrinsics as well.
---
~~HELP: I need some help with these. Either I should go to sleep or there must be something that I must be missing. AFAICT I am calling the `maxnum` builder correctly, yet rustc/LLVM seem to insert a call to `llvm.minnum` there instead...~~ EDIT: Rust's LLVM version is too old :/
Instead, expose apparently-fallible conversions in cases where
the implementation happens to be infallible for a given target.
Having an associated type / return type in a public API change
based on the target is a portability hazard.
rustbuild: Fail the build if we build Cargo twice
This commit updates the `ToolBuild` step to stream Cargo's JSON messages, parse
them, and record all libraries built. If we build anything twice (aka Cargo)
it'll most likely happen due to dependencies being recompiled which is caught by
this check.
This commit updates the `ToolBuild` step to stream Cargo's JSON messages, parse
them, and record all libraries built. If we build anything twice (aka Cargo)
it'll most likely happen due to dependencies being recompiled which is caught by
this check.
- Change nested_visit_map so it will recusively check functions
- Add visit_stmt and visit_expr for impl Visitor for CheckAttrVisitor and check for incorrect
inline and repr attributes on staements and expressions
- Add regression test for isssue #43988
Fix confusing doc for `scan`
The comment "the value passed on to the next iteration" confused me since it sounded more like what Haskell's [scanl](http://hackage.haskell.org/package/base-4.11.0.0/docs/Prelude.html#v:scanl) does where the closure's return value serves as both the "yielded value" *and* the new value of the "state".
I tried changing the example to make it clear that the closure's return value is decoupled from the state argument.
rustbuild: Disable docs on cross-compiled builds
This commit disables building documentation on cross-compiled compilers, for
example ARM/MIPS/PowerPC/etc. Currently I believe we're not getting much use out
of these documentation artifacts and they often take 10-15 minutes total to
build as it requires building rustdoc/rustbook and then also generating all the
documentation, especially for the reference and the book itself.
In an effort to cut down on the amount of work that we're doing on dist CI
builders in light of recent timeouts this was some relatively low hanging fruit
to cut which in theory won't have much impact on the ecosystem in the hopes that
the documentation isn't used too heavily anyway.
While initial analysis in #48827 showed only shaving 5 minutes off local builds
the same 5 minute conclusion was drawn from #48826 which ended up having nearly
a half-hour impact on the bots. In that sense I'm hoping that we can land this
and test out what happens on CI to see how it affects timing.
Note that all tier 1 platforms, Windows, Mac, and Linux, will continue to
generate documentation.
Use an uninitialized buffer in GenericRadix::fmt_int, like in Display::fmt for numeric types
The code using a slice of that buffer is only ever going to use
bytes that are subsequently initialized.