Add `Ord::cmp` for primitives as a `BinOp` in MIR
Update: most of this OP was written months ago. See https://github.com/rust-lang/rust/pull/118310#issuecomment-2016940014 below for where we got to recently that made it ready for review.
---
There are dozens of reasonable ways to implement `Ord::cmp` for integers using comparison, bit-ops, and branches. Those differences are irrelevant at the rust level, however, so we can make things better by adding `BinOp::Cmp` at the MIR level:
1. Exactly how to implement it is left up to the backends, so LLVM can use whatever pattern its optimizer best recognizes and cranelift can use whichever pattern codegens the fastest.
2. By not inlining those details for every use of `cmp`, we drastically reduce the amount of MIR generated for `derive`d `PartialOrd`, while also making it more amenable to MIR-level optimizations.
Having extremely careful `if` ordering to μoptimize resource usage on broadwell (#63767) is great, but it really feels to me like libcore is the wrong place to put that logic. Similarly, using subtraction [tricks](https://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign) (#105840) is arguably even nicer, but depends on the optimizer understanding it (https://github.com/llvm/llvm-project/issues/73417) to be practical. Or maybe [bitor is better than add](https://discourse.llvm.org/t/representing-in-ir/67369/2?u=scottmcm)? But maybe only on a future version that [has `or disjoint` support](https://discourse.llvm.org/t/rfc-add-or-disjoint-flag/75036?u=scottmcm)? And just because one of those forms happens to be good for LLVM, there's no guarantee that it'd be the same form that GCC or Cranelift would rather see -- especially given their very different optimizers. Not to mention that if LLVM gets a spaceship intrinsic -- [which it should](https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Suboptimal.20inlining.20in.20std.20function.20.60binary_search.60/near/404250586) -- we'll need at least a rustc intrinsic to be able to call it.
As for simplifying it in Rust, we now regularly inline `{integer}::partial_cmp`, but it's quite a large amount of IR. The best way to see that is with 8811efa88b (diff-d134c32d028fbe2bf835fef2df9aca9d13332dd82284ff21ee7ebf717bfa4765R113) -- I added a new pre-codegen MIR test for a simple 3-tuple struct, and this PR change it from 36 locals and 26 basic blocks down to 24 locals and 8 basic blocks. Even better, as soon as the construct-`Some`-then-match-it-in-same-BB noise is cleaned up, this'll expose the `Cmp == 0` branches clearly in MIR, so that an InstCombine (#105808) can simplify that to just a `BinOp::Eq` and thus fix some of our generated code perf issues. (Tracking that through today's `if a < b { Less } else if a == b { Equal } else { Greater }` would be *much* harder.)
---
r? `@ghost`
But first I should check that perf is ok with this
~~...and my true nemesis, tidy.~~
Codegen const panic messages as function calls
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting shared library (see [perf]).
A sample improvement from nightly:
```
leaq str.0(%rip), %rdi
leaq .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdx
movl $25, %esi
callq *_ZN4core9panicking5panic17h17cabb89c5bcc999E@GOTPCREL(%rip)
```
to this PR:
```
leaq .Lalloc_d6aeb8e2aa19de39a7f0e861c998af13(%rip), %rdi
callq *_RNvNtNtCsduqIKoij8JB_4core9panicking11panic_const23panic_const_div_by_zero@GOTPCREL(%rip)
```
[perf]: https://perf.rust-lang.org/compare.html?start=a7e4de13c1785819f4d61da41f6704ed69d5f203&end=64fbb4f0b2d621ff46d559d1e9f5ad89a8d7789b&stat=instructions:u
Unbox and unwrap the contents of `StatementKind::Coverage`
The payload of coverage statements was historically a structure with several fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace `Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid accidentally bloating it in the future.
``@rustbot`` label +A-code-coverage
We already use `Instance` at declaration sites when available to glean
additional information about possible abstractions of the type in use.
This does the same when possible at callsites as well.
The primary purpose of this change is to allow CFI to alter how it
generates type information for indirect calls through `Virtual`
instances.
The payload of coverage statements was historically a structure with several
fields, so it was boxed to avoid bloating `StatementKind`.
Now that the payload is a single relatively-small enum, we can replace
`Box<Coverage>` with just `CoverageKind`.
This patch also adds a size assertion for `StatementKind`, to avoid
accidentally bloating it in the future.
This skips emitting extra arguments at every callsite (of which there
can be many). For a librustc_driver build with overflow checks enabled,
this cuts 0.7MB from the resulting binary.
Stabilize associated type bounds (RFC 2289)
This PR stabilizes associated type bounds, which were laid out in [RFC 2289]. This gives us a shorthand to express nested type bounds that would otherwise need to be expressed with nested `impl Trait` or broken into several `where` clauses.
### What are we stabilizing?
We're stabilizing the associated item bounds syntax, which allows us to put bounds in associated type position within other bounds, i.e. `T: Trait<Assoc: Bounds...>`. See [RFC 2289] for motivation.
In all position, the associated type bound syntax expands into a set of two (or more) bounds, and never anything else (see "How does this differ[...]" section for more info).
Associated type bounds are stabilized in four positions:
* **`where` clauses (and APIT)** - This is equivalent to breaking up the bound into two (or more) `where` clauses. For example, `where T: Trait<Assoc: Bound>` is equivalent to `where T: Trait, <T as Trait>::Assoc: Bound`.
* **Supertraits** - Similar to above, `trait CopyIterator: Iterator<Item: Copy> {}`. This is almost equivalent to breaking up the bound into two (or more) `where` clauses; however, the bound on the associated item is implied whenever the trait is used. See #112573/#112629.
* **Associated type item bounds** - This allows constraining the *nested* rigid projections that are associated with a trait's associated types. e.g. `trait Trait { type Assoc: Trait2<Assoc2: Copy>; }`.
* **opaque item bounds (RPIT, TAIT)** - This allows constraining associated types that are associated with the opaque without having to *name* the opaque. For example, `impl Iterator<Item: Copy>` defines an iterator whose item is `Copy` without having to actually name that item bound.
The latter three are not expressible in surface Rust (though for associated type item bounds, this will change in #120752, which I don't believe should block this PR), so this does represent a slight expansion of what can be expressed in trait bounds.
### How does this differ from the RFC?
Compared to the RFC, the current implementation *always* desugars associated type bounds to sets of `ty::Clause`s internally. Specifically, it does *not* introduce a position-dependent desugaring as laid out in [RFC 2289], and in particular:
* It does *not* desugar to anonymous associated items in associated type item bounds.
* It does *not* desugar to nested RPITs in RPIT bounds, nor nested TAITs in TAIT bounds.
This position-dependent desugaring laid out in the RFC existed simply to side-step limitations of the trait solver, which have mostly been fixed in #120584. The desugaring laid out in the RFC also added unnecessary complication to the design of the feature, and introduces its own limitations to, for example:
* Conditionally lowering to nested `impl Trait` in certain positions such as RPIT and TAIT means that we inherit the limitations of RPIT/TAIT, namely lack of support for higher-ranked opaque inference. See this code example: https://github.com/rust-lang/rust/pull/120752#issuecomment-1979412531.
* Introducing anonymous associated types makes traits no longer object safe, since anonymous associated types are not nameable, and all associated types must be named in `dyn` types.
This last point motivates why this PR is *not* stabilizing support for associated type bounds in `dyn` types, e.g, `dyn Assoc<Item: Bound>`. Why? Because `dyn` types need to have *concrete* types for all associated items, this would necessitate a distinct lowering for associated type bounds, which seems both complicated and unnecessary compared to just requiring the user to write `impl Trait` themselves. See #120719.
### Implementation history:
Limited to the significant behavioral changes and fixes and relevant PRs, ping me if I left something out--
* #57428
* #108063
* #110512
* #112629
* #120719
* #120584Closes#52662
[RFC 2289]: https://rust-lang.github.io/rfcs/2289-associated-type-bounds.html