This significantly improves performance. For example for the
simple-raytracer benchmark it goes from a 13% improvement over LLVM to
39% improvement over LLVM.
Sometimes it is necessary for handling vector to scalar pair transmutes,
but if the types are the same there is no need for this.
This improves runtime performance on simple-raytracer by 12%.
Sync rustc_codegen_cranelift
For cg_clif itself there have been a couple of bug fixes since the last sync, a Cranelift update and implemented all remaining simd platform intrinsics used by `std::simd`. (`std::arch` still misses a lot though) Most of the diff is from reworking of the cg_clif build system though.
r? `@ghost`
`@rustbot` label +A-codegen +A-cranelift +T-compiler
Rollup of 11 pull requests
Successful merges:
- #106407 (Improve proc macro attribute diagnostics)
- #106960 (Teach parser to understand fake anonymous enum syntax)
- #107085 (Custom MIR: Support binary and unary operations)
- #107086 (Print PID holding bootstrap build lock on Linux)
- #107175 (Fix escaping inference var ICE in `point_at_expr_source_of_inferred_type`)
- #107204 (suggest qualifying bare associated constants)
- #107248 (abi: add AddressSpace field to Primitive::Pointer )
- #107272 (Implement ObjectSafe and WF in the new solver)
- #107285 (Implement `Generator` and `Future` in the new solver)
- #107286 (ICE in new solver if we see an inference variable)
- #107313 (Add Style Team Triagebot config)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
InstCombine away intrinsic validity assertions
This optimization (currently) fires 246 times on the standard library. It seems to fire hardly at all on the big crates in the benchmark suite. Interesting.
...and remove it from `PointeeInfo`, which isn't meant for this.
There are still various places (marked with FIXMEs) that assume all pointers
have the same size and alignment. Fixing this requires parsing non-default
address spaces in the data layout string, which will be done in a followup.
Various cleanups around pre-TyCtxt queries and functions
part of #105462
based on https://github.com/rust-lang/rust/pull/106776 (everything starting at [0e2b39f](0e2b39fd1f) is new in this PR)
r? `@petrochenkov`
I think this should be most of the uncontroversial part of #105462.
This is an additional 17% improvement on ./y.rs compile --sysroot none
Benchmark 1: ./y_before.bin build --sysroot none
Time (mean ± σ): 1.533 s ± 0.022 s [User: 1.411 s, System: 0.471 s]
Range (min … max): 1.517 s … 1.589 s 10 runs
Benchmark 2: ./y_after.bin build --sysroot none
Time (mean ± σ): 1.311 s ± 0.020 s [User: 1.232 s, System: 0.428 s]
Range (min … max): 1.298 s … 1.366 s 10 runs
Summary
'./y_after.bin build --sysroot none' ran
1.17 ± 0.02 times faster than './y_before.bin build --sysroot none'
By avoiding some redundant rustc calls and stripping debuginfo for
wrappers. ./y.rs build --sysroot none now runs 44% faster.
Benchmark 1: ./y_before.bin build --sysroot none
Time (mean ± σ): 2.200 s ± 0.038 s [User: 2.140 s, System: 0.653 s]
Range (min … max): 2.171 s … 2.303 s 10 runs
Benchmark 2: ./y_after.bin build --sysroot none
Time (mean ± σ): 1.528 s ± 0.020 s [User: 1.388 s, System: 0.490 s]
Range (min … max): 1.508 s … 1.580 s 10 runs
Summary
'./y_after.bin build --sysroot none' ran
1.44 ± 0.03 times faster than './y_before.bin build --sysroot none'