This significantly improves performance. For example for the
simple-raytracer benchmark it goes from a 13% improvement over LLVM to
39% improvement over LLVM.
Sometimes it is necessary for handling vector to scalar pair transmutes,
but if the types are the same there is no need for this.
This improves runtime performance on simple-raytracer by 12%.
Use stable metric for const eval limit instead of current terminator-based logic
This patch adds a `MirPass` that inserts a new MIR instruction `ConstEvalCounter` to any loops and function calls in the CFG. This instruction is used during Const Eval to count against the `const_eval_limit`, and emit the `StepLimitReached` error, replacing the current logic which uses Terminators only.
The new method of counting loops and function calls should be more stable across compiler versions (i.e., not cause crates that compiled successfully before, to no longer compile when changes to the MIR generation/optimization are made).
Also see: #103877
Sync rustc_codegen_cranelift
For cg_clif itself there have been a couple of bug fixes since the last sync, a Cranelift update and implemented all remaining simd platform intrinsics used by `std::simd`. (`std::arch` still misses a lot though) Most of the diff is from reworking of the cg_clif build system though.
r? `@ghost`
`@rustbot` label +A-codegen +A-cranelift +T-compiler
Rollup of 11 pull requests
Successful merges:
- #106407 (Improve proc macro attribute diagnostics)
- #106960 (Teach parser to understand fake anonymous enum syntax)
- #107085 (Custom MIR: Support binary and unary operations)
- #107086 (Print PID holding bootstrap build lock on Linux)
- #107175 (Fix escaping inference var ICE in `point_at_expr_source_of_inferred_type`)
- #107204 (suggest qualifying bare associated constants)
- #107248 (abi: add AddressSpace field to Primitive::Pointer )
- #107272 (Implement ObjectSafe and WF in the new solver)
- #107285 (Implement `Generator` and `Future` in the new solver)
- #107286 (ICE in new solver if we see an inference variable)
- #107313 (Add Style Team Triagebot config)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
InstCombine away intrinsic validity assertions
This optimization (currently) fires 246 times on the standard library. It seems to fire hardly at all on the big crates in the benchmark suite. Interesting.
...and remove it from `PointeeInfo`, which isn't meant for this.
There are still various places (marked with FIXMEs) that assume all pointers
have the same size and alignment. Fixing this requires parsing non-default
address spaces in the data layout string, which will be done in a followup.
Various cleanups around pre-TyCtxt queries and functions
part of #105462
based on https://github.com/rust-lang/rust/pull/106776 (everything starting at [0e2b39f](0e2b39fd1f) is new in this PR)
r? `@petrochenkov`
I think this should be most of the uncontroversial part of #105462.
This is an additional 17% improvement on ./y.rs compile --sysroot none
Benchmark 1: ./y_before.bin build --sysroot none
Time (mean ± σ): 1.533 s ± 0.022 s [User: 1.411 s, System: 0.471 s]
Range (min … max): 1.517 s … 1.589 s 10 runs
Benchmark 2: ./y_after.bin build --sysroot none
Time (mean ± σ): 1.311 s ± 0.020 s [User: 1.232 s, System: 0.428 s]
Range (min … max): 1.298 s … 1.366 s 10 runs
Summary
'./y_after.bin build --sysroot none' ran
1.17 ± 0.02 times faster than './y_before.bin build --sysroot none'
By avoiding some redundant rustc calls and stripping debuginfo for
wrappers. ./y.rs build --sysroot none now runs 44% faster.
Benchmark 1: ./y_before.bin build --sysroot none
Time (mean ± σ): 2.200 s ± 0.038 s [User: 2.140 s, System: 0.653 s]
Range (min … max): 2.171 s … 2.303 s 10 runs
Benchmark 2: ./y_after.bin build --sysroot none
Time (mean ± σ): 1.528 s ± 0.020 s [User: 1.388 s, System: 0.490 s]
Range (min … max): 1.508 s … 1.580 s 10 runs
Summary
'./y_after.bin build --sysroot none' ran
1.44 ± 0.03 times faster than './y_before.bin build --sysroot none'