62ebe3a2b1
Use the same DISubprogram for each instance of the same inlined function within a caller # Issue Details: The call to `panic` within a function like `Option::unwrap` is translated to LLVM as a `tail call` (as it will never return), when multiple calls to the same function like this are inlined LLVM will notice the common `tail call` block (i.e., loading the same panic string + location info and then calling `panic`) and merge them together. When merging these instructions together, LLVM will also attempt to merge the debug locations as well, but this fails (i.e., debug info is dropped) as Rust emits a new `DISubprogram` at each inline site thus LLVM doesn't recognize that these are actually the same function and so thinks that there isn't a common debug location. As an example of this, consider the following program: ```rust #[no_mangle] fn add_numbers(x: &Option<i32>, y: &Option<i32>) -> i32 { let x1 = x.unwrap(); let y1 = y.unwrap(); x1 + y1 } ``` When building for x86_64 Windows using 1.72 it generates (note the lack of `.cv_loc` before the call to `panic`, thus it will be attributed to the same line at the `addq` instruction): ```llvm .cv_loc 0 1 3 0 # src\lib.rs:3:0 addq $40, %rsp retq leaq .Lalloc_f570dea0a53168780ce9a91e67646421(%rip), %rcx leaq .Lalloc_629ace53b7e5b76aaa810d549cc84ea3(%rip), %r8 movl $43, %edx callq _ZN4core9panicking5panic17h12e60b9063f6dee8E int3 ``` # Fix Details: Cache the `DISubprogram` emitted for each inlined function instance within a caller so that this can be reused if that instance is encountered again. Ideally, we would also deduplicate child scopes and variables, however my attempt to do that with #114643 resulted in asserts when building for Linux (#115156) which would require some deep changes to Rust to fix (#115455). Instead, when using an inlined function as a debug scope, we will also create a new child scope such that subsequent child scopes and variables do not collide (from LLVM's perspective). After this change the above assembly now (with <https://reviews.llvm.org/D159226> as well) shows the `panic!` was inlined from `unwrap` in `option.rs` at line 935 into the current function in `lib.rs` at line 0 (line 0 is emitted since it is ambiguous which line to use as there were two inline sites that lead to this same code): ```llvm .cv_loc 0 1 3 0 # src\lib.rs:3:0 addq $40, %rsp retq .cv_inline_site_id 6 within 0 inlined_at 1 0 0 .cv_loc 6 2 935 0 # library\core\src\option.rs:935:0 leaq .Lalloc_5f55955de67e57c79064b537689facea(%rip), %rcx leaq .Lalloc_e741d4de8cb5801e1fd7a6c6795c1559(%rip), %r8 movl $43, %edx callq _ZN4core9panicking5panic17hde1558f32d5b1c04E int3 ``` |
||
---|---|---|
.. | ||
auxiliary | ||
avr | ||
cffi | ||
dllimports | ||
enum | ||
instrument-xray | ||
intrinsics | ||
issues | ||
lib-optimizations | ||
loongarch-abi | ||
macos | ||
naked-fn | ||
non-terminate | ||
remap_path_prefix | ||
repr | ||
riscv-abi | ||
sanitizer | ||
simd | ||
simd-intrinsic | ||
src-hash-algorithm | ||
unwind-abis | ||
aarch64-struct-align-128.rs | ||
abi-efiapi.rs | ||
abi-main-signature-16bit-c-int.rs | ||
abi-main-signature-32bit-c-int.rs | ||
abi-repr-ext.rs | ||
abi-sysv64.rs | ||
abi-x86_64_sysv.rs | ||
abi-x86-interrupt.rs | ||
addr-of-mutate.rs | ||
adjustments.rs | ||
align-byval-vector.rs | ||
align-byval.rs | ||
align-enum.rs | ||
align-fn.rs | ||
align-offset.rs | ||
align-struct.rs | ||
alloc-optimisation.rs | ||
array-clone.rs | ||
array-codegen.rs | ||
array-equality.rs | ||
array-map.rs | ||
ascii-char.rs | ||
asm-clobber_abi.rs | ||
asm-clobbers.rs | ||
asm-may_unwind.rs | ||
asm-maybe-uninit.rs | ||
asm-multiple-options.rs | ||
asm-options.rs | ||
asm-powerpc-clobbers.rs | ||
asm-sanitize-llvm.rs | ||
asm-target-clobbers.rs | ||
async-fn-debug-awaitee-field.rs | ||
async-fn-debug-msvc.rs | ||
async-fn-debug.rs | ||
atomic-operations.rs | ||
autovectorize-f32x4.rs | ||
binary-search-index-no-bound-check.rs | ||
bool-cmp.rs | ||
box-uninit-bytes.rs | ||
bpf-alu32.rs | ||
branch-protection.rs | ||
call-llvm-intrinsics.rs | ||
call-metadata.rs | ||
catch-unwind.rs | ||
cdylib-external-inline-fns.rs | ||
cf-protection.rs | ||
cfguard-checks.rs | ||
cfguard-disabled.rs | ||
cfguard-nochecks.rs | ||
cfguard-non-msvc.rs | ||
codemodels.rs | ||
coercions.rs | ||
cold-call-declare-and-call.rs | ||
comparison-operators-2-tuple.rs | ||
comparison-operators-newtype.rs | ||
const_scalar_pair.rs | ||
consts.rs | ||
dealloc-no-unwind.rs | ||
debug-alignment.rs | ||
debug-column-msvc.rs | ||
debug-column.rs | ||
debug-compile-unit-path.rs | ||
debug-limited.rs | ||
debug-line-directives-only.rs | ||
debug-line-tables-only.rs | ||
debug-linkage-name.rs | ||
debug-vtable.rs | ||
debuginfo-constant-locals.rs | ||
debuginfo-generic-closure-env-names.rs | ||
debuginfo-inline-callsite-location.rs | ||
deduced-param-attrs.rs | ||
default-requires-uwtable.rs | ||
drop-in-place-noalias.rs | ||
drop.rs | ||
dst-vtable-align-nonzero.rs | ||
dst-vtable-size-range.rs | ||
enable-lto-unit-splitting.rs | ||
export-no-mangle.rs | ||
external-no-mangle-fns.rs | ||
external-no-mangle-statics.rs | ||
fastcall-inreg.rs | ||
fatptr.rs | ||
fewer-names.rs | ||
float_math.rs | ||
fn-impl-trait-self.rs | ||
foo.s | ||
force-frame-pointers.rs | ||
force-no-unwind-tables.rs | ||
force-unwind-tables.rs | ||
frame-pointer.rs | ||
function-arguments-noopt.rs | ||
function-arguments.rs | ||
gdb_debug_script_load.rs | ||
generator-debug-msvc.rs | ||
generator-debug.rs | ||
generic-debug.rs | ||
global_asm_include.rs | ||
global_asm_x2.rs | ||
global_asm.rs | ||
inherit_overflow.rs | ||
inline-always-works-always.rs | ||
inline-debuginfo.rs | ||
inline-function-args-debug-info.rs | ||
inline-hint.rs | ||
instrument-coverage.rs | ||
instrument-mcount.rs | ||
integer-cmp.rs | ||
integer-overflow.rs | ||
internalize-closures.rs | ||
intrinsic-no-unnamed-attr.rs | ||
iter-repeat-n-trivial-drop.rs | ||
layout-size-checks.rs | ||
lifetime_start_end.rs | ||
link_section.rs | ||
link-dead-code.rs | ||
llvm-ident.rs | ||
loads.rs | ||
local-generics-in-exe-internalized.rs | ||
lto-removes-invokes.rs | ||
mainsubprogram.rs | ||
mainsubprogramstart.rs | ||
match-optimized.rs | ||
match-optimizes-away.rs | ||
match-unoptimized.rs | ||
mem-replace-big-type.rs | ||
mem-replace-simple-type.rs | ||
merge-functions.rs | ||
method-declaration.rs | ||
mir_zst_stores.rs | ||
mir-inlined-line-numbers.rs | ||
move-before-nocapture-ref-arg.rs | ||
move-operands.rs | ||
no_builtins-at-crate.rs | ||
no-assumes-on-casts.rs | ||
no-dllimport-w-cross-lang-lto.rs | ||
no-jump-tables.rs | ||
no-plt.rs | ||
noalias-box-off.rs | ||
noalias-box.rs | ||
noalias-flag.rs | ||
noalias-refcell.rs | ||
noalias-rwlockreadguard.rs | ||
noalias-unpin.rs | ||
noreturn-uninhabited.rs | ||
noreturnflag.rs | ||
nounwind.rs | ||
nrvo.rs | ||
optimize-attr-1.rs | ||
option-as-slice.rs | ||
option-nonzero-eq.rs | ||
packed.rs | ||
panic-abort-windows.rs | ||
panic-in-drop-abort.rs | ||
panic-unwind-default-uwtable.rs | ||
personality_lifetimes.rs | ||
pgo-counter-bias.rs | ||
pgo-instrumentation.rs | ||
pic-relocation-model.rs | ||
pie-relocation-model.rs | ||
ptr-arithmetic.rs | ||
ptr-read-metadata.rs | ||
README.md | ||
refs.rs | ||
repeat-trusted-len.rs | ||
scalar-pair-bool.rs | ||
set-discriminant-invalid.rs | ||
slice_as_from_ptr_range.rs | ||
slice-as_chunks.rs | ||
slice-indexing.rs | ||
slice-init.rs | ||
slice-iter-fold.rs | ||
slice-iter-len-eq-zero.rs | ||
slice-iter-nonnull.rs | ||
slice-position-bounds-check.rs | ||
slice-ref-equality.rs | ||
slice-reverse.rs | ||
slice-windows-no-bounds-check.rs | ||
some-abis-do-extend-params-to-32-bits.rs | ||
some-global-nonnull.rs | ||
sparc-struct-abi.rs | ||
split-lto-unit.rs | ||
sroa-fragment-debuginfo.rs | ||
sse42-implies-crc32.rs | ||
stack-probes-call.rs | ||
stack-probes-inline.rs | ||
stack-protector.rs | ||
static-relocation-model-msvc.rs | ||
staticlib-external-inline-fns.rs | ||
stores.rs | ||
swap-large-types.rs | ||
swap-small-types.rs | ||
target-cpu-on-functions.rs | ||
target-feature-inline-closure.rs | ||
target-feature-overrides.rs | ||
thread-local.rs | ||
tied-features-strength.rs | ||
to_vec.rs | ||
trailing_zeros.rs | ||
transmute-optimized.rs | ||
transmute-scalar.rs | ||
try_identity.rs | ||
try_question_mark_nop.rs | ||
tune-cpu-on-functions.rs | ||
tuple-layout-opt.rs | ||
unchecked_shifts.rs | ||
unchecked-float-casts.rs | ||
uninit-consts.rs | ||
union-abi.rs | ||
unwind-and-panic-abort.rs | ||
unwind-extern-exports.rs | ||
unwind-extern-imports.rs | ||
used_with_arg.rs | ||
var-names.rs | ||
vec-as-ptr.rs | ||
vec-calloc.rs | ||
vec-in-place.rs | ||
vec-iter-collect-len.rs | ||
vec-optimizes-away.rs | ||
vec-shrink-panik.rs | ||
vecdeque_no_panic.rs | ||
virtual-function-elimination-32bit.rs | ||
virtual-function-elimination.rs | ||
wasm_casts_trapping.rs | ||
wasm_exceptions.rs | ||
zip.rs | ||
zst-offset.rs |
The files here use the LLVM FileCheck framework, documented at https://llvm.org/docs/CommandGuide/FileCheck.html.
One extension worth noting is the use of revisions as custom prefixes for FileCheck. If your codegen test has different behavior based on the chosen target or different compiler flags that you want to exercise, you can use a revisions annotation, like so:
// revisions: aaa bbb
// [bbb] compile-flags: --flags-for-bbb
After specifying those variations, you can write different expected, or
explicitly unexpected output by using <prefix>-SAME:
and <prefix>-NOT:
,
like so:
// CHECK: expected code
// aaa-SAME: emitted-only-for-aaa
// aaa-NOT: emitted-only-for-bbb
// bbb-NOT: emitted-only-for-aaa
// bbb-SAME: emitted-only-for-bbb