rust/compiler/rustc_codegen_gcc
bors 154ae32a55 Auto merge of #114643 - dpaoliello:inlinedebuginfo, r=wesleywiser
Use the same DISubprogram for each instance of the same inlined function within a caller

# Issue Details:
The call to `panic` within a function like `Option::unwrap` is translated to LLVM as a `tail call` (as it will never return), when multiple calls to the same function like this is inlined LLVM will notice the common `tail call` block (i.e., loading the same panic string + location info and then calling `panic`) and merge them together.

When merging these instructions together, LLVM will also attempt to merge the debug locations as well, but this fails (i.e., debug info is dropped) as Rust emits a new `DISubprogram` at each inline site thus LLVM doesn't recognize that these are actually the same function and so thinks that there isn't a common debug location.

As an example of this when building for x86_64 Windows (note the lack of `.cv_loc` before the call to `panic`, thus it will be attributed to the same line at the `addq` instruction):

```
	.cv_loc	0 1 23 0                        # src\lib.rs:23:0
	addq	$40, %rsp
	retq
	leaq	.Lalloc_f570dea0a53168780ce9a91e67646421(%rip), %rcx
	leaq	.Lalloc_629ace53b7e5b76aaa810d549cc84ea3(%rip), %r8
	movl	$43, %edx
	callq	_ZN4core9panicking5panic17h12e60b9063f6dee8E
	int3
```

# Fix Details:
Cache the `DISubprogram` emitted for each inlined function instance within a caller so that this can be reused if that instance is encountered again, this also requires caching the `DILexicalBlock` and `DIVariable` objects to avoid creating duplicates.

After this change the above assembly now looks like:

```
	.cv_loc	0 1 23 0                        # src\lib.rs:23:0
	addq	$40, %rsp
	retq
	.cv_inline_site_id 5 within 0 inlined_at 1 0 0
	.cv_inline_site_id 6 within 5 inlined_at 1 12 0
	.cv_loc	6 2 935 0                       # library\core\src\option.rs:935:0
	leaq	.Lalloc_5f55955de67e57c79064b537689facea(%rip), %rcx
	leaq	.Lalloc_e741d4de8cb5801e1fd7a6c6795c1559(%rip), %r8
	movl	$43, %edx
	callq	_ZN4core9panicking5panic17hde1558f32d5b1c04E
	int3
```
2023-08-22 20:15:29 +00:00
..
.github/workflows Merge commit '1bbee3e217d75e7bc3bfe5d8c1b35e776fce96e6' into sync-cg_gcc-2023-06-19 2023-06-19 18:51:02 -04:00
build_sysroot Merge commit '1bbee3e217d75e7bc3bfe5d8c1b35e776fce96e6' into sync-cg_gcc-2023-06-19 2023-06-19 18:51:02 -04:00
crate_patches
example add a csky-unknown-linux-gnuabiv2 target 2023-08-14 23:02:36 +08:00
patches Merge commit '1bbee3e217d75e7bc3bfe5d8c1b35e776fce96e6' into sync-cg_gcc-2023-06-19 2023-06-19 18:51:02 -04:00
src Auto merge of #114643 - dpaoliello:inlinedebuginfo, r=wesleywiser 2023-08-22 20:15:29 +00:00
tests
tools Apply changes to fix python linting errors 2023-06-16 20:56:01 -04:00
.gitignore Merge commit '1bbee3e217d75e7bc3bfe5d8c1b35e776fce96e6' into sync-cg_gcc-2023-06-19 2023-06-19 18:51:02 -04:00
.rustfmt.toml
build.sh
Cargo.lock Update Cargo.lock 2023-06-19 20:44:01 -04:00
cargo.sh
Cargo.toml
clean_all.sh
config.sh
failing-ui-tests12.txt
failing-ui-tests.txt Merge commit '1bbee3e217d75e7bc3bfe5d8c1b35e776fce96e6' into sync-cg_gcc-2023-06-19 2023-06-19 18:51:02 -04:00
LICENSE-APACHE
LICENSE-MIT
messages.ftl UPDATE - replace expected_simd error with one from codegen_ssa 2023-07-20 00:20:00 -04:00
prepare_build.sh
prepare.sh
Readme.md Merge commit '1bbee3e217d75e7bc3bfe5d8c1b35e776fce96e6' into sync-cg_gcc-2023-06-19 2023-06-19 18:51:02 -04:00
rust-toolchain Merge commit '1bbee3e217d75e7bc3bfe5d8c1b35e776fce96e6' into sync-cg_gcc-2023-06-19 2023-06-19 18:51:02 -04:00
rustup.sh
test.sh Merge commit '1bbee3e217d75e7bc3bfe5d8c1b35e776fce96e6' into sync-cg_gcc-2023-06-19 2023-06-19 18:51:02 -04:00

WIP libgccjit codegen backend for rust

Chat on IRC

This is a GCC codegen for rustc, which means it can be loaded by the existing rustc frontend, but benefits from GCC: more architectures are supported and GCC's optimizations are used.

Despite its name, libgccjit can be used for ahead-of-time compilation, as is used here.

Motivation

The primary goal of this project is to be able to compile Rust code on platforms unsupported by LLVM. A secondary goal is to check if using the gcc backend will provide any run-time speed improvement for the programs compiled using rustc.

Building

This requires a patched libgccjit in order to work. The patches in this repository need to be applied. (Those patches should work when applied on master, but in case it doesn't work, they are known to work when applied on 079c23cfe079f203d5df83fea8e92a60c7d7e878.) You can also use my fork of gcc which already includes these patches.

To build it (most of these instructions come from here, so don't hesitate to take a look there if you encounter an issue):

$ git clone https://github.com/antoyo/gcc
$ sudo apt install flex libmpfr-dev libgmp-dev libmpc3 libmpc-dev
$ mkdir gcc-build gcc-install
$ cd gcc-build
$ ../gcc/configure \
    --enable-host-shared \
    --enable-languages=jit \
    --enable-checking=release \ # it enables extra checks which allow to find bugs
    --disable-bootstrap \
    --disable-multilib \
    --prefix=$(pwd)/../gcc-install
$ make -j4 # You can replace `4` with another number depending on how many cores you have.

If you want to run libgccjit tests, you will need to also enable the C++ language in the configure:

--enable-languages=jit,c++

Then to run libgccjit tests:

$ cd gcc # from the `gcc-build` folder
$ make check-jit
# To run one specific test:
$ make check-jit RUNTESTFLAGS="-v -v -v jit.exp=jit.dg/test-asm.cc"

Put the path to your custom build of libgccjit in the file gcc_path.

$ dirname $(readlink -f `find . -name libgccjit.so`) > gcc_path

You also need to set RUST_COMPILER_RT_ROOT:

$ git clone https://github.com/llvm/llvm-project llvm --depth 1 --single-branch
$ export RUST_COMPILER_RT_ROOT="$PWD/llvm/compiler-rt"

Then you can run commands like this:

$ ./prepare.sh # download and patch sysroot src and install hyperfine for benchmarking
$ LIBRARY_PATH=$(cat gcc_path) LD_LIBRARY_PATH=$(cat gcc_path) ./build.sh --release

To run the tests:

$ ./test.sh --release

Usage

$cg_gccjit_dir is the directory you cloned this repo into in the following instructions.

Cargo

$ CHANNEL="release" $cg_gccjit_dir/cargo.sh run

If you compiled cg_gccjit in debug mode (aka you didn't pass --release to ./test.sh) you should use CHANNEL="debug" instead or omit CHANNEL="release" completely.

Rustc

You should prefer using the Cargo method.

$ rustc +$(cat $cg_gccjit_dir/rust-toolchain) -Cpanic=abort -Zcodegen-backend=$cg_gccjit_dir/target/release/librustc_codegen_gcc.so --sysroot $cg_gccjit_dir/build_sysroot/sysroot my_crate.rs

Env vars

CG_GCCJIT_INCR_CACHE_DISABLED
Don't cache object files in the incremental cache. Useful during development of cg_gccjit to make it possible to use incremental mode for all analyses performed by rustc without caching object files when their content should have been changed by a change to cg_gccjit.
CG_GCCJIT_DISPLAY_CG_TIME
Display the time it took to perform codegen for a crate

Debugging

Sometimes, libgccjit will crash and output an error like this:

during RTL pass: expand
libgccjit.so: error: in expmed_mode_index, at expmed.h:249
0x7f0da2e61a35 expmed_mode_index
	../../../gcc/gcc/expmed.h:249
0x7f0da2e61aa4 expmed_op_cost_ptr
	../../../gcc/gcc/expmed.h:271
0x7f0da2e620dc sdiv_cost_ptr
	../../../gcc/gcc/expmed.h:540
0x7f0da2e62129 sdiv_cost
	../../../gcc/gcc/expmed.h:558
0x7f0da2e73c12 expand_divmod(int, tree_code, machine_mode, rtx_def*, rtx_def*, rtx_def*, int)
	../../../gcc/gcc/expmed.c:4335
0x7f0da2ea1423 expand_expr_real_2(separate_ops*, rtx_def*, machine_mode, expand_modifier)
	../../../gcc/gcc/expr.c:9240
0x7f0da2cd1a1e expand_gimple_stmt_1
	../../../gcc/gcc/cfgexpand.c:3796
0x7f0da2cd1c30 expand_gimple_stmt
	../../../gcc/gcc/cfgexpand.c:3857
0x7f0da2cd90a9 expand_gimple_basic_block
	../../../gcc/gcc/cfgexpand.c:5898
0x7f0da2cdade8 execute
	../../../gcc/gcc/cfgexpand.c:6582

To see the code which causes this error, call the following function:

gcc_jit_context_dump_to_file(ctxt, "/tmp/output.c", 1 /* update_locations */)

This will create a C-like file and add the locations into the IR pointing to this C file. Then, rerun the program and it will output the location in the second line:

libgccjit.so: /tmp/something.c:61322:0: error: in expmed_mode_index, at expmed.h:249

Or add a breakpoint to add_error in gdb and print the line number using:

p loc->m_line
p loc->m_filename->m_buffer

To print a debug representation of a tree:

debug_tree(expr);

(defined in print-tree.h)

To print a debug reprensentation of a gimple struct:

debug_gimple_stmt(gimple_struct)

To get the rustc command to run in gdb, add the --verbose flag to cargo build.

To have the correct file paths in gdb instead of /usr/src/debug/gcc/libstdc++-v3/libsupc++/eh_personality.cc:

Maybe by calling the following at the beginning of gdb:

set substitute-path /usr/src/debug/gcc /path/to/gcc-repo/gcc

TODO(antoyo): but that's not what I remember I was doing.

How to use a custom-build rustc

  • Build the stage2 compiler (rustup toolchain link debug-current build/x86_64-unknown-linux-gnu/stage2).
  • Clean and rebuild the codegen with debug-current in the file rust-toolchain.

How to install a forked git-subtree

Using git-subtree with rustc requires a patched git to make it work. The PR that is needed is here. Use the following instructions to install it:

git clone git@github.com:tqc/git.git
cd git
git checkout tqc/subtree
make
make install
cd contrib/subtree
make
cp git-subtree ~/bin

Then, do a sync with this command:

PATH="$HOME/bin:$PATH" ~/bin/git-subtree push -P compiler/rustc_codegen_gcc/ ../rustc_codegen_gcc/ sync_branch_name
cd ../rustc_codegen_gcc
git checkout master
git pull
git checkout sync_branch_name
git merge master

TODO: write a script that does the above.

https://rust-lang.zulipchat.com/#narrow/stream/301329-t-devtools/topic/subtree.20madness/near/258877725

How to use mem-trace

rustc needs to be built without jemalloc so that mem-trace can overload malloc since jemalloc is linked statically, so a LD_PRELOAD-ed library won't a chance to intercept the calls to malloc.

How to build a cross-compiling libgccjit

Building libgccjit

  • Follow these instructions: https://preshing.com/20141119/how-to-build-a-gcc-cross-compiler/ with the following changes:
  • Configure gcc with ../gcc/configure --enable-host-shared --disable-multilib --enable-languages=c,jit,c++ --disable-bootstrap --enable-checking=release --prefix=/opt/m68k-gcc/ --target=m68k-linux --without-headers.
  • Some shells, like fish, don't define the environment variable $MACHTYPE.
  • Add CFLAGS="-Wno-error=attributes -g -O2" at the end of the configure command for building glibc (CFLAGS="-Wno-error=attributes -Wno-error=array-parameter -Wno-error=stringop-overflow -Wno-error=array-bounds -g -O2" for glibc 2.31, which is useful for Debian).

Configuring rustc_codegen_gcc

  • Set TARGET_TRIPLE="m68k-unknown-linux-gnu" in config.sh.
  • Since rustc doesn't support this architecture yet, set it back to TARGET_TRIPLE="mips-unknown-linux-gnu" (or another target having the same attributes). Alternatively, create a target specification file (note that the arch specified in this file must be supported by the rust compiler).
  • Set linker='-Clinker=m68k-linux-gcc'.
  • Set the path to the cross-compiling libgccjit in gcc_path.
  • Comment the line: context.add_command_line_option("-masm=intel"); in src/base.rs.
  • (might not be necessary) Disable the compilation of libstd.so (and possibly libcore.so?).