rust/compiler/rustc_codegen_llvm
bors 4916e2b9e6 Auto merge of #98393 - michaelwoerister:new-cpp-like-enum-debuginfo, r=wesleywiser
debuginfo: Generalize C++-like encoding for enums.

The updated encoding should be able to handle niche layouts where more than one variant has fields (as introduced in https://github.com/rust-lang/rust/pull/94075).

The new encoding is more uniform as there is no structural difference between direct-tag, niche-tag, and no-tag layouts anymore. The only difference between those cases is that the "dataful" variant in a niche-tag enum will have a `(start, end)` pair denoting the tag range instead of a single value.

The new encoding now also supports 128-bit tags, which occur in at least some standard library types. These tags are represented as `u64` pairs so that debuggers (which don't always have support for 128-bit integers) can reliably deal with them. The downside is that this adds quite a bit of complexity to the encoding and especially to the corresponding NatVis.

The new encoding seems to increase the size of (x86_64-pc-windows-msvc) debuginfo by 10-15%. The size of binaries is not affected (release builds were built with `-Cdebuginfo=2`, numbers are in kilobytes):

EXE | before | after | relative
-- | -- | -- | --
cargo (debug) | 40453 | 40450 | +0%
ripgrep (debug) | 10275 | 10273 | +0%
cargo (release) | 16186 | 16185 | +0%
ripgrep (release) | 4727 | 4726 | +0%

PDB | before | after | relative
-- | -- | -- | --
cargo (debug) | 236524 | 261412 | +11%
ripgrep (debug) | 53140 | 59060 | +11%
cargo (release) | 148516 | 169620 | +14%
ripgrep (release) | 10676 | 11804 | +11%

Given that the new encoding is more general, this is to be expected. Only platforms using C++-like debuginfo are affected -- which currently is only `*-pc-windows-msvc`.

*TODO*
- [x] Properly update documentation
- [x] Add regression tests for new optimized enum layouts as introduced by #94075.

r? `@wesleywiser`
2022-08-15 12:59:53 +00:00
..
src Auto merge of #98393 - michaelwoerister:new-cpp-like-enum-debuginfo, r=wesleywiser 2022-08-15 12:59:53 +00:00
Cargo.toml Add fine-grained LLVM CFI support to the Rust compiler 2022-07-23 10:51:34 -07:00
README.md

The codegen crate contains the code to convert from MIR into LLVM IR, and then from LLVM IR into machine code. In general it contains code that runs towards the end of the compilation process.

For more information about how codegen works, see the rustc dev guide.