b3f1379509
Encode hashes as bytes, not varint In a few places, we store hashes as `u64` or `u128` and then apply `derive(Decodable, Encodable)` to the enclosing struct/enum. It is more efficient to encode hashes directly than try to apply some varint encoding. This PR adds two new types `Hash64` and `Hash128` which are produced by `StableHasher` and replace every use of storing a `u64` or `u128` that represents a hash. Distribution of the byte lengths of leb128 encodings, from `x build --stage 2` with `incremental = true` Before: ``` ( 1) 373418203 (53.7%, 53.7%): 1 ( 2) 196240113 (28.2%, 81.9%): 3 ( 3) 108157958 (15.6%, 97.5%): 2 ( 4) 17213120 ( 2.5%, 99.9%): 4 ( 5) 223614 ( 0.0%,100.0%): 9 ( 6) 216262 ( 0.0%,100.0%): 10 ( 7) 15447 ( 0.0%,100.0%): 5 ( 8) 3633 ( 0.0%,100.0%): 19 ( 9) 3030 ( 0.0%,100.0%): 8 ( 10) 1167 ( 0.0%,100.0%): 18 ( 11) 1032 ( 0.0%,100.0%): 7 ( 12) 1003 ( 0.0%,100.0%): 6 ( 13) 10 ( 0.0%,100.0%): 16 ( 14) 10 ( 0.0%,100.0%): 17 ( 15) 5 ( 0.0%,100.0%): 12 ( 16) 4 ( 0.0%,100.0%): 14 ``` After: ``` ( 1) 372939136 (53.7%, 53.7%): 1 ( 2) 196240140 (28.3%, 82.0%): 3 ( 3) 108014969 (15.6%, 97.5%): 2 ( 4) 17192375 ( 2.5%,100.0%): 4 ( 5) 435 ( 0.0%,100.0%): 5 ( 6) 83 ( 0.0%,100.0%): 18 ( 7) 79 ( 0.0%,100.0%): 10 ( 8) 50 ( 0.0%,100.0%): 9 ( 9) 6 ( 0.0%,100.0%): 19 ``` The remaining 9 or 10 and 18 or 19 are `u64` and `u128` respectively that have the high bits set. As far as I can tell these are coming primarily from `SwitchTargets`.
The codegen
crate contains the code to convert from MIR into LLVM IR,
and then from LLVM IR into machine code. In general it contains code
that runs towards the end of the compilation process.
For more information about how codegen works, see the rustc dev guide.