76c500ec6c
Let a portion of DefPathHash uniquely identify the DefPath's crate. This allows to directly map from a `DefPathHash` to the crate it originates from, without constructing side tables to do that mapping -- something that is useful for incremental compilation where we deal with `DefPathHash` instead of `DefId` a lot. It also allows to reliably and cheaply check for `DefPathHash` collisions which allows the compiler to gracefully abort compilation instead of running into a subsequent ICE at some random place in the code. The following new piece of documentation describes the most interesting aspects of the changes: ```rust /// A `DefPathHash` is a fixed-size representation of a `DefPath` that is /// stable across crate and compilation session boundaries. It consists of two /// separate 64-bit hashes. The first uniquely identifies the crate this /// `DefPathHash` originates from (see [StableCrateId]), and the second /// uniquely identifies the corresponding `DefPath` within that crate. Together /// they form a unique identifier within an entire crate graph. /// /// There is a very small chance of hash collisions, which would mean that two /// different `DefPath`s map to the same `DefPathHash`. Proceeding compilation /// with such a hash collision would very probably lead to an ICE and, in the /// worst case, to a silent mis-compilation. The compiler therefore actively /// and exhaustively checks for such hash collisions and aborts compilation if /// it finds one. /// /// `DefPathHash` uses 64-bit hashes for both the crate-id part and the /// crate-internal part, even though it is likely that there are many more /// `LocalDefId`s in a single crate than there are individual crates in a crate /// graph. Since we use the same number of bits in both cases, the collision /// probability for the crate-local part will be quite a bit higher (though /// still very small). /// /// This imbalance is not by accident: A hash collision in the /// crate-local part of a `DefPathHash` will be detected and reported while /// compiling the crate in question. Such a collision does not depend on /// outside factors and can be easily fixed by the crate maintainer (e.g. by /// renaming the item in question or by bumping the crate version in a harmless /// way). /// /// A collision between crate-id hashes on the other hand is harder to fix /// because it depends on the set of crates in the entire crate graph of a /// compilation session. Again, using the same crate with a different version /// number would fix the issue with a high probability -- but that might be /// easier said then done if the crates in questions are dependencies of /// third-party crates. /// /// That being said, given a high quality hash function, the collision /// probabilities in question are very small. For example, for a big crate like /// `rustc_middle` (with ~50000 `LocalDefId`s as of the time of writing) there /// is a probability of roughly 1 in 14,750,000,000 of a crate-internal /// collision occurring. For a big crate graph with 1000 crates in it, there is /// a probability of 1 in 36,890,000,000,000 of a `StableCrateId` collision. ``` Given the probabilities involved I hope that no one will ever actually see the error messages. Nonetheless, I'd be glad about some feedback on how to improve them. Should we create a GH issue describing the problem and possible solutions to point to? Or a page in the rustc book? r? `@pnkfelix` (feel free to re-assign) |
||
---|---|---|
.. | ||
rustc | ||
rustc_apfloat | ||
rustc_arena | ||
rustc_ast | ||
rustc_ast_lowering | ||
rustc_ast_passes | ||
rustc_ast_pretty | ||
rustc_attr | ||
rustc_builtin_macros | ||
rustc_codegen_cranelift | ||
rustc_codegen_llvm | ||
rustc_codegen_ssa | ||
rustc_data_structures | ||
rustc_driver | ||
rustc_error_codes | ||
rustc_errors | ||
rustc_expand | ||
rustc_feature | ||
rustc_fs_util | ||
rustc_graphviz | ||
rustc_hir | ||
rustc_hir_pretty | ||
rustc_incremental | ||
rustc_index | ||
rustc_infer | ||
rustc_interface | ||
rustc_lexer | ||
rustc_lint | ||
rustc_lint_defs | ||
rustc_llvm | ||
rustc_macros | ||
rustc_metadata | ||
rustc_middle | ||
rustc_mir | ||
rustc_mir_build | ||
rustc_parse | ||
rustc_parse_format | ||
rustc_passes | ||
rustc_plugin_impl | ||
rustc_privacy | ||
rustc_query_impl | ||
rustc_query_system | ||
rustc_resolve | ||
rustc_save_analysis | ||
rustc_serialize | ||
rustc_session | ||
rustc_span | ||
rustc_symbol_mangling | ||
rustc_target | ||
rustc_trait_selection | ||
rustc_traits | ||
rustc_ty_utils | ||
rustc_type_ir | ||
rustc_typeck |