Let a portion of DefPathHash uniquely identify the DefPath's crate.
This allows to directly map from a `DefPathHash` to the crate it originates from, without constructing side tables to do that mapping -- something that is useful for incremental compilation where we deal with `DefPathHash` instead of `DefId` a lot.
It also allows to reliably and cheaply check for `DefPathHash` collisions which allows the compiler to gracefully abort compilation instead of running into a subsequent ICE at some random place in the code.
The following new piece of documentation describes the most interesting aspects of the changes:
```rust
/// A `DefPathHash` is a fixed-size representation of a `DefPath` that is
/// stable across crate and compilation session boundaries. It consists of two
/// separate 64-bit hashes. The first uniquely identifies the crate this
/// `DefPathHash` originates from (see [StableCrateId]), and the second
/// uniquely identifies the corresponding `DefPath` within that crate. Together
/// they form a unique identifier within an entire crate graph.
///
/// There is a very small chance of hash collisions, which would mean that two
/// different `DefPath`s map to the same `DefPathHash`. Proceeding compilation
/// with such a hash collision would very probably lead to an ICE and, in the
/// worst case, to a silent mis-compilation. The compiler therefore actively
/// and exhaustively checks for such hash collisions and aborts compilation if
/// it finds one.
///
/// `DefPathHash` uses 64-bit hashes for both the crate-id part and the
/// crate-internal part, even though it is likely that there are many more
/// `LocalDefId`s in a single crate than there are individual crates in a crate
/// graph. Since we use the same number of bits in both cases, the collision
/// probability for the crate-local part will be quite a bit higher (though
/// still very small).
///
/// This imbalance is not by accident: A hash collision in the
/// crate-local part of a `DefPathHash` will be detected and reported while
/// compiling the crate in question. Such a collision does not depend on
/// outside factors and can be easily fixed by the crate maintainer (e.g. by
/// renaming the item in question or by bumping the crate version in a harmless
/// way).
///
/// A collision between crate-id hashes on the other hand is harder to fix
/// because it depends on the set of crates in the entire crate graph of a
/// compilation session. Again, using the same crate with a different version
/// number would fix the issue with a high probability -- but that might be
/// easier said then done if the crates in questions are dependencies of
/// third-party crates.
///
/// That being said, given a high quality hash function, the collision
/// probabilities in question are very small. For example, for a big crate like
/// `rustc_middle` (with ~50000 `LocalDefId`s as of the time of writing) there
/// is a probability of roughly 1 in 14,750,000,000 of a crate-internal
/// collision occurring. For a big crate graph with 1000 crates in it, there is
/// a probability of 1 in 36,890,000,000,000 of a `StableCrateId` collision.
```
Given the probabilities involved I hope that no one will ever actually see the error messages. Nonetheless, I'd be glad about some feedback on how to improve them. Should we create a GH issue describing the problem and possible solutions to point to? Or a page in the rustc book?
r? `@pnkfelix` (feel free to re-assign)
Use u32 over Option<u32> in DebugLoc
~~Changes `Option<u32>` fields in `DebugLoc` to `Option<NonZeroU32>`. Since the respective fields (`line` and `col`) are guaranteed to be 1-based, this layout optimization is a freebie.~~
EDIT: Changes `Option<u32>` fields in `DebugLoc` to `u32`. As `@bugadani` pointed out, an `Option<NonZeroU32>` is probably an unnecessary layer of abstraction since the `None` variant is always used as `UNKNOWN_LINE_NUMBER` (which is just `0`). Also, `SourceInfo` in `metadata.rs` already uses a `u32` instead of an `Option<u32>` to encode the same information, so I think this change is warranted.
Since `@jyn514` raised some concerns over measuring performance in a similar PR (#82255), does this need a perf run?
Sync rustc_codegen_cranelift
The main highlight of this sync is removal of support for the old x86 Cranelift backend. This made it possible to use native atomic instructions rather than hackishly using a global mutex. 128bit integer support has also seen a few bugfixes and performance improvements. And finally I have formatted everything using the same rustfmt config as the rest of this repo.
r? ````@ghost````
````@rustbot```` label +A-codegen +A-cranelift +T-compiler
rustdoc: Add an unstable option to print all unversioned files
This allows sharing those files between different doc invocations
without having to know their names ahead of time.
Helps with https://github.com/rust-lang/docs.rs/issues/1302.
r? ````@GuillaumeGomez```` cc ````@pietroalbini```` ````@Nemo157````
improve offset_from docs
`@thomcc` pointed out that the current docs leave it kind of unclear how one can satisfy the "no wrapping around `isize` or the address space" requirement of `offset_from`, so make the docs clearer about that.
FWIW, I don't think I entirely agree with that second paragraph about large objects (that I left mostly unchanged here). LLVM, to my knowledge, fundamentally assumes that all allocations fit into an `isize::MAX`. So in that sense creating a larger allocation is simply UB. I would expect a guarantee that Rust heap allocation methods will never return allocations larger than `isize::MAX` (or rather, Rust heap allocation methods should require that the `Layout` is no larger than `isize::MAX`). However, I cannot find any such requirement documented currently. Large allocations are not mentioned at all in the allocator docs, which is quite surprising -- even if we say that such allocations are not insta-UB (which I think is incompatible with LLVM), they are still extremely footgunny since `ptr::offset`/`ptr::add` do not support offsetting by more than `isize::MAX` bytes.
Furthermore, the allocator docs don't even say anything about allocations wrapping around the address space. But that is certainly something allocators must ensure never happens; we cannot expect clients to defend against this.
Cc `@rust-lang/wg-allocators`
Cleanup rustdoc warnings
## Clean up error reporting for deprecated passes
Using `error!` here goes all the way back to the original commit, https://github.com/rust-lang/rust/pull/8540. I don't see any reason to use logging; rustdoc should use diagnostics wherever possible. See https://github.com/rust-lang/rust/pull/81932#issuecomment-785291244 for further context.
- Use spans for deprecated attributes
- Use a proper diagnostic for unknown passes, instead of error logging
- Add tests for unknown passes
- Improve some wording in diagnostics
## Report that `doc(plugins)` doesn't work using diagnostics instead of `eprintln!`
This also adds a test for the output.
This was added in https://github.com/rust-lang/rust/pull/52194. I don't see any particular reason not to use diagnostics here, I think it was just missed in https://github.com/rust-lang/rust/pull/50541.
Improve transmute docs with further clarifications
Closes#82493.
Please let me know if any of the new wording sounds off, English is not my mother tongue.
Remove RefCell around `module_trait_cache`
This builds on https://github.com/rust-lang/rust/pull/82018 and should not be merged before.
## Don't require a `DocContext` for `report_diagnostic`
This is needed for the next commit, which needs mutable access to the `cx` from
within the `decorate` closure.
- Change `as_local_hir_id` to an associated function, since it only
needs a `TyCtxt`
- Change `source_span_for_markdown_range` to only take a `TyCtxt`
## Remove RefCell around module_trait_cache
This is mostly just changing lots of functions from `&DocContext` to `&mut DocContext`.
Prevent specialized ZipImpl from calling `__iterator_get_unchecked` twice with the same index
Fixes#82291
It's open for review, but conflicts with #82289, wait before merging. The conflict involves only the new test, so it should be rather trivial to fix.
Make some Option, Result methods unstably const
The following methods are now unstably const:
- Option::transpose
- Option::flatten
- Result::flatten
While some methods for could likely be made `const` in the future, nearly all of them require something to be dropped at compile-time, which isn't currently supported. The functions listed above should have a trivial path to stabilization.
Change built-in kernel targets to be os = none throughout
Whether for Rust's own `target_os`, LLVM's triples, or GNU config's, the
OS-related have fields have been for code running *on* that OS, not code
hat is *part* of the OS.
The difference is huge, as syscall interfaces are nothing like
freestanding interfaces. Kernels are (hypervisors and other more exotic
situations aside) freestanding programs that use the interfaces provided
by the hardware. It's *those* interfaces, the ones external to the
program being built and its software dependencies, that are the content
of the target.
For the Linux Kernel in particular, `target_env: "gnu"` is removed for
the same reason: that `-gnu` refers to glibc or GNU/linux, neither of
which applies to the kernel itself.
Relates to #74247
Move check only relevant in error case out of critical path
Move the check for potentially forgotten `return` in a tail expression
of arbitrary expressions into the coercion error branch to avoid
computing unncessary coercion checks on successful code.
Follow up to #81458.
Rollup of 7 pull requests
Successful merges:
- #80845 (Make ItemKind::ExternCrate looks like hir::ItemKind::ExternCrate to make transition over hir::ItemKind simpler)
- #82708 (Warn on `#![doc(test(...))]` on items other than the crate root and use future incompatible lint)
- #82714 (Detect match arm body without braces)
- #82736 (Bump optimization from mir_opt_level 2 to 3 and 3 to 4 and make "release" be level 2 by default)
- #82782 (Make rustc shim's verbose output include crate_name being compiled.)
- #82797 (Update tests names to start with `issue-`)
- #82809 (rustdoc: Use substrings instead of split to grab enum variant paths)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
rustdoc: Use substrings instead of split to grab enum variant paths
Both versions are about equally readable, but this version avoids scanning the entire path and building an intermediate array (`split()` in Rust is a lazy iterator, but not in JavaScript).
Make rustc shim's verbose output include crate_name being compiled.
This change is mainly motivated by an issue with the environment printing I added in PR 82403: multiple rustc invocations progress in parallel, and the environment output, spanning multiple lines, gets interleaved in ways make it difficult to extra the enviroment settings.
(This aforementioned difficulty is more of a hiccup than an outright show-stopper, because the environment variables tend to be the same for all of the rustc invocations, so it doesn't matter too much if one mixes up which lines one is looking at. But still: Better to fix it.)
Warn on `#![doc(test(...))]` on items other than the crate root and use future incompatible lint
Part of #82672.
This PR does multiple things:
* Create a new `INVALID_DOC_ATTRIBUTE` lint which is also "future incompatible", allowing us to use it as a warning for the moment until it turns (eventually) into a hard error.
* Use this link when `#![doc(test(...))]` isn't used at the crate level.
* Make #82702 use this new lint as well.
r? ``@jyn514``
Make ItemKind::ExternCrate looks like hir::ItemKind::ExternCrate to make transition over hir::ItemKind simpler
It was surprisingly difficult to make this change, mostly because of two issues:
* We now store the `ExternCrate` name in the parent struct (`clean::Item`), which forced me to modify the json conversion code a bit more than expected.
* The second problem was that, since we now have a `Some(name)`, it was trying to render it, ending up in a panic because we ended up in a `unreachable` statement. The solution was simply to add `!item.is_extern_crate()` in `formats::renderer` before calling `cx.item(item, &cache)?;`.
I'll continue to replace all the `clean::ItemKind` variants one by one until it looks exactly like `hir::ItemKind`. Then we'll simply discard the rustdoc type. Once this done, we'll be able to discard `clean::Item` too to use `hir::Item`.
r? ``@jyn514``
Improve slice.binary_search_by()'s best-case performance to O(1)
This PR aimed to improve the [slice.binary_search_by()](https://doc.rust-lang.org/std/primitive.slice.html#method.binary_search_by)'s best-case performance to O(1).
# Noticed
I don't know why the docs of `binary_search_by` said `"If there are multiple matches, then any one of the matches could be returned."`, but the implementation isn't the same thing. Actually, it returns the **last one** if multiple matches found.
Then we got two options:
## If returns the last one is the correct or desired result
Then I can rectify the docs and revert my changes.
## If the docs are correct or desired result
Then my changes can be merged after fully reviewed.
However, if my PR gets merged, another issue raised: this could be a **breaking change** since if multiple matches found, the returning order no longer the last one instead of it could be any one.
For example:
```rust
let mut s = vec![0, 1, 1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55];
let num = 1;
let idx = s.binary_search(&num);
s.insert(idx, 2);
// Old implementations
assert_eq!(s, [0, 1, 1, 1, 1, 2, 2, 3, 5, 8, 13, 21, 34, 42, 55]);
// New implementations
assert_eq!(s, [0, 1, 1, 1, 2, 1, 2, 3, 5, 8, 13, 21, 34, 42, 55]);
```
# Benchmarking
**Old implementations**
```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1 ... bench: 59 ns/iter (+/- 4)
test slice::binary_search_l1_with_dups ... bench: 59 ns/iter (+/- 3)
test slice::binary_search_l2 ... bench: 76 ns/iter (+/- 5)
test slice::binary_search_l2_with_dups ... bench: 77 ns/iter (+/- 17)
test slice::binary_search_l3 ... bench: 183 ns/iter (+/- 23)
test slice::binary_search_l3_with_dups ... bench: 185 ns/iter (+/- 19)
```
**New implementations (1)**
Implemented by this PR.
```rust
if cmp == Equal {
return Ok(mid);
} else if cmp == Less {
base = mid
}
```
```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1 ... bench: 58 ns/iter (+/- 2)
test slice::binary_search_l1_with_dups ... bench: 37 ns/iter (+/- 4)
test slice::binary_search_l2 ... bench: 76 ns/iter (+/- 3)
test slice::binary_search_l2_with_dups ... bench: 57 ns/iter (+/- 6)
test slice::binary_search_l3 ... bench: 200 ns/iter (+/- 30)
test slice::binary_search_l3_with_dups ... bench: 157 ns/iter (+/- 6)
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1 ... bench: 59 ns/iter (+/- 8)
test slice::binary_search_l1_with_dups ... bench: 37 ns/iter (+/- 2)
test slice::binary_search_l2 ... bench: 77 ns/iter (+/- 2)
test slice::binary_search_l2_with_dups ... bench: 57 ns/iter (+/- 2)
test slice::binary_search_l3 ... bench: 198 ns/iter (+/- 21)
test slice::binary_search_l3_with_dups ... bench: 158 ns/iter (+/- 11)
```
**New implementations (2)**
Suggested by `@nbdd0121` in [comment](https://github.com/rust-lang/rust/pull/74024#issuecomment-665430239).
```rust
base = if cmp == Greater { base } else { mid };
if cmp == Equal { break }
```
```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1 ... bench: 59 ns/iter (+/- 7)
test slice::binary_search_l1_with_dups ... bench: 37 ns/iter (+/- 5)
test slice::binary_search_l2 ... bench: 75 ns/iter (+/- 3)
test slice::binary_search_l2_with_dups ... bench: 56 ns/iter (+/- 3)
test slice::binary_search_l3 ... bench: 195 ns/iter (+/- 15)
test slice::binary_search_l3_with_dups ... bench: 151 ns/iter (+/- 7)
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1 ... bench: 57 ns/iter (+/- 2)
test slice::binary_search_l1_with_dups ... bench: 38 ns/iter (+/- 2)
test slice::binary_search_l2 ... bench: 77 ns/iter (+/- 11)
test slice::binary_search_l2_with_dups ... bench: 57 ns/iter (+/- 4)
test slice::binary_search_l3 ... bench: 194 ns/iter (+/- 15)
test slice::binary_search_l3_with_dups ... bench: 151 ns/iter (+/- 18)
```
I run some benchmarking testings against on two implementations. The new implementation has a lot of improvement in duplicates cases, while in `binary_search_l3` case, it's a little bit slower than the old one.
Both versions are about equally readable, but this version avoids scanning
the entire path and building an intermediate array (`split()` in Rust is
a lazy iterator, but not in JavaScript).
1. I added `--test` based on review feedback from simulacrum: I decided I would
rather include such extra context than get confused later on by its absence.
(However, I chose to encode it differently than how `[RUSTC-TIMING]` does... I
don't have much basis for doing so, other than `--test` to me more directly
reflects what it came from.)
2. I also decided to include `[RUSTC-SHIM]` at start of all of these lines
driven by the verbosity level, to make to clear where these lines of text
originate from. (Basically, I skimmed over the output and realized that a casual
observer might not be able to tell where this huge set of new lines were coming
from.)