Implement `slice::split_once` and `slice::rsplit_once`
Feature gate is `slice_split_once` and tracking issue is #112811. These are equivalents to the existing `str::split_once` and `str::rsplit_once` methods.
Update cargo
5 commits in 794d0a82547f3081044c0aca7b6083733ce51344..6fa6fdc7606cfa664f9bee2fb33ee2ed904f4e88
2023-10-03 23:19:33 +0000 to 2023-10-10 23:06:08 +0000
- test(build): generalize test assertion for non-rustup env (rust-lang/cargo#12804)
- chore: Sort dependency tables (rust-lang/cargo#12803)
- fix(install): Suggest an alternative version on MSRV failure (rust-lang/cargo#12798)
- rustdoc: remove the word "Version" from test cases (rust-lang/cargo#12800)
- Add unsupported lowercase `-z` flag suggestion for `-Z` flag (rust-lang/cargo#12788)
r? ghost
Document `diagnostic_namespace` feature
This adds it to the rust unstable book.
FWIW: I couldn't find a way to serve the book locally (please send help), so I can't check that this renders correctly.
cc `@weiznich`
Add explicit-endian String::from_utf16 variants
This adds the following APIs under `feature(str_from_utf16_endian)`:
```rust
impl String {
pub fn from_utf16le(v: &[u8]) -> Result<String, FromUtf16Error>;
pub fn from_utf16le_lossy(v: &[u8]) -> String;
pub fn from_utf16be(v: &[u8]) -> Result<String, FromUtf16Error>;
pub fn from_utf16be_lossy(v: &[u8]) -> String;
}
```
These are versions of `String::from_utf16` that explicitly take [UTF-16LE and UTF-16BE](https://unicode.org/faq/utf_bom.html#gen7). Notably, we can do better than just the obvious `decode_utf16(v.array_chunks::<2>().copied().map(u16::from_le_bytes)).collect()` in that:
- We handle the case where the byte slice is not an even number of bytes, and
- In the case that the UTF-16 is native endian and the slice is aligned, we can forward to `String::from_utf16`.
If the Unicode Consortium actively defines how to handle character replacement when decoding a UTF-16 bytestream with a trailing odd byte, I was unable to find reference. However, the behavior implemented here is fairly self-evidently correct: replace the single errant byte with the replacement character.
Also consider call and yield as MIR SSA.
The SSA analysis on MIR only considered `Assign` statements as defining a SSA local.
This PR adds assignments as part of a `Call` or `Yield` terminator in that category.
This mainly allows to perform CopyProp on a call return place.
The only subtlety is in the dominance property: the assignment is only complete at the beginning of the target block.
Rollup of 7 pull requests
Successful merges:
- #109422 (rustdoc-search: add impl disambiguator to duplicate assoc items)
- #116250 (On type error of closure call argument, point at earlier calls that affected inference)
- #116444 (add test for const-eval error in dead code during monomorphization)
- #116503 (Update docs for mips target tier demotion.)
- #116559 (Mark `new_in` as `const` for BTree collections)
- #116560 (In smir use `FxIndexMap` to store indexed ids)
- #116574 (Update books)
r? `@ghost`
`@rustbot` modify labels: rollup
In smir use `FxIndexMap` to store indexed ids
Previously we used `vec` for storing indexed types, which is fine for small cases but will lead to huge performance issues when we use `smir` for real world cases.
Addresses https://github.com/rust-lang/project-stable-mir/issues/35
r? ``@oli-obk``
Update docs for mips target tier demotion.
These mips targets were demoted in #113274, but the documentation was not updated. I have also elected to document this in the release notes for 1.72 because I think that should have been included.
On type error of closure call argument, point at earlier calls that affected inference
Mitigate part of https://github.com/rust-lang/rust/issues/71209.
When we encounter a type error on a specific argument of a closure call argument, where the closure's definition doesn't have a type specified, look for other calls of the closure to try and find the specific call that cased that argument to be inferred of the expected type.
```
error[E0308]: mismatched types
--> $DIR/unboxed-closures-type-mismatch.rs:30:18
|
LL | identity(1u16);
| -------- ^^^^ expected `u8`, found `u16`
| |
| arguments to this function are incorrect
|
note: expected because the closure was earlier called with an argument of type `u8`
--> $DIR/unboxed-closures-type-mismatch.rs:29:18
|
LL | identity(1u8);
| -------- ^^^ expected because this argument is of type `u8`
| |
| in this closure call
note: closure parameter defined here
--> $DIR/unboxed-closures-type-mismatch.rs:28:25
|
LL | let identity = |x| x;
| ^
help: change the type of the numeric literal from `u16` to `u8`
|
LL | identity(1u8);
| ~~
```
rustdoc-search: add impl disambiguator to duplicate assoc items
Preview (to see the difference, click the link and pay attention to the specific function that comes up):
| Before | After |
|--|--|
| [`simd<i64>, simd<i64> -> simd<i64>`](https://doc.rust-lang.org/nightly/std/?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) | [`simd<i64>, simd<i64> -> simd<i64>`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=simd%3Ci64%3E%2C%20simd%3Ci64%3E%20-%3E%20simd%3Ci64%3E) |
| [`cow, vec -> bool`](https://doc.rust-lang.org/nightly/std/?search=cow%2C%20vec%20-%3E%20bool) | [`cow, vec -> bool`](https://notriddle.com/rustdoc-demo-html-3/impl-disambiguate-search/std/index.html?search=cow%2C%20vec%20-%3E%20bool)
Helps with #90929
This changes the search results, specifically, when there's more than one impl with an associated item with the same name. For example, the search queries `simd<i8> -> simd<i8>` and `simd<i64> -> simd<i64>` don't link to the same function, but most of the functions have the same names.
This change should probably be FCP-ed, especially since it adds a new anchor link format for `main.js` to handle, so that URLs like `struct.Vec.html#impl-AsMut<[T]>-for-Vec<T,+A>/method.as_mut` redirect to `struct.Vec.html#method.as_mut-2`. It's a strange design, but there are a few reasons for it:
* I'd like to avoid making the HTML bigger. Obviously, fixing this bug is going to add at least a little more data to the search index, but adding more HTML penalises viewers for the benefit of searchers.
* Breaking `struct.Vec.html#method.len` would also be a disappointment.
On the other hand:
* The path-style anchors might be less prone to link rot than the numbered anchors. It's definitely less likely to have URLs that appear to "work", but silently point at the wrong thing.
* This commit arranges the path-style anchor to redirect to the numbered anchor. Nothing stops rustdoc from doing the opposite, making path-style anchors the default and redirecting the "legacy" numbered ones.
### The bug
On the "Before" links, this example search calls for `i64`:
![image](https://github.com/rust-lang/rust/assets/1593513/9431d89d-41dc-4f68-bbb1-3e2704a973d2)
But if I click any of the results, I get `f64` instead.
![image](https://github.com/rust-lang/rust/assets/1593513/6d89c692-1847-421a-84d9-22e359d9cf82)
The PR fixes this problem by adding enough information to the search result `href` to disambiguate methods with different types but the same name.
More detailed description of the problem at:
https://github.com/rust-lang/rust/pull/109422#issuecomment-1491089293
> When a struct/enum/union has multiple impls with different type parameters, it can have multiple methods that have the same name, but which are on different impls. Besides Simd, [Any](https://doc.rust-lang.org/nightly/std/any/trait.Any.html?search=any%3A%3Adowncast) also demonstrates this pattern. It has three methods named `downcast`, on three different impls.
>
> When that happens, it presents a challenge in linking to the method. Normally we link like `#method.foo`. When there are multiple `foo`, we number them like `#method.foo`, `#method.foo-1`, `#method.foo-2`, etc.
>
> It also presents a challenge for our search code. Currently we store all the variants in the index, but don’t have any way to generate unambiguous URLs in the results page, or to distinguish them in the SERP.
>
> To fix this, we need three things:
>
> 1. A fragment format that fully specifies the impl type parameters when needed to disambiguate (`#impl-SimdOrd-for-Simd<i64,+LANES>/method.simd_max`)
> 2. A search index that stores methods with enough information to disambiguate the impl they were on.
> 3. A search results interface that can display multiple methods on the same type with the same name, when appropriate OR a disambiguation landing section on item pages?
>
> For reviewers: it can be hard to see the new fragment format in action since it immediately gets rewritten to the numbered form.
coverage: Unbox and simplify `bcb_filtered_successors`
This is a small cleanup in the coverage instrumentor's graph-building code.
---
This function already has access to the MIR body, so instead of taking a reference to a terminator, it's simpler and easier to pass in a basic block index.
There is no need to box the returned iterator if we instead add appropriate lifetime captures, and make `short_circuit_preorder` generic over the type of iterator it expects.
We can also greatly simplify the function's implementation by observing that the only difference between its two cases is whether we take all of a BB's successors, or just the first one.
---
`@rustbot` label +A-code-coverage
use env variable to control thread ids in rustc_log
Currently, when parallel rustc is enabled, even if the number of threads is 1, the thread ID will be included before all the logs.
E.g.
`WARN rustc_mir_build::thir::pattern::const_to_pat ...`
=>
`2:rustcWARN rustc_mir_build::thir::pattern::const_to_pat ...`
This makes the logs confusing and results in inconsistent UI test results for serial and parallel rustc. Therefore I think we should let users decide whether thread id information is needed through explicit control.
miri: make NaN generation non-deterministic
This implements the [LLVM semantics for NaN generation](https://llvm.org/docs/LangRef.html#behavior-of-floating-point-nan-values). I will soon submit an RFC to make this also officially the Rust semantics, but it has been our de-facto semantics for a long time so there's no reason Miri has to wait for that RFC. This PR just better aligns Miri with codegen.
This PR does that just for the operations that have MIR primitives; a future PR will adjust the intrinsics.
coverage: Separate initial span extraction from span processing
One of the main subtasks of coverage instrumentation is looking through MIR to determine a list of source code spans that require coverage counters.
That task is in turn subdivided into a few main steps:
- Getting the initial spans from MIR statements/terminators
- Processing the list of spans to merge or truncate nearby spans as necessary
- Grouping the processed spans by their corresponding coverage graph node
---
This PR enforces a firmer separation between the first two steps (span extraction and span processing), which ends up slightly simplifying both steps, since they don't need to deal with state that is only meaningful for the other step.
---
`@rustbot` label +A-code-coverage
Improve handling of assertion failures with very long conditions
It's not perfectly clear what the best behaviour is here, but I think this is an improvement.
r? `@matthewjasper`
cc `@m-ou-se`
This function already has access to the MIR body, so instead of taking a
reference to a terminator, it's simpler and easier to pass in a basic block
index.
There is no need to box the returned iterator if we instead add appropriate
lifetime captures, since `short_circuit_preorder` is now generic over the type
of iterator it expects.
We can also greatly simplify the function's implementation by observing that
the only difference between its two cases is whether we take all of a BB's
successors, or just the first one.
There are several that are unused and can be removed.
And there are some calls to `to_string`, which can be expressed more
nicely as a `foo_to_string` call, and then `to_string` need not be
`pub`. (This requires adding `pat_to_string`).
This enum was mainly needed to track the precise origin of a span in MIR, for
debug printing purposes. Since the old debug code was removed in #115962, we
can replace it with just the span itself.