Use more slice patterns inside the compiler
Nothing super noteworthy. Just replacing the common 'fragile' pattern of "length check followed by indexing or unwrap" with slice patterns for legibility and 'robustness'.
r? ghost
use stable sort to sort multipart diagnostics
I think a stable sort should be used to sort the different parts of a multipart selection. The current unstable sort uses the text of the suggestion as a tie-breaker. That just doesn't seem right, and the order of the input is a better choice I think, because it gives the diagnostic author more control.
This came up when I was building a suggestion where
```rust
fn foo() {}
```
must be turned into an unsafe function, and an attribute must be added
```rust
#[target_feature(enable = "...")]
unsafe fn foo() {}
```
In this example, the two suggestions occur at the same position, but the order is extremely important: unsafe must come after the attribute. But the situation changes if there is a pub/pub(crate), and if the unsafe is already present. It just out that because of the suggestion text, there is no way for me to order the suggestions correctly.
This change probably should be tested, but are there tests of the diagnostics code itself in the tests?
r? ```@estebank```
Some `const { }` asserts for #128200
The correctness of code in #128200 relies on an array being sorted (so that it can be used in binary search later), which is currently enforced with `// tidy-alphabetical` (and characters being written in `\u{XXXX}` form), as well as lack of duplicate entries with conflicting keys, which is not currently enforced.
This PR changes it to using a `const{ }` assertion (and also checks for duplicate entries). Sadly, we cannot use the recently-stabilized `is_sorted_by_key` here, because it is not const (but it would not allow us to check for uniqueness anyways). Instead, let's write a manual loop.
Alternative approach (perfect hash function): #128463
r? `@ghost`
On short error format, append primary span label to message
The `error-format=short` output only displays the path, error code and main error message all in the same line. We now add the primary span label as well after the error message, to provide more context.
The `error-format=short` output only displays the path, error code and
main error message all in the same line. We now add the primary span label
as well after the error message, to provide more context.
Change output normalization logic to be linear against size of output
Modify the rendered output normalization routine to scan each character *once* and construct a `String` to be printed out to the terminal *once*, instead of using `String::replace` in a loop multiple times. The output doesn't change, but the time spent to prepare a diagnostic is now faster (or rather, closer to what it was before #127528).
When a suggestion part is for already present code, do not highlight it. If after that there are no highlights left, do not show the suggestion at all.
Fix clippy lint suggestion incorrectly treated as `span_help`.
Scan strings to be normalized for printing in a linear scan and collect
the resulting `String` only once.
Use a binary search when looking for chars to be replaced, instead of a
`HashMap::get`.
Replace ASCII control chars with Unicode Control Pictures
Replace ASCII control chars like `CR` with Unicode Control Pictures like `␍`:
```
error: bare CR not allowed in doc-comment
--> $DIR/lex-bare-cr-string-literal-doc-comment.rs:3:32
|
LL | /// doc comment with bare CR: '␍'
| ^
```
Centralize the checking of unicode char width for the purposes of CLI display in one place. Account for the new replacements. Remove unneeded tracking of "zero-width" unicode chars, as we calculate these in the `SourceMap` as needed now.
We already point these out quite aggressively, telling people not to use them, but would normally be rendered as nothing. Having them visible will make it easier for people to actually deal with them.
```
error: unicode codepoint changing visible direction of text present in literal
--> $DIR/unicode-control-codepoints.rs:26:22
|
LL | println!("{:?}", '�');
| ^-^
| ||
| |'\u{202e}'
| this literal contains an invisible unicode text flow control codepoint
|
= note: these kind of unicode codepoints change the way text flows on applications that support them, but can cause confusion because they change the order of characters on the screen
= help: if their presence wasn't intentional, you can remove them
help: if you want to keep them but make them visible in your source code, you can escape them
|
LL | println!("{:?}", '\u{202e}');
| ~~~~~~~~
```
vs the previous
```
error: unicode codepoint changing visible direction of text present in literal
--> $DIR/unicode-control-codepoints.rs:26:22
|
LL | println!("{:?}", '');
| ^-
| ||
| |'\u{202e}'
| this literal contains an invisible unicode text flow control codepoint
|
= note: these kind of unicode codepoints change the way text flows on applications that support them, but can cause confusion because they change the order of characters on the screen
= help: if their presence wasn't intentional, you can remove them
help: if you want to keep them but make them visible in your source code, you can escape them
|
LL | println!("{:?}", '\u{202e}');
| ~~~~~~~~
```
No longer track "zero-width" chars in `SourceMap`, read directly from the line when calculating the `display_col` of a `BytePos`. Move `char_width` to `rustc_span` and use it from the emitter.
This change allows the following to properly align in terminals (depending on the font, the replaced control codepoints are rendered as 1 or 2 width, on my terminal they are rendered as 1, on VSCode text they are rendered as 2):
```
error: this file contains an unclosed delimiter
--> $DIR/issue-68629.rs:5:17
|
LL | ␜␟ts␀![{i
| -- unclosed delimiter
| |
| unclosed delimiter
LL | ␀␀ fn rݻoa>rݻm
| ^
```