Emit only one nbsp error per file
Fixes#106101.
See https://github.com/rust-lang/rust/issues/106098 for an explanation of how someone would end up with a large number of these nbsp characters in their source code, which is why I think rustc needs to handle this specific case in a friendlier way.
Emit a single error for contiguous sequences of unknown tokens
Closes#106101
On encountering a sequence of identical source characters which are unknown tokens, note the amount of subsequent characters and advance past them silently. The old behavior was to emit an error and 'help' note for every single one.
`@rustbot` label +A-diagnostics +A-parser
Recover from where clauses placed before tuple struct bodies
Open to any suggestions regarding the phrasing of the diagnostic.
Fixes#100790.
`@rustbot` label A-diagnostics
r? diagnostics
Rollup of 9 pull requests
Successful merges:
- #104531 (Provide a better error and a suggestion for `Fn` traits with lifetime params)
- #105899 (`./x doc library --open` opens `std`)
- #106190 (Account for multiple multiline spans with empty padding)
- #106202 (Trim more paths in obligation types)
- #106234 (rustdoc: simplify settings, help, and copy button CSS by not reusing)
- #106236 (docs/test: add docs and a UI test for `E0514` and `E0519`)
- #106259 (Update Clippy)
- #106260 (Fix index out of bounds issues in rustdoc)
- #106263 (Formatter should not try to format non-Rust files)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Currently, given `Fn`-family traits with lifetime params like
`Fn<'a>(&'a str) -> bool`, many unhelpful errors show up. These are a
bit confusing.
This commit allows these situations to suggest simply using
higher-ranked trait bounds like `for<'a> Fn(&'a str) -> bool`.
Properly calculate best failure in macro matching
Previously, we used spans. This was not good. Sometimes, the span of the token that failed to match may come from a position later in the file which has been transcribed into a token stream way earlier in the file. If precisely this token fails to match, we think that it was the best match because its span is so high, even though other arms might have gotten further in the token stream.
We now try to properly use the location in the token stream.
This needs a little cleanup as the `best_failure` field is getting out of hand but it should be mostly good to go. I hope I didn't violate too many abstraction boundaries..
Always suggest as `MachineApplicable` in `recover_intersection_pat`
This resolves one FIXME in `recover_intersection_pat` by always applying `MachineApplicable` when suggesting, as `bindings_after_at` is now stable.
This also separates a test to apply `// run-rustfix`.
Signed-off-by: Yuki Okushi <jtitor@2k36.org>
Previously, we used spans. This was not good. Sometimes, the span of the
token that failed to match may come from a position later in the file
which has been transcribed into a token stream way earlier in the file.
If precisely this token fails to match, we think that it was the best
match because its span is so high, even though other arms might have
gotten further in the token stream.
We now try to properly use the location in the token stream.
Remove `token::Lit` from `ast::MetaItemLit`.
Currently `ast::MetaItemLit` represents the literal kind twice. This PR removes that redundancy. Best reviewed one commit at a time.
r? `@petrochenkov`
propagate the error from parsing enum variant to the parser and emit out
While parsing enum variant, the error message always disappear
Because the error message that emit out is from main error of parser
The information of enum variant disappears while parsing enum variant with error
We only check the syntax of expecting token, i.e, in case https://github.com/rust-lang/rust/issues/103869
It will error it without telling the message that this error is from pasring enum variant.
Propagate the sub-error from parsing enum variant to the main error of parser by chaining it with map_err
Check the sub-error before emitting the main error of parser and attach it.
Fix https://github.com/rust-lang/rust/issues/103869
These two methods both produce a `MetaItemLit`, and then some of the
call sites convert the `MetaItemLit` to a `token::Lit` with
`as_token_lit`.
This commit parameterises these two methods with a `mk_lit_char`
closure, which can be used to produce either `MetaItemLit` or
`token::Lit` directly as necessary.
Fix ICE on invalid variable declarations in macro calls
This fixes ICE that happens with invalid variable declarations in macro calls like:
```rust
macro_rules! m { ($s:stmt) => {} }
m! { var x }
m! { auto x }
m! { mut x }
```
Found this is because of not collecting tokens on recovery, so I changed to force collect them.
Fixes https://github.com/rust-lang/rust/issues/103529.
Previously, the `recover_local_after_let` function was called from the
body of the `recover_stmt_local` function. Unifying these two functions
make it more simple and more readable.
Because the error message that emit out is from main error of parser
The information of enum variant disappears while parsing enum variant with error
We only check the syntax of expecting token, i.e, in case #103869
It will error it without telling the message that this error is from pasring enum variant.
Propagate the sub-error from parsing enum variant to the main error of parser by chaining it with map_err
Check the sub-error before emitting the main error of parser and attach it.
Fix#103869
`token::Lit` contains a `kind` field that indicates what kind of literal
it is. `ast::MetaItemLit` currently wraps a `token::Lit` but also has
its own `kind` field. This means that `ast::MetaItemLit` encodes the
literal kind in two different ways.
This commit changes `ast::MetaItemLit` so it no longer wraps
`token::Lit`. It now contains the `symbol` and `suffix` fields from
`token::Lit`, but not the `kind` field, eliminating the redundancy.
Lower them into a single item with multiple resolutions instead.
This also allows to remove additional `NodId`s and `DefId`s related to those additional items.
`check_builtin_attribute` calls `parse_meta` to convert an `Attribute`
to a `MetaItem`, which it then checks. However, many callers of
`check_builtin_attribute` start with a `MetaItem`, and then convert it
to an `Attribute` by calling `cx.attribute(meta_item)`. This `MetaItem`
to `Attribute` to `MetaItem` conversion is silly.
This commit adds a new function `check_builtin_meta_item`, which can be
called instead from these call sites. `check_builtin_attribute` also now
calls it. The commit also renames `check_meta` as `check_attr` to better
match its arguments.
Use `as_deref` in compiler (but only where it makes sense)
This simplifies some code :3
(there are some changes that are not exacly `as_deref`, but more like "clever `Option`/`Result` method use")
Split `MacArgs` in two.
`MacArgs` is an enum with three variants: `Empty`, `Delimited`, and `Eq`. It's used in two ways:
- For representing attribute macro arguments (e.g. in `AttrItem`), where all three variants are used.
- For representing function-like macros (e.g. in `MacCall` and `MacroDef`), where only the `Delimited` variant is used.
In other words, `MacArgs` is used in two quite different places due to them having partial overlap. I find this makes the code hard to read. It also leads to various unreachable code paths, and allows invalid values (such as accidentally using `MacArgs::Empty` in a `MacCall`).
This commit splits `MacArgs` in two:
- `DelimArgs` is a new struct just for the "delimited arguments" case. It is now used in `MacCall` and `MacroDef`.
- `AttrArgs` is a renaming of the old `MacArgs` enum for the attribute macro case. Its `Delimited` variant now contains a `DelimArgs`.
Various other related things are renamed as well.
These changes make the code clearer, avoids several unreachable paths, and disallows the invalid values.
r? `@petrochenkov`
`MacArgs` is an enum with three variants: `Empty`, `Delimited`, and `Eq`. It's
used in two ways:
- For representing attribute macro arguments (e.g. in `AttrItem`), where all
three variants are used.
- For representing function-like macros (e.g. in `MacCall` and `MacroDef`),
where only the `Delimited` variant is used.
In other words, `MacArgs` is used in two quite different places due to them
having partial overlap. I find this makes the code hard to read. It also leads
to various unreachable code paths, and allows invalid values (such as
accidentally using `MacArgs::Empty` in a `MacCall`).
This commit splits `MacArgs` in two:
- `DelimArgs` is a new struct just for the "delimited arguments" case. It is
now used in `MacCall` and `MacroDef`.
- `AttrArgs` is a renaming of the old `MacArgs` enum for the attribute macro
case. Its `Delimited` variant now contains a `DelimArgs`.
Various other related things are renamed as well.
These changes make the code clearer, avoids several unreachable paths, and
disallows the invalid values.
Rollup of 8 pull requests
Successful merges:
- #101162 (Migrate rustc_resolve to use SessionDiagnostic, part # 1)
- #103386 (Don't allow `CoerceUnsized` into `dyn*` (except for trait upcasting))
- #103405 (Detect incorrect chaining of if and if let conditions and recover)
- #103594 (Fix non-associativity of `Instant` math on `aarch64-apple-darwin` targets)
- #104006 (Add variant_name function to `LangItem`)
- #104494 (Migrate GUI test to use functions)
- #104516 (rustdoc: clean up sidebar width CSS)
- #104550 (fix a typo)
Failed merges:
- #104554 (Use `ErrorGuaranteed::unchecked_claim_error_was_emitted` less)
r? `@ghost`
`@rustbot` modify labels: rollup
Use `token::Lit` in `ast::ExprKind::Lit`.
Instead of `ast::Lit`.
Literal lowering now happens at two different times. Expression literals are lowered when HIR is crated. Attribute literals are lowered during parsing.
r? `@petrochenkov`
Instead of `ast::Lit`.
Literal lowering now happens at two different times. Expression literals
are lowered when HIR is crated. Attribute literals are lowered during
parsing.
This commit changes the language very slightly. Some programs that used
to not compile now will compile. This is because some invalid literals
that are removed by `cfg` or attribute macros will no longer trigger
errors. See this comment for more details:
https://github.com/rust-lang/rust/pull/102944#issuecomment-1277476773
Recover from function pointer types with generic parameter list
Give a more helpful error when encountering function pointer types with a generic parameter list like `fn<'a>(&'a str) -> bool` or `fn<T>(T) -> T` and suggest moving lifetime parameters to a `for<>` parameter list.
I've added a bunch of extra code to properly handle (unlikely?) corner cases like `for<'a> fn<'b>()` (where there already exists a `for<>` parameter list) correctly suggesting `for<'a, 'b> fn()` (merging the lists). If you deem this useless, I can simplify the code by suggesting nothing at all in this case.
I am quite open to suggestions regarding the wording of the diagnostic messages.
Fixes#103487.
``@rustbot`` label A-diagnostics
r? diagnostics
Delay `include_bytes` to AST lowering
Hopefully addresses #65818.
This PR introduces a new `ExprKind::IncludedBytes` which stores the path and bytes of a file included with `include_bytes!()`. We can then create a literal from the bytes during AST lowering, which means we don't need to escape the bytes into valid UTF8 which is the cause of most of the overhead of embedding large binary blobs.
Recover wrong-cased keywords that start items
(_this pr was inspired by [this tweet](https://twitter.com/Azumanga/status/1552982326409367561)_)
r? `@estebank`
We've talked a bit about this recovery, but I just wanted to make sure that this is the right approach :)
For now I've only added the case insensitive recovery to `use`s, since most other items like `impl` blocks, modules, functions can start with multiple keywords which complicates the matter.
Rollup of 9 pull requests
Successful merges:
- #102763 (Some diagnostic-related nits)
- #103443 (Parser: Recover from using colon as path separator in imports)
- #103675 (remove redundent "<>" for ty::Slice with reference type)
- #104046 (bootstrap: add support for running Miri on a file)
- #104115 (Migrate crate-search element to CSS variables)
- #104190 (Ignore "Change InferCtxtBuilder from enter to build" in git blame)
- #104201 (Add check in GUI test for file loading failure)
- #104211 (⬆️ rust-analyzer)
- #104231 (Update mailmap)
Failed merges:
- #104169 (Migrate `:target` rules to use CSS variables)
r? `@ghost`
`@rustbot` modify labels: rollup
Parser: Recover from using colon as path separator in imports
I don't know if this is the right approach, any feedback is welcome.
r? ```@compiler-errors```
Fixes#103269
Some diagnostic-related nits
1. Use `&mut Diagnostic` instead of `&mut DiagnosticBuilder<'_, T>`
2. Make `diag.span_suggestions` take an `IntoIterator` instead of `Iterator`, just to remove some `.into_iter` calls on the caller.
idk if I should add a lint to make sure people use `&mut Diagnostic` instead of `&mut DiagnosticBuilder<'_, T>` in cases where we're just, e.g., adding subdiagnostics to the diagnostic... maybe a followup.
It deals with eight cases: ints, floats, and the six quoted types
(char/byte/strings). For ints and floats we have an early return, and
the other six types fall through to the code at the end, which makes the
function hard to read.
This commit rearranges things to avoid the early returns.
There are three kinds of "byte" literals: byte literals, byte string
literals, and raw byte string literals. None are allowed to have
non-ASCII chars in them.
Two `EscapeError` variants exist for when that constraint is violated.
- `NonAsciiCharInByte`: used for byte literals and byte string literals.
- `NonAsciiCharInByteString`: used for raw byte string literals.
As a result, the messages for raw byte string literals use different
wording, without good reason. Also, byte string literals are incorrectly
described as "byte constants" in some error messages.
This commit eliminates `NonAsciiCharInByteString` so the three cases are
handled similarly, and described correctly. The `mode` is enough to
distinguish them.
Note: Some existing error messages mention "byte constants" and some
mention "byte literals". I went with the latter here, because it's a
more correct name, as used by the Reference.
It's passed to numerous places where we just need an `is_byte` bool.
Passing the bool avoids the need for some assertions.
Also rename `is_bytes()` as `is_byte()`, to better match `Mode::Byte`,
`Mode::ByteStr`, and `Mode::RawByteStr`.
Gate some parser recovery behind the check
Mainly in `expr.rs`. `may_recover` doesn't do anything useful yet until I implement that on top of #103439.
r? `@compiler-errors`
Change #[suggestion_*] attributes to use style="..."
As discussed [on Zulip](https://rust-lang.zulipchat.com/#narrow/stream/336883-i18n/topic/.23100717.20tool_only_span_suggestion), this changes `#[(multipart_)suggestion_{short,verbose,hidden}(...)]` attributes to plain `#[(multipart_)suggestion(...)]` attributes with a `style = "{short,verbose,hidden}"` parameter.
It also adds a new style, `tool-only`, that corresponds to `tool_only_span_suggestion`/`tool_only_multipart_suggestion` and causes the suggestion to not be shown in human-readable output at all.
Best reviewed commit-by-commit, there's a bit of noise in there.
cc #100717 `@compiler-errors`
r? `@davidtwco`
Track where diagnostics were created.
This implements the `-Ztrack-diagnostics` flag, which uses `#[track_caller]` to track where diagnostics are created. It is meant as a debugging tool much like `-Ztreat-err-as-bug`.
For example, the following code...
```rust
struct A;
struct B;
fn main(){
let _: A = B;
}
```
...now emits the following error message:
```
error[E0308]: mismatched types
--> src\main.rs:5:16
|
5 | let _: A = B;
| - ^ expected struct `A`, found struct `B`
| |
| expected due to this
-Ztrack-diagnostics: created at compiler\rustc_infer\src\infer\error_reporting\mod.rs:2275:31
```
Add flag to forbid recovery in the parser
To start the effort of fixing #103534, this adds a new flag to the parser, which forbids the parser from doing recovery, which it shouldn't do in macros.
This doesn't add any new checks for recoveries yet and is just here to bikeshed the names for the functions here before doing more.
r? `@compiler-errors`
Rollup of 6 pull requests
Successful merges:
- #101293 (Recover when unclosed char literal is parsed as a lifetime in some positions)
- #101908 (Suggest let for assignment, and some code refactor)
- #103192 (rustdoc: Eliminate uses of `EarlyDocLinkResolver::all_traits`)
- #103226 (Check `needs_infer` before `needs_drop` during HIR generator analysis)
- #103249 (resolve: Revert "Set effective visibilities for imports more precisely")
- #103305 (Move some tests to more reasonable places)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Flatten diagnostic slug modules
This makes it easier to grep for the slugs in the code.
See https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Localization.20infra.20interferes.20with.20grepping.20for.20error for more discussion about it.
This was mostly done with a few regexes and a bunch of manual work. This also exposes a pretty annoying inconsistency for the extra labels. Some of the extra labels are defined as additional properties in the fluent message (which makes them not prefixed with the crate name) and some of them are new fluent messages themselves (which makes them prefixed with the crate name). I don't know whether we want to clean this up at some point but it's useful to know.
r? `@davidtwco`
Fix `let` keyword removal suggestion in structs
(1.) Fixes a bug where, given this code:
```rust
struct Foo {
let x: i32,
}
```
We were parsing the field name as `let` instead of `x`, which causes issues later on in the type-checking phase.
(2.) Also, suggestions for `let: i32` as a field regressed, displaying this extra `help:` which is removed by this PR
```
help: remove the let, the `let` keyword is not allowed in struct field definitions
|
2 - let: i32,
2 + : i32,
```
(3.) Makes the suggestion text a bit more succinct, since we don't need to re-explain that `let` is not allowed in this position (since it's in a note that follows). This causes the suggestion to render inline as well.
cc `@gimbles,` this addresses a few nits I mentioned in your PR.
It's now only used in one function. Also, the "should we glue the
tokens?" check is only necessary when pushing a `TokenTree::Token`, not
when pushing a `TokenTree::Delimited`.
As part of this, we now do the "should we glue the tokens?" check
immediately, which avoids having look back at the previous token. It
also puts all the logic dealing with token gluing in a single place.
Remove `expr_parentheses_needed` from `ParseSess`
Not sure why this method needed to exist on `ParseSess`, but we can achieve the same behavior by just inlining it everywhere.
Group together more size assertions.
Also add a few more assertions for some relevant token-related types.
And fix an erroneous comment in `rustc_errors`.
r? `@lqd`
Improve errors for incomplete functions in struct definitions
Given the following code:
```rust
fn main() {}
struct Foo {
fn
}
```
[playground](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=29139f870511f6918324be5ddc26c345)
The current output is:
```
Compiling playground v0.0.1 (/playground)
error: functions are not allowed in struct definitions
--> src/main.rs:4:5
|
4 | fn
| ^^
|
= help: unlike in C++, Java, and C#, functions are declared in `impl` blocks
= help: see https://doc.rust-lang.org/book/ch05-03-method-syntax.html for more information
error: could not compile `playground` due to previous error
```
In this case, rustc should suggest escaping `fn` to use it as an identifier.
Migrate more of rustc_parse to SessionDiagnostic
Still far from complete, but I thought I'd add a checkpoint here because rebasing was starting to get annoying.
Rollup of 5 pull requests
Successful merges:
- #102143 (Recover from struct nested in struct)
- #102178 (bootstrap: the backtrace feature is stable, no need to allow it any more)
- #102197 (Stabilize const `BTree{Map,Set}::new`)
- #102267 (Don't set RUSTC in the bootstrap build script)
- #102270 (Remove benches from `rustc_middle`)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
`Cursor` is currently hidden, and the main tokenization path uses
`rustc_lexer::first_token` which involves constructing a new `Cursor`
for every single token, which is weird. Also, `first_token` also can't
handle empty input, so callers have to check for that first.
This commit makes `Cursor` public, so `StringReader` can contain a
`Cursor`, which results in a simpler structure. The commit also changes
`StringReader::advance_token` so it returns an `Option<Token>`,
simplifying the the empty input case.
`TokenTreesReader` wraps a `StringReader`, but the `into_token_trees`
function obscures this. This commit moves to a more straightforward
control flow.
The spacing computation is done in two parts. In the first part
`next_token` and `bump` use `Spacing::Alone` to mean "preceded by
whitespace" and `Spacing::Joint` to mean the opposite. In the second
part `parse_token_tree_other` then adjusts the `spacing` value to mean
the usual thing (i.e. "is the following token joinable punctuation?").
This shift in meaning is very confusing and it took me some time to
understand what was going on.
This commit changes the first part to use a bool, and adds some
comments, which makes things much clearer.
`parse_token_tree` is basically a match with four arms: `Eof`,
`OpenDelim`, `CloseDelim`, and "other". It has two call sites, and at
each call site one of the arms is unreachable. It's also not inlined.
This commit removes `parse_token_tree` by splitting it into four
functions and inlining them. This avoids some repeated conditional
tests and also some non-inlined function calls on the hot path.
FIX - ambiguous Diagnostic link in docs
UPDATE - rename diagnostic_items to IntoDiagnostic and AddToDiagnostic
[Gardening] FIX - formatting via `x fmt`
FIX - rebase conflicts. NOTE: Confirm wheather or not we want to handle TargetDataLayoutErrorsWrapper this way
DELETE - unneeded allow attributes in Handler method
FIX - broken test
FIX - Rebase conflict
UPDATE - rename residual _SessionDiagnostic and fix LintDiag link
On later stages, the feature is already stable.
Result of running:
rg -l "feature.let_else" compiler/ src/librustdoc/ library/ | xargs sed -s -i "s#\\[feature.let_else#\\[cfg_attr\\(bootstrap, feature\\(let_else\\)#"
make `mk_attr_id` part of `ParseSess`
Updates #48685
The current `mk_attr_id` uses the `AtomicU32` type, which is not very efficient and adds a lot of lock contention in a parallel environment.
This PR refers to the task list in #48685, uses `mk_attr_id` as a method of the `AttrIdGenerator` struct, and adds a new field `attr_id_generator` to `ParseSess`.
`AttrIdGenerator` uses the `WorkerLocal`, which has two advantages: 1. `Cell` is more efficient than `AtomicU32`, and does not increase any lock contention. 2. We put the index of the work thread in the first few bits of the generated `AttrId`, so that the `AttrId` generated in different threads can be easily guaranteed to be unique.
cc `@cjgillot`
Initial implementation of dyn*
This PR adds extremely basic and incomplete support for [dyn*](https://smallcultfollowing.com/babysteps//blog/2022/03/29/dyn-can-we-make-dyn-sized/). The goal is to get something in tree behind a flag to make collaboration easier, and also to make sure the implementation so far is not unreasonable. This PR does quite a few things:
* Introduce `dyn_star` feature flag
* Adds parsing for `dyn* Trait` types
* Defines `dyn* Trait` as a sized type
* Adds support for explicit casts, like `42usize as dyn* Debug`
* Including const evaluation of such casts
* Adds codegen for drop glue so things are cleaned up properly when a `dyn* Trait` object goes out of scope
* Adds codegen for method calls, at least for methods that take `&self`
Quite a bit is still missing, but this gives us a starting point. Note that this is never intended to become stable surface syntax for Rust, but rather `dyn*` is planned to be used as an implementation detail for async functions in dyn traits.
Joint work with `@nikomatsakis` and `@compiler-errors.`
r? `@bjorn3`
The primary purpose of this commit is to introduce the
dyn_star flag so we can begin experimenting with implementation.
In order to have something to do in the feature gate test, we also add
parser support for `dyn* Trait` objects. These are currently treated
just like `dyn Trait` objects, but this will change in the future.
Note that for now `dyn* Trait` is experimental syntax to enable
implementing some of the machinery needed for async fn in dyn traits
without fully supporting the feature.
`To` is better than `Create` for indicating that this is a non-consuming
conversion, rather than creating something out of nothing.
And the addition of `Attr` is because the current names makes them sound
like they relate to `TokenStream`, but really they relate to
`AttrTokenStream`.
These two type names are long and have long matching prefixes. I find
them hard to read, especially in combinations like
`AttrAnnotatedTokenStream::new(vec![AttrAnnotatedTokenTree::Token(..)])`.
This commit renames them as `AttrToken{Stream,Tree}`.