Adjust span of fn argument declaration
Span of a fn argument declaration goes from:
```
fn foo(i : i32 , ...)
^^^^^^^^
```
to:
```
fn foo(i : i32 , ...)
^^^^^^^
```
That is, we don't include the extra spacing up to the trailing comma, which I think is more correct.
cc https://github.com/rust-lang/rust/pull/99646#discussion_r944568074
r? ``@estebank``
---
The two tests that had dramatic changes in their rendering I think actually are improved, though they are kinda poor spans both before and after the changes. 🤷 Thoughts?
`Parser::parse_bottom_expr` currently constructs an empty `attrs` and
then passes it to a large number of other functions. This makes the code
harder to read than it should be, because it's not clear that many
`attrs` arguments are always empty.
This commit removes `attrs` and the passing, simplifying a lot of
functions. The commit also renames `Parser::mk_expr` (which takes an
`attrs` argument) as `mk_expr_with_attrs`, and introduces a new
`mk_expr` which creates an expression with no attributes, which is the
more common case.
Stringify non-shorthand visibility correctly
This makes `stringify!(pub(in crate))` evaluate to `pub(in crate)` rather than `pub(crate)`, matching the behavior before the `crate` shorthand was removed. Further, this changes `stringify!(pub(in super))` to evaluate to `pub(in super)` rather than the current `pub(super)`. If the latter is not desired (it is _technically_ breaking), it can be undone.
Fixes#99981
`@rustbot` label +C-bug +regression-from-stable-to-beta +T-compiler
Use `&mut Diagnostic` instead of `&mut DiagnosticBuilder` unless needed
This seems to be the established convention (02ff9e0) when `DiagnosticBuilder` was first added. I am guilty of introducing some of these.
Improve diagnostics for `const a: = expr;`
Adds a suggestion to write a type when there is a colon, but the type is not present.
I've also shrunk spans a little, so the suggestions are a little nicer.
Resolves#100146
r? `@compiler-errors`
Use Parser's `restrictions` instead of `let_expr_allowed`
This also means that the `ALLOW_LET` flag is reset properly for subexpressions, so we can properly deny things like `a && (b && let c = d)`. Also the parser is a tiny bit smaller now.
It doesn't reject _all_ bad `let` expr usages, just a bit more.
cc `@c410-f3r`
From 72 bytes to 12 bytes (on x86-64).
There are two parts to this:
- Changing various source code offsets from 64-bit to 32-bit. This is
not a problem because the rest of rustc also uses 32-bit source code
offsets. This means `Token` is no longer `Copy` but this causes no
problems.
- Removing the `RawStrError` from `LiteralKind`. Raw string literal
invalidity is now indicated by a `None` value within
`RawStr`/`RawByteStr`, and the new `validate_raw_str` function can be
used to re-lex an invalid raw string literal to get the `RawStrError`.
There is one very small change in behaviour. Previously, if a raw string
literal matched both the `InvalidStarter` and `TooManyHashes` cases,
the latter would override the former. This has now changed, because
`raw_double_quoted_string` now uses `?` and so returns immediately upon
detecting the `InvalidStarter` case. I think this is a slight
improvement to report the earlier-detected error, and it explains the
change in the `test_too_many_hashes` test.
The commit also removes a couple of comments that refer to #77629 and
say that the size of these types don't affect performance. These
comments are wrong, though the performance effect is small.
Rollup of 5 pull requests
Successful merges:
- #99311 (change maybe_body_owned_by to take local def id)
- #99862 (Improve type mismatch w/ function signatures)
- #99895 (don't call type ascription "cast")
- #99900 (remove some manual hash stable impls)
- #99903 (Add diagnostic when using public instead of pub)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Add diagnostic when using public instead of pub
Forwarding from https://github.com/rust-lang/rust/pull/99706
I accidentally broke something(??) in git and the commits in that PR are absolutely not what I did in that branch
Anyways, this is the PR for this now. Adding tests again in a minute.
cc `@davidtwco`
Remove `TreeAndSpacing`.
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
r? `@petrochenkov`
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
This commit updates the source_file_to_parser and the
maybe_source_file_to_parse function's doc comments which currently
refer to a config parameter. The doc comments have been updated to
refer to the 'session' parameter similar to the doc comment for
try_file_to_source_file, which also takes a &Session parameter.
Implement `for<>` lifetime binder for closures
This PR implements RFC 3216 ([TI](https://github.com/rust-lang/rust/issues/97362)) and allows code like the following:
```rust
let _f = for<'a, 'b> |a: &'a A, b: &'b B| -> &'b C { b.c(a) };
// ^^^^^^^^^^^--- new!
```
cc ``@Aaron1011`` ``@cjgillot``
diagnostics: error messages when struct literals fail to parse
If an expression is supplied where a field is expected, the parser can become convinced that it's a shorthand field syntax when it's not.
This PR addresses it by explicitly recording the permitted `:` token immediately after the identifier, and also adds a suggestion to insert the name of the field if it looks like a complex expression.
Fixes#98917
Fix last `let_chains` blocker
In order to forbid things like `let x = (let y = 1);` or `if let a = 1 && { let x = let y = 1; } {}`, the parser **HAS** to know the context of `let`.
This context thing is not a surprise in the parser because you can see **a lot** of ad hoc fixes mixing parsing logic with validation logic creating code that looks more like spaghetti with tomato sauce.
To make things even greater, a new ad hoc fix was added to only allow `let`s in a valid `let_chains` context by checking the previously processed token. This was the only solution I could think of and believe me, I thought about it for a long time 👍
In the long term, it should be preferable to segregate different responsibilities or create a more robust and cleaner parser framework.
cc https://github.com/rust-lang/rust/pull/94927
cc https://github.com/rust-lang/rust/issues/53667
Use less string interning
This removes string interning in a couple of places where doing so won't result in perf improvements. I also switched one place to use pre-interned symbols.
Avoid some `&str` to `String` conversions with `MultiSpan::push_span_label`
This patch removes some`&str` to `String` conversions with `MultiSpan::push_span_label`.
The `rustc_lint_diagnostics` attribute is used by the diagnostic
translation/struct migration lints to identify calls where
non-translatable diagnostics or diagnostics outwith impls are being
created. Any function used in creating a diagnostic should be annotated
with this attribute so this commit adds the attribute to many more
functions.
Signed-off-by: David Wood <david.wood@huawei.com>
macros: use typed identifiers in diag and subdiag derive
Using typed identifiers instead of strings with the Fluent identifiers in the diagnostic and subdiagnostic derives - this enables the diagnostic derive to benefit from the compile-time validation that comes with typed identifiers, namely that use of a non-existent Fluent identifier will not compile.
r? `````@oli-obk`````
As in the diagnostic derive, using typed identifiers in the
subdiagnostic derive improves the diagnostics of using the subdiagnostic
derive as Fluent messages will be confirmed to exist at compile-time.
Signed-off-by: David Wood <david.wood@huawei.com>
Using typed identifiers instead of strings with the Fluent identifier
enables the diagnostic derive to benefit from the compile-time
validation that comes with typed identifiers - use of a non-existent
Fluent identifier will not compile.
Signed-off-by: David Wood <david.wood@huawei.com>
Fix pretty printing of empty bound lists in where-clause
Repro:
```rust
macro_rules! assert_item_stringify {
($item:item $expected:literal) => {
assert_eq!(stringify!($item), $expected);
};
}
fn main() {
assert_item_stringify! {
fn f<'a, T>() where 'a:, T: {}
"fn f<'a, T>() where 'a:, T: {}"
}
}
```
Previously this assertion would fail because rustc renders the where-clause as `where 'a, T` which is invalid syntax.
This PR makes the above assertion pass.
This bug also affects `-Zunpretty=expanded`. The intention is for that to emit syntactically valid code, but the buggy output is not valid Rust syntax.
```console
$ rustc <(echo "fn f<'a, T>() where 'a:, T: {}") -Zunpretty=expanded
#![feature(prelude_import)]
#![no_std]
#[prelude_import]
use ::std::prelude::rust_2015::*;
#[macro_use]
extern crate std;
fn f<'a, T>() where 'a, T {}
```
```console
$ rustc <(echo "fn f<'a, T>() where 'a:, T: {}") -Zunpretty=expanded | rustc -
error: expected `:`, found `,`
--> <anon>:7:23
|
7 | fn f<'a, T>() where 'a, T {}
| ^ expected `:`
```
Improve parser diagnostics
This pr fixes https://github.com/rust-lang/rust/issues/93867 and contains a couple of diagnostics related changes to the parser.
Here is a short list with some of the changes:
- don't suggest the same thing that is the current token
- suggest removing the current token if the following token is one of the suggestions (maybe incorrect)
- tell the user to put a type or lifetime after where if there is none (as a warning)
- reduce the amount of tokens suggested (via the new eat_noexpect and check_noexpect methods)
If any of these changes are undesirable, i can remove them, thanks!
Recover missing comma after match arm
If we're missing a comma after a match arm expression, try parsing another pattern and a following `=>`. If we find both of those, then recover by suggesting to insert a `,`.
Fixes#80112
Move conditions out of recover/report functions.
`Parser` has six recover/report functions that are passed a boolean, and
nothing is done if the boolean has a particular value.
This PR moves the tests outside the functions. This has the following effects.
- The number of lines of code goes down.
- Some `use` items become shorter.
- Avoids the strangeness whereby 11 out of 12 calls to
`maybe_recover_from_bad_qpath` pass `true` as the second argument.
- Makes it clear at the call site that only one of
`maybe_recover_from_bad_type_plus` and `maybe_report_ambiguous_plus` will be
run.
r? `@estebank`
Rollup of 6 pull requests
Successful merges:
- #89685 (refactor: VecDeques Iter fields to private)
- #97172 (Optimize the diagnostic generation for `extern unsafe`)
- #97395 (Miri call ABI check: ensure type size+align stay the same)
- #97431 (don't do `Sized` and other return type checks on RPIT's real type)
- #97555 (Source code page: line number click adds `NaN`)
- #97558 (Fix typos in comment)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Optimize the diagnostic generation for `extern unsafe`
This PR does the following about diagnostic generation when parsing foreign mod:
1. Fixes the FIXME about avoiding depending on the error message text.
2. Continue parsing when `unsafe` is followed by `{` (just like `unsafe extern {...}`).
3. Add test case.
To render the message of a Fluent attribute, the identifier of the
Fluent message must be known. `DiagnosticMessage::FluentIdentifier`
contains both the message's identifier and optionally the identifier of
an attribute. Generated constants for each attribute would therefore
need to be named uniquely (amongst all error messages) or be able to
refer to only the attribute identifier which will be combined with a
message identifier later. In this commit, the latter strategy is
implemented as part of the `Diagnostic` type's functions for adding
subdiagnostics of various kinds.
Signed-off-by: David Wood <david.wood@huawei.com>
Parse expression after `else` as a condition if followed by `{`
Fixes#49361.
Two things:
1. This wording needs help. I can never find a natural/intuitive phrasing when I write diagnostics 😅
2. Do we even want to show the "wrap in braces" case? I would assume most of the time the "add an `if`" case is the right one.
Remove hacks in `make_token_stream`.
`make_tokenstream` has three commented hacks, and a comment at the top
referring to #67062. These hacks have no observable effect, at least as judged
by running the test suite. The hacks were added in #82608, with an explanation
[here](https://github.com/rust-lang/rust/pull/82608#issuecomment-812877329). It
appears that one of the following is true: (a) they never did anything useful,
(b) they do something useful but we have no test coverage for them, or (c)
something has changed in the meantime that means they are no longer necessary.
This commit removes the hacks and the comments, in the hope that (b) is not
true.
r? `@Aaron1011`
Begin fixing all the broken doctests in `compiler/`
Begins to fix#95994.
All of them pass now but 24 of them I've marked with `ignore HELP (<explanation>)` (asking for help) as I'm unsure how to get them to work / if we should leave them as they are.
There are also a few that I marked `ignore` that could maybe be made to work but seem less important.
Each `ignore` has a rough "reason" for ignoring after it parentheses, with
- `(pseudo-rust)` meaning "mostly rust-like but contains foreign syntax"
- `(illustrative)` a somewhat catchall for either a fragment of rust that doesn't stand on its own (like a lone type), or abbreviated rust with ellipses and undeclared types that would get too cluttered if made compile-worthy.
- `(not-rust)` stuff that isn't rust but benefits from the syntax highlighting, like MIR.
- `(internal)` uses `rustc_*` code which would be difficult to make work with the testing setup.
Those reason notes are a bit inconsistently applied and messy though. If that's important I can go through them again and try a more principled approach. When I run `rg '```ignore \(' .` on the repo, there look to be lots of different conventions other people have used for this sort of thing. I could try unifying them all if that would be helpful.
I'm not sure if there was a better existing way to do this but I wrote my own script to help me run all the doctests and wade through the output. If that would be useful to anyone else, I put it here: https://github.com/Elliot-Roberts/rust_doctest_fixing_tool
Overhaul `MacArgs`
Motivation:
- Clarify some code that I found hard to understand.
- Eliminate one use of three places where `TokenKind::Interpolated` values are created.
r? `@petrochenkov`
The value in `MacArgs::Eq` is currently represented as a `Token`.
Because of `TokenKind::Interpolated`, `Token` can be either a token or
an arbitrary AST fragment. In practice, a `MacArgs::Eq` starts out as a
literal or macro call AST fragment, and then is later lowered to a
literal token. But this is very non-obvious. `Token` is a much more
general type than what is needed.
This commit restricts things, by introducing a new type `MacArgsEqKind`
that is either an AST expression (pre-lowering) or an AST literal
(post-lowering). The downside is that the code is a bit more verbose in
a few places. The benefit is that makes it much clearer what the
possibilities are (though also shorter in some other places). Also, it
removes one use of `TokenKind::Interpolated`, taking us a step closer to
removing that variant, which will let us make `Token` impl `Copy` and
remove many "handle Interpolated" code paths in the parser.
Things to note:
- Error messages have improved. Messages like this:
```
unexpected token: `"bug" + "found"`
```
now say "unexpected expression", which makes more sense. Although
arbitrary expressions can exist within tokens thanks to
`TokenKind::Interpolated`, that's not obvious to anyone who doesn't
know compiler internals.
- In `parse_mac_args_common`, we no longer need to collect tokens for
the value expression.
Using an obviously-placeholder syntax. An RFC would still be needed before this could have any chance at stabilization, and it might be removed at any point.
But I'd really like to have it in nightly at least to ensure it works well with try_trait_v2, especially as we refactor the traits.
`make_tokenstream` has three commented hacks, and a comment at the top
referring to #67062. These hacks have no observable effect, at least as judged
by running the test suite. The hacks were added in #82608, with an explanation
[here](https://github.com/rust-lang/rust/pull/82608#issuecomment-812877329). It
appears that one of the following is true: (a) they never did anything useful,
(b) they do something useful but we have no test coverage for them, or (c)
something has changed in the meantime that means they are no longer necessary.
This commit removes the hacks and the comments, in the hope that (b) is not
true.
Change `span_suggestion` (and variants) to take `impl ToString` rather
than `String` for the suggested code, as this simplifies the
requirements on the diagnostic derive.
Signed-off-by: David Wood <david.wood@huawei.com>
rustc_ast: Harmonize delimiter naming with `proc_macro::Delimiter`
Compiler cannot reuse `proc_macro::Delimiter` directly due to extra impls, but can at least use the same naming.
After this PR the only difference between these two enums is that `proc_macro::Delimiter::None` is turned into `token::Delimiter::Invisible`.
It's my mistake that the invisible delimiter is called `None` on stable, during the stabilization I audited the naming and wrote the docs, but missed the fact that the `None` naming gives a wrong and confusing impression about what this thing is.
cc https://github.com/rust-lang/rust/pull/96421
r? ``@nnethercote``
Less `NoDelim`
Currently there are several places where `NoDelim` (which really means "implicit delimiter" or "invisible delimiter") is used to mean "no delimiter". The name `NoDelim` is a bit misleading, and may be a cause.
This PR changes these places, e.g. by changing a `DelimToken` to `Option<DelimToken>` and then using `None` to mean "no delimiter". As a result, the *only* place where `NoDelim` values are now produced is within:
- `Delimiter::to_internal()`, when converting from `Delimiter::None`.
- `FlattenNonterminals::process_token()`, when converting `TokenKind::Interpolated`.
r? ````@petrochenkov````
This lets us clone just the parts within a `TokenTree` that need
cloning, rather than the entire thing. This is a surprisingly large
performance win, up to 4% on `async-std-1.10.0`.
This makes `CloseDelim` handling more like `OpenDelim` handling, which
produces `OpenDelim` and pushes the stack at the same time. It requires
some adjustment to `parse_token_tree` now that we don't remain within
the frame after getting the `CloseDelim`.
A Google search of the error message fails to return any relevant
resuts, suggesting this has never occurred in practice. And removeing it
reduces instruction counts by up to 2% on some benchmarks.
The loop is there to handle a `NoDelim` open/close token. This commit
changes `TokenCursor::inlined_next` so it never returns such a token.
This is a performance win because the conditional test in `bump()` is
removed.
If the parser needs changing in the future to handle `NoDelim` tokens,
then `inlined_next()` can easily be changed to return them.
The `DelimToken` here is `NoDelim`, which means the returned delim
tokens will just be ignored by `Parser::bump()`. This commit changes
things so the delim tokens won't be returned.
This will facilitate the change in the next commit.
`boolean` arguments aren't great, but the function is only used in three
places within this one file.
In particular, avoid wrapping a token within `TokenTree::Token` and then
immediately matching it and returning the token within. Just return the
token immediately.
Parse inner attributes on inline const block
According to https://github.com/rust-lang/rust/pull/84414#issuecomment-826150936, inner attributes are intended to be supported *"in all containers for statements (or some subset of statements)"*.
This PR adds inner attribute parsing and pretty-printing for inline const blocks (https://github.com/rust-lang/rust/issues/76001), which contain statements just like an unsafe block or a loop body.
```rust
let _ = const {
#![allow(...)]
let x = ();
x
};
```
Improve diagnostics for unterminated nested block comment
close#95283
(This is my first time try to messing around with rust compiler and might get a lot of things wrong... 🙇 )
By heap allocating the argument within `NtPath`, `NtVis`, and `NtStmt`.
This slightly reduces cumulative and peak allocation amounts, most
notably on `deep-vector`.
This commit updates the signatures of all diagnostic functions to accept
types that can be converted into a `DiagnosticMessage`. This enables
existing diagnostic calls to continue to work as before and Fluent
identifiers to be provided. The `SessionDiagnostic` derive just
generates normal diagnostic calls, so these APIs had to be modified to
accept Fluent identifiers.
In addition, loading of the "fallback" Fluent bundle, which contains the
built-in English messages, has been implemented.
Each diagnostic now has "arguments" which correspond to variables in the
Fluent messages (necessary to render a Fluent message) but no API for
adding arguments has been added yet. Therefore, diagnostics (that do not
require interpolation) can be converted to use Fluent identifiers and
will be output as before.
`MultiSpan` contains labels, which are more complicated with the
introduction of diagnostic translation and will use types from
`rustc_errors` - however, `rustc_errors` depends on `rustc_span` so
`rustc_span` cannot use types like `DiagnosticMessage` without
dependency cycles. Introduce a new `rustc_error_messages` crate that can
contain `DiagnosticMessage` and `MultiSpan`.
Signed-off-by: David Wood <david.wood@huawei.com>
Introduce a `DiagnosticMessage` type that will enable diagnostic
messages to be simple strings or Fluent identifiers.
`DiagnosticMessage` is now used in the implementation of the standard
`DiagnosticBuilder` APIs.
Signed-off-by: David Wood <david.wood@huawei.com>
Suggest `i += 1` when we see `i++` or `++i`
Closes#83502 (for `i++` and `++i`; `--i` should be covered by #82987, and `i--`
is tricky to handle).
This is a continuation of #83536.
r? `@estebank`
Fix multiline attributes handling in doctests
Fixes#55713.
I needed to have access to the `unclosed_delims` field in order to check that the attribute was completely parsed and didn't have missing parts, so I created a getter for it.
r? `@notriddle`
Spellchecking compiler comments
This PR cleans up the rest of the spelling mistakes in the compiler comments. This PR does not change any literal or code spelling issues.