When a macro is used in the trailing expression position of a block
(e.g. `fn foo() { my_macro!() }`), we currently parse it as an
expression, rather than a statement. As a result, we ended up
using the `NodeId` of the containing statement as our `lint_node_id`,
even though we don't normally do this for macro calls.
If such a macro expands to an expression with a `#[cfg]` attribute,
then the trailing statement can get removed entirely. This lead to
an ICE, since we were usng the `NodeId` of the expression to emit
a lint.
Ths commit makes us skip updating `lint_node_id` when handling
a macro in trailing expression position. This will cause us to
lint at the closest parent of the macro call.
Instead of updating global state to mark attributes as used,
we now explicitly emit a warning when an attribute is used in
an unsupported position. As a side effect, we are to emit more
detailed warning messages (instead of just a generic "unused" message).
`Session.check_name` is removed, since its only purpose was to mark
the attribute as used. All of the callers are modified to use
`Attribute.has_name`
Additionally, `AttributeType::AssumedUsed` is removed - an 'assumed
used' attribute is implemented by simply not performing any checks
in `CheckAttrVisitor` for a particular attribute.
We no longer emit unused attribute warnings for the `#[rustc_dummy]`
attribute - it's an internal attribute used for tests, so it doesn't
mark sense to treat it as 'unused'.
With this commit, a large source of global untracked state is removed.
Fixes#87877
This change interacts badly with `noop_flat_map_stmt`,
which synthesizes multiple statements with the same `NodeId`.
I'm working on a better fix that will still allow us to
remove this special case. For now, let's revert the change
to fix the ICE.
This reverts commit a4262cc984, reversing
changes made to 8ee962f88e.
Display an extra note for trailing semicolon lint with trailing macro
Currently, we parse macros at the end of a block
(e.g. `fn foo() { my_macro!() }`) as expressions, rather than
statements. This means that a macro invoked in this position
cannot expand to items or semicolon-terminated expressions.
In the future, we might want to start parsing these kinds of macros
as statements. This would make expansion more 'token-based'
(i.e. macro expansion behaves (almost) as if you just textually
replaced the macro invocation with its output). However,
this is a breaking change (see PR #78991), so it will require
further discussion.
Since the current behavior will not be changing any time soon,
we need to address the interaction with the
`SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` lint. Since we are parsing
the result of macro expansion as an expression, we will emit a lint
if there's a trailing semicolon in the macro output. However, this
results in a somewhat confusing message for users, since it visually
looks like there should be no problem with having a semicolon
at the end of a block
(e.g. `fn foo() { my_macro!() }` => `fn foo() { produced_expr; }`)
To help reduce confusion, this commit adds a note explaining
that the macro is being interpreted as an expression. Additionally,
we suggest adding a semicolon after the macro *invocation* - this
will cause us to parse the macro call as a statement. We do *not*
use a structured suggestion for this, since the user may actually
want to remove the semicolon from the macro definition (allowing
the block to evaluate to the expression produced by the macro).
Currently, we parse macros at the end of a block
(e.g. `fn foo() { my_macro!() }`) as expressions, rather than
statements. This means that a macro invoked in this position
cannot expand to items or semicolon-terminated expressions.
In the future, we might want to start parsing these kinds of macros
as statements. This would make expansion more 'token-based'
(i.e. macro expansion behaves (almost) as if you just textually
replaced the macro invocation with its output). However,
this is a breaking change (see PR #78991), so it will require
further discussion.
Since the current behavior will not be changing any time soon,
we need to address the interaction with the
`SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` lint. Since we are parsing
the result of macro expansion as an expression, we will emit a lint
if there's a trailing semicolon in the macro output. However, this
results in a somewhat confusing message for users, since it visually
looks like there should be no problem with having a semicolon
at the end of a block
(e.g. `fn foo() { my_macro!() }` => `fn foo() { produced_expr; }`)
To help reduce confusion, this commit adds a note explaining
that the macro is being interpreted as an expression. Additionally,
we suggest adding a semicolon after the macro *invocation* - this
will cause us to parse the macro call as a statement. We do *not*
use a structured suggestion for this, since the user may actually
want to remove the semicolon from the macro definition (allowing
the block to evaluate to the expression produced by the macro).
These attributes are currently discarded.
This may change in the future (see #63221), but for now,
placing inert attributes on a macro invocation does nothing,
so we should warn users about it.
Technically, it's possible for there to be attribute macro
on the same macro invocation (or at a higher scope), which
inspects the inert attribute. For example:
```rust
#[look_for_inline_attr]
#[inline]
my_macro!()
#[look_for_nested_inline]
mod foo { #[inline] my_macro!() }
```
However, this would be a very strange thing to do.
Anyone running into this can manually suppress the warning.
When we need to emit a lint at a macro invocation, we currently use the
`NodeId` of its parent definition (e.g. the enclosing function). This
means that any `#[allow]` / `#[deny]` attributes placed 'closer' to the
macro (e.g. on an enclosing block or statement) will have no effect.
This commit computes a better `lint_node_id` in `InvocationCollector`.
When we visit/flat_map an AST node, we assign it a `NodeId` (earlier
than we normally would), and store than `NodeId` in current
`ExpansionData`. When we collect a macro invocation, the current
`lint_node_id` gets cloned along with our `ExpansionData`, allowing it
to be used if we need to emit a lint later on.
This improves the handling of `#[allow]` / `#[deny]` for
`SEMICOLON_IN_EXPRESSIONS_FROM_MACROS` and some `asm!`-related lints.
The 'legacy derive helpers' lint retains its current behavior
(I've inlined the now-removed `lint_node_id` function), since
there isn't an `ExpansionData` readily available.
Rollup of 6 pull requests
Successful merges:
- #87085 (Search result colors)
- #87090 (Make BTreeSet::split_off name elements like other set methods do)
- #87098 (Unignore some pretty printing tests)
- #87099 (Upgrade `cc` crate to 1.0.69)
- #87101 (Suggest a path separator if a stray colon is found in a match arm)
- #87102 (Add GUI test for "go to first" feature)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
This PR modifies the macro expansion infrastructure to handle attributes
in a fully token-based manner. As a result:
* Derives macros no longer lose spans when their input is modified
by eager cfg-expansion. This is accomplished by performing eager
cfg-expansion on the token stream that we pass to the derive
proc-macro
* Inner attributes now preserve spans in all cases, including when we
have multiple inner attributes in a row.
This is accomplished through the following changes:
* New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced.
These are very similar to a normal `TokenTree`, but they also track
the position of attributes and attribute targets within the stream.
They are built when we collect tokens during parsing.
An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when
we invoke a macro.
* Token capturing and `LazyTokenStream` are modified to work with
`AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which
is created during the parsing of a nested AST node to make the 'outer'
AST node aware of the attributes and attribute target stored deeper in the token stream.
* When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`),
we tokenize and reparse our target, capturing additional information about the locations of
`#[cfg]` and `#[cfg_attr]` attributes at any depth within the target.
This is a performance optimization, allowing us to perform less work
in the typical case where captured tokens never have eager cfg-expansion run.
expand: Do not ICE when a legacy AST-based macro attribute produces and empty expression
Fixes https://github.com/rust-lang/rust/issues/80251
The reported error is the same as for `let _ = #[cfg(FALSE)] EXPR;`
StructField -> FieldDef ("field definition")
Field -> ExprField ("expression field", not "field expression")
FieldPat -> PatField ("pattern field", not "field pattern")
Also rename visiting and other methods working on them.
When token-based attribute handling is implemeneted in #80689,
we will need to access tokens from `HasAttrs` (to perform
cfg-stripping), and we will to access attributes from `HasTokens` (to
construct a `PreexpTokenStream`).
This PR merges the `HasAttrs` and `HasTokens` traits into a new
`AstLike` trait. The previous `HasAttrs` impls from `Vec<Attribute>` and `AttrVec`
are removed - they aren't attribute targets, so the impls never really
made sense.
Crate root is sufficiently different from `mod` items, at least at syntactic level.
Also remove customization point for "`mod` item or crate root" from AST visitors.
Along the way, we also implement a handful of diagnostics improvements
and fixes, particularly with respect to the special handling of `||` in
place of `|` and when there are leading verts in function params, which
don't allow top-level or-patterns anyway.
Fixes#81543
After we expand a macro, we try to parse the resulting tokens as a AST
node. This commit makes several improvements to how we handle spans when
an error occurs:
* Only ovewrite the original `Span` if it's a dummy span. This preserves
a more-specific span if one is available.
* Use `self.prev_token` instead of `self.token` when emitting an error
message after encountering EOF, since an EOF token always has a dummy
span
* Make `SourceMap::next_point` leave dummy spans unused. A dummy span
does not have a logical 'next point', since it's a zero-length span.
Re-using the span span preserves its 'dummy-ness' for other checks
Make `-Z time-passes` less noisy
- Add the module name to `pre_AST_expansion_passes` and don't make it a
verbose event (since it normally doesn't take very long, and it's
emitted many times)
- Don't make the following rustdoc events verbose; they're emitted many times.
+ build_extern_trait_impl
+ build_local_trait_impl
+ build_primitive_trait_impl
+ get_auto_trait_impls
+ get_blanket_trait_impls
- Remove the `get_auto_trait_and_blanket_synthetic_impls` rustdoc event; it's wholly
covered by get_{auto,blanket}_trait_impls and not very useful.
I found this while working on https://github.com/rust-lang/rust/pull/81275 but it's independent of those changes.
- Add the module name to `pre_AST_expansion_passes` and don't make it a
verbose event (since it normally doesn't take very long, and it's
emitted many times)
- Don't make the following rustdoc events verbose; they're emitted many times.
+ build_extern_trait_impl
+ build_local_trait_impl
+ build_primitive_trait_impl
+ get_auto_trait_impls
+ get_blanket_trait_impls
- Remove `get_auto_trait_and_blanket_synthetic_impls`; it's wholly
covered by get_{auto,blanket}_trait_impls and not very useful.
Fixes#81007
Previously, we would fail to collect tokens in the proper place when
only builtin attributes were present. As a result, we would end up with
attribute tokens in the collected `TokenStream`, leading to duplication
when we attempted to prepend the attributes from the AST node.
We now explicitly track when token collection must be performed due to
nomterminal parsing.
Fix some clippy lints
Happy to revert these if you think they're less readable, but personally I like them better now (especially the `else { if { ... } }` to `else if { ... }` change).
We now collect tokens for the underlying node wrapped by `StmtKind`
instead of storing tokens directly in `Stmt`.
`LazyTokenStream` now supports capturing a trailing semicolon after it
is initially constructed. This allows us to avoid refactoring statement
parsing to wrap the parsing of the semicolon in `parse_tokens`.
Attributes on item statements
(e.g. `fn foo() { #[bar] struct MyStruct; }`) are now treated as
item attributes, not statement attributes, which is consistent with how
we handle attributes on other kinds of statements. The feature-gating
code is adjusted so that proc-macro attributes are still allowed on item
statements on stable.
Two built-in macros (`#[global_allocator]` and `#[test]`) needed to be
adjusted to support being passed `Annotatable::Stmt`.
Do not collect tokens for doc comments
Doc comment is a single token and AST has all the information to re-create it precisely.
Doc comments are also responsible for majority of calls to `collect_tokens` (with `num_calls == 1` and `num_calls == 0`, cc https://github.com/rust-lang/rust/pull/78736).
(I also moved token collection into `fn parse_attribute` to deduplicate code a bit.)
r? `@Aaron1011`
rustc_ast: Do not panic by default when visiting macro calls
Panicking by default made sense when we didn't have HIR or MIR and everything worked on AST, but now all AST visitors run early and majority of them have to deal with macro calls, often by ignoring them.
The second commit renames `visit_mac` to `visit_mac_call`, the corresponding structures were renamed earlier in https://github.com/rust-lang/rust/pull/69589.
Split out statement attributes changes from #78306
This is the same as PR https://github.com/rust-lang/rust/pull/78306, but `unused_doc_comments` is modified to explicitly ignore statement items (which preserves the current behavior).
This shouldn't have any user-visible effects, so it can be landed without lang team discussion.
---------
When the 'early' and 'late' visitors visit an attribute target, they
activate any lint attributes (e.g. `#[allow]`) that apply to it.
This can affect warnings emitted on sibiling attributes. For example,
the following code does not produce an `unused_attributes` for
`#[inline]`, since the sibiling `#[allow(unused_attributes)]` suppressed
the warning.
```rust
trait Foo {
#[allow(unused_attributes)] #[inline] fn first();
#[inline] #[allow(unused_attributes)] fn second();
}
```
However, we do not do this for statements - instead, the lint attributes
only become active when we visit the struct nested inside `StmtKind`
(e.g. `Item`).
Currently, this is difficult to observe due to another issue - the
`HasAttrs` impl for `StmtKind` ignores attributes for `StmtKind::Item`.
As a result, the `unused_doc_comments` lint will never see attributes on
item statements.
This commit makes two interrelated fixes to the handling of inert
(non-proc-macro) attributes on statements:
* The `HasAttr` impl for `StmtKind` now returns attributes for
`StmtKind::Item`, treating it just like every other `StmtKind`
variant. The only place relying on the old behavior was macro
which has been updated to explicitly ignore attributes on item
statements. This allows the `unused_doc_comments` lint to fire for
item statements.
* The `early` and `late` lint visitors now activate lint attributes when
invoking the callback for `Stmt`. This ensures that a lint
attribute (e.g. `#[allow(unused_doc_comments)]`) can be applied to
sibiling attributes on an item statement.
For now, the `unused_doc_comments` lint is explicitly disabled on item
statements, which preserves the current behavior. The exact locatiosn
where this lint should fire are being discussed in PR #78306
When the 'early' and 'late' visitors visit an attribute target, they
activate any lint attributes (e.g. `#[allow]`) that apply to it.
This can affect warnings emitted on sibiling attributes. For example,
the following code does not produce an `unused_attributes` for
`#[inline]`, since the sibiling `#[allow(unused_attributes)]` suppressed
the warning.
```rust
trait Foo {
#[allow(unused_attributes)] #[inline] fn first();
#[inline] #[allow(unused_attributes)] fn second();
}
```
However, we do not do this for statements - instead, the lint attributes
only become active when we visit the struct nested inside `StmtKind`
(e.g. `Item`).
Currently, this is difficult to observe due to another issue - the
`HasAttrs` impl for `StmtKind` ignores attributes for `StmtKind::Item`.
As a result, the `unused_doc_comments` lint will never see attributes on
item statements.
This commit makes two interrelated fixes to the handling of inert
(non-proc-macro) attributes on statements:
* The `HasAttr` impl for `StmtKind` now returns attributes for
`StmtKind::Item`, treating it just like every other `StmtKind`
variant. The only place relying on the old behavior was macro
which has been updated to explicitly ignore attributes on item
statements. This allows the `unused_doc_comments` lint to fire for
item statements.
* The `early` and `late` lint visitors now activate lint attributes when
invoking the callback for `Stmt`. This ensures that a lint
attribute (e.g. `#[allow(unused_doc_comments)]`) can be applied to
sibiling attributes on an item statement.
For now, the `unused_doc_comments` lint is explicitly disabled on item
statements, which preserves the current behavior. The exact locatiosn
where this lint should fire are being discussed in PR #78306
This allows us to avoid synthesizing tokens in `prepend_attr`, since we
have the original tokens available.
We still need to synthesize tokens when expanding `cfg_attr`,
but this is an unavoidable consequence of the syntax of `cfg_attr` -
the user does not supply the `#` and `[]` tokens that a `cfg_attr`
expands to.
This approach lives exclusively in the parser, so struct expr bodies
that are syntactically correct on their own but are otherwise incorrect
will still emit confusing errors, like in the following case:
```rust
fn foo() -> Foo {
bar: Vec::new()
}
```
```
error[E0425]: cannot find value `bar` in this scope
--> src/file.rs:5:5
|
5 | bar: Vec::new()
| ^^^ expecting a type here because of type ascription
error[E0214]: parenthesized type parameters may only be used with a `Fn` trait
--> src/file.rs:5:15
|
5 | bar: Vec::new()
| ^^^^^ only `Fn` traits may use parentheses
error[E0107]: wrong number of type arguments: expected 1, found 0
--> src/file.rs:5:10
|
5 | bar: Vec::new()
| ^^^^^^^^^^ expected 1 type argument
```
If that field had a trailing comma, that would be a parse error and it
would trigger the new, more targetted, error:
```
error: struct literal body without path
--> file.rs:4:17
|
4 | fn foo() -> Foo {
| _________________^
5 | | bar: Vec::new(),
6 | | }
| |_^
|
help: you might have forgotten to add the struct literal inside the block
|
4 | fn foo() -> Foo { Path {
5 | bar: Vec::new(),
6 | } }
|
```
Partially address last part of #34255.
We currently only attach tokens when parsing a `:stmt` matcher for a
`macro_rules!` macro. Proc-macro attributes on statements are still
unstable, and need additional work.
Fixes#75050
Previously, we would unconditionally suppress the panic hook during
proc-macro execution. This commit adds a new flag
-Z proc-macro-backtrace, which allows running the panic hook for
easier debugging.