A recent PR increased the size, which caused regressions. This uses the
existing generic infrastructure to differentiate between the hot path
and the diagnostics path.
Encode spans relative to the enclosing item -- enable on nightly
Follow-up to #84373 with the flag `-Zincremental-relative-spans` set by default.
This PR seeks to remove one of the main shortcomings of incremental: the handling of spans.
Changing the contents of a function may require redoing part of the compilation process for another function in another file because of span information is changed.
Within one file: all the spans in HIR change, so typechecking had to be re-done.
Between files: spans of associated types/consts/functions change, so type-based resolution needs to be re-done (hygiene information is stored in the span).
The flag `-Zincremental-relative-spans` encodes local spans relative to the span of an item, stored inside the `source_span` query.
Trap: stashed diagnostics are referenced by the "raw" span, so stealing them requires to remove the span's parent.
In order to avoid too much traffic in the span interner, span encoding uses the `ctxt_or_tag` field to encode:
- the parent when the `SyntaxContext` is 0;
- the `SyntaxContext` when the parent is `None`.
Even with this, the PR creates a lot of traffic to the Span interner, when a Span has both a LocalDefId parent and a non-root SyntaxContext. They appear in lowering, when we add a parent to all spans, including those which come from macros, and during inlining when we mark inlined spans.
The last commit changes how queries of `LocalDefId` manage their cache. I can put this in a separate PR if required.
Possible future directions:
- validate that all spans are marked in HIR validation;
- mark macro-expanded spans relative to the def-site and not the use-site.
Properly calculate best failure in macro matching
Previously, we used spans. This was not good. Sometimes, the span of the token that failed to match may come from a position later in the file which has been transcribed into a token stream way earlier in the file. If precisely this token fails to match, we think that it was the best match because its span is so high, even though other arms might have gotten further in the token stream.
We now try to properly use the location in the token stream.
This needs a little cleanup as the `best_failure` field is getting out of hand but it should be mostly good to go. I hope I didn't violate too many abstraction boundaries..
Previously, we used spans. This was not good. Sometimes, the span of the
token that failed to match may come from a position later in the file
which has been transcribed into a token stream way earlier in the file.
If precisely this token fails to match, we think that it was the best
match because its span is so high, even though other arms might have
gotten further in the token stream.
We now try to properly use the location in the token stream.
Remove `token::Lit` from `ast::MetaItemLit`.
Currently `ast::MetaItemLit` represents the literal kind twice. This PR removes that redundancy. Best reviewed one commit at a time.
r? `@petrochenkov`
compiler: remove unnecessary imports and qualified paths
Some of these imports were necessary before Edition 2021, others were already in the prelude.
I hope it's fine that this PR is so spread-out across files :/
This migrates everything but the `mbe` and `proc_macro` modules. It also
contains a few cleanups and drive-by/accidental diagnostic improvements
which can be seen in the diff for the UI tests.
This is required to distinguish between cooked and raw byte string
literals in an `ast::LitKind`, without referring to an adjacent
`token::Lit`. It's a prerequisite for the next commit.
There is code for converting `Attribute` (syntactic) to `MetaItem`
(semantic). There is also code for the reverse direction. The reverse
direction isn't really necessary; it's currently only used when
generating attributes, e.g. in `derive` code.
This commit adds some new functions for creating `Attributes`s directly,
without involving `MetaItem`s: `mk_attr_word`, `mk_attr_name_value_str`,
`mk_attr_nested_word`, and
`ExtCtxt::attr_{word,name_value_str,nested_word}`.
These new methods replace the old functions for creating `Attribute`s:
`mk_attr_inner`, `mk_attr_outer`, and `ExtCtxt::attribute`. Those
functions took `MetaItem`s as input, and relied on many other functions
that created `MetaItems`, which are also removed: `mk_name_value_item`,
`mk_list_item`, `mk_word_item`, `mk_nested_word_item`,
`{MetaItem,MetaItemKind,NestedMetaItem}::token_trees`,
`MetaItemKind::attr_args`, `MetaItemLit::{from_lit_kind,to_token}`,
`ExtCtxt::meta_word`.
Overall this cuts more than 100 lines of code and makes thing simpler.
`check_builtin_attribute` calls `parse_meta` to convert an `Attribute`
to a `MetaItem`, which it then checks. However, many callers of
`check_builtin_attribute` start with a `MetaItem`, and then convert it
to an `Attribute` by calling `cx.attribute(meta_item)`. This `MetaItem`
to `Attribute` to `MetaItem` conversion is silly.
This commit adds a new function `check_builtin_meta_item`, which can be
called instead from these call sites. `check_builtin_attribute` also now
calls it. The commit also renames `check_meta` as `check_attr` to better
match its arguments.
Change multiline span ASCII art visual order
Tweak the ASCII art for nested multiline spans so that we minimize line overlaps.
Partially addresses https://github.com/rust-lang/rust/issues/61017.
`Lit::from_included_bytes` calls `Lit::from_lit_kind`, but the two call
sites only need the resulting `token::Lit`, not the full `ast::Lit`.
This commit changes those call sites to use `LitKind::to_token_lit`,
which means `from_included_bytes` can be removed.
Split `MacArgs` in two.
`MacArgs` is an enum with three variants: `Empty`, `Delimited`, and `Eq`. It's used in two ways:
- For representing attribute macro arguments (e.g. in `AttrItem`), where all three variants are used.
- For representing function-like macros (e.g. in `MacCall` and `MacroDef`), where only the `Delimited` variant is used.
In other words, `MacArgs` is used in two quite different places due to them having partial overlap. I find this makes the code hard to read. It also leads to various unreachable code paths, and allows invalid values (such as accidentally using `MacArgs::Empty` in a `MacCall`).
This commit splits `MacArgs` in two:
- `DelimArgs` is a new struct just for the "delimited arguments" case. It is now used in `MacCall` and `MacroDef`.
- `AttrArgs` is a renaming of the old `MacArgs` enum for the attribute macro case. Its `Delimited` variant now contains a `DelimArgs`.
Various other related things are renamed as well.
These changes make the code clearer, avoids several unreachable paths, and disallows the invalid values.
r? `@petrochenkov`
`MacArgs` is an enum with three variants: `Empty`, `Delimited`, and `Eq`. It's
used in two ways:
- For representing attribute macro arguments (e.g. in `AttrItem`), where all
three variants are used.
- For representing function-like macros (e.g. in `MacCall` and `MacroDef`),
where only the `Delimited` variant is used.
In other words, `MacArgs` is used in two quite different places due to them
having partial overlap. I find this makes the code hard to read. It also leads
to various unreachable code paths, and allows invalid values (such as
accidentally using `MacArgs::Empty` in a `MacCall`).
This commit splits `MacArgs` in two:
- `DelimArgs` is a new struct just for the "delimited arguments" case. It is
now used in `MacCall` and `MacroDef`.
- `AttrArgs` is a renaming of the old `MacArgs` enum for the attribute macro
case. Its `Delimited` variant now contains a `DelimArgs`.
Various other related things are renamed as well.
These changes make the code clearer, avoids several unreachable paths, and
disallows the invalid values.
Use `token::Lit` in `ast::ExprKind::Lit`.
Instead of `ast::Lit`.
Literal lowering now happens at two different times. Expression literals are lowered when HIR is crated. Attribute literals are lowered during parsing.
r? `@petrochenkov`
Instead of `ast::Lit`.
Literal lowering now happens at two different times. Expression literals
are lowered when HIR is crated. Attribute literals are lowered during
parsing.
This commit changes the language very slightly. Some programs that used
to not compile now will compile. This is because some invalid literals
that are removed by `cfg` or attribute macros will no longer trigger
errors. See this comment for more details:
https://github.com/rust-lang/rust/pull/102944#issuecomment-1277476773
Show note where the macro failed to match
When feeding the wrong tokens, it used to fail with a very generic error that wasn't very helpful. This change tries to help by noting where specifically the matching went wrong.
```rust
macro_rules! uwu {
(a a a b) => {};
}
uwu! { a a a c }
```
```diff
error: no rules expected the token `c`
--> macros.rs:5:14
|
1 | macro_rules! uwu {
| ---------------- when calling this macro
...
4 | uwu! { a a a c }
| ^ no rules expected this token in macro call
|
+note: while trying to match `b`
+ --> macros.rs:2:12
+ |
+2 | (a a a b) => {};
+ | ^
```
Delay `include_bytes` to AST lowering
Hopefully addresses #65818.
This PR introduces a new `ExprKind::IncludedBytes` which stores the path and bytes of a file included with `include_bytes!()`. We can then create a literal from the bytes during AST lowering, which means we don't need to escape the bytes into valid UTF8 which is the cause of most of the overhead of embedding large binary blobs.
Add the `#[derive_const]` attribute
Closes#102371. This is a minimal patchset for the attribute to work. There are no restrictions on what traits this attribute applies to.
r? `````@oli-obk`````
For now, we only collect the small info for the `best_failure`, but
using this tracker, we can easily extend it in the future to track
things with more performance overhead.
We cannot retry cases where the macro failed with a parser error that
was emitted already, as that would cause us to emit the same error to
the user twice.
This moves out the matching part of expansion into a new function. This
function will try to match the macro and return an error if it failed to
match. A tracker can be used to get more information about the matching.
Track where diagnostics were created.
This implements the `-Ztrack-diagnostics` flag, which uses `#[track_caller]` to track where diagnostics are created. It is meant as a debugging tool much like `-Ztreat-err-as-bug`.
For example, the following code...
```rust
struct A;
struct B;
fn main(){
let _: A = B;
}
```
...now emits the following error message:
```
error[E0308]: mismatched types
--> src\main.rs:5:16
|
5 | let _: A = B;
| - ^ expected struct `A`, found struct `B`
| |
| expected due to this
-Ztrack-diagnostics: created at compiler\rustc_infer\src\infer\error_reporting\mod.rs:2275:31
```
Add flag to forbid recovery in the parser
To start the effort of fixing #103534, this adds a new flag to the parser, which forbids the parser from doing recovery, which it shouldn't do in macros.
This doesn't add any new checks for recoveries yet and is just here to bikeshed the names for the functions here before doing more.
r? `@compiler-errors`
Only apply `ProceduralMasquerade` hack to older versions of `rental`
The latest version of `rental` (v0.5.6) contains a fix that allows it to
compile without relying on the pretty-print back-compat hack.
Hopefully, there are no longer any crates relying on the affected
versions of the (much less popular) `procedural-masquerade` crate. This
should allow us to target the pretty-print back-compat hack specifically
to older versions of `rental`, and specifically mention upgrading to
`rental` v0.5.6 in the lint message.
The latest version of `rental` (v0.5.6) contains a fix that allows it to
compile without relying on the pretty-print back-compat hack.
Hopefully, there are no longer any crates relying on the affected
versions of the (much less popular) `procedural-masquerade` crate. This
should allow us to target the pretty-print back-compat hack specifically
to older versions of `rental`, and specifically mention upgrading to
`rental` v0.5.6 in the lint message.
Remove `TokenStreamBuilder`
`TokenStreamBuilder` is used to combine multiple token streams. It can be removed, leaving the code a little simpler and a little faster.
r? `@Aaron1011`
`TokenStreamBuilder` exists to concatenate multiple `TokenStream`s
together. This commit removes it, and moves the concatenation
functionality directly into `TokenStream`, via two new methods
`push_tree` and `push_stream`. This makes things both simpler and
faster.
`push_tree` is particularly important. `TokenStreamBuilder` only had a
single `push` method, which pushed a stream. But in practice most of the
time we push a single token tree rather than a stream, and `push_tree`
avoids the need to build a token stream with a single entry (which
requires two allocations, one for the `Lrc` and one for the `Vec`).
The main `push_tree` use arises from a change to one of the `ToInternal`
impls in `proc_macro_server.rs`. It now returns a `SmallVec` instead of
a `TokenStream`. This return value is then iterated over by
`concat_trees`, which does `push_tree` on each element. Furthermore, the
use of `SmallVec` avoids more allocations, because there is always only
one or two token trees.
Note: the removed `TokenStreamBuilder::push` method had some code to
deal with a quadratic blowup case from #57735. This commit removes the
code. I tried and failed to reproduce the blowup from that PR, before
and after this change. Various other changes have happened to
`TokenStreamBuilder` in the meantime, so I suspect the original problem
is no longer relevant, though I don't have proof of this. Generally
speaking, repeatedly extending a `Vec` without pre-determining its
capacity is *not* quadratic. It's also incredibly common, within rustc
and many other Rust programs, so if there were performance problems
there you'd think it would show up in other places, too.
`TokenTree::Punct` is handled outside the `match`. This commits moves it
inside the `match`, avoiding the need for the `return`s and making it
easier to read.
FIX - ambiguous Diagnostic link in docs
UPDATE - rename diagnostic_items to IntoDiagnostic and AddToDiagnostic
[Gardening] FIX - formatting via `x fmt`
FIX - rebase conflicts. NOTE: Confirm wheather or not we want to handle TargetDataLayoutErrorsWrapper this way
DELETE - unneeded allow attributes in Handler method
FIX - broken test
FIX - Rebase conflict
UPDATE - rename residual _SessionDiagnostic and fix LintDiag link
On later stages, the feature is already stable.
Result of running:
rg -l "feature.let_else" compiler/ src/librustdoc/ library/ | xargs sed -s -i "s#\\[feature.let_else#\\[cfg_attr\\(bootstrap, feature\\(let_else\\)#"
`To` is better than `Create` for indicating that this is a non-consuming
conversion, rather than creating something out of nothing.
And the addition of `Attr` is because the current names makes them sound
like they relate to `TokenStream`, but really they relate to
`AttrTokenStream`.
These two type names are long and have long matching prefixes. I find
them hard to read, especially in combinations like
`AttrAnnotatedTokenStream::new(vec![AttrAnnotatedTokenTree::Token(..)])`.
This commit renames them as `AttrToken{Stream,Tree}`.
Debuginfo line information for macro invocations are collapsed by
default - line information are replaced by the line of the outermost
expansion site. Using `-Zdebug-macros` disables this behaviour.
When the `collapse_debuginfo` feature is enabled, the default behaviour
is reversed so that debuginfo is not collapsed by default. In addition,
the `#[collapse_debuginfo]` attribute is available and can be applied to
macro definitions which will then have their line information collapsed.
Signed-off-by: David Wood <david.wood@huawei.com>
Fix a bunch of typo
This PR will fix some typos detected by [typos].
I only picked the ones I was sure were spelling errors to fix, mostly in
the comments.
[typos]: https://github.com/crate-ci/typos
proc_macro/bridge: send diagnostics over the bridge as a struct
This removes some RPC when creating and emitting diagnostics, and
simplifies the bridge slightly.
After this change, there are no remaining methods which take advantage
of the support for `&mut` references to objects in the store as
arguments, meaning that support for them could technically be removed if
we wanted. The only remaining uses of immutable references into the
store are `TokenStream` and `SourceFile`.
r? `@eddyb`
This PR will fix some typos detected by [typos].
I only picked the ones I was sure were spelling errors to fix, mostly in
the comments.
[typos]: https://github.com/crate-ci/typos
In some places we use `Vec<Attribute>` and some places we use
`ThinVec<Attribute>` (a.k.a. `AttrVec`). This results in various points
where we have to convert between `Vec` and `ThinVec`.
This commit changes the places that use `Vec<Attribute>` to use
`AttrVec`. A lot of this is mechanical and boring, but there are
some interesting parts:
- It adds a few new methods to `ThinVec`.
- It implements `MapInPlace` for `ThinVec`, and introduces a macro to
avoid the repetition of this trait for `Vec`, `SmallVec`, and
`ThinVec`.
Overall, it makes the code a little nicer, and has little effect on
performance. But it is a precursor to removing
`rustc_data_structures::thin_vec::ThinVec` and replacing it with
`thin_vec::ThinVec`, which is implemented more efficiently.
Migrations for rustc_expand transcribe.rs
This PR includes some migrations to the new diagnostics API for the `rustc_expand` module.
r? ```@davidtwco```
- Rename `ast::Lit::token` as `ast::Lit::token_lit`, because its type is
`token::Lit`, which is not a token. (This has been confusing me for a
long time.)
reasonable because we have an `ast::token::Lit` inside an `ast::Lit`.
- Rename `LitKind::{from,to}_lit_token` as
`LitKind::{from,to}_token_lit`, to match the above change and
`token::Lit`.
Implement `#[rustc_default_body_unstable]`
This PR implements a new stability attribute — `#[rustc_default_body_unstable]`.
`#[rustc_default_body_unstable]` controls the stability of default bodies in traits.
For example:
```rust
pub trait Trait {
#[rustc_default_body_unstable(feature = "feat", isssue = "none")]
fn item() {}
}
```
In order to implement `Trait` user needs to either
- implement `item` (even though it has a default implementation)
- enable `#![feature(feat)]`
This is useful in conjunction with [`#[rustc_must_implement_one_of]`](https://github.com/rust-lang/rust/pull/92164), we may want to relax requirements for a trait, for example allowing implementing either of `PartialEq::{eq, ne}`, but do so in a safe way — making implementation of only `PartialEq::ne` unstable.
r? `@Aaron1011`
cc `@nrc` (iirc you were interested in this wrt `read_buf`), `@danielhenrymantilla` (you were interested in the related `#[rustc_must_implement_one_of]`)
P.S. This is my first time working with stability attributes, so I'm not sure if I did everything right 😅
This removes some RPC when creating and emitting diagnostics, and
simplifies the bridge slightly.
After this change, there are no remaining methods which take advantage
of the support for `&mut` references to objects in the store as
arguments, meaning that support for them could technically be removed if
we wanted. The only remaining uses of immutable references into the
store are `TokenStream` and `SourceFile`.
Remove `TreeAndSpacing`.
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
r? `@petrochenkov`
This is done by having the crossbeam dependency inserted into the
proc_macro server code from the server side, to avoid adding a
dependency to proc_macro.
In addition, this introduces a -Z command-line option which will switch
rustc to run proc-macros using this cross-thread executor. With the
changes to the bridge in #98186, #98187, #98188 and #98189, the
performance of the executor should be much closer to same-thread
execution.
In local testing, the crossbeam executor was substantially more
performant than either of the two existing CrossThread strategies, so
they have been removed to keep things simple.
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
This attribute allows to mark default body of a trait function as
unstable. This means that implementing the trait without implementing
the function will require enabling unstable feature.
This is useful in conjunction with `#[rustc_must_implement_one_of]`,
we may want to relax requirements for a trait, for example allowing
implementing either of `PartialEq::{eq, ne}`, but do so in a safe way
-- making implementation of only `PartialEq::ne` unstable.
Revert "Stabilize $$ in Rust 1.63.0"
This mechanically reverts commit 9edaa76adc, the one commit from #95860.
https://github.com/rust-lang/rust/issues/99035; the behavior of `$$crate` is potentially unexpected and not ready to be stabilized. https://github.com/rust-lang/rust/pull/99193 attempts to forbid `$$crate` without also destabilizing `$$` more generally.
`@rustbot` modify labels +T-compiler +T-lang +P-medium +beta-nominated +relnotes
(applying the labels I think are accurate from the issue and alternative partial revert)
cc `@Mark-Simulacrum`
This method is still only used for Literal::subspan, however the
implementation only depends on the Span component, so it is simpler and
more efficient for now to pass down only the information that is needed.
In the future, if more information about the Literal is required in the
implementation (e.g. to validate that spans line up as expected with
source text), that extra information can be added back with extra
arguments.
This builds on the symbol infrastructure built for `Ident` to replicate
the `LitKind` and `Lit` structures in rustc within the `proc_macro`
client, allowing literals to be fully created and interacted with from
the client thread. Only parsing and subspan operations still require
sync RPC.
Doing this for all unicode identifiers would require a dependency on
`unicode-normalization` and `rustc_lexer`, which is currently not
possible for `proc_macro` due to it being built concurrently with `std`
and `core`. Instead, ASCII identifiers are validated locally, and an RPC
message is used to validate unicode identifiers when needed.
String values are interned on the both the server and client when
deserializing, to avoid unnecessary copies and keep Ident cheap to copy and
move. This appears to be important for performance.
The client-side interner is based roughly on the one from rustc_span, and uses
an arena inspired by rustc_arena.
RPC messages passing symbols always include the full value. This could
potentially be optimized in the future if it is revealed to be a
performance bottleneck.
Despite now having a relevant implementaion of Display for Ident, ToString is
still specialized, as it is a hot-path for this object.
The symbol infrastructure will also be used for literals in the next
part.
Emit warning when named arguments are used positionally in format
Addresses Issue 98466 by emitting an error if a named argument
is used like a position argument (i.e. the name is not used in
the string to be formatted).
Fixes rust-lang#98466
Implement `for<>` lifetime binder for closures
This PR implements RFC 3216 ([TI](https://github.com/rust-lang/rust/issues/97362)) and allows code like the following:
```rust
let _f = for<'a, 'b> |a: &'a A, b: &'b B| -> &'b C { b.c(a) };
// ^^^^^^^^^^^--- new!
```
cc ``@Aaron1011`` ``@cjgillot``
Addresses Issue 98466 by emitting a warning if a named argument
is used like a position argument (i.e. the name is not used in
the string to be formatted).
Fixes rust-lang#98466