A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
More derive output improvements
This PR includes:
- Some test improvements.
- Some cosmetic changes to derive output that make the code look more like what a human would write.
- Some more fundamental improvements to `cmp` and `partial_cmp` generation.
r? `@Mark-Simulacrum`
It's common to see repeated assertions like this in derived `clone` and
`eq` methods:
```
let _: ::core::clone::AssertParamIsClone<u32>;
let _: ::core::clone::AssertParamIsClone<u32>;
```
This commit avoids them.
Fixup missing renames from `#[main]` to `#[rustc_main]`
In #84217 `#[main]` was removed and replaced with `#[rustc_main]`. In some places the rename was forgotten, which makes the current code confusing, because at first glance it seems that `#[main]` is still around. Perform the renames also in these places.
I noticed this (after first being confused by it) when working on #97802.
r? `@petrochenkov`
(since you reviewed the other PR)
In fc357039f9 `#[main]` was removed and replaced with `#[rustc_main]`.
In some place the rename was forgotten, which makes the current code
confusing, because at first glance it seems that `#[main]` is still
around. Perform the renames also in these places.
Both functions do some modifying of streams using `make_mut`:
- `push` sometimes glues the first token of the next stream to the last
token of the first stream.
- `build` appends tokens to the first stream.
By doing all of this in the one place, things are simpler. The first
stream can be modified in both ways (if necessary) in the one place, and
any next stream with the first token removed doesn't need to be stored.
It's a weird function: it lets you modify the token stream in the middle
of iteration. There is only one call site, and it is only used for the
rare `ProceduralMasquerade` legacy case.
This avoids the name clash with `rustc_serialize::Encoder` (a trait),
and allows lots qualifiers to be removed and imports to be simplified
(e.g. fewer `as` imports).
(This was previously merged as commit 5 in #94732 and then was reverted
in #97905 because of a perf regression caused by commit 4 in #94732.)
Don't suggest adding `let` in certain `if` conditions
Avoid being too eager to suggest `let` in an `if` condition with an `=`, namely when the LHS of the `=` isn't even valid as a pattern (to a first degree approximation).
This heustic I came up with kinda sucks. Let me know if it needs to be refined.
This avoids the name clash with `rustc_serialize::Encoder` (a trait),
and allows lots qualifiers to be removed and imports to be simplified
(e.g. fewer `as` imports).
There are two impls of the `Encoder` trait: `opaque::Encoder` and
`opaque::FileEncoder`. The former encodes into memory and is infallible, the
latter writes to file and is fallible.
Currently, standard `Result`/`?`/`unwrap` error handling is used, but this is a
bit verbose and has non-trivial cost, which is annoying given how rare failures
are (especially in the infallible `opaque::Encoder` case).
This commit changes how `Encoder` fallibility is handled. All the `emit_*`
methods are now infallible. `opaque::Encoder` requires no great changes for
this. `opaque::FileEncoder` now implements a delayed error handling strategy.
If a failure occurs, it records this via the `res` field, and all subsequent
encoding operations are skipped if `res` indicates an error has occurred. Once
encoding is complete, the new `finish` method is called, which returns a
`Result`. In other words, there is now a single `Result`-producing method
instead of many of them.
This has very little effect on how any file errors are reported if
`opaque::FileEncoder` has any failures.
Much of this commit is boring mechanical changes, removing `Result` return
values and `?` or `unwrap` from expressions. The more interesting parts are as
follows.
- serialize.rs: The `Encoder` trait gains an `Ok` associated type. The
`into_inner` method is changed into `finish`, which returns
`Result<Vec<u8>, !>`.
- opaque.rs: The `FileEncoder` adopts the delayed error handling
strategy. Its `Ok` type is a `usize`, returning the number of bytes
written, replacing previous uses of `FileEncoder::position`.
- Various methods that take an encoder now consume it, rather than being
passed a mutable reference, e.g. `serialize_query_result_cache`.
The change was "Show invisible delimiters (within comments) when pretty
printing". It's useful to show these delimiters, but is a breaking
change for some proc macros.
Fixes#97608.
Permit `asm_const` and `asm_sym` to reference generic params
Related #96557
These constructs will be allowed:
```rust
fn foofoo<const N: usize>() {}
unsafe fn foo<const N: usize>() {
asm!("/* {0} */", const N);
asm!("/* {0} */", const N + 1);
asm!("/* {0} */", sym foofoo::<N>);
}
fn barbar<T>() {}
unsafe fn bar<T>() {
asm!("/* {0} */", const std::mem::size_of::<T>());
asm!("/* {0} */", const std::mem::size_of::<(T, T)>());
asm!("/* {0} */", sym barbar::<T>);
asm!("/* {0} */", sym barbar::<(T, T)>);
}
```
`@Amanieu,` I didn't switch inline asms to use `DefKind::InlineAsm`, as I see little value doing that; given that no type inference is needed, it will only make typecking slower and more complex but will have no real gains. I did switch them to follow the same code path as inline asm during symbol resolution, though.
The `error: unconstrained generic constant` you mentioned in #76001 is due to the fact that `to_const` will actually add a wfness obligation to the constant, which we don't need for `asm_const`, so I have that removed.
`@rustbot` label: +A-inline-assembly +F-asm
Begin fixing all the broken doctests in `compiler/`
Begins to fix#95994.
All of them pass now but 24 of them I've marked with `ignore HELP (<explanation>)` (asking for help) as I'm unsure how to get them to work / if we should leave them as they are.
There are also a few that I marked `ignore` that could maybe be made to work but seem less important.
Each `ignore` has a rough "reason" for ignoring after it parentheses, with
- `(pseudo-rust)` meaning "mostly rust-like but contains foreign syntax"
- `(illustrative)` a somewhat catchall for either a fragment of rust that doesn't stand on its own (like a lone type), or abbreviated rust with ellipses and undeclared types that would get too cluttered if made compile-worthy.
- `(not-rust)` stuff that isn't rust but benefits from the syntax highlighting, like MIR.
- `(internal)` uses `rustc_*` code which would be difficult to make work with the testing setup.
Those reason notes are a bit inconsistently applied and messy though. If that's important I can go through them again and try a more principled approach. When I run `rg '```ignore \(' .` on the repo, there look to be lots of different conventions other people have used for this sort of thing. I could try unifying them all if that would be helpful.
I'm not sure if there was a better existing way to do this but I wrote my own script to help me run all the doctests and wade through the output. If that would be useful to anyone else, I put it here: https://github.com/Elliot-Roberts/rust_doctest_fixing_tool
Overhaul `MacArgs`
Motivation:
- Clarify some code that I found hard to understand.
- Eliminate one use of three places where `TokenKind::Interpolated` values are created.
r? `@petrochenkov`
The value in `MacArgs::Eq` is currently represented as a `Token`.
Because of `TokenKind::Interpolated`, `Token` can be either a token or
an arbitrary AST fragment. In practice, a `MacArgs::Eq` starts out as a
literal or macro call AST fragment, and then is later lowered to a
literal token. But this is very non-obvious. `Token` is a much more
general type than what is needed.
This commit restricts things, by introducing a new type `MacArgsEqKind`
that is either an AST expression (pre-lowering) or an AST literal
(post-lowering). The downside is that the code is a bit more verbose in
a few places. The benefit is that makes it much clearer what the
possibilities are (though also shorter in some other places). Also, it
removes one use of `TokenKind::Interpolated`, taking us a step closer to
removing that variant, which will let us make `Token` impl `Copy` and
remove many "handle Interpolated" code paths in the parser.
Things to note:
- Error messages have improved. Messages like this:
```
unexpected token: `"bug" + "found"`
```
now say "unexpected expression", which makes more sense. Although
arbitrary expressions can exist within tokens thanks to
`TokenKind::Interpolated`, that's not obvious to anyone who doesn't
know compiler internals.
- In `parse_mac_args_common`, we no longer need to collect tokens for
the value expression.
The comment on this function explains that it's a specialized version of
`maybe_whole_expr`. But `maybe_whole_expr` doesn't do anything with
`NtIdent`, so `is_whole_expr` also doesn't need to.
Using an obviously-placeholder syntax. An RFC would still be needed before this could have any chance at stabilization, and it might be removed at any point.
But I'd really like to have it in nightly at least to ensure it works well with try_trait_v2, especially as we refactor the traits.
The two paths are equivalent -- they both end up calling `visit_expr()`.
I have kept the more restrictive path, the one that requires that
`token` be an expression nonterminal. (The next commit will simplify this
function further.)
Add `BoundKind` in `visit_param_bounds` to check questions in bounds
From the FIXME in the impl of `AstValidator`. Better bound checks by adding `BoundCtxt` type parameter to `visit_param_bound`
cc `@ecstatic-morse`
This lets us clone just the parts within a `TokenTree` that need
cloning, rather than the entire thing. This is a surprisingly large
performance win, up to 4% on `async-std-1.10.0`.
Report undeclared lifetimes during late resolution.
First step in https://github.com/rust-lang/rust/pull/91557
We reuse the rib design of the current resolution framework. Specific `LifetimeRib` and `LifetimeRibKind` types are introduced. The most important variant is `LifetimeRibKind::Generics`, which happens each time we encounter something which may introduce generic lifetime parameters. It can be an item or a `for<...>` binder. The `LifetimeBinderKind` specifies how this rib behaves with respect to in-band lifetimes.
r? `@petrochenkov`
Remove last vestiges of skippng ident span hashing
This removes a comment that no longer applies, and properly hashes
the full ident for path segments.
Implement sym operands for global_asm!
Tracking issue: #93333
This PR is pretty much a complete rewrite of `sym` operand support for inline assembly so that the same implementation can be shared by `asm!` and `global_asm!`. The main changes are:
- At the AST level, `sym` is represented as a special `InlineAsmSym` AST node containing a path instead of an `Expr`.
- At the HIR level, `sym` is split into `SymStatic` and `SymFn` depending on whether the path resolves to a static during AST lowering (defaults to `SynFn` if `get_early_res` fails).
- `SymFn` is just an `AnonConst`. It runs through typeck and we just collect the resulting type at the end. An error is emitted if the type is not a `FnDef`.
- `SymStatic` directly holds a path and the `DefId` of the `static` that it is pointing to.
- The representation at the MIR level is mostly unchanged. There is a minor change to THIR where `SymFn` is a constant instead of an expression.
- At the codegen level we need to apply the target's symbol mangling to the result of `tcx.symbol_name()` depending on the target. This is done by calling the LLVM name mangler, which handles all of the details.
- On Mach-O, all symbols have a leading underscore.
- On x86 Windows, different mangling is used for cdecl, stdcall, fastcall and vectorcall.
- No mangling is needed on other platforms.
r? `@nagisa`
cc `@eddyb`
Create (unstable) 2024 edition
[On Zulip](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Deprecating.20macro.20scoping.20shenanigans/near/272860652), there was a small aside regarding creating the 2024 edition now as opposed to later. There was a reasonable amount of support and no stated opposition.
This change creates the 2024 edition in the compiler and creates a prelude for the 2024 edition. There is no current difference between the 2021 and 2024 editions. Cargo and other tools will need to be updated separately, as it's not in the same repository. This change permits the vast majority of work towards the next edition to proceed _now_ instead of waiting until 2024.
For sanity purposes, I've merged the "hello" UI tests into a single file with multiple revisions. Otherwise we'd end up with a file per edition, despite them being essentially identical.
````@rustbot```` label +T-lang +S-waiting-on-review
Not sure on the relevant team, to be honest.
By heap allocating the argument within `NtPath`, `NtVis`, and `NtStmt`.
This slightly reduces cumulative and peak allocation amounts, most
notably on `deep-vector`.
Spellchecking compiler comments
This PR cleans up the rest of the spelling mistakes in the compiler comments. This PR does not change any literal or code spelling issues.
It's only needed for macro expansion, not as a general element in the
AST. This commit removes it, adds `NtOrTt` for the parser and macro
expansion cases, and renames the variants in `NamedMatch` to better
match the new type.
More robust fallback for `use` suggestion
Our old way to suggest where to add `use`s would first look for pre-existing `use`s in the relevant crate/module, and if there are *no* uses, it would fallback on trying to use another item as the basis for the suggestion.
But this was fragile, as illustrated in issue #87613
This PR instead identifies span of the first token after any inner attributes, and uses *that* as the fallback for the `use` suggestion.
Fix#87613
then we just suggest the first legal position where you could inject a use.
To do this, I added `inject_use_span` field to `ModSpans`, and populate it in
parser (it is the span of the first token found after inner attributes, if any).
Then I rewrote the use-suggestion code to utilize it, and threw out some stuff
that is now unnecessary with this in place. (I think the result is easier to
understand.)
Then I added a test of issue 87613.
Generator drop tracking: improve break and continue handling
This PR fixes two related issues.
One, sometimes break or continue have a block target instead of an expression target. This seems to mainly happen with try blocks. Since the drop tracking analysis only works on expressions, if we see a block target for break or continue, we substitute the last expression of the block as the target instead.
Two, break and continue were incorrectly being treated as the same, so continue would also show up as an exit from the loop or block. This patch corrects the way continue is handled by keeping a stack of loop entry points and uses those to find the target of the continue.
Fixes#93197
r? `@nikomatsakis`
This commit fixes two issues.
One, sometimes break or continue have a block target instead of an
expression target. This seems to mainly happen with try blocks. Since
the drop tracking analysis only works on expressions, if we see a block
target for break or continue, we substitute the last expression of the
block as the target instead.
Two, break and continue were incorrectly being treated as the same, so
continue would also show up as an exit from the loop or block. This
patch corrects the way continue is handled by keeping a stack of loop
entry points and uses those to find the target of the continue.
`Decoder` has two impls:
- opaque: this impl is already partly infallible, i.e. in some places it
currently panics on failure (e.g. if the input is too short, or on a
bad `Result` discriminant), and in some places it returns an error
(e.g. on a bad `Option` discriminant). The number of places where
either happens is surprisingly small, just because the binary
representation has very little redundancy and a lot of input reading
can occur even on malformed data.
- json: this impl is fully fallible, but it's only used (a) for the
`.rlink` file production, and there's a `FIXME` comment suggesting it
should change to a binary format, and (b) in a few tests in
non-fundamental ways. Indeed #85993 is open to remove it entirely.
And the top-level places in the compiler that call into decoding just
abort on error anyway. So the fallibility is providing little value, and
getting rid of it leads to some non-trivial performance improvements.
Much of this commit is pretty boring and mechanical. Some notes about
a few interesting parts:
- The commit removes `Decoder::{Error,error}`.
- `InternIteratorElement::intern_with`: the impl for `T` now has the same
optimization for small counts that the impl for `Result<T, E>` has,
because it's now much hotter.
- Decodable impls for SmallVec, LinkedList, VecDeque now all use
`collect`, which is nice; the one for `Vec` uses unsafe code, because
that gave better perf on some benchmarks.
Fix star handling in block doc comments
Fixes#92872.
Some extra explanation about this PR and why https://github.com/rust-lang/rust/pull/92357 created this regression: when we merge doc comment kinds for example in:
```rust
/// he
/**
* hello
*/
#[doc = "boom"]
```
We don't want to remove the empty lines between them. However, to correctly compute the "horizontal trim", we still need it, so instead, I put back a part of the "vertical trim" directly in the "horizontal trim" computation so it doesn't impact the output buffer but allows us to correctly handle the stars.
r? `@camelid`
Add Attribute::meta_kind
The `AttrItem::meta` function is being called on a lot of places, however almost always the caller is only interested in the `kind` of the result `MetaItem`. Before, the `path` had to be cloned in order to get the kind, now it does not have to be.
There is a larger related "problem". In a lot of places, something wants to know contents of attributes. This is accessed through `Attribute::meta_item_list`, which calls `AttrItem::meta` (now `AttrItem::meta_kind`), among other methods. When this function is called, the meta item list has to be recreated from scratch. Everytime something asks a simple question (like is this item/list of attributes `#[doc(hidden)]`?), the tokens of the attribute(s) are cloned, parsed and the results are allocated on the heap. That seems really unnecessary. What would be the best way to cache this? Turn `meta_item_list` into a query perhaps? Related PR: https://github.com/rust-lang/rust/pull/92227
r? rust-lang/rustdoc
ast: Avoid aborts on fatal errors thrown from mutable AST visitor
Set the node to some dummy value and rethrow the error instead.
When using the old aborting `visit_clobber` in `InvocationCollector::visit_crate` the next tests abort due to fatal errors:
```
ui\modules\path-invalid-form.rs
ui\modules\path-macro.rs
ui\modules\path-no-file-name.rs
ui\parser\issues\issue-5806.rs
ui\parser\mod_file_with_path_attr.rs
```
Follow up to https://github.com/rust-lang/rust/pull/91313.
Remove `SymbolStr`
This was originally proposed in https://github.com/rust-lang/rust/pull/74554#discussion_r466203544. As well as removing the icky `SymbolStr` type, it allows the removal of a lot of `&` and `*` occurrences.
Best reviewed one commit at a time.
r? `@oli-obk`
Stabilise `feature(const_generics_defaults)`
`feature(const_generics_defaults)` is complete implementation wise and has a pretty extensive test suite so I think is ready for stabilisation.
needs stabilisation report and maybe an RFC 😅
r? `@lcnr`
cc `@rust-lang/project-const-generics`
Fix bug with `#[doc]` string single-character last lines
Fixes#90618.
This is because `.iter().all(|c| c == '*')` returns `true` if there is no character checked. And in case the last line has only one character, it simply returns `true`, making the last line behind removed.
TraitKind -> Trait
TyAliasKind -> TyAlias
ImplKind -> Impl
FnKind -> Fn
All `*Kind`s in AST are supposed to be enums.
Tuple structs are converted to braced structs for the types above, and fields are reordered in syntactic order.
Also, mutable AST visitor now correctly visit spans in defaultness, unsafety, impl polarity and constness.
Optimize bidi character detection.
Should fix most of the performance regression of the bidi character detection (#90514), to be confirmed with a perf run.