A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.
This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.
The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`
These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.
This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.
These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
The change was "Show invisible delimiters (within comments) when pretty
printing". It's useful to show these delimiters, but is a breaking
change for some proc macros.
Fixes#97608.
Overhaul `MacArgs`
Motivation:
- Clarify some code that I found hard to understand.
- Eliminate one use of three places where `TokenKind::Interpolated` values are created.
r? `@petrochenkov`
The value in `MacArgs::Eq` is currently represented as a `Token`.
Because of `TokenKind::Interpolated`, `Token` can be either a token or
an arbitrary AST fragment. In practice, a `MacArgs::Eq` starts out as a
literal or macro call AST fragment, and then is later lowered to a
literal token. But this is very non-obvious. `Token` is a much more
general type than what is needed.
This commit restricts things, by introducing a new type `MacArgsEqKind`
that is either an AST expression (pre-lowering) or an AST literal
(post-lowering). The downside is that the code is a bit more verbose in
a few places. The benefit is that makes it much clearer what the
possibilities are (though also shorter in some other places). Also, it
removes one use of `TokenKind::Interpolated`, taking us a step closer to
removing that variant, which will let us make `Token` impl `Copy` and
remove many "handle Interpolated" code paths in the parser.
Things to note:
- Error messages have improved. Messages like this:
```
unexpected token: `"bug" + "found"`
```
now say "unexpected expression", which makes more sense. Although
arbitrary expressions can exist within tokens thanks to
`TokenKind::Interpolated`, that's not obvious to anyone who doesn't
know compiler internals.
- In `parse_mac_args_common`, we no longer need to collect tokens for
the value expression.
Using an obviously-placeholder syntax. An RFC would still be needed before this could have any chance at stabilization, and it might be removed at any point.
But I'd really like to have it in nightly at least to ensure it works well with try_trait_v2, especially as we refactor the traits.
This commit rearranges the `match`. The new code avoids testing for
`MacArgs::Eq` twice, at the cost of repeating the `self.print_path()`
call. I think this is worthwhile because it puts the `match` in a more
standard and readable form.
Parse inner attributes on inline const block
According to https://github.com/rust-lang/rust/pull/84414#issuecomment-826150936, inner attributes are intended to be supported *"in all containers for statements (or some subset of statements)"*.
This PR adds inner attribute parsing and pretty-printing for inline const blocks (https://github.com/rust-lang/rust/issues/76001), which contain statements just like an unsafe block or a loop body.
```rust
let _ = const {
#![allow(...)]
let x = ();
x
};
```
It's only needed for macro expansion, not as a general element in the
AST. This commit removes it, adds `NtOrTt` for the parser and macro
expansion cases, and renames the variants in `NamedMatch` to better
match the new type.
The `print_expr` method already places an `ibox(INDENT_UNIT)` around
every expr that gets printed. Some exprs were then using `self.head`
inside of that, which does its own `cbox(INDENT_UNIT)`, resulting in two
levels of indentation:
while true {
stuff;
}
This commit fixes those cases to produce the expected single level of
indentation within every expression containing a block.
while true {
stuff;
}
Previously the pretty printer would compute indentation always relative
to whatever column a block begins at, like this:
fn demo(arg1: usize,
arg2: usize);
This is never the thing to do in the dominant contemporary Rust style.
Rustfmt's default and the style used by the vast majority of Rust
codebases is block indentation:
fn demo(
arg1: usize,
arg2: usize,
);
where every indentation level is a multiple of 4 spaces and each level
is indented relative to the indentation of the previous line, not the
position that the block starts in.
Pretty printer algorithm revamp step 2
This PR follows #92923 as a second chunk of modernizations backported from https://github.com/dtolnay/prettyplease into rustc_ast_pretty.
I've broken this up into atomic commits that hopefully are sensible in isolation. At every commit, the pretty printer is compilable and has runtime behavior that is identical to before and after the PR. None of the refactoring so far changes behavior.
The general theme of this chunk of commits is: the logic in the old pretty printer is doing some very basic things (pushing and popping tokens on a ring buffer) but expressed in a too-low-level way that I found makes it quite complicated/subtle to reason about. There are a number of obvious invariants that are "almost true" -- things like `self.left == self.buf.offset` and `self.right == self.buf.offset + self.buf.data.len()` and `self.right_total == self.left_total + self.buf.data.sum()`. The reason these things are "almost true" is the implementation tends to put updating one side of the invariant unreasonably far apart from updating the other side, leaving the invariant broken while unrelated stuff happens in between. The following code from master is an example of this:
e5e2b0be26/compiler/rustc_ast_pretty/src/pp.rs (L314-L317)
In this code the `advance_right` is reserving an entry into which to write a next token on the right side of the ring buffer, the `check_stack` is doing something totally unrelated to the right boundary of the ring buffer, and the `scan_push` is actually writing the token we previously reserved space for. Much of what this PR is doing is rearranging code to shrink the amount of stuff in between when an invariant is broken to when it is restored, until the whole thing can be factored out into one indivisible method call on the RingBuffer type.
The end state of the PR is that we can entirely eliminate `self.left` (because it's now just equal to `self.buf.offset` always) and `self.right` (because it's equal to `self.buf.offset + self.buf.data.len()` always) and the whole `Token::Eof` state which used to be the value of tokens that have been reserved space for but not yet written.
I found without these changes the pretty printer implementation to be hard to reason about and I wasn't able to confidently introduce improvements like trailing commas in `prettyplease` until after this refactor. The logic here is 43 years old at this point (Graydon translated it as directly as possible from the 1979 pretty printing paper) and while there are advantages to following the paper as closely as possible, in `prettyplease` I decided if we're going to adapt the algorithm to work better for Rust syntax, it was worthwhile making it easier to follow than the original.
ProjectionPredicate should be able to handle both associated types and consts so this adds the
first step of that. It mainly just pipes types all the way down, not entirely sure how to handle
consts, but hopefully that'll come with time.
Remove deprecated LLVM-style inline assembly
The `llvm_asm!` was deprecated back in #87590 1.56.0, with intention to remove
it once `asm!` was stabilized, which already happened in #91728 1.59.0. Now it
is time to remove `llvm_asm!` to avoid continued maintenance cost.
Closes#70173.
Closes#92794.
Closes#87612.
Closes#82065.
cc `@rust-lang/wg-inline-asm`
r? `@Amanieu`
Rename Printer constructor from mk_printer() to Printer::new()
The original naming is left over from 2011 which was before impl blocks and associated functions existed.
21313d623a/src/comp/pretty/pp.rs
Fix unclosed boxes in pretty printing of TraitAlias
This was causing trait aliases to not even render at all in stringified / pretty printed output.
```rust
macro_rules! repro {
($item:item) => {
stringify!($item)
};
}
fn main() {
println!("{:?}", repro!(pub trait Trait<T> = Sized where T: 'a;));
}
```
Before: `""`
After: `"pub trait Trait<T> = Sized where T: 'a;"`
The fix is copied from how `head`/`end` for `ItemKind::Use`, `ItemKind::ExternCrate`, and `ItemKind::Mod` are all done in the pretty printer:
dd3ac41495/compiler/rustc_ast_pretty/src/pprust/state.rs (L1178-L1184)
Remove &self from PrintState::to_string
The point of `PrintState::to_string` is to create a `State` and evaluate the caller's closure on it:
e9fbe79292/compiler/rustc_ast_pretty/src/pprust/state.rs (L868-L872)
Making the caller *also* construct and pass in a `State`, which is then ignored, was confusing.
Fix spacing and ordering of words in pretty printed Impl
Follow-up to #92238 fixing one of the FIXMEs.
```rust
macro_rules! repro {
($item:item) => {
stringify!($item)
};
}
fn main() {
println!("{}", repro!(impl<T> Struct<T> {}));
println!("{}", repro!(impl<T> const Trait for T {}));
}
```
Before: `impl <T> Struct<T> {}`
After: `impl<T> Struct<T> {}`
Before: `impl const <T> Trait for T {}` 😿
After: `impl<T> const Trait for T {}`
Fix whitespace in pretty printed PatKind::Range
Follow-up to #92238 fixing one of the FIXMEs.
```rust
macro_rules! repro {
($pat:pat) => {
stringify!($pat)
};
}
fn main() {
println!("{}", repro!(0..=1));
}
```
Before: `0 ..=1`
After: `0..=1`
The canonical spacing applied by rustfmt has no space after the lower expr. Rustc's parser diagnostics also do not put a space there:
df96fb166f/compiler/rustc_parse/src/parser/pat.rs (L754)