`tokenstream::Spacing` appears on all `TokenTree::Token` instances,
both punct and non-punct. Its current usage:
- `Joint` means "can join with the next token *and* that token is a
punct".
- `Alone` means "cannot join with the next token *or* can join with the
next token but that token is not a punct".
The fact that `Alone` is used for two different cases is awkward.
This commit augments `tokenstream::Spacing` with a new variant
`JointHidden`, resulting in:
- `Joint` means "can join with the next token *and* that token is a
punct".
- `JointHidden` means "can join with the next token *and* that token is a
not a punct".
- `Alone` means "cannot join with the next token".
This *drastically* improves the output of `print_tts`. For example,
this:
```
stringify!(let a: Vec<u32> = vec![];)
```
currently produces this string:
```
let a : Vec < u32 > = vec! [] ;
```
With this PR, it now produces this string:
```
let a: Vec<u32> = vec![] ;
```
(The space after the `]` is because `TokenTree::Delimited` currently
doesn't have spacing information. The subsequent commit fixes this.)
The new `print_tts` doesn't replicate original code perfectly. E.g.
multiple space characters will be condensed into a single space
character. But it's much improved.
`print_tts` still produces the old, uglier output for code produced by
proc macros. Because we have to translate the generated code from
`proc_macro::Spacing` to the more expressive `token::Spacing`, which
results in too much `proc_macro::Along` usage and no
`proc_macro::JointHidden` usage. So `space_between` still exists and
is used by `print_tts` in conjunction with the `Spacing` field.
This change will also help with the removal of `Token::Interpolated`.
Currently interpolated tokens are pretty-printed nicely via AST pretty
printing. `Token::Interpolated` removal will mean they get printed with
`print_tts`. Without this change, that would result in much uglier
output for code produced by decl macro expansions. With this change, AST
pretty printing and `print_tts` produce similar results.
The commit also tweaks the comments on `proc_macro::Spacing`. In
particular, it refers to "compound tokens" rather than "multi-char
operators" because lifetimes aren't operators.
Properly print cstr literals in `proc_macro::Literal::to_string`
Previously we printed the contents of the string, rather than the actual string literal (e.g. `the c string` instead of `c"the c string"`).
Fixes#112820
cc #105723
without this, the only way to create a `LitKind::Byte` is by
doing `"b'a'".parse::<Literal>()`, this solves that by enabling
`Literal::byte_character(b'a')`
It lints against features that are inteded to be internal to the
compiler and standard library. Implements MCP #596.
We allow `internal_features` in the standard library and compiler as those
use many features and this _is_ the standard library from the "internal to the compiler and
standard library" after all.
Marking some features as internal wasn't exactly the most scientific approach, I just marked some
mostly obvious features. While there is a categorization in the macro,
it's not very well upheld (should probably be fixed in another PR).
We always pass `-Ainternal_features` in the testsuite
About 400 UI tests and several other tests use internal features.
Instead of throwing the attribute on each one, just always allow them.
There's nothing wrong with testing internal features^^
The status quo is highly confusing, since the overlap is not apparent,
and specialization is not a feature of Rust. This addresses #87545;
I'm not certain if it closes it, since that issue might also be trackign
a *general* solution for hiding specializing impls automatically.
Added byte position range for `proc_macro::Span`
Currently, the [`Debug`](https://doc.rust-lang.org/beta/proc_macro/struct.Span.html#impl-Debug-for-Span) implementation for [`proc_macro::Span`](https://doc.rust-lang.org/beta/proc_macro/struct.Span.html#) calls the debug function implemented in the trait implementation of `server::Span` for the type `Rustc` in the `rustc-expand` crate.
The current implementation, of the referenced function, looks something like this:
```rust
fn debug(&mut self, span: Self::Span) -> String {
if self.ecx.ecfg.span_debug {
format!("{:?}", span)
} else {
format!("{:?} bytes({}..{})", span.ctxt(), span.lo().0, span.hi().0)
}
}
```
It returns the byte position of the [`Span`](https://doc.rust-lang.org/beta/proc_macro/struct.Span.html#) as an interpolated string.
Because this is currently the only way to get a spans position in the file, I might lead someone, who is interested in this information, to parsing this interpolated string back into a range of bytes, which I think is a very non-rusty way.
The proposed `position()`, method implemented in this PR, gives the ability to directly get this info.
It returns a [`std::ops::Range`](https://doc.rust-lang.org/std/ops/struct.Range.html#) wrapping the lowest and highest byte of the [`Span`](https://doc.rust-lang.org/beta/proc_macro/struct.Span.html#).
I put it behind the `proc_macro_span` feature flag because many of the other functions that have a similar footprint also are annotated with it, I don't actually know if this is right.
It would be great if somebody could take a look at this, thank you very much in advanced.
Use associated items of `char` instead of freestanding items in `core::char`
The associated functions and constants on `char` have been stable since 1.52 and the freestanding items have soft-deprecated since 1.62 (https://github.com/rust-lang/rust/pull/95566). This PR ~~marks them as "deprecated in future", similar to the integer and floating point modules (`core::{i32, f32}` etc)~~ replaces all uses of `core::char::*` with `char::*` to prepare for future deprecation of `core::char::*`.
While working on some other changes in the bridge, I noticed that when
running a nested proc-macro (which is currently only possible using
the unstable `TokenStream::expand_expr`), any symbols held by the
proc-macro client would be invalidated, as the same thread would be used
for the nested macro by default, and the interner doesn't handle nested
use.
After discussing with @eddyb, we decided the best approach might be to
force the use of the cross-thread executor for nested invocations, as it
will never re-use thread-local storage, avoiding the issue. This
shouldn't impact performance, as expand_expr is still unstable, and
infrequently used.
This was chosen rather than making the client symbol interner handle
nested invocations, as that would require replacing the internal
interner `Vec` with a `BTreeMap` (as valid symbol id ranges could now be
disjoint), and the symbol interner is known to be fairly perf-sensitive.
This patch adds checks to the execution strategy to use the cross-thread
executor when doing nested invocations. An alternative implementation
strategy could be to track this information in the `ExtCtxt`, however a
thread-local in the `proc_macro` crate was chosen to add an assertion so
that `rust-analyzer` is aware of the issue if it implements
`expand_expr` in the future.
r? @eddyb
This removes some RPC when creating and emitting diagnostics, and
simplifies the bridge slightly.
After this change, there are no remaining methods which take advantage
of the support for `&mut` references to objects in the store as
arguments, meaning that support for them could technically be removed if
we wanted. The only remaining uses of immutable references into the
store are `TokenStream` and `SourceFile`.
This is done by having the crossbeam dependency inserted into the
proc_macro server code from the server side, to avoid adding a
dependency to proc_macro.
In addition, this introduces a -Z command-line option which will switch
rustc to run proc-macros using this cross-thread executor. With the
changes to the bridge in #98186, #98187, #98188 and #98189, the
performance of the executor should be much closer to same-thread
execution.
In local testing, the crossbeam executor was substantially more
performant than either of the two existing CrossThread strategies, so
they have been removed to keep things simple.
This method is still only used for Literal::subspan, however the
implementation only depends on the Span component, so it is simpler and
more efficient for now to pass down only the information that is needed.
In the future, if more information about the Literal is required in the
implementation (e.g. to validate that spans line up as expected with
source text), that extra information can be added back with extra
arguments.
This builds on the symbol infrastructure built for `Ident` to replicate
the `LitKind` and `Lit` structures in rustc within the `proc_macro`
client, allowing literals to be fully created and interacted with from
the client thread. Only parsing and subspan operations still require
sync RPC.
Doing this for all unicode identifiers would require a dependency on
`unicode-normalization` and `rustc_lexer`, which is currently not
possible for `proc_macro` due to it being built concurrently with `std`
and `core`. Instead, ASCII identifiers are validated locally, and an RPC
message is used to validate unicode identifiers when needed.
String values are interned on the both the server and client when
deserializing, to avoid unnecessary copies and keep Ident cheap to copy and
move. This appears to be important for performance.
The client-side interner is based roughly on the one from rustc_span, and uses
an arena inspired by rustc_arena.
RPC messages passing symbols always include the full value. This could
potentially be optimized in the future if it is revealed to be a
performance bottleneck.
Despite now having a relevant implementaion of Display for Ident, ToString is
still specialized, as it is a hot-path for this object.
The symbol infrastructure will also be used for literals in the next
part.
Unfortunately, as it is difficult to depend on crates from within proc_macro,
this is done by vendoring a copy of the hasher as a module rather than
depending on the rustc_hash crate.
This probably doesn't have a substantial impact up-front, however will be more
relevant once symbols are interned within the proc_macro client.
This greatly reduces round-trips to fetch relevant extra information about the
token in proc macro code, and avoids RPC messages to create Group tokens.
This greatly reduces round-trips to fetch relevant extra information about the
token in proc macro code, and avoids RPC messages to create Punct tokens.
proc_macro/bridge: remove `#[repr(C)]` from non-ABI-relevant types.
Not sure how this happened, maybe some of these were passed through the bridge a long time ago?
r? `@bjorn3`
This is an experimental patch to try to reduce the codegen complexity of
TokenStream's FromIterator and Extend implementations for downstream
crates, by moving the core logic into a helper type. This might help
improve build performance of crates which depend on proc_macro as
iterators are used less, and the compiler may take less time to do
things like attempt specializations or other iterator optimizations.
The change intentionally sacrifices some optimization opportunities,
such as using the specializations for collecting iterators derived from
Vec::into_iter() into Vec.
This is one of the simpler potential approaches to reducing the amount
of code generated in crates depending on proc_macro, so it seems worth
trying before other more-involved changes.
This significantly reduces the cost of common interactions with TokenStream
when running with the CrossThread execution strategy, by reducing the number of
RPC calls required.
Remove migrate borrowck mode
Closes#58781Closes#43234
# Stabilization proposal
This PR proposes the stabilization of `#![feature(nll)]` and the removal of `-Z borrowck`. Current borrow checking behavior of item bodies is currently done by first infering regions *lexically* and reporting any errors during HIR type checking. If there *are* any errors, then MIR borrowck (NLL) never occurs. If there *aren't* any errors, then MIR borrowck happens and any errors there would be reported. This PR removes the lexical region check of item bodies entirely and only uses MIR borrowck. Because MIR borrowck could never *not* be run for a compiled program, this should not break any programs. It does, however, change diagnostics significantly and allows a slightly larger set of programs to compile.
Tracking issue: #43234
RFC: https://github.com/rust-lang/rfcs/blob/master/text/2094-nll.md
Version: 1.63 (2022-06-30 => beta, 2022-08-11 => stable).
## Motivation
Over time, the Rust borrow checker has become "smarter" and thus allowed more programs to compile. There have been three different implementations: AST borrowck, MIR borrowck, and polonius (well, in progress). Additionally, there is the "lexical region resolver", which (roughly) solves the constraints generated through HIR typeck. It is not a full borrow checker, but does emit some errors.
The AST borrowck was the original implementation of the borrow checker and was part of the initially stabilized Rust 1.0. In mid 2017, work began to implement the current MIR borrow checker and that effort ompleted by the end of 2017, for the most part. During 2018, efforts were made to migrate away from the AST borrow checker to the MIR borrow checker - eventually culminating into "migrate" mode - where HIR typeck with lexical region resolving following by MIR borrow checking - being active by default in the 2018 edition.
In early 2019, migrate mode was turned on by default in the 2015 edition as well, but with MIR borrowck errors emitted as warnings. By late 2019, these warnings were upgraded to full errors. This was followed by the complete removal of the AST borrow checker.
In the period since, various errors emitted by the MIR borrow checker have been improved to the point that they are mostly the same or better than those emitted by the lexical region resolver.
While there do remain some degradations in errors (tracked under the [NLL-diagnostics tag](https://github.com/rust-lang/rust/issues?q=is%3Aopen+is%3Aissue+label%3ANLL-diagnostics), those are sufficiently small and rare enough that increased flexibility of MIR borrow check-only is now a worthwhile tradeoff.
## What is stabilized
As said previously, this does not fundamentally change the landscape of accepted programs. However, there are a [few](https://github.com/rust-lang/rust/issues?q=is%3Aopen+is%3Aissue+label%3ANLL-fixed-by-NLL) cases where programs can compile under `feature(nll)`, but not otherwise.
There are two notable patterns that are "fixed" by this stabilization. First, the `scoped_threads` feature, which is a continutation of a pre-1.0 API, can sometimes emit a [weird lifetime error](https://github.com/rust-lang/rust/issues/95527) without NLL. Second, actually seen in the standard library. In the `Extend` impl for `HashMap`, there is an implied bound of `K: 'a` that is available with NLL on but not without - this is utilized in the impl.
As mentioned before, there are a large number of diagnostic differences. Most of them are better, but some are worse. None are serious or happen often enough to need to block this PR. The biggest change is the loss of error code for a number of lifetime errors in favor of more general "lifetime may not live long enough" error. While this may *seem* bad, the former error codes were just attempts to somewhat-arbitrarily bin together lifetime errors of the same type; however, on paper, they end up being roughly the same with roughly the same kinds of solutions.
## What isn't stabilized
This PR does not completely remove the lexical region resolver. In the future, it may be possible to remove that (while still keeping HIR typeck) or to remove it together with HIR typeck.
## Tests
Many test outputs get updated by this PR. However, there are number of tests specifically geared towards NLL under `src/test/ui/nll`
## History
* On 2017-07-14, [tracking issue opened](https://github.com/rust-lang/rust/issues/43234)
* On 2017-07-20, [initial empty MIR pass added](https://github.com/rust-lang/rust/pull/43271)
* On 2017-08-29, [RFC opened](https://github.com/rust-lang/rfcs/pull/2094)
* On 2017-11-16, [Integrate MIR type-checker with NLL](https://github.com/rust-lang/rust/pull/45825)
* On 2017-12-20, [NLL feature complete](https://github.com/rust-lang/rust/pull/46862)
* On 2018-07-07, [Don't run AST borrowck on mir mode](https://github.com/rust-lang/rust/pull/52083)
* On 2018-07-27, [Add migrate mode](https://github.com/rust-lang/rust/pull/52681)
* On 2019-04-22, [Enable migrate mode on 2015 edition](https://github.com/rust-lang/rust/pull/59114)
* On 2019-08-26, [Don't downgrade errors on 2015 edition](https://github.com/rust-lang/rust/pull/64221)
* On 2019-08-27, [Remove AST borrowck](https://github.com/rust-lang/rust/pull/64790)
The change was "Show invisible delimiters (within comments) when pretty
printing". It's useful to show these delimiters, but is a breaking
change for some proc macros.
Fixes#97608.
Reduce the amount of unstable features used in libproc_macro
This makes it easier to adapt the source for stable when copying it into rust-analyzer to load rustc compiled proc macros.
This updates the standard library's documentation to use the new syntax. The
documentation is worthwhile to update as it should be more idiomatic
(particularly for features like this, which are nice for users to get acquainted
with). The general codebase is likely more hassle than benefit to update: it'll
hurt git blame, and generally updates can be done by folks updating the code if
(and when) that makes things more readable with the new format.
A few places in the compiler and library code are updated (mostly just due to
already having been done when this commit was first authored).
Implement `panic::update_hook`
Add a new function `panic::update_hook` to allow creating panic hooks that forward the call to the previously set panic hook, without race conditions. It works by taking a closure that transforms the old panic hook into a new one, while ensuring that during the execution of the closure no other thread can modify the panic hook. This is a small function so I hope it can be discussed here without a formal RFC, however if you prefer I can write one.
Consider the following example:
```rust
let prev = panic::take_hook();
panic::set_hook(Box::new(move |info| {
println!("panic handler A");
prev(info);
}));
```
This is a common pattern in libraries that need to do something in case of panic: log panic to a file, record code coverage, send panic message to a monitoring service, print custom message with link to github to open a new issue, etc. However it is impossible to avoid race conditions with the current API, because two threads can execute in this order:
* Thread A calls `panic::take_hook()`
* Thread B calls `panic::take_hook()`
* Thread A calls `panic::set_hook()`
* Thread B calls `panic::set_hook()`
And the result is that the original panic hook has been lost, as well as the panic hook set by thread A. The resulting panic hook will be the one set by thread B, which forwards to the default panic hook. This is not considered a big issue because the panic handler setup is usually run during initialization code, probably before spawning any other threads.
Using the new `panic::update_hook` function, this race condition is impossible, and the result will be either `A, B, original` or `B, A, original`.
```rust
panic::update_hook(|prev| {
Box::new(move |info| {
println!("panic handler A");
prev(info);
})
});
```
I found one real world use case here: 988cf403e7/src/detection.rs (L32) the workaround is to detect the race condition and panic in that case.
The pattern of `take_hook` + `set_hook` is very common, you can see some examples in this pull request, so I think it's natural to have a function that combines them both. Also using `update_hook` instead of `take_hook` + `set_hook` reduces the number of calls to `HOOK_LOCK.write()` from 2 to 1, but I don't expect this to make any difference in performance.
### Unresolved questions:
* `panic::update_hook` takes a closure, if that closure panics the error message is "panicked while processing panic" which is not nice. This is a consequence of holding the `HOOK_LOCK` while executing the closure. Could be avoided using `catch_unwind`?
* Reimplement `panic::set_hook` as `panic::update_hook(|_prev| hook)`?
This feature is aimed at giving proc macros access to powers similar to
those used by builtin macros such as `format_args!` or `concat!`. These
macros are able to accept macros in place of string literal parameters,
such as the format string, as they perform recursive macro expansion
while being expanded.
This can be especially useful in many cases thanks to helper macros like
`concat!`, `stringify!` and `include_str!` which are often used to
construct string literals at compile-time in user code.
For now, this method only allows expanding macros which produce
literals, although more expresisons will be supported before the method
is stabilized.