Create (unstable) 2024 edition
[On Zulip](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Deprecating.20macro.20scoping.20shenanigans/near/272860652), there was a small aside regarding creating the 2024 edition now as opposed to later. There was a reasonable amount of support and no stated opposition.
This change creates the 2024 edition in the compiler and creates a prelude for the 2024 edition. There is no current difference between the 2021 and 2024 editions. Cargo and other tools will need to be updated separately, as it's not in the same repository. This change permits the vast majority of work towards the next edition to proceed _now_ instead of waiting until 2024.
For sanity purposes, I've merged the "hello" UI tests into a single file with multiple revisions. Otherwise we'd end up with a file per edition, despite them being essentially identical.
````@rustbot```` label +T-lang +S-waiting-on-review
Not sure on the relevant team, to be honest.
Remove `<mbe::TokenTree as Clone>`
`mbe::TokenTree` doesn't really need to implement `Clone`, and getting rid of that impl leads to some speed-ups.
r? `@petrochenkov`
Loading the fallback bundle in compilation sessions that won't go on to
emit any errors unnecessarily degrades compile time performance, so
lazily create the Fluent bundle when it is first required.
Signed-off-by: David Wood <david.wood@huawei.com>
When a `macro_rules! foo { ... }` invocation is compiled the name used
is `foo`, not `macro_rules!`. This is different to all other macro
invocations, and confused me when I was inserted debugging println
statements for macro evaluation.
This commit changes it to `macro_rules` (or just `macro`), which is what
I expected. There are no externally visible changes.
expand: Remove `ParseSess::missing_fragment_specifiers`
It was used for deduplicating some errors for legacy code which are mostly deduplicated even without that, but at cost of global mutable state, which is not a good tradeoff.
cc https://github.com/rust-lang/rust/pull/95747#issuecomment-1091619403
r? ``@nnethercote``
Left overs of #95761
These are just nits. Feel free to close this PR if all modifications are not worth merging.
* `#![feature(decl_macro)]` is not needed anymore in `rustc_expand`
* `tuple_impls` does not require `$Tuple:ident`. I guess it is there to enhance readability?
r? ```@petrochenkov```
refactor: simplify few string related interactions
Few small optimizations:
check_doc_keyword: don't alloc string for emptiness check
check_doc_alias_value: get argument as Symbol to prevent needless string convertions
check_doc_attrs: don't alloc vec, iterate over slice.
replace as_str() check with symbol check
get_single_str_from_tts: don't prealloc string
trivial string to str replace
LifetimeScopeForPath::NonElided use Vec<Symbol> instead of Vec<String>
AssertModuleSource use FxHashSet<Symbol> instead of BTreeSet<String>
CrateInfo.crate_name replace FxHashMap<CrateNum, String> with FxHashMap<CrateNum, Symbol>
It was used for deduplicating some errors for legacy code which are mostly deduplicated even without that, but at cost of global mutable state, which is not a good tradeoff.
Remove explicit delimiter token trees from `Delimited`.
They were introduced by the final commit in #95159 and gave a
performance win. But since the introduction of `MatcherLoc` they are no
longer needed. This commit reverts that change, making the code a bit
simpler.
r? `@petrochenkov`
They were introduced by the final commit in #95159 and gave a
performance win. But since the introduction of `MatcherLoc` they are no
longer needed. This commit reverts that change, making the code a bit
simpler.
check_doc_alias_value: get argument as Symbol to prevent needless string convertions
check_doc_attrs: don't alloc vec, iterate over slice. Vec introduced in #83149, but no perf run posted on merge
replace as_str() check with symbol check
get_single_str_from_tts: don't prealloc string
trivial string to str replace
LifetimeScopeForPath::NonElided use Vec<Symbol> instead of Vec<String>
AssertModuleSource use BTreeSet<Symbol> instead of BTreeSet<String>
CrateInfo.crate_name replace FxHashMap<CrateNum, String> with FxHashMap<CrateNum, Symbol>
By heap allocating the argument within `NtPath`, `NtVis`, and `NtStmt`.
This slightly reduces cumulative and peak allocation amounts, most
notably on `deep-vector`.
Currently it's called in `parse_tt` every time a match rule is invoked.
This commit moves it so it's called instead once per match rule, in
`compile_declarative_macro. This is a performance win.
The commit also moves `compute_locs` out of `TtParser`, because there's
no longer any reason for it to be in there.
Use the proc-macro descr to track their individual expansions with
self-profiling events. This will help diagnose performance issues
with slow proc-macros.
In #95555 this was moved out of `parse_tt_inner` and `nameize` into
`compute_locs`. But the next commit will be moving `compute_locs`
outwards to a place that isn't suitable for the missing fragment
identifier checking. So this reinstates the old checking.
Add an option for enabling and disabling Fluent's directionality
isolation markers in output. Disabled by default as these can render in
some terminals and applications.
Signed-off-by: David Wood <david.wood@huawei.com>
Extend loading of Fluent bundles so that bundles can be loaded from the
sysroot based on the language requested by the user, or using a nightly
flag.
Sysroot bundles are loaded from `$sysroot/share/locale/$locale/*.ftl`.
Signed-off-by: David Wood <david.wood@huawei.com>
This commit updates the signatures of all diagnostic functions to accept
types that can be converted into a `DiagnosticMessage`. This enables
existing diagnostic calls to continue to work as before and Fluent
identifiers to be provided. The `SessionDiagnostic` derive just
generates normal diagnostic calls, so these APIs had to be modified to
accept Fluent identifiers.
In addition, loading of the "fallback" Fluent bundle, which contains the
built-in English messages, has been implemented.
Each diagnostic now has "arguments" which correspond to variables in the
Fluent messages (necessary to render a Fluent message) but no API for
adding arguments has been added yet. Therefore, diagnostics (that do not
require interpolation) can be converted to use Fluent identifiers and
will be output as before.
`MultiSpan` contains labels, which are more complicated with the
introduction of diagnostic translation and will use types from
`rustc_errors` - however, `rustc_errors` depends on `rustc_span` so
`rustc_span` cannot use types like `DiagnosticMessage` without
dependency cycles. Introduce a new `rustc_error_messages` crate that can
contain `DiagnosticMessage` and `MultiSpan`.
Signed-off-by: David Wood <david.wood@huawei.com>
Introduce a `DiagnosticMessage` type that will enable diagnostic
messages to be simple strings or Fluent identifiers.
`DiagnosticMessage` is now used in the implementation of the standard
`DiagnosticBuilder` APIs.
Signed-off-by: David Wood <david.wood@huawei.com>
Reduce unnecessary escaping in proc_macro::Literal::character/string
I noticed that https://doc.rust-lang.org/proc_macro/struct.Literal.html#method.character is producing unreadable literals that make macro-expanded code unnecessarily hard to read. Since the proc macro server was using `escape_unicode()`, every char is escaped using `\u{…}` regardless of whether there is any need to do so. For example `Literal::character('=')` would previously produce `'\u{3d}'` which unnecessarily obscures the meaning when reading the macro-expanded code.
I've changed Literal::string also in this PR because `str`'s `Debug` impl is also smarter than just calling `escape_debug` on every char. For example `Literal::string("ferris's")` would previously produce `"ferris\'s"` but will now produce `"ferris's"`.
`parse_tt` currently traverses a `&[TokenTree]` to do matching. But this
is a bad representation for the traversal.
- `TokenTree` is nested, and there's a bunch of expensive and fiddly
state required to handle entering and exiting nested submatchers.
- There are three positions (sequence separators, sequence Kleene ops,
and end of the matcher) that are represented by an index that exceeds
the end of the `&[TokenTree]`, which is clumsy and error-prone.
This commit introduces a new representation called `MatcherLoc` that is
designed specifically for matching. It fixes all the above problems,
making the code much easier to read. A `&[TokenTree]` is converted to a
`&[MatcherLoc]` before matching begins. Despite the cost of the
conversion, it's still a net performance win, because various pieces of
traversal state are computed once up-front, rather than having to be
recomputed repeatedly during the macro matching.
Some improvements worth noting.
- `parse_tt_inner` is *much* easier to read. No more having to compare
`idx` against `len` and read comments to understand what the result
means.
- The handling of `Delimited` in `parse_tt_inner` is now trivial.
- The three end-of-sequence cases in `parse_tt_inner` are now handled in
three separate match arms, and the control flow is much simpler.
- `nameize` is no longer recursive.
- There were two places that issued "missing fragment specifier" errors:
one in `parse_tt_inner()`, and one in `nameize()`. Presumably the
latter was never executed. There's now a single place issuing these
errors, in `compute_locs()`.
- The number of heap allocations done for a `check full` build of
`async-std-1.10.0` (an extreme example of heavy macro use) drops from
11.8M to 2.6M, and most of these occur outside of macro matching.
- The size of `MatcherPos` drops from 64 bytes to 16 bytes. Small enough
that it no longer needs boxing, which partly accounts for the
reduction in allocations.
- The rest of the drop in allocations is due to the removal of
`MatcherKind`, because we no longer need to record anything for the
parent matcher when entering a submatcher.
- Overall it reduces code size by 45 lines.
It's only used in one place, and there we clone and then make a bunch of
modifications. It's clearer if we duplicate more explicitly, and there's
a symmetry now between `sequence()` and `empty_sequence()`.