Introduce support for `async` bound modifier on `Fn*` traits
Adds `async` to the list of `TraitBoundModifiers`, which instructs AST lowering to map the trait to an async flavor of the trait. For now, this is only supported for `Fn*` to `AsyncFn*`, and I expect that this manual mapping via lang items will be replaced with a better system in the future.
The motivation for adding these bounds is to separate the users of async closures from the exact trait desugaring of their callable bounds. Instead of users needing to be concerned with the `AsyncFn` trait, they should be able to write `async Fn()` and it will desugar to whatever underlying trait we decide is best for the lowering of async closures.
Note: rustfmt support can be done in the rustfmt repo after a subtree sync.
Be more careful about interpreting a label/lifetime as a mistyped char literal.
Currently the parser interprets any label/lifetime in certain positions as a mistyped char literal, on the assumption that the trailing single quote was accidentally omitted. In such cases it gives an error with a suggestion to add the trailing single quote, and then puts the appropriate char literal into the AST. This behaviour was introduced in #101293.
This is reasonable for a case like this:
```
let c = 'a;
```
because `'a'` is a valid char literal. It's less reasonable for a case like this:
```
let c = 'abc;
```
because `'abc'` is not a valid char literal.
Prior to #120329 this could result in some sub-optimal suggestions in error messages, but nothing else. But #120329 changed `LitKind::from_token_lit` to assume that the char/byte/string literals it receives are valid, and to assert if not. This is reasonable because the lexer does not produce invalid char/byte/string literals in general. But in this "interpret label/lifetime as unclosed char literal" case the parser can produce an invalid char literal with contents such as `abc`, which triggers an assertion failure.
This PR changes the parser so it's more cautious about interpreting labels/lifetimes as unclosed char literals.
Fixes#120397.
r? `@compiler-errors`
Improve handling of expressions in patterns
Closes#112593.
Methodcalls' dots in patterns are silently recovered as commas (e.g. `Foo("".len())` -> `Foo("", len())`) so extra diagnostics are emitted:
```rs
struct Foo(u8, String, u8);
fn bar(foo: Foo) -> bool {
match foo {
Foo(4, "yippee".yeet(), 7) => true,
_ => false
}
}
```
```
error: expected one of `)`, `,`, `...`, `..=`, `..`, or `|`, found `.`
--> main.rs:5:24
|
5 | Foo(4, "yippee".yeet(), 7) => true,
| ^
| |
| expected one of `)`, `,`, `...`, `..=`, `..`, or `|`
| help: missing `,`
error[E0531]: cannot find tuple struct or tuple variant `yeet` in this scope
--> main.rs:5:25
|
5 | Foo(4, "yippee".yeet(), 7) => true,
| ^^^^ not found in this scope
error[E0023]: this pattern has 4 fields, but the corresponding tuple struct has 3 fields
--> main.rs:5:13
|
1 | struct Foo(u8, String, u8);
| -- ------ -- tuple struct has 3 fields
...
5 | Foo(4, "yippee".yeet(), 7) => true,
| ^ ^^^^^^^^ ^^^^^^ ^ expected 3 fields, found 4
error: aborting due to 3 previous errors
```
This PR checks for patterns that ends with a dot and a lowercase ident (as structs/variants should be uppercase):
```
error: expected a pattern, found a method call
--> main.rs:5:16
|
5 | Foo(4, "yippee".yeet(), 7) => true,
| ^^^^^^^^^^^^^^^ method calls are not allowed in patterns
error: aborting due to 1 previous error
```
Also check for expressions:
```rs
fn is_idempotent(x: f32) -> bool {
match x {
x * x => true,
_ => false,
}
}
fn main() {
let mut t: [i32; 5];
let t[0] = 1;
}
```
```
error: expected a pattern, found an expression
--> main.rs:3:9
|
3 | x * x => true,
| ^^^^^ arbitrary expressions are not allowed in patterns
error: expected a pattern, found an expression
--> main.rs:10:9
|
10 | let t[0] = 1;
| ^^^^ arbitrary expressions are not allowed in patterns
```
Would be cool if the compiler could suggest adding a guard for `match`es, but I've no idea how to do it.
---
`@rustbot` label +A-diagnostics +A-parser +A-patterns +C-enhancement
Currently the parser will interpret any label/lifetime in certain
positions as a mistyped char literal, on the assumption that the
trailing single quote was accidentally omitted. This is reasonable for a
something like 'a (because 'a' would be valid) but not reasonable for a
something like 'abc (because 'abc' is not valid).
This commit restricts this behaviour only to labels/lifetimes that would
be valid char literals, via the new `could_be_unclosed_char_literal`
function. The commit also augments the `label-is-actually-char.rs` test
in a couple of ways:
- Adds testing of labels/lifetimes with identifiers longer than one
char, e.g. 'abc.
- Adds a new match with simpler patterns, because the
`recover_unclosed_char` call in `parse_pat_with_range_pat` was not
being exercised (in this test or any other ui tests).
Fixes#120397, an assertion failure, which was caused by this behaviour
in the parser interacting with some new stricter char literal checking
added in #120329.
Error codes are integers, but `String` is used everywhere to represent
them. Gross!
This commit introduces `ErrCode`, an integral newtype for error codes,
replacing `String`. It also introduces a constant for every error code,
e.g. `E0123`, and removes the `error_code!` macro. The constants are
imported wherever used with `use rustc_errors::codes::*`.
With the old code, we have three different ways to specify an error code
at a use point:
```
error_code!(E0123) // macro call
struct_span_code_err!(dcx, span, E0123, "msg"); // bare ident arg to macro call
\#[diag(name, code = "E0123")] // string
struct Diag;
```
With the new code, they all use the `E0123` constant.
```
E0123 // constant
struct_span_code_err!(dcx, span, E0123, "msg"); // constant
\#[diag(name, code = E0123)] // constant
struct Diag;
```
The commit also changes the structure of the error code definitions:
- `rustc_error_codes` now just defines a higher-order macro listing the
used error codes and nothing else.
- Because that's now the only thing in the `rustc_error_codes` crate, I
moved it into the `lib.rs` file and removed the `error_codes.rs` file.
- `rustc_errors` uses that macro to define everything, e.g. the error
code constants and the `DIAGNOSTIC_TABLES`. This is in its new
`codes.rs` file.
Properly recover from trailing attr in body
When encountering an attribute in a body, we try to recover from an attribute on an expression (as opposed to a statement). We need to properly clean up when the attribute is at the end of the body where a tail expression would be.
Fix#118164, fix#118575.
When encountering an attribute in a body, we try to recover from an
attribute on an expression (as opposed to a statement). We need to
properly clean up when the attribute is at the end of the body where a
tail expression would be.
Fix#118164.
They can't contain `\x` escapes, which means they can't contain high
bytes, which means we can used `unescape_unicode` instead of
`unescape_mixed` to unescape them. This avoids unnecessary used of
`MixedUnit`.
`unescape_literal` becomes `unescape_unicode`, and `unescape_c_string`
becomes `unescape_mixed`. Because rfc3349 will mean that C string
literals will no longer be the only mixed utf8 literals.
Remove special handling of `box` expressions from parser
#108471 added a temporary hack to parse `box expr`. It's been almost a year since then, so I think it's safe to remove the special handling.
As a drive-by cleanup, move `parser/removed-syntax*` tests to their own directory.
Detect `NulInCStr` error earlier.
By making it an `EscapeError` instead of a `LitError`. This makes it like the other errors produced when checking string literals contents, e.g. for invalid escape sequences or bare CR chars.
NOTE: this means these errors are issued earlier, before expansion, which changes behaviour. It will be possible to move the check back to the later point if desired. If that happens, it's likely that all the string literal contents checks will be delayed together.
One nice thing about this: the old approach had some code in `report_lit_error` to calculate the span of the nul char from a range. This code used a hardwired `+2` to account for the `c"` at the start of a C string literal, but this should have changed to a `+3` for raw C string literals to account for the `cr"`, which meant that the caret in `cr"` nul error messages was one short of where it should have been. The new approach doesn't need any of this and avoids the off-by-one error.
r? ```@fee1-dead```
By making it an `EscapeError` instead of a `LitError`. This makes it
like the other errors produced when checking string literals contents,
e.g. for invalid escape sequences or bare CR chars.
NOTE: this means these errors are issued earlier, before expansion,
which changes behaviour. It will be possible to move the check back to
the later point if desired. If that happens, it's likely that all the
string literal contents checks will be delayed together.
One nice thing about this: the old approach had some code in
`report_lit_error` to calculate the span of the nul char from a range.
This code used a hardwired `+2` to account for the `c"` at the start of
a C string literal, but this should have changed to a `+3` for raw C
string literals to account for the `cr"`, which meant that the caret in
`cr"` nul error messages was one short of where it should have been. The
new approach doesn't need any of this and avoids the off-by-one error.
One consequence is that errors returned by
`maybe_new_parser_from_source_str` now must be consumed, so a bunch of
places that previously ignored those errors now cancel them. (Most of
them explicitly dropped the errors before. I guess that was to indicate
"we are explicitly ignoring these", though I'm not 100% sure.)
Two different lifetimes are conflated. This doesn't matter right now,
but needs to be fixed for the next commit to work. And the more
descriptive lifetime names make the code easier to read.
Each of these has a single call site: `source_file_to_parser`,
`try_file_to_source_file`, `file_to_source_file`. Having them separate
just makes the code longer and harder to read.
Also, `maybe_file_to_stream` doesn't need to be `pub`.
In #119606 I added them and used a `_mv` suffix, but that wasn't great.
A `with_` prefix has three different existing uses.
- Constructors, e.g. `Vec::with_capacity`.
- Wrappers that provide an environment to execute some code, e.g.
`with_session_globals`.
- Consuming chaining methods, e.g. `Span::with_{lo,hi,ctxt}`.
The third case is exactly what we want, so this commit changes
`DiagnosticBuilder::foo_mv` to `DiagnosticBuilder::with_foo`.
Thanks to @compiler-errors for the suggestion.
Because it takes an error code after the span. This avoids the confusing
overlap with the `DiagCtxt::struct_span_err` method, which doesn't take
an error code.
The existing uses are replaced in one of three ways.
- In a function that also has calls to `emit`, just rearrange the code
so that exactly one of `delay_as_bug` or `emit` is called on every
path.
- In a function returning a `DiagnosticBuilder`, use
`downgrade_to_delayed_bug`. That's good enough because it will get
emitted later anyway.
- In `unclosed_delim_err`, one set of errors is being replaced with
another set, so just cancel the original errors.
The old code was very hard to understand, involving an
`emit_without_consuming` call *and* a `delay_as_bug_without_consuming`
call.
With slight changes both calls can be avoided. Not creating the error
until later is crucial, as is the early return in the `if recovered`
block.
It took me some time to come up with this reworking -- it went through
intermediate states much further from the original code than this final
version -- and it's isn't obvious at a glance that it is equivalent. But
I think it is, and the unchanged test behaviour is good supporting
evidence.
The commit also changes `check_trailing_angle_brackets` to return
`Option<ErrorGuaranteed>`. This provides a stricter proof that it
emitted an error message than asserting `dcx.has_errors().is_some()`,
which would succeed if any error had previously been emitted anywhere.
It's not clear why this was here, because the created error is returned
as a normal error anyway.
Nor is it clear why removing the call works. The change doesn't affect
any tests; `tests/ui/parser/issues/issue-102182-impl-trait-recover.rs`
looks like the only test that could have been affected.
Instead of taking `seq` as a mutable reference,
`maybe_recover_struct_lit_bad_delims` now consumes `seq` on the recovery
path, and returns `seq` unchanged on the non-recovery path. The commit
also combines an `if` and a `match` to merge two identical paths.
Also change `recover_seq_parse_error` so it receives a `PErr` instead of
a `PResult`, because all the call sites now handle the `Ok`/`Err`
distinction themselves.
In this parsing recovery function, we only need to emit the previously
obtained error message and mark `expr` as erroneous in the case where we
actually recover.
This works for most of its call sites. This is nice, because `emit` very
much makes sense as a consuming operation -- indeed,
`DiagnosticBuilderState` exists to ensure no diagnostic is emitted
twice, but it uses runtime checks.
For the small number of call sites where a consuming emit doesn't work,
the commit adds `DiagnosticBuilder::emit_without_consuming`. (This will
be removed in subsequent commits.)
Likewise, `emit_unless` becomes consuming. And `delay_as_bug` becomes
consuming, while `delay_as_bug_without_consuming` is added (which will
also be removed in subsequent commits.)
All this requires significant changes to `DiagnosticBuilder`'s chaining
methods. Currently `DiagnosticBuilder` method chaining uses a
non-consuming `&mut self -> &mut Self` style, which allows chaining to
be used when the chain ends in `emit()`, like so:
```
struct_err(msg).span(span).emit();
```
But it doesn't work when producing a `DiagnosticBuilder` value,
requiring this:
```
let mut err = self.struct_err(msg);
err.span(span);
err
```
This style of chaining won't work with consuming `emit` though. For
that, we need to use to a `self -> Self` style. That also would allow
`DiagnosticBuilder` production to be chained, e.g.:
```
self.struct_err(msg).span(span)
```
However, removing the `&mut self -> &mut Self` style would require that
individual modifications of a `DiagnosticBuilder` go from this:
```
err.span(span);
```
to this:
```
err = err.span(span);
```
There are *many* such places. I have a high tolerance for tedious
refactorings, but even I gave up after a long time trying to convert
them all.
Instead, this commit has it both ways: the existing `&mut self -> Self`
chaining methods are kept, and new `self -> Self` chaining methods are
added, all of which have a `_mv` suffix (short for "move"). Changes to
the existing `forward!` macro lets this happen with very little
additional boilerplate code. I chose to add the suffix to the new
chaining methods rather than the existing ones, because the number of
changes required is much smaller that way.
This doubled chainging is a bit clumsy, but I think it is worthwhile
because it allows a *lot* of good things to subsequently happen. In this
commit, there are many `mut` qualifiers removed in places where
diagnostics are emitted without being modified. In subsequent commits:
- chaining can be used more, making the code more concise;
- more use of chaining also permits the removal of redundant diagnostic
APIs like `struct_err_with_code`, which can be replaced easily with
`struct_err` + `code_mv`;
- `emit_without_diagnostic` can be removed, which simplifies a lot of
machinery, removing the need for `DiagnosticBuilderState`.
rustc_span: More consistent span combination operations
Also add more tests for using `tt` in addition to `ident`, and some other minor tweaks, see individual commits.
This is a part of https://github.com/rust-lang/rust/pull/119412 that doesn't yet add side tables for metavariable spans.
Rollup of 10 pull requests
Successful merges:
- #118521 (Enable address sanitizer for MSVC targets using INFERASANLIBS linker flag)
- #119026 (std::net::bind using -1 for openbsd which in turn sets it to somaxconn.)
- #119195 (Make named_asm_labels lint not trigger on unicode and trigger on format args)
- #119204 (macro_rules: Less hacky heuristic for using `tt` metavariable spans)
- #119362 (Make `derive(Trait)` suggestion more accurate)
- #119397 (Recover parentheses in range patterns)
- #119417 (Uplift some miscellaneous coroutine-specific machinery into `check_closure`)
- #119539 (Fix typos)
- #119540 (Don't synthesize host effect args inside trait object types)
- #119555 (Add codegen test for RVO on MaybeUninit)
r? `@ghost`
`@rustbot` modify labels: rollup