Originally, this was kinda half-allowed. There were some primitive
checks in place that looked at the span to see whether the input was
likely a literal. These "source literal" checks are needed because the
spans created during `format_args` parsing only make sense when it is
indeed a literal that was written in the source code directly.
This is orthogonal to the restriction that the first argument must be a
"direct literal", not being exanpanded from macros. This restriction was
imposed by [RFC 2795] on the basis of being too confusing. But this was
only concerned with the argument of the invocation being a literal, not
whether it was a source literal (maybe in spirit it meant it being a
source literal, this is not clear to me).
Since the original check only really cared about source literals (which
is good enough to deny the `format_args!(concat!())` example), macros
expanding to `format_args` invocations were able to use implicit
captures if they spanned the string in a way that lead back to a source
string.
The "source literal" checks were not strict enough and caused ICEs in
certain cases (see # 106191 (the space is intended to avoid spammy
backreferences)). So I tightened it up in # 106195 to really only work
if it's a direct source literal.
This caused the `indoc` crate to break. `indoc` transformed the source
literal by removing whitespace, which made it not a "source literal"
anymore (which is required to fix the ICE). But since `indoc` spanned
the literal in ways that made the old check think that it's a literal,
it was able to use implicit captures (which is useful and nice for the
users of `indoc`).
This commit properly seperates the previously introduced concepts of
"source literal" and "direct literal" and therefore allows `indoc`
invocations, which don't create "source literals" to use implicit
captures again.
[RFC 2795]: https://rust-lang.github.io/rfcs/2795-format-args-implicit-identifiers.html#macro-hygiene
tidy: enforce comment blocks to have an even number of backticks
After PR #108694, most unmatched backticks in `compiler/` comments have been eliminated. This PR adds a tidy lint to ensure no new unmatched backticks are added, and either addresses the lint in the remaining instances it found, or allows it.
Very often, backtick containing sections wrap around lines, for example:
```Rust
// This function takes a tuple `(Vec<String>,
// Box<[u8]>)` and transforms it into `Vec<u8>`.
```
The lint is implemented to work on top of blocks, counting each line with a `//` into a block, and counting if there are an odd or even number of backticks in the entire block, instead of looking at just a single line.
This makes it easier to open the messages file while developing on features.
The commit was the result of automatted changes:
for p in compiler/rustc_*; do mv $p/locales/en-US.ftl $p/messages.ftl; rmdir $p/locales; done
for p in compiler/rustc_*; do sed -i "s#\.\./locales/en-US.ftl#../messages.ftl#" $p/src/lib.rs; done
allow negative numeric literals in `concat!`
Fixes#106837
While *technically* negative numeric literals are implemented as unary operations, users can reasonably expect that negative literals are treated the same as positive literals.
Relax ordering rules for `asm!` operands
The `asm!` and `global_asm!` macros require their operands to appear strictly in the following order:
- Template strings
- Positional operands
- Named operands
- Explicit register operands
- `clobber_abi`
- `options`
This is overly strict and can be inconvienent when building complex `asm!` statements with macros. This PR relaxes the ordering requirements as follows:
- Template strings must still come before all other operands.
- Positional operands must still come before named and explicit register operands.
- Named and explicit register operands can be freely mixed.
- `options` and `clobber_abi` can appear in any position after the template strings.
r? ```````@joshtriplett```````
This resolves an inconsistency in naming style for functions
on the parser, between functions parsing specific kinds of items
and those for expressions, favoring the parse_item_[sth] style
used by functions for items. There are multiple advantages
of that style:
* functions of both categories are collected in the same place
in the rustdoc output.
* it helps with autocompletion, as you can narrow down your
search for a function to those about expressions.
* it mirrors rust's path syntax where less specific things
come first, then it gets more specific, i.e.
std::collections::hash_map::Entry
The disadvantage is that it doesn't "read like a sentence"
any more, but I think the advantages weigh more greatly.
This change was mostly application of this command:
sed -i -E 's/(fn |\.)parse_([[:alnum:]_]+)_expr/\1parse_expr_\2/' compiler/rustc_parse/src/parser/*.rs
Plus very minor fixes outside of rustc_parse, and an invocation
of x fmt.
Instead of loading the Fluent resources for every crate in
`rustc_error_messages`, each crate generates typed identifiers for its
own diagnostics and creates a static which are pulled together in the
`rustc_driver` crate and provided to the diagnostic emitter.
Signed-off-by: David Wood <david.wood@huawei.com>
If you do `derive(PartialEq)` on a packed struct, the output shown by
`-Zunpretty=expanded` includes expressions like this:
```
{ self.x } == { other.x }
```
This is invalid syntax. This doesn't break compilation, because the AST
nodes are constructed within the compiler. But it does mean anyone using
`-Zunpretty=expanded` output as a guide for hand-written impls could get
a nasty surprise.
This commit fixes things by instead using this form:
```
({ self.x }) == ({ other.x })
```
Currently, deriving on packed structs has some non-trivial limitations,
related to the fact that taking references on unaligned fields is UB.
The current approach to field accesses in derived code:
- Normal case: `&self.0`
- In a packed struct that derives `Copy`: `&{self.0}`
- In a packed struct that doesn't derive `Copy`: `&self.0`
Plus, we disallow deriving any builtin traits other than `Default` for any
packed generic type, because it's possible that there might be
misaligned fields. This is a fairly broad restriction.
Plus, we disallow deriving any builtin traits other than `Default` for most
packed types that don't derive `Copy`. (The exceptions are those where the
alignments inherently satisfy the packing, e.g. in a type with
`repr(packed(N))` where all the fields have alignments of `N` or less
anyway. Such types are pretty strange, because the `packed` attribute is
not having any effect.)
This commit introduces a new, simpler approach to field accesses:
- Normal case: `&self.0`
- In a packed struct: `&{self.0}`
In the latter case, this requires that all fields impl `Copy`, which is
a new restriction. This means that the following example compiles under
the old approach and doesn't compile under the new approach.
```
#[derive(Debug)]
struct NonCopy(u8);
#[derive(Debug)
#[repr(packed)]
struct MyType(NonCopy);
```
(Note that the old approach's support for cases like this was brittle.
Changing the `u8` to a `u16` would be enough to stop it working. So not
much capability is lost here.)
However, the other constraints from the old rules are removed. We can now
derive builtin traits for packed generic structs like this:
```
trait Trait { type A; }
#[derive(Hash)]
#[repr(packed)]
pub struct Foo<T: Trait>(T, T::A);
```
To allow this, we add a `T: Copy` bound in the derived impl and a `T::A:
Copy` bound in where clauses. So `T` and `T::A` must impl `Copy`.
We can now also derive builtin traits for packed structs that don't derive
`Copy`, so long as the fields impl `Copy`:
```
#[derive(Hash)]
#[repr(packed)]
pub struct Foo(u32);
```
This includes types that hand-impl `Copy` rather than deriving it, such as the
following, that show up in winapi-0.2:
```
#[derive(Clone)]
#[repr(packed)]
struct MyType(i32);
impl Copy for MyType {}
```
The new approach is simpler to understand and implement, and it avoids
the need for the `unsafe_derive_on_repr_packed` check.
One exception is required for backwards-compatibility: we allow `[u8]`
fields for now. There is a new lint for this,
`byte_slice_in_packed_struct_with_derive`.
Special-case deriving `PartialOrd` for enums with dataless variants
I was able to get slightly better codegen by flipping the derived `PartialOrd` logic for two-variant enums. I also tried to document the implementation of the derive macro to make the special-case logic a little clearer.
```rs
#[derive(PartialEq, PartialOrd)]
pub enum A<T> {
A,
B(T)
}
```
```diff
impl<T: ::core::cmp::PartialOrd> ::core::cmp::PartialOrd for A<T> {
#[inline]
fn partial_cmp(
&self,
other: &A<T>,
) -> ::core::option::Option<::core::cmp::Ordering> {
let __self_tag = ::core::intrinsics::discriminant_value(self);
let __arg1_tag = ::core::intrinsics::discriminant_value(other);
- match ::core::cmp::PartialOrd::partial_cmp(&__self_tag, &__arg1_tag) {
- ::core::option::Option::Some(::core::cmp::Ordering::Equal) => {
- match (self, other) {
- (A::B(__self_0), A::B(__arg1_0)) => {
- ::core::cmp::PartialOrd::partial_cmp(__self_0, __arg1_0)
- }
- _ => ::core::option::Option::Some(::core::cmp::Ordering::Equal),
- }
+ match (self, other) {
+ (A::B(__self_0), A::B(__arg1_0)) => {
+ ::core::cmp::PartialOrd::partial_cmp(__self_0, __arg1_0)
}
- cmp => cmp,
+ _ => ::core::cmp::PartialOrd::partial_cmp(&__self_tag, &__arg1_tag),
}
}
}
```
Godbolt: [Current](https://godbolt.org/z/GYjEzG1T8), [New](https://godbolt.org/z/GoK78qx15)
I'm not sure how common a case comparing two enums like this (such as `Option`) is, and if it's worth the slowdown of adding a special case to the derive. If it causes overall regressions it might be worth just manually implementing this for `Option`.
The `asm!` and `global_asm!` macros require their operands to appear
strictly in the following order:
- Template strings
- Positional operands
- Named operands
- Explicit register operands
- `clobber_abi`
- `options`
This is overly strict and can be inconvienent when building complex
`asm!` statements with macros. This PR relaxes the ordering requirements
as follows:
- Template strings must still come before all other operands.
- Positional operands must still come before named and explicit register
operands.
- Named and explicit register operands can be freely mixed.
- `options` and `clobber_abi` can appear in any position.
Move format_args!() into AST (and expand it during AST lowering)
Implements https://github.com/rust-lang/compiler-team/issues/541
This moves FormatArgs from rustc_builtin_macros to rustc_ast_lowering. For now, the end result is the same. But this allows for future changes to do smarter things with format_args!(). It also allows Clippy to directly access the ast::FormatArgs, making things a lot easier.
This change turns the format args types into lang items. The builtin macro used to refer to them by their path. After this change, the path is no longer relevant, making it easier to make changes in `core`.
This updates clippy to use the new language items, but this doesn't yet make clippy use the ast::FormatArgs structure that's now available. That should be done after this is merged.
Mark `proc_macro_decls_static` as always used
This would have avoided a bug in https://github.com/rust-lang/rust/pull/104860.
In practice this shouldn't matter since nothing uses the query other than the `dead_code` lint, but this isn't documented as an internal-only query so it seems nice for it to be accurate. I think for `dead_code` it doesn't matter because the relevant code is generated by `rustc_builtin_macros` and isn't linted.
I think `@tmiasko` or `@bjorn3` would be a good reviewer?
r? `@tmiasko`
This would have avoided a bug in https://github.com/rust-lang/rust/pull/104860.
In practice this shouldn't matter since nothing uses the query other than the `dead_code` lint,
but this isn't documented as an internal-only query so it seems nice for it to be accurate.
I think for `dead_code` it doesn't matter because the relevant code is generated by `rustc_builtin_macros` and isn't linted.
Remove `token::Lit` from `ast::MetaItemLit`.
Currently `ast::MetaItemLit` represents the literal kind twice. This PR removes that redundancy. Best reviewed one commit at a time.
r? `@petrochenkov`
compiler: remove unnecessary imports and qualified paths
Some of these imports were necessary before Edition 2021, others were already in the prelude.
I hope it's fine that this PR is so spread-out across files :/
This migrates everything but the `mbe` and `proc_macro` modules. It also
contains a few cleanups and drive-by/accidental diagnostic improvements
which can be seen in the diff for the UI tests.
Shrink `rustc_parse_format::Piece`
This makes both variants closer together in size (previously they were different by 208 bytes -- 16 vs 224). This may make things worse, but it's worth a try.
r? `@nnethercote`
This makes both variants closer together in size (previously they were
different by 208 bytes -- 16 vs 224). This may make things worse, but
it's worth a try.
Add `type_ascribe!` macro as placeholder syntax for type ascription
This makes it still possible to test the internal semantics of type ascription even once the `:`-syntax is removed from the parser. The macro now gets used in a bunch of UI tests that test the semantics and not syntax of type ascription.
I might have forgotten a few tests but this should hopefully be most of them. The remaining ones will certainly be found once type ascription is removed from the parser altogether.
Part of #101728
rustc_ast_lowering: Stop lowering imports into multiple items
Lower them into a single item with multiple resolutions instead.
This also allows to remove additional `NodId`s and `DefId`s related to those additional items.
`token::Lit` contains a `kind` field that indicates what kind of literal
it is. `ast::MetaItemLit` currently wraps a `token::Lit` but also has
its own `kind` field. This means that `ast::MetaItemLit` encodes the
literal kind in two different ways.
This commit changes `ast::MetaItemLit` so it no longer wraps
`token::Lit`. It now contains the `symbol` and `suffix` fields from
`token::Lit`, but not the `kind` field, eliminating the redundancy.
This is required to distinguish between cooked and raw byte string
literals in an `ast::LitKind`, without referring to an adjacent
`token::Lit`. It's a prerequisite for the next commit.
Lower them into a single item with multiple resolutions instead.
This also allows to remove additional `NodId`s and `DefId`s related to those additional items.
There is code for converting `Attribute` (syntactic) to `MetaItem`
(semantic). There is also code for the reverse direction. The reverse
direction isn't really necessary; it's currently only used when
generating attributes, e.g. in `derive` code.
This commit adds some new functions for creating `Attributes`s directly,
without involving `MetaItem`s: `mk_attr_word`, `mk_attr_name_value_str`,
`mk_attr_nested_word`, and
`ExtCtxt::attr_{word,name_value_str,nested_word}`.
These new methods replace the old functions for creating `Attribute`s:
`mk_attr_inner`, `mk_attr_outer`, and `ExtCtxt::attribute`. Those
functions took `MetaItem`s as input, and relied on many other functions
that created `MetaItems`, which are also removed: `mk_name_value_item`,
`mk_list_item`, `mk_word_item`, `mk_nested_word_item`,
`{MetaItem,MetaItemKind,NestedMetaItem}::token_trees`,
`MetaItemKind::attr_args`, `MetaItemLit::{from_lit_kind,to_token}`,
`ExtCtxt::meta_word`.
Overall this cuts more than 100 lines of code and makes thing simpler.
In `Expander::expand` the code currently uses `mk_attr_outer` to convert
a `MetaItem` to an `Attribute`, and then follows that with
`meta_item_list` which converts back. This commit avoids the unnecessary
conversions.
There was one wrinkle: the existing conversions caused the bogus `<>` on
`Default<>` to be removed. With the conversion gone, we get a second
error message about the `<>`. This is a rare case, so I think it
probably doesn't matter much.
`check_builtin_attribute` calls `parse_meta` to convert an `Attribute`
to a `MetaItem`, which it then checks. However, many callers of
`check_builtin_attribute` start with a `MetaItem`, and then convert it
to an `Attribute` by calling `cx.attribute(meta_item)`. This `MetaItem`
to `Attribute` to `MetaItem` conversion is silly.
This commit adds a new function `check_builtin_meta_item`, which can be
called instead from these call sites. `check_builtin_attribute` also now
calls it. The commit also renames `check_meta` as `check_attr` to better
match its arguments.
Use `as_deref` in compiler (but only where it makes sense)
This simplifies some code :3
(there are some changes that are not exacly `as_deref`, but more like "clever `Option`/`Result` method use")
`MacArgs` is an enum with three variants: `Empty`, `Delimited`, and `Eq`. It's
used in two ways:
- For representing attribute macro arguments (e.g. in `AttrItem`), where all
three variants are used.
- For representing function-like macros (e.g. in `MacCall` and `MacroDef`),
where only the `Delimited` variant is used.
In other words, `MacArgs` is used in two quite different places due to them
having partial overlap. I find this makes the code hard to read. It also leads
to various unreachable code paths, and allows invalid values (such as
accidentally using `MacArgs::Empty` in a `MacCall`).
This commit splits `MacArgs` in two:
- `DelimArgs` is a new struct just for the "delimited arguments" case. It is
now used in `MacCall` and `MacroDef`.
- `AttrArgs` is a renaming of the old `MacArgs` enum for the attribute macro
case. Its `Delimited` variant now contains a `DelimArgs`.
Various other related things are renamed as well.
These changes make the code clearer, avoids several unreachable paths, and
disallows the invalid values.
The current approach to field accesses in derived code:
- Normal case: `&self.0`
- In a packed struct that derives `Copy`: `&{self.0}`
- In a packed struct that doesn't derive `Copy`: `let Self(ref x) = *self`
The `let` pattern used in the third case is equivalent to the simpler
field access in the first case. This commit changes the third case to
use a field access.
The commit also combines two boolean arguments (`is_packed` and
`always_copy`) into a single field (`copy_fields`) earlier, to save
passing both around.
Instead of `ast::Lit`.
Literal lowering now happens at two different times. Expression literals
are lowered when HIR is crated. Attribute literals are lowered during
parsing.
This commit changes the language very slightly. Some programs that used
to not compile now will compile. This is because some invalid literals
that are removed by `cfg` or attribute macros will no longer trigger
errors. See this comment for more details:
https://github.com/rust-lang/rust/pull/102944#issuecomment-1277476773
Delay `include_bytes` to AST lowering
Hopefully addresses #65818.
This PR introduces a new `ExprKind::IncludedBytes` which stores the path and bytes of a file included with `include_bytes!()`. We can then create a literal from the bytes during AST lowering, which means we don't need to escape the bytes into valid UTF8 which is the cause of most of the overhead of embedding large binary blobs.
Add the `#[derive_const]` attribute
Closes#102371. This is a minimal patchset for the attribute to work. There are no restrictions on what traits this attribute applies to.
r? `````@oli-obk`````
`#[test]`: Point at return type if `Termination` bound is unsatisfied
Together with #103142 (already merged) this fully fixes#50291.
I don't consider my current solution of changing a few spans “here and there” very clean since the
failed obligation is a `FunctionArgumentObligation` and we point at a type instead of a function argument.
If you agree with me on this point, I can offer to keep the spans of the existing nodes and instead inject
`let _: AssertRetTyIsTermination<$ret_ty>;` (type to be defined in `libtest`) similar to `AssertParamIsEq` etc.
used by some built-in derive-macros.
I haven't tried that approach yet though and cannot promise that it would actually work out or
be “cleaner” for that matter.
````@rustbot```` label A-libtest A-diagnostics
r? ````@estebank````
The new implementation doesn't use weak lang items and instead changes
`#[alloc_error_handler]` to an attribute macro just like
`#[global_allocator]`.
The attribute will generate the `__rg_oom` function which is called by
the compiler-generated `__rust_alloc_error_handler`. If no `__rg_oom`
function is defined in any crate then the compiler shim will call
`__rdl_oom` in the alloc crate which will simply panic.
This also fixes link errors with `-C link-dead-code` with
`default_alloc_error_handler`: `__rg_oom` was previously defined in the
alloc crate and would attempt to reference the `oom` lang item, even if
it didn't exist. This worked as long as `__rg_oom` was excluded from
linking since it was not called.
This is a prerequisite for the stabilization of
`default_alloc_error_handler` (#102318).
Sort tests at compile time, not at startup
Recently, another Miri user was trying to run `cargo miri test` on the crate `iced-x86` with `--features=code_asm,mvex`. This configuration has a startup time of ~18 minutes. That's ~18 minutes before any tests even start to run. The fact that this crate has over 26,000 tests and Miri is slow makes a lot of code which is otherwise a bit sloppy but fine into a huge runtime issue.
Sorting the tests when the test harness is created instead of at startup time knocks just under 4 minutes out of those ~18 minutes. I have ways to remove most of the rest of the startup time, but this change requires coordinating changes of both the compiler and libtest, so I'm sending it separately.
(except for doctests, because there is no compile-time harness)
This fixes a typo first appearing in #94624
in which test-macro diagnostic uses "a" article twice.
Since I searched sources for " a a " sequences,
I also fixed the same issue in a few source files where I found it.
Signed-off-by: Petr Portnov <gh@progrm-jarvis.ru>
On later stages, the feature is already stable.
Result of running:
rg -l "feature.let_else" compiler/ src/librustdoc/ library/ | xargs sed -s -i "s#\\[feature.let_else#\\[cfg_attr\\(bootstrap, feature\\(let_else\\)#"
These two type names are long and have long matching prefixes. I find
them hard to read, especially in combinations like
`AttrAnnotatedTokenStream::new(vec![AttrAnnotatedTokenTree::Token(..)])`.
This commit renames them as `AttrToken{Stream,Tree}`.
Recently, another Miri user was trying to run `cargo miri test` on the
crate `iced-x86` with `--features=code_asm,mvex`. This configuration has
a startup time of ~18 minutes. That's ~18 minutes before any tests even
start to run. The fact that this crate has over 26,000 tests and Miri is
slow makes a lot of code which is otherwise a bit sloppy but fine into a
huge runtime issue.
Sorting the tests when the test harness is created instead of at startup
time knocks just under 4 minutes out of those ~18 minutes. I have ways
to remove most of the rest of the startup time, but this change requires
coordinating changes of both the compiler and libtest, so I'm sending it
separately.
Replace `rustc_data_structures::thin_vec::ThinVec` with `thin_vec::ThinVec`
`rustc_data_structures::thin_vec::ThinVec` looks like this:
```
pub struct ThinVec<T>(Option<Box<Vec<T>>>);
```
It's just a zero word if the vector is empty, but requires two
allocations if it is non-empty. So it's only usable in cases where the
vector is empty most of the time.
This commit removes it in favour of `thin_vec::ThinVec`, which is also
word-sized, but stores the length and capacity in the same allocation as
the elements. It's good in a wider variety of situation, e.g. in enum
variants where the vector is usually/always non-empty.
The commit also:
- Sorts some `Cargo.toml` dependency lists, to make additions easier.
- Sorts some `use` item lists, to make additions easier.
- Changes `clean_trait_ref_with_bindings` to take a
`ThinVec<TypeBinding>` rather than a `&[TypeBinding]`, because this
avoid some unnecessary allocations.
r? `@spastorino`
Revert let_chains stabilization
This is the revert against master, the beta revert was already done in #100538.
Bumps the stage0 compiler which already has it reverted.
Separate CountIsStar from CountIsParam in rustc_parse_format.
`rustc_parse_format`'s parser would result in the exact same output for `{:.*}` and `{:.0$}`, making it hard for diagnostics to handle these cases properly.
This splits those cases by adding a new `CountIsStar` enum variant.
This fixes#100995
Prerequisite for https://github.com/rust-lang/rust/pull/100996
`rustc_data_structures::thin_vec::ThinVec` looks like this:
```
pub struct ThinVec<T>(Option<Box<Vec<T>>>);
```
It's just a zero word if the vector is empty, but requires two
allocations if it is non-empty. So it's only usable in cases where the
vector is empty most of the time.
This commit removes it in favour of `thin_vec::ThinVec`, which is also
word-sized, but stores the length and capacity in the same allocation as
the elements. It's good in a wider variety of situation, e.g. in enum
variants where the vector is usually/always non-empty.
The commit also:
- Sorts some `Cargo.toml` dependency lists, to make additions easier.
- Sorts some `use` item lists, to make additions easier.
- Changes `clean_trait_ref_with_bindings` to take a
`ThinVec<TypeBinding>` rather than a `&[TypeBinding]`, because this
avoid some unnecessary allocations.
Fix rustc_parse_format precision & width spans
When a `precision`/`width` was `CountIsName - {:name$}` or `CountIs - {:10}` the `precision_span`/`width_span` was set to `None`
For `width` the name span in `CountIsName(_, name_span)` had its `.start` off by one
r? ``@fee1-dead`` / cc ``@PrestonFrom`` since this is similar to #99987
Migrate rustc_ast_passes diagnostics to `SessionDiagnostic` and translatable messages (first part)
Doing a full migration of the `rustc_ast_passes` crate.
Making a draft here since there's not yet a tracking issue for the migrations going on.
`@rustbot` label +A-translation
In some places we use `Vec<Attribute>` and some places we use
`ThinVec<Attribute>` (a.k.a. `AttrVec`). This results in various points
where we have to convert between `Vec` and `ThinVec`.
This commit changes the places that use `Vec<Attribute>` to use
`AttrVec`. A lot of this is mechanical and boring, but there are
some interesting parts:
- It adds a few new methods to `ThinVec`.
- It implements `MapInPlace` for `ThinVec`, and introduces a macro to
avoid the repetition of this trait for `Vec`, `SmallVec`, and
`ThinVec`.
Overall, it makes the code a little nicer, and has little effect on
performance. But it is a precursor to removing
`rustc_data_structures::thin_vec::ThinVec` and replacing it with
`thin_vec::ThinVec`, which is implemented more efficiently.