Commit Graph

76 Commits

Author SHA1 Message Date
Nicholas Nethercote
99f5c79d64 Shrink Token.
From 72 bytes to 12 bytes (on x86-64).

There are two parts to this:
- Changing various source code offsets from 64-bit to 32-bit. This is
  not a problem because the rest of rustc also uses 32-bit source code
  offsets. This means `Token` is no longer `Copy` but this causes no
  problems.
- Removing the `RawStrError` from `LiteralKind`. Raw string literal
  invalidity is now indicated by a `None` value within
  `RawStr`/`RawByteStr`, and the new `validate_raw_str` function can be
  used to re-lex an invalid raw string literal to get the `RawStrError`.

There is one very small change in behaviour. Previously, if a raw string
literal matched both the `InvalidStarter` and `TooManyHashes` cases,
the latter would override the former. This has now changed, because
`raw_double_quoted_string` now uses `?` and so returns immediately upon
detecting the `InvalidStarter` case. I think this is a slight
improvement to report the earlier-detected error, and it explains the
change in the `test_too_many_hashes` test.

The commit also removes a couple of comments that refer to #77629 and
say that the size of these types don't affect performance. These
comments are wrong, though the performance effect is small.
2022-08-01 08:53:04 +10:00
Nicholas Nethercote
e6b9fccfb1 Add a size assertion for Token. 2022-08-01 08:27:43 +10:00
Nicholas Nethercote
ddf62b5bd4 Inline TokenStreamBuilder::push.
Because it's small and hot.
2022-08-01 08:27:41 +10:00
Nicholas Nethercote
c01a36d5e4 Avoid an unnecessary return. 2022-08-01 08:25:56 +10:00
Nicholas Nethercote
bd23d68b41 Remove StringReader::end_src_index.
It not needed, always being set to the end of the text.
2022-08-01 08:11:15 +10:00
Nicholas Nethercote
55185992d6 Improve shebang handling.
Avoid doing stuff until it's necessary.
2022-08-01 08:11:15 +10:00
Nicholas Nethercote
332dffb1f9 Remove TreeAndSpacing.
A `TokenStream` contains a `Lrc<Vec<(TokenTree, Spacing)>>`. But this is
not quite right. `Spacing` makes sense for `TokenTree::Token`, but does
not make sense for `TokenTree::Delimited`, because a
`TokenTree::Delimited` cannot be joined with another `TokenTree`.

This commit fixes this problem, by adding `Spacing` to `TokenTree::Token`,
changing `TokenStream` to contain a `Lrc<Vec<TokenTree>>`, and removing the
`TreeAndSpacing` typedef.

The commit removes these two impls:
- `impl From<TokenTree> for TokenStream`
- `impl From<TokenTree> for TreeAndSpacing`

These were useful, but also resulted in code with many `.into()` calls
that was hard to read, particularly for anyone not highly familiar with
the relevant types. This commit makes some other changes to compensate:
- `TokenTree::token()` becomes `TokenTree::token_{alone,joint}()`.
- `TokenStream::token_{alone,joint}()` are added.
- `TokenStream::delimited` is added.

This results in things like this:
```rust
TokenTree::token(token::Semi, stmt.span).into()
```
changing to this:
```rust
TokenStream::token_alone(token::Semi, stmt.span)
```
This makes the type of the result, and its spacing, clearer.

These changes also simplifies `Cursor` and `CursorRef`, because they no longer
need to distinguish between `next` and `next_with_spacing`.
2022-07-29 15:52:15 +10:00
Takayuki Maeda
77d6176e69 remove unnecessary to_string and String::new 2022-06-13 15:48:40 +09:00
Jacob Pratt
49c82f31a8
Remove crate visibility usage in compiler 2022-05-20 20:04:54 -04:00
David Wood
73fa217bc1 errors: span_suggestion takes impl ToString
Change `span_suggestion` (and variants) to take `impl ToString` rather
than `String` for the suggested code, as this simplifies the
requirements on the diagnostic derive.

Signed-off-by: David Wood <david.wood@huawei.com>
2022-04-29 02:05:20 +01:00
Vadim Petrochenkov
2733ec1be3 rustc_ast: Harmonize delimiter naming with proc_macro::Delimiter 2022-04-28 10:04:29 +03:00
Ellen
f697955c1e tut tut tut 2022-04-27 08:51:33 +01:00
Dylan DPC
946d76ec0e
Rollup merge of #95859 - rainy-me:unterminated-nested-block-comment, r=petrochenkov
Improve diagnostics for unterminated nested block comment

close #95283

(This is my first time try to messing around with rust compiler and might get a lot of things wrong... 🙇 )
2022-04-16 07:12:44 +02:00
rainy-me
1b7008dc77 refactor: change to use peekable 2022-04-14 21:18:27 +09:00
Matthias Krüger
7c2d57e0fa couple of clippy::complexity fixes 2022-04-13 22:51:34 +02:00
rainy-me
4a0f8d5175 improve diagnostics for unterminated nested block comment 2022-04-14 03:22:02 +09:00
Dylan DPC
86388f6171
Rollup merge of #95251 - GrishaVar:hashes-u16-to-u8, r=dtolnay
Reduce max hash in raw strings from u16 to u8

[Relevant discussion](https://rust-lang.zulipchat.com/#narrow/stream/237824-t-lang.2Fdoc/topic/Max.20raw.20string.20delimiters)
2022-03-31 00:26:31 +02:00
Grisha Vartanyan
759d1e6af8 Update error message & remove outdated test comment 2022-03-30 18:20:30 +02:00
Michael Goulet
928388bad2 Make fatal DiagnosticBuilder yield never 2022-03-27 22:25:32 -07:00
mark
bb8d4307eb rustc_error: make ErrorReported impossible to construct
There are a few places were we have to construct it, though, and a few
places that are more invasive to change. To do this, we create a
constructor with a long obvious name.
2022-03-16 10:35:24 -05:00
mark
e489a94dee rename ErrorReported -> ErrorGuaranteed 2022-03-02 09:45:25 -06:00
Caio
e3e902bb06 4 - Make more use of let_chains
Continuation of #94376.

cc #53667
2022-02-28 07:49:56 -03:00
Eduard-Mihai Burtescu
b7e95dee65 rustc_errors: let DiagnosticBuilder::emit return a "guarantee of emission". 2022-02-23 06:38:52 +00:00
Eduard-Mihai Burtescu
02ff9e0aef Replace &mut DiagnosticBuilder, in signatures, with &mut Diagnostic. 2022-02-23 05:38:19 +00:00
est31
2ef8af6619 Adopt let else in more places 2022-02-19 17:27:43 +01:00
Matthias Krüger
637d8b89e8
Rollup merge of #94011 - est31:let_else, r=lcnr
Even more let_else adoptions

Continuation of #89933, #91018, #91481, #93046, #93590.
2022-02-17 23:00:59 +01:00
est31
60f969a4f2 Adopt let_else in even more places 2022-02-16 22:43:39 +01:00
Erin Petra Sofiya Moon
e59cda9ee1
suggest using raw string literals when invalid escapes appear
i'd guess about 70% of "bad escape" cases occur when someone meant to
use a raw string literal because they're passing it directly to
Regex::new(). this emits an advisory (Applicability::MaybeIncorrect)
help: suggestion to the user that they use an r"" string,
on top of the normal notes about looking at the
string literal documentation/spec.
2022-02-14 15:11:38 -05:00
Esteban Kuber
d68add9ecc review comment: plural of emoji is emoji 2021-11-23 20:36:19 +00:00
Esteban Kuber
21224e6ee0 Account for confusable codepoints when recovering emoji identifiers 2021-11-23 20:36:19 +00:00
Esteban Kuber
5a68abb094 Tokenize emoji as if they were valid indentifiers
In the lexer, consider emojis to be valid identifiers and reject
them later to avoid knock down parse errors.
2021-11-23 20:35:07 +00:00
5225225
09e59c2875 Inline printable function 2021-11-16 08:06:31 +00:00
5225225
52199c93bb Suggest removing the non-printing characters 2021-11-16 08:06:30 +00:00
5225225
de05d3ec31 Print full char literal on error if any are non-printing 2021-11-16 08:06:30 +00:00
Hans Kratz
7885233df0 Optimize literal, doc comment lint as well, extract function. 2021-11-04 23:31:42 +01:00
Hans Kratz
a5b25a2cfa Create subslice as that leads to a smaller code size. 2021-11-04 17:03:13 +01:00
Hans Kratz
2d9f0e2c50 Optimize bidi character detection. 2021-11-04 12:01:26 +01:00
Pietro Albini
cdd3b8624f
fix formatting 2021-11-01 10:39:43 +01:00
Esteban Küber
c0b134582a
Lint against RTL unicode codepoints in literals and comments
Address CVE-2021-42574.
2021-10-31 13:14:04 +01:00
Fabian Wolff
0d8245b5b1 Improve diagnostics if a character literal contains combining marks 2021-09-10 19:23:37 +02:00
Anton Golov
a03fbfe2ff Warn when an escaped newline skips multiple lines 2021-08-11 11:35:08 +02:00
Cameron Steffen
4380056397
Rollup merge of #87659 - FabianWolff:issue-87397, r=davidtwco
Fix invalid suggestions for non-ASCII characters in byte constants

Fixes #87397.
2021-08-02 09:36:51 -05:00
bors
4e282795d7 Auto merge of #87662 - FabianWolff:rb-string, r=estebank
Suggest `br` if the unknown string prefix `rb` is found

Currently, for the following code:
```rust
fn main() {
    rb"abc";
}
```
we issue the following suggestion:
```
help: consider inserting whitespace here
  |
2 |     rb "abc";
  |       --
```
With my changes (only in edition 2021, where unknown prefixes became an error), I get:
```
help: use `br` for a raw byte string
  |
2 |     br"abc";
  |     ^^
```
2021-07-31 20:20:18 +00:00
Fabian Wolff
f2c9654dcd Suggest br if the unknown string prefix rb is found 2021-07-31 15:37:36 +02:00
Fabian Wolff
c1abb6f4d6 Fix invalid suggestions for non-ASCII characters in byte constants 2021-07-31 15:21:11 +02:00
Anton Golov
5d59b4412e Add warning when whitespace is not skipped after an escaped newline. 2021-07-30 16:26:39 +02:00
Ryan Levick
d4e384bc1d rename rust_2021_token_prefixes to rust_2021_prefixes_incompatible_syntax 2021-07-06 20:13:36 +02:00
Ryan Levick
81c11a212e rust_2021_token_prefixes 2021-07-06 20:13:16 +02:00
Ryan Levick
6c87772e3c Rename reserved_prefix lint to reserved_prefixes 2021-07-06 20:12:55 +02:00
Mara Bos
7490305e13 No reserved_prefix suggestion in proc macro call_site. 2021-06-26 23:11:14 +08:00