Commit Graph

30 Commits

Author SHA1 Message Date
Vadim Petrochenkov
2733ec1be3 rustc_ast: Harmonize delimiter naming with proc_macro::Delimiter 2022-04-28 10:04:29 +03:00
Nicholas Nethercote
643e9f707e Introduced Cursor::next_with_spacing_ref.
This lets us clone just the parts within a `TokenTree` that need
cloning, rather than the entire thing. This is a surprisingly large
performance win, up to 4% on `async-std-1.10.0`.
2022-04-21 13:49:40 +10:00
Nicholas Nethercote
5b653c1a43 Inline Cursor::next_with_spacing. 2022-04-20 12:43:25 +10:00
Nicholas Nethercote
aefbbeec34 Inline and remove TokenTree::{open_tt,close_tt}.
They both have a single call site.
2022-04-19 17:02:48 +10:00
Nicholas Nethercote
ad566b78f2 Tweak Cursor::next_with_spacing.
This makes it more like `CursorRef::next_with_spacing`. There is no
performance effect, just a consistency improvement.
2022-04-19 17:02:48 +10:00
Yuri Astrakhan
5160f8f843 Spellchecking compiler comments
This PR cleans up the rest of the spelling mistakes in the compiler comments. This PR does not change any literal or code spelling issues.
2022-03-30 15:14:15 -04:00
Caio
ef5601b321 2 - Make more use of let_chains
Continuation of #94376.

cc #53667
2022-02-26 13:45:43 -03:00
Nicholas Nethercote
416399dc10 Make Decodable and Decoder infallible.
`Decoder` has two impls:
- opaque: this impl is already partly infallible, i.e. in some places it
  currently panics on failure (e.g. if the input is too short, or on a
  bad `Result` discriminant), and in some places it returns an error
  (e.g. on a bad `Option` discriminant). The number of places where
  either happens is surprisingly small, just because the binary
  representation has very little redundancy and a lot of input reading
  can occur even on malformed data.
- json: this impl is fully fallible, but it's only used (a) for the
  `.rlink` file production, and there's a `FIXME` comment suggesting it
  should change to a binary format, and (b) in a few tests in
  non-fundamental ways. Indeed #85993 is open to remove it entirely.

And the top-level places in the compiler that call into decoding just
abort on error anyway. So the fallibility is providing little value, and
getting rid of it leads to some non-trivial performance improvements.

Much of this commit is pretty boring and mechanical. Some notes about
a few interesting parts:
- The commit removes `Decoder::{Error,error}`.
- `InternIteratorElement::intern_with`: the impl for `T` now has the same
  optimization for small counts that the impl for `Result<T, E>` has,
  because it's now much hotter.
- Decodable impls for SmallVec, LinkedList, VecDeque now all use
  `collect`, which is nice; the one for `Vec` uses unsafe code, because
  that gave better perf on some benchmarks.
2022-01-22 10:38:31 +11:00
EliseZeroTwo
7402eb001b
fix: inner attribute followed by outer attribute causing ICE 2021-10-25 17:31:27 +02:00
klensy
56a2a2ae1f don't clone attrs 2021-05-30 22:44:40 +03:00
klensy
f43ee8ebf6 fix few typos 2021-04-19 15:57:08 +03:00
Aaron Hill
a93c4f05de
Implement token-based handling of attributes during expansion
This PR modifies the macro expansion infrastructure to handle attributes
in a fully token-based manner. As a result:

* Derives macros no longer lose spans when their input is modified
  by eager cfg-expansion. This is accomplished by performing eager
  cfg-expansion on the token stream that we pass to the derive
  proc-macro
* Inner attributes now preserve spans in all cases, including when we
  have multiple inner attributes in a row.

This is accomplished through the following changes:

* New structs `AttrAnnotatedTokenStream` and `AttrAnnotatedTokenTree` are introduced.
  These are very similar to a normal `TokenTree`, but they also track
  the position of attributes and attribute targets within the stream.
  They are built when we collect tokens during parsing.
  An `AttrAnnotatedTokenStream` is converted to a regular `TokenStream` when
  we invoke a macro.
* Token capturing and `LazyTokenStream` are modified to work with
  `AttrAnnotatedTokenStream`. A new `ReplaceRange` type is introduced, which
  is created during the parsing of a nested AST node to make the 'outer'
  AST node aware of the attributes and attribute target stored deeper in the token stream.
* When we need to perform eager cfg-expansion (either due to `#[derive]` or `#[cfg_eval]`),
we tokenize and reparse our target, capturing additional information about the locations of
`#[cfg]` and `#[cfg_attr]` attributes at any depth within the target.
This is a performance optimization, allowing us to perform less work
in the typical case where captured tokens never have eager cfg-expansion run.
2021-04-11 01:31:36 -04:00
Guillaume Gomez
f4a19ca851
Fix typo in TokenStream documentation 2021-04-05 22:58:07 +02:00
Joshua Nelson
441dc3640a Remove (lots of) dead code
Found with https://github.com/est31/warnalyzer.

Dubious changes:
- Is anyone else using rustc_apfloat? I feel weird completely deleting
  x87 support.
- Maybe some of the dead code in rustc_data_structures, in case someone
  wants to use it in the future?
- Don't change rustc_serialize

  I plan to scrap most of the json module in the near future (see
  https://github.com/rust-lang/compiler-team/issues/418) and fixing the
  tests needed more work than I expected.

TODO: check if any of the comments on the deleted code should be kept.
2021-03-27 22:16:33 -04:00
Josh Stone
72ebebe474 Use iter::zip in compiler/ 2021-03-26 09:32:31 -07:00
Harald van Dijk
95e096d623
Change x64 size checks to not apply to x32.
Rust contains various size checks conditional on target_arch = "x86_64",
but these checks were never intended to apply to
x86_64-unknown-linux-gnux32. Add target_pointer_width = "64" to the
conditions.
2021-03-06 16:02:48 +00:00
Aaron Hill
ccfc292999
Refactor token collection to capture trailing token immediately 2021-01-22 00:33:03 -05:00
pierwill
9a240e4857 Edit rustc_ast::tokenstream docs
Fix some punctuation and wording, and add intra-documentation links.
2021-01-03 11:54:56 -08:00
Aaron Hill
530a629635
Remove pretty-print/reparse hack, and add derive-specific hack 2020-12-29 09:36:42 -05:00
Aaron Hill
de88bf148b
Properly handle attributes on statements
We now collect tokens for the underlying node wrapped by `StmtKind`
instead of storing tokens directly in `Stmt`.

`LazyTokenStream` now supports capturing a trailing semicolon after it
is initially constructed. This allows us to avoid refactoring statement
parsing to wrap the parsing of the semicolon in `parse_tokens`.

Attributes on item statements
(e.g. `fn foo() { #[bar] struct MyStruct; }`) are now treated as
item attributes, not statement attributes, which is consistent with how
we handle attributes on other kinds of statements. The feature-gating
code is adjusted so that proc-macro attributes are still allowed on item
statements on stable.

Two built-in macros (`#[global_allocator]` and `#[test]`) needed to be
adjusted to support being passed `Annotatable::Stmt`.
2020-11-26 17:08:35 -05:00
Dániel Buga
a7f2bb6343 Reserve space in advance 2020-11-13 11:19:25 +01:00
Vadim Petrochenkov
19dbb02a89 Expand NtExpr tokens only in key-value attributes 2020-11-03 00:53:43 +03:00
Vadim Petrochenkov
6b63e9b990 Do not remove tokens before AST json serialization 2020-11-01 00:03:35 +03:00
Vadim Petrochenkov
d0c63bccc5 parser: Cleanup LazyTokenStream and avoid some clones
by using a named struct instead of a closure.
2020-10-31 01:56:34 +03:00
bors
22e6b9c689 Auto merge of #77250 - Aaron1011:feature/flat-token-collection, r=petrochenkov
Rewrite `collect_tokens` implementations to use a flattened buffer

Instead of trying to collect tokens at each depth, we 'flatten' the
stream as we go allong, pushing open/close delimiters to our buffer
just like regular tokens. One capturing is complete, we reconstruct a
nested `TokenTree::Delimited` structure, producing a normal
`TokenStream`.

The reconstructed `TokenStream` is not created immediately - instead, it is
produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This
closure stores a clone of the original `TokenCursor`, plus a record of the
number of calls to `next()/next_desugared()`. This is sufficient to reconstruct
the tokenstream seen by the callback without storing any additional state. If
the tokenstream is never used (e.g. when a captured `macro_rules!` argument is
never passed to a proc macro), we never actually create a `TokenStream`.

This implementation has a number of advantages over the previous one:

* It is significantly simpler, with no edge cases around capturing the
  start/end of a delimited group.

* It can be easily extended to allow replacing tokens an an arbitrary
  'depth' by just using `Vec::splice` at the proper position. This is
  important for PR #76130, which requires us to track information about
  attributes along with tokens.

* The lazy approach to `TokenStream` construction allows us to easily
  parse an AST struct, and then decide after the fact whether we need a
  `TokenStream`. This will be useful when we start collecting tokens for
  `Attribute` - we can discard the `LazyTokenStream` if the parsed
  attribute doesn't need tokens (e.g. is a builtin attribute).

The performance impact seems to be neglibile (see
https://github.com/rust-lang/rust/pull/77250#issuecomment-703960604). There is a
small slowdown on a few benchmarks, but it only rises above 1% for incremental
builds, where it represents a larger fraction of the much smaller instruction
count. There a ~1% speedup on a few other incremental benchmarks - my guess is
that the speedups and slowdowns will usually cancel out in practice.
2020-10-21 15:03:14 +00:00
Aaron Hill
593fdd3d45
Rewrite collect_tokens implementations to use a flattened buffer
Instead of trying to collect tokens at each depth, we 'flatten' the
stream as we go allong, pushing open/close delimiters to our buffer
just like regular tokens. One capturing is complete, we reconstruct a
nested `TokenTree::Delimited` structure, producing a normal
`TokenStream`.

The reconstructed `TokenStream` is not created immediately - instead, it is
produced on-demand by a closure (wrapped in a new `LazyTokenStream` type). This
closure stores a clone of the original `TokenCursor`, plus a record of the
number of calls to `next()/next_desugared()`. This is sufficient to reconstruct
the tokenstream seen by the callback without storing any additional state. If
the tokenstream is never used (e.g. when a captured `macro_rules!` argument is
never passed to a proc macro), we never actually create a `TokenStream`.

This implementation has a number of advantages over the previous one:

* It is significantly simpler, with no edge cases around capturing the
  start/end of a delimited group.

* It can be easily extended to allow replacing tokens an an arbitrary
  'depth' by just using `Vec::splice` at the proper position. This is
  important for PR #76130, which requires us to track information about
  attributes along with tokens.

* The lazy approach to `TokenStream` construction allows us to easily
  parse an AST struct, and then decide after the fact whether we need a
  `TokenStream`. This will be useful when we start collecting tokens for
  `Attribute` - we can discard the `LazyTokenStream` if the parsed
  attribute doesn't need tokens (e.g. is a builtin attribute).

The performance impact seems to be neglibile (see
https://github.com/rust-lang/rust/pull/77250#issuecomment-703960604). There is a
small slowdown on a few benchmarks, but it only rises above 1% for incremental
builds, where it represents a larger fraction of the much smaller instruction
count. There a ~1% speedup on a few other incremental benchmarks - my guess is
that the speedups and slowdowns will usually cancel out in practice.
2020-10-19 13:59:18 -04:00
Aaron Hill
f6aec82d4d
Avoid cloning the contents of a TokenStream in a few places 2020-10-19 12:30:41 -04:00
est31
49d4a756f1 Remove unused code from rustc_ast 2020-10-14 04:14:32 +02:00
Aleksey Kladov
ccf41dd5eb Rename IsJoint -> Spacing
To match better naming from proc-macro
2020-09-03 17:32:45 +02:00
mark
9e5f7d5631 mv compiler to compiler/ 2020-08-30 18:45:07 +03:00