rust/crates
bors e64610dbbe Auto merge of #17037 - davidsemakula:token-set-collisions, r=Veykril
internal: improve `TokenSet` implementation and add reserved keywords

The current `TokenSet` type represents "A bit-set of `SyntaxKind`s" as a newtype `u128`.
Internally, the flag for each `SyntaxKind` variant in the bit-set is set as the n-th LSB (least significant bit) via a bit-wise left shift operation, where n is the discriminant.

Edit: This is problematic because there's currently ~121 token `SyntaxKind`s, so adding new token kinds for missing reserved keywords increases the number of token `SyntaxKind`s above 128, thus making this ["mask"](7a8374c162/crates/parser/src/token_set.rs (L31-L33)) operation overflow.
~~This is problematic because there's currently 266 SyntaxKinds, so this ["mask"](7a8374c162/crates/parser/src/token_set.rs (L31-L33)) operation silently overflows in release mode.~~
~~This leads to a single flag/bit in the bit-set being shared by multiple `SyntaxKind`s~~.

This PR:
- Changes the wrapped type for `TokenSet` from `u128` to `[u64; 3]` ~~`[u*; N]` (currently `[u16; 17]`) where `u*` can be any desirable unsigned integer type and `N` is the minimum array length needed to represent all token `SyntaxKind`s without any collisions~~.
- Edit: Add assertion that `TokenSet`s only include token `SyntaxKind`s
- Edit: Add ~7 missing [reserved keywords](https://doc.rust-lang.org/stable/reference/keywords.html#reserved-keywords)
- ~~Moves the definition of the `TokenSet` type to grammar codegen in xtask, so that `N` is adjusted automatically (depending on the chosen `u*` "base" type) when new `SyntaxKind`s are added~~.
- ~~Updates the `token_set_works_for_tokens` unit test to include the `__LAST` `SyntaxKind` as a way of catching overflows in tests.~~

~~Currently `u16` is arbitrarily chosen as the `u*` "base" type mostly because it strikes a good balance (IMO) between unused bits and readability of the generated `TokenSet` code (especially the [`union` method](7a8374c162/crates/parser/src/token_set.rs (L26-L28))), but I'm open to other suggestions or a better methodology for choosing `u*` type.~~

~~I considered using a third-party crate for the bit-set, but a direct implementation seems simple enough without adding any new dependencies. I'm not strongly opposed to using a third-party crate though, if that's preferred.~~

~~Finally, I haven't had the chance to review issues, to figure out if there are any parser issues caused by collisions due the current implementation that may be fixed by this PR - I just stumbled upon the issue while adding "new" keywords to solve #16858~~

Edit: fixes #16858
2024-04-16 07:00:12 +00:00
..
base-db Generally optimize diagnostics performance 2024-04-15 22:15:41 +02:00
cfg internal: Thread edition through to parsing/tt-to-syntax-tree routines for macros 2024-04-14 16:02:38 +02:00
flycheck Run cargo test per workspace in the test explorer 2024-04-13 06:22:58 +03:30
hir Generally optimize diagnostics performance 2024-04-15 22:15:41 +02:00
hir-def Generally optimize diagnostics performance 2024-04-15 22:15:41 +02:00
hir-expand internal: Thread edition through to parsing/tt-to-syntax-tree routines for macros 2024-04-14 16:02:38 +02:00
hir-ty Generally optimize diagnostics performance 2024-04-15 22:15:41 +02:00
ide Adjust package.json semantic highlighting items 2024-04-15 17:00:03 +02:00
ide-assists Auto merge of #17037 - davidsemakula:token-set-collisions, r=Veykril 2024-04-16 07:00:12 +00:00
ide-completion Revert "Auto merge of #17073 - roife:better-inline-preview, r=Veykril" 2024-04-15 18:24:15 -04:00
ide-db fix: Fix impl Trait<Self> causing stackoverflows 2024-04-15 15:41:20 +02:00
ide-diagnostics Generally optimize diagnostics performance 2024-04-15 22:15:41 +02:00
ide-ssr internal: Thread edition through to parsing/tt-to-syntax-tree routines for macros 2024-04-14 16:02:38 +02:00
intern Fix new clippy lints 2024-04-01 17:55:56 +02:00
limit Simplify 2024-04-06 13:12:07 +02:00
load-cargo
mbe internal: Thread edition through to parsing/tt-to-syntax-tree routines for macros 2024-04-14 16:02:38 +02:00
parser internal: simplify TokenSet implementation 2024-04-15 17:33:09 +03:00
paths
proc-macro-api Consider ADT generic parameter defaults for unsubstituted layout calculations 2024-04-03 09:01:27 +02:00
proc-macro-srv
proc-macro-srv-cli
profile
project-model Arc CrateData::cfg_options 2024-04-06 13:55:10 +02:00
rust-analyzer Generally optimize diagnostics performance 2024-04-15 22:15:41 +02:00
salsa Fix new clippy lints 2024-04-01 17:55:56 +02:00
sourcegen Fix new clippy lints 2024-04-01 17:55:56 +02:00
span Deduplicate Edition enum 2024-04-14 15:29:01 +02:00
stdx
syntax internal: Thread edition through to parsing/tt-to-syntax-tree routines for macros 2024-04-14 16:02:38 +02:00
test-fixture Use Edition::CURRENT 2024-04-14 15:30:29 +02:00
test-utils Fix new clippy lints 2024-04-01 17:55:56 +02:00
text-edit
toolchain
tt
vfs [vfs] Don't confuse paths with source roots that have the same prefix 2024-04-08 15:48:04 -07:00
vfs-notify