Compile `unicode-normalization` faster
Various optimizations and cleanups aimed at improving compilation of `unicode-normalization`, which is notable for having several very large `match`es with many char ranges.
Best reviewed one commit at a time.
r? `@oli-obk`
Big const allocations hash a large amount of data for interning:
the whole bytes buffer, and the 1/8th sized initmask, with FxHash.
This hash function is made for shorter keys.
This only hashes the length, and head and tail of these buffers, to
limit possible collisions while avoiding most of the hashing work.
Previously we only show at most 6 lines of suggestions and, if the
suggestions are more than 6 lines apart, we've just showed ... at the
end. This is probably fine, but quite confusing in my opinion.
This commit is an attempt to show ... in places where there is nothing
to suggest instead, for example:
Before:
```text
help: consider enclosing expression in a block
|
3 ~ 'l: { match () { () => break 'l,
4 |
5 |
6 |
7 |
8 |
...
```
After:
```text
help: consider enclosing expression in a block
|
3 ~ 'l: { match () { () => break 'l,
4 |
...
31|
32~ } };
|
```
Make #[cfg(bootstrap)] not error in proc macros on later stages
As was discovered in https://github.com/rust-lang/rust/pull/93628#issuecomment-1154697627,
adding #[cfg(bootstrap)] to a rust-internal proc macro crate
would yield an unexpected cfg name error, at least on later
stages wher the bootstrap cfg arg wasn't set.
rustc already passes arguments to mark bootstrap as expected,
however the means of delivery through the RUSTFLAGS env var
is unable to reach proc macro crates, as described
in the issue linked in the code this commit touches.
This wouldn't be an issue for cfg args that get passed through
RUSTFLAGS, as they would never become *active* either, so
any usage of one of these flags in a proc macro's code would
legitimately yield a lint warning. But since dc30258,
rust takes extra measures to pass --cfg=bootstrap even in
proc macros, by passing it via the wrapper. Thus, we need
to send the flags to mark bootstrap as expected also from the
wrapper, so that #[cfg(bootstrap)] also works from proc macros.
I want to thank `Urgau` and `jplatte` for helping me find the cause of this. ❤️
debuginfo: Fix NatVis for Rc and Arc with unsized pointees.
Currently, the NatVis for `Rc<T>` and `Arc<T>` does not support unsized `T`. For both `Rc<T>` and `Rc<dyn SomeTrait>` the visualizers fail:
```txt
[Reference count] : -> must be used on pointers and . on structures
[Weak reference count] : -> must be used on pointers and . on structures
```
This PR fixes the visualizers. For slices we can even give show the elements, so one now gets something like:
```txt
slice_rc : { len=3 }
[Length] : 3
[Reference count] : 41
[Weak reference count] : 2
[0] : 1
[1] : 2
[2] : 3
```
r? `@wesleywiser`
Entry and_modify doc
This PR modifies the documentation for [HashMap](https://doc.rust-lang.org/std/collections/struct.HashMap.html#) and [BTreeMap](https://doc.rust-lang.org/std/collections/struct.BTreeMap.html#) by introducing examples for `and_modify`. `and_modify` is a function that tends to give more idiomatic rust code when dealing with these data structures -- yet it lacked examples and was hidden away. This PR adds that and addresses #98122.
I've made some choices which I tried to explain in my commits. This is my first time contributing to rust, so hopefully, I made the right choices.
Support lint expectations for `--force-warn` lints (RFC 2383)
Rustc has a `--force-warn` flag, which overrides lint level attributes and forces the diagnostics to always be warn. This means, that for lint expectations, the diagnostic can't be suppressed as usual. This also means that the expectation would not be fulfilled, even if a lint had been triggered in the expected scope.
This PR now also tracks the expectation ID in the `ForceWarn` level. I've also made some minor adjustments, to possibly catch more bugs and make the whole implementation more robust.
This will probably conflict with https://github.com/rust-lang/rust/pull/97718. That PR should ideally be reviewed and merged first. The conflict itself will be trivial to fix.
---
r? `@wesleywiser`
cc: `@flip1995` since you've helped with the initial review and also discussed this topic with me. 🙃
Follow-up of: https://github.com/rust-lang/rust/pull/87835
Issue: https://github.com/rust-lang/rust/issues/85549
Yeah, and that's it.
This simplifies things, but requires making `CacheEncoder` non-generic.
(This was previously merged as commit 4 in #94732 and then was reverted
in #97905 because it caused a perf regression.)
This commit removes the `a == b` early return, which isn't useful in
practice, and replaces it with one that helps matches with many ranges,
including char ranges.
The code is clearer and simpler without it. Note that the `a == b` early
return at the top of the function means the `a == b` test at the end of
the function could never succeed.
It's never executed when running the entire test suite. I think it's
because of the early return at the top of the function if `a.ty() != ty`
succeeds.
This is a performance win for `unicode-normalization`.
The commit also removes the closure, which isn't necessary. And
reformulates the comparison into a form I find easier to read.