Shrink Unicode tables (even more)
This shrinks the Unicode tables further, building upon the wins in #68232 (the previous counts differ due to an interim Unicode version update, see #69929.
The new data structure is slower by around 3x, on the benchmark of looking up every Unicode scalar value in each data set sequentially in every data set included. Note that for ASCII, the exposed functions on `char` optimize with direct branches, so ASCII will retain the same performance regardless of internal optimizations (or the reverse). Also, note that the size reduction due to the skip list (from where the performance losses come) is around 40%, and, as a result, I believe the performance loss is acceptable, as the routines are still quite fast. Anywhere where this is hot, should probably be using a custom data structure anyway (e.g., a raw bitset) or something optimized for frequently seen values, etc.
This PR updates both the bitset data structure, and introduces a new data structure similar to a skip list. For more details, see the [main.rs] of the table generator, which describes both. The commits mostly work individually and document size wins.
As before, this is tested on all valid chars to have the same results as nightly (and the canonical Unicode data sets), happily, no bugs were found.
[main.rs]: https://github.com/rust-lang/rust/blob/fb4a715e18b/src/tools/unicode-table-generator/src/main.rs
Set | Previous | New | % of old | Codepoints | Ranges |
----------------|---------:|------:|-----------:|-----------:|-------:|
Alphabetic | 3055 | 1599 | 52% | 132875 | 695 |
Case Ignorable | 2136 | 949 | 44% | 2413 | 410 |
Cased | 934 | 359 | 38% | 4286 | 141 |
Cc | 43 | 9 | 20% | 65 | 2 |
Grapheme Extend | 1774 | 813 | 46% | 1979 | 344 |
Lowercase | 985 | 867 | 88% | 2344 | 652 |
N | 1266 | 419 | 33% | 1781 | 133 |
Uppercase | 934 | 777 | 83% | 1911 | 643 |
White_Space | 140 | 37 | 26% | 25 | 10 |
----------------|----------|-------|------------|------------|--------|
Total | 11267 | 5829 | 51% | - | - |
Create output dir in rustdoc markdown render
`rustdoc` command on a standalone markdown document fails because the output directory (which default to `doc/`) is not created, even when specified with the `--output` argument.
This PR adds the creation of the output directory before the file creation to avoid an unexpected error which is unclear.
I am not sure about the returned error code. I did not find a table explaining them. So I simply put the same error code that is returned when `File::create` fails because they are both related to file-system errors.
Resolve#70431
submodules: update clippy from 1ff81c1b to 70b93aab
Changes:
````
remove redundant import
rustup https://github.com/rust-lang/rust/pull/68404
rustup https://github.com/rust-lang/rust/pull/69644
rustup https://github.com/rust-lang/rust/pull/70344
Move verbose_file_reads to restriction
move redundant_pub_crate to nursery
readme: explain how to run only a single lint on a codebase
Remove dependency on `matches` crate
Move useless_transmute to nursery
nursery group -> style
Update for PR feedback
Auto merge of #5314 - ehuss:remove-git2, r=flip1995
Lint for `pub(crate)` items that are not crate visible due to the visibility of the module that contains them
````
Fixes#70456
Mozilla's IRC service was shut down in March 2020. The official
instant messaging variant has been Discord for a while, and most of
the links were already replaced by #61524.
This was the last line that came up with `irc.mozilla.org` or any
combination of "irc.*#[a-z]+" in a `git grep`:
git grep -i -E "irc.*#[a-z]+"
As there is only one other link directly to Rust's discord, I used the
same Markdown link `[rust-discord]` as in `bootstrap/README.md` to
stay consistent. This might come in handy if the chat platform changes
at a later point again.
As an aside: for those interested in the use of IRC, Mozilla's [wiki]
still offers a lot of in-depth knowledge.
[wiki]: https://wiki.mozilla.org/IRC
Move arg/constraint partition check to validation & improve recovery
- In the first commit, we move the check rejecting e.g., `<'a, Item = u8, String>` from the parser into AST validation.
- We then use this to improve the code for parsing generic arguments.
- And we add recovery for e.g., `<Item = >` (missing), `<Item = 42>` (constant), and `<Item = 'a>` (lifetime).
This is also preparatory work for supporting https://github.com/rust-lang/rust/issues/70256.
r? @varkor
Implement -Zlink-native-libraries
This implements a flag `-Zlink-native-libraries=yes/no`. If set to true/yes, or unspecified, then
native libraries referenced via `#[link]` attributes will be put on the linker line (ie, unchanged
behaviour).
If `-Zlink-native-libraries=no` is specified then rustc will not add the native libraries to the link
line. The assumption is that the outer build system driving the build already knows about the native
libraries and will specify them to the linker directly (for example via `-Clink-arg=`).
Addresses issue #70093
In practice, for the two data sets that still use the bitset encoding (uppercase
and lowercase) this is not a significant win, so just drop it entirely. It costs
us about 5 bytes, and the complexity is nontrivial.
This arranges for the sparser sets (everything except lower and uppercase) to be
encoded in a significantly smaller context. However, it is also a performance
trade-off (roughly 3x slower than the bitset encoding). The 40% size reduction
is deemed to be sufficiently important to merit this performance loss,
particularly as it is unlikely that this code is hot anywhere (and if it is,
paying the memory cost for a bitset that directly represents the data seems
worthwhile).
Alphabetic : 1599 bytes (- 937 bytes)
Case_Ignorable : 949 bytes (- 822 bytes)
Cased : 359 bytes (- 429 bytes)
Cc : 9 bytes (- 15 bytes)
Grapheme_Extend: 813 bytes (- 675 bytes)
Lowercase : 863 bytes
N : 419 bytes (- 619 bytes)
Uppercase : 776 bytes
White_Space : 37 bytes (- 46 bytes)
Total table sizes: 5824 bytes (-3543 bytes)
Changes:
````
remove redundant import
rustup https://github.com/rust-lang/rust/pull/68404
rustup https://github.com/rust-lang/rust/pull/69644
rustup https://github.com/rust-lang/rust/pull/70344
Move verbose_file_reads to restriction
move redundant_pub_crate to nursery
readme: explain how to run only a single lint on a codebase
Remove dependency on `matches` crate
Move useless_transmute to nursery
nursery group -> style
Update for PR feedback
Auto merge of #5314 - ehuss:remove-git2, r=flip1995
Lint for `pub(crate)` items that are not crate visible due to the visibility of the module that contains them
````
Fixes#70456
non-exhastive diagnostic: add note re. scrutinee type
This fixes https://github.com/rust-lang/rust/issues/67259 by adding a note:
```
= note: the matched value is of type &[i32]
```
to non-exhaustive pattern matching errors.
r? @varkor @estebank
Remove `no_integrated_as` mode.
Specifically, remove both `-Z no_integrated_as` and
`TargetOptions::no_integrated_as`. The latter was only used for the
`msp430_none_elf` platform, for which it's no longer required.
r? @alexcrichton
Move the query system to a dedicated crate
The query system `rustc::ty::query` is split out into the `rustc_query_system` crate.
Some commits are unformatted, to ease rebasing.
Based on #67761 and #69910.
r? @Zoxc
This implements a flag `-Zlink-native-libraries=yes/no`. If set to true/yes, or unspecified, then
native libraries referenced via `#[link]` attributes will be put on the linker line (ie, unchanged
behaviour).
If `-Zlink-native-libraries=no` is specified then rustc will not add the native libraries to the link
line. The assumption is that the outer build system driving the build already knows about the native
libraries and will specify them to the linker directly (for example via `-Clink-arg=`).
Addresses issue #70093
Rollup of 4 pull requests
Successful merges:
- #65222 (Proposal: `fold_self` and `try_fold_self` for Iterators)
- #69887 (clean up E0404 explanation)
- #70068 (use "gcc" instead of "cc" on *-sun-solaris systems when linking)
- #70470 (Clean up E0463 explanation)
Failed merges:
r? @ghost
use "gcc" instead of "cc" on *-sun-solaris systems when linking
On illumos and Solaris systems, Rust will use GCC as the link editor.
Rust does this by invoking "cc", which on many (Linux and perhaps BSD)
systems is generally either GCC or a GCC-compatible front-end. On
historical Solaris systems, "cc" was often the Sun Studio compiler.
This history casts a long shadow, and as such, even most modern
illumos-based operating systems tend to install GCC as "gcc", without
also making it available as "cc".
We should invoke GCC as "gcc" on such systems to ensure we get the right
compiler driver.
Proposal: `fold_self` and `try_fold_self` for Iterators
This pull request proposes & implements two new methods on Iterators: `fold_self` and `try_fold_self`. These are variants of `fold` and `try_fold` that use the first element in the iterator as the initial accumulator.
Let me know if a public feature like this requires an RFC, or if this pull request is sufficient as place for discussion.
Enable blessing of mir opt tests
cc @rust-lang/wg-mir-opt
cc @RalfJung
Long overdue, but now you can finally just add a
```rust
// EMIT_MIR rustc.function_name.MirPassName.before.mir
```
(or `after.mir` since most of the time you want to know the MIR after a pass). A `--bless` invocation will automatically create the files for you.
I suggest we do this for all mir opt tests that have all of the MIR in their source anyway
If you use `rustc.function.MirPass.diff` you only get the diff that the MIR pass causes on the MIR.
Fixes#67865
Test and fix gdb pretty printing more
Over time I had oversimplified the test case for #68098: it does not have an internal node to print so it did not test what it pretended to test. And then I also realized not spotting the same mistake reviewing #70111, and more likely to occur in the wild. Now, both test cases fail if you put back the flawed python code.
r? @Mark-Simulacrum