Add some regression tests
Closes#64848 (fixed by #67631)
Closes#65918 (ICE is hidden by #67000, no longer ICE)
Closes#66473 (fixed by #68084)
Closes#67550 (set mir-opt-level to 3)
r? @Centril
Optimize size/speed of Unicode datasets
The overall implementation has the same general idea as the prior approach,
which was based on a compressed trie structure, but modified to use less space
(and, coincidentally, be an overall performance improvement).
Sizes | Old | New | New/current
-- | -- | -- | --
Alphabetic | 4616 | 2982 | 64.60%
Case_Ignorable | 3144 | 2112 | 67.18%
Cased | 2376 | 934 | 39.31%
Cc | 19 | 43 | 226.32%
Grapheme_Extend | 3072 | 1734 | 56.45%
Lowercase | 2328 | 985 | 42.31%
N | 2648 | 1239 | 46.79%
Uppercase | 1978 | 934 | 47.22%
White_Space | 241 | 140 | 58.09%
| | |
Total | 20422 | 11103 | 54.37%
This table shows the size of the old and new tables in bytes. The most important
of these tables is "Grapheme_Extend", as it is present in essentially all Rust
programs due to being called from `str`'s Debug impl (`char::escape_debug`). In
a representative case given by this [blog post] for the embedded world, the
shrinking in this PR shrinks the final binary by 1,604 bytes, from 14,440 to
12,836.
The performance of these new tables, based on the (rough) benchmark of linearly
scanning the entire valid set of chars, querying for each `is_*`, is roughly
~50% better, though in some cases is either on par or slightly (3-5%) worse. In
practice, I believe the size benefits of this PR are the main concern. The new
implementation has been tested to be equivalent to the current nightly in terms
of returned values on the set of valid chars.
A (relatively) high-level explanation of the specific compression scheme used
can be found [in the generator].
This is split into three commits -- the first adds the generator which produces
the Rust code for the tables, the second adds support code for the lookup, and
the third actually swaps the current implementation out for the new one.
[blog post]: https://jamesmunns.com/blog/fmt-unreasonably-expensive/
[in the generator]: https://github.com/Mark-Simulacrum/rust/blob/unicode-tables/src/tools/unicode-table-generator/src/raw_emitter.rs
Promoteds can contain raw pointers, but these must still only point to immutable allocations
fixes#67601
r? @RalfJung
cc @wesleywiser in order to not change behaviour in this PR, const prop uses the constant rules for interning, but at least there's an explicit mode for it now that we can think about this in the future
Rollup of 12 pull requests
Successful merges:
- #67784 (Reset Formatter flags on exit from pad_integral)
- #67914 (Don't run const propagation on items with inconsistent bounds)
- #68141 (use winapi for non-stdlib Windows bindings)
- #68211 (Add failing example for E0170 explanation)
- #68219 (Untangle ZST validation from integer validation and generalize it to all zsts)
- #68222 (Update the wasi-libc bundled with libstd)
- #68226 (Avoid calling tcx.hir().get() on CRATE_HIR_ID)
- #68227 (Update to a version of cmake with windows arm64 support)
- #68229 (Update iovec to a version with no winapi dependency)
- #68230 (Update libssh2-sys to a version that can build for aarch64-pc-windows…)
- #68231 (Better support for cross compilation on Windows.)
- #68233 (Update compiler_builtins with changes to fix 128 bit integer remainder for aarch64 windows.)
Failed merges:
r? @ghost
Update compiler_builtins with changes to fix 128 bit integer remainder for aarch64 windows.
I have been investigating enabling panic=unwind for aarch64-pc-windows-msvc (see #65313) and building rustc and cargo hosted on aarch64-pc-windows-msvc.
Better support for cross compilation on Windows.
I have been investigating enabling panic=unwind for aarch64-pc-windows-msvc (see #65313) and building rustc and cargo hosted on aarch64-pc-windows-msvc.
Without the libpath changes we were trying to link a mix of amd64 and arm64 binaries.
Without the cmake system name change, the llvm build was trying to run an arm64 build tool on the x86_64 build machine.
That said, I haven't tested all different combinations here and am very open to resolving this a different way.
Update libssh2-sys to a version that can build for aarch64-pc-windows…
I have been investigating enabling panic=unwind for aarch64-pc-windows-msvc (see #65313) and building rustc and cargo hosted on aarch64-pc-windows-msvc.
Update iovec to a version with no winapi dependency
I have been investigating enabling panic=unwind for aarch64-pc-windows-msvc (see #65313) and building rustc and cargo hosted on aarch64-pc-windows-msvc.
Update to a version of cmake with windows arm64 support
I have been investigating enabling panic=unwind for aarch64-pc-windows-msvc (see #65313) and building rustc and cargo hosted on aarch64-pc-windows-msvc.
Avoid calling tcx.hir().get() on CRATE_HIR_ID
This was causing an ICE when enabling trace logging for an unrelated
module, since the arguments to `trace!` ended up getting evaluated
Don't run const propagation on items with inconsistent bounds
Fixes#67696
Using `#![feature(trivial_bounds)]`, it's possible to write functions
with unsatisfiable 'where' clauses, making them uncallable. However, the
user can act as if these 'where' clauses are true inside the body of the
function, leading to code that would normally be impossible to write.
Since const propgation can run even without any user-written calls to a
function, we need to explcitly check for these uncallable functions.
Reset Formatter flags on exit from pad_integral
This fixes a bug where after calling pad_integral with appropriate flags, the
fill and alignment flags would be set to '0' and 'Right' and left as such even
after exiting pad_integral, which meant that future calls on the same Formatter
would get incorrect flags reported.
This is quite difficult to observe in practice, as almost all formatting
implementations in practice don't call `Display::fmt` directly, but rather use
`write!` or a similar macro, which means that they cannot observe the effects of
the wrong flags (as `write!` creates a fresh Formatter instance). However, we
include a test case.
A manual check leads me to believe this is the only case where we failed to reset the flags appropriately, but I could have missed something.
Add unreachable propagation mir optimization pass
@oli-obk suggested we create a MIR pass that optimizes away basic blocks that lead only to basic blocks with terminator kind **unreachable**. This is a first take on this, which we started with @gilescope at RustFest Impl Days.
The test currently fails when the compiled program runs (undefined behaviour). Is there a way to avoid running the compiled program?
perf: Eagerly convert literals to consts
Previousely even literal constants were being converted to an `Unevaluted` constant for evaluation later. This seems unecessary as no more information is needed to be able to convert the literal to a mir constant.
Hopefully this will also minimise the performance impact of #67717, as far less constant evaluations are needed.