Do not hash leading zero bytes of i64 numbers in Sip128 hasher
I was poking into the stable hasher, trying to improve its performance by compressing the number of hashed bytes. First I was experimenting with LEB128, but it was painful to implement because of the many assumptions that the SipHasher makes, so I tried something simpler - just ignoring leading zero bytes. For example, if an 8-byte integer can fit into a 4-byte integer, I will just hash the four bytes.
I wonder if this could produce any hashing ambiguity. Originally I thought so, but then I struggled to find any counter-example where this could cause different values to have the same hash. I'd be glad for any examples that could be broken by this (there are some ways of mitigating it if that would be the case). It could happen if you had e.g. 2x `u8` vs 1x `u16` hashed after one another in two different runs, but that can also happen now, without this "trick". And with collections, it should be fine, because the length is included in their hash.
I gathered some statistics for common values used in the `clap` benchmark. I observed that especially `i64` often had very low values, so I started with that type, let's see what perf does on CI.
There are some tradeoffs that we can try:
1) What types to use this optimization for? `u64`, `u32`, `u16`? Locally it was a slight loss for `u64`, I noticed that its values are often quite large.
2) What byte sizes to check? E.g. we can only distinguish between `u64`/`u32` or `u64`/`u8` instead of `u64`/`u32`/`u16`/`u8` to reduce branching (with `i64` it seemed to be better to go all the way down to `u8` locally though).
(The macro was introduced because I expect that I will be trying out this "trick" for different types).
Can you please schedule a perf. run? Thanks.
r? `@the8472`
Ignore flaky `panic-short-backtrace-windows-x86_64.rs` test for now
Mitigates (but does not fix) #92000.
It has been causing a lot of spurious test failures recently that slow
down the bors queue.
Update cargo
10 commits in fcef61230c3b6213b6b0d233a36ba4ebd1649ec3..358e79fe56fe374649275ca7aebaafd57ade0e8d
2021-12-17 02:30:38 +0000 to 2022-01-04 18:39:45 +0000
- Make rmeta_required no longer depend on whether timing is enabled (rust-lang/cargo#10254)
- The first version of pull request template (rust-lang/cargo#10218)
- Stabilize the `strip` profile option, now that rustc has stable `-C strip` (rust-lang/cargo#10088)
- Update docs for windows ssh-agent. (rust-lang/cargo#10248)
- Fix typo: substract -> subtract (rust-lang/cargo#10244)
- timings: Fix tick mark alignment (rust-lang/cargo#10239)
- Remove unused lifetimes (rust-lang/cargo#10238)
- Make levenshtein distance case insensitive. (rust-lang/cargo#10224)
- [docs] Adds basic CI yaml for GitHub Actions (rust-lang/cargo#10212)
- Add function for parsing already-read manifest (rust-lang/cargo#10209)
Rollup of 7 pull requests
Successful merges:
- #91587 (core::ops::unsize: improve docs for DispatchFromDyn)
- #91907 (Allow `_` as the length of array types and repeat expressions)
- #92515 (RustWrapper: adapt for an LLVM API change)
- #92516 (Do not use deprecated -Zsymbol-mangling-version in bootstrap)
- #92530 (Move `contains` method of Option and Result lower in docs)
- #92546 (Update books)
- #92551 (rename StackPopClean::None to Root)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
rename StackPopClean::None to Root
With https://github.com/rust-lang/rust/pull/90102, `StackPopClean::None` is now only used for the "root" frame of the stack, so adjust its name accordingly and add an assertion.
r? `@oli-obk`
Update books
## reference
3 commits in 06f9e61931bcf58b91dfe6c924057e42ce273ee1..f8ba2f12df60ee19b96de24ae5b73af3de8a446b
2021-12-17 07:31:40 -0800 to 2022-01-03 11:02:08 -0800
- Switch the default edition for examples to 2021 (rust-lang/reference#1125)
- Clarify behavior of x87 FP registers in inline assembly (rust-lang/reference#1126)
- Add inline assembly to the reference (rust-lang/reference#1105)
## book
36 commits in 8a0bb3c96e71927b80fa2286d7a5a5f2547c6aa4..d3740fb7aad0ea4a80ae20f64dee3a8cfc0c5c3c
2021-12-22 20:54:27 -0500 to 2022-01-03 21:46:04 -0500
- Add a concrete example of an optional value. Fixesrust-lang/book#2848.
- match isn't really an operator. Fixesrust-lang/book#2859
- Edits to edits of chapter 6
- Make fixes recommended by shellcheck
- Use shellcheck
- SIGH fix all the typos that were missed while spellcheck was broken
- SIGH add all the words to the dictionary that were missed while spellcheck was broken
- Remove test_harness from the dictionary
- sigh, the xkcd sandwich one
- Install aspell in CI
- set -eu in all bash scripts
- typo: assignement -> assignment
- Fix quotes
- Snapshot of ch12 for nostarch
- Use 'static lifetime earlier because that's more correct. Fixesrust-lang/book#2864.
- Add does_not_compile annotation to intermediate steps that don't compile
- Sidestep who provides output streams. Fixesrust-lang/book#2933.
- Remove note about primitive obsession. Fixesrust-lang/book#2863.
- Remove sentence encouraging writing tests on your own. Fixesrust-lang/book#2223.
- Bump mdBook version to 0.4.14 in workflow main.yml
- Past tense make better sense
- Past tense makes better sense
- Update the edition in all the listings' Cargo.toml
- Update the book to either say 2021 edition or not talk about editions
- Remove most of the 2018 edition text from the title page. Fixesrust-lang/book#2852.
- Fix word wrapping
- Emphasize return type is mandatory
- fix title capitalization
- Further edits to mention of --include-ignored, propagate to src
- feat: mention `cargo test -- --include-ignored`
- wording: get rid of "to from"
- interchanged position of `binary` and `library`
- Fix wrong word typo
- Further edits in rust-analyzer text
- appendix-04 IDE integration: Replaced rls by rust-analyzer
- Update link to Italian translation. Connects to rust-lang/book#2484.
## rustc-dev-guide
3 commits in 9bf0028b557798ddd07a6f652e4d0c635d3d6620..875464457c4104686faf667f47848aa7b0f0a744
2021-12-20 21:53:57 +0900 to 2021-12-28 22:17:49 -0600
- Update link to moved section (rust-lang/rustc-dev-guide#1282)
- Fix link in contributing.md (rust-lang/rustc-dev-guide#1280)
- Streamline "Getting Started" (rust-lang/rustc-dev-guide#1279)
Move `contains` method of Option and Result lower in docs
Follow-up to #92444 trying to get the `Option` and `Result` rustdocs in better shape.
This addresses the request in https://github.com/rust-lang/rust/issues/62358#issuecomment-645676285. The `contains` methods are previously too high up in the docs on both `Option` and `Result` — stuff like `ok` and `map` and `and_then` should all be featured higher than `contains`. All of those are more ubiquitously useful than `contains`.
Do not use deprecated -Zsymbol-mangling-version in bootstrap
`-Zsymbol-mangling-version` now produces warnings unconditionally. So if you want to use legacy mangling for the compiler (`new-symbol-mangling = false` in `config.toml`), the build is now littered with warnings.
However, with this change, stage 1 `std` doesn't compile:
```
error: `-C symbol-mangling-version=legacy` requires `-Z unstable-options`
```
Even after the bootstrap compiler is updated and it will support `-Csymbol-mangling-version`, the bootstrap code would either need to use `-Z` for the legacy mangling or use `-C` in combination with `-Z unstable-options` (because `-C` + legacy is not allowed without the unstable options). Should we just add `-Z unstable-options` to `std` compilation to resolve this?
Btw I use legacy mangling because the new mangling is not supported by [Hotspot](https://github.com/KDAB/hotspot).
RustWrapper: adapt for an LLVM API change
No functional changes intended.
The LLVM commit ec501f15a8 removed the signed version of `createExpression`.
This adapts the Rust LLVM wrappers accordingly.
Rollup of 7 pull requests
Successful merges:
- #91754 (Modifications to `std::io::Stdin` on Windows so that there is no longer a 4-byte buffer minimum in read().)
- #91884 (Constify `Box<T, A>` methods)
- #92107 (Actually set IMAGE_SCN_LNK_REMOVE for .rmeta)
- #92456 (Make the documentation of builtin macro attributes accessible)
- #92507 (Suggest single quotes when char expected, str provided)
- #92525 (intra-doc: Make `Receiver::into_iter` into a clickable link)
- #92532 (revert #92254 "Bump gsgdt to 0.1.3")
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
intra-doc: Make `Receiver::into_iter` into a clickable link
The documentation on `std::sync::mpsc::Iter` and `std::sync::mpsc::TryIter` provides links to the corresponding `Receiver` methods, unlike `std::sync::mpsc::IntoIter` does.
This was left out in c59b188aae
Related to #29377
Suggest single quotes when char expected, str provided
If a type mismatch occurs where a char is expected and a string literal is provided, suggest changing the double quotes to single quotes.
We already provide this suggestion in the other direction ( ' -> " ).
Especially useful for new rust devs used to a language in which single/double quotes are interchangeable.
Fixes#92479.
Actually set IMAGE_SCN_LNK_REMOVE for .rmeta
The code intended to set the IMAGE_SCN_LNK_REMOVE flag for the
.rmeta section, however the value of this flag was set to zero.
Instead use the actual value provided by the object crate.
This dates back to the original introduction of this code in
PR #84449, so we were never setting this flag. As I'm not on
Windows, I'm not sure whether that means we were embedding .rmeta
into executables, or whether the section ended up getting stripped
for some other reason.
Modifications to `std::io::Stdin` on Windows so that there is no longer a 4-byte buffer minimum in read().
This is an attempted fix of issue #91722, where a too-small buffer was passed to the read function of stdio on Windows. This caused an error to be returned when `read_to_end` or `read_to_string` were called. Both delegate to `std::io::default_read_to_end`, which creates a buffer that is of length >0, and forwards it to `std::io::Stdin::read()`. The latter method returns an error if the length of the buffer is less than 4, as there might not be enough space to allocate a UTF-16 character. This creates a problem when the buffer length is in `0 < N < 4`, causing the bug.
The current modification creates an internal buffer, much like the one used for the write functions
I'd also like to acknowledge the help of ``@agausmann`` and ``@hkratz`` in detecting and isolating the bug, and for suggestions that made the fix possible.
Couple disclaimers:
- Firstly, I didn't know where to put code to replicate the bug found in the issue. It would probably be wise to add that case to the testing suite, but I'm afraid that I don't know _where_ that test should be added.
- Secondly, the code is fairly fundamental to IO operations, so my fears are that this may cause some undesired side effects ~or performance loss in benchmarks.~ The testing suite runs on my computer, and it does fix the issue noted in #91722.
- Thirdly, I left the "surrogate" field in the Stdin struct, but from a cursory glance, it seems to be serving the same purpose for other functions. Perhaps merging the two would be appropriate.
Finally, this is my first pull request to the rust language, and as such some things may be weird/unidiomatic/plain out bad. If there are any obvious improvements I could do to the code, or any other suggestions, I would appreciate them.
Edit: Closes#91722
Remove special-cased stable hashing for HIR module
All other 'containers' (e.g. `impl` blocks) hashed their contents
in the normal, order-dependent way. However, `Mod` was hashing
its contents in a (sort-of) order-independent way. However, the
exact order is exposed to consumers through `Mod.item_ids`,
and through query results like `hir_module_items`. Therefore,
stable hashing needs to take the order of items into account,
to avoid fingerprint ICEs.
Unforuntately, I was unable to directly build a reproducer
for the ICE, due to the behavior of `Fingerprint::combine_commutative`.
This operation swaps the upper and lower `u64` when constructing the
result, which makes the function non-associative. Since we start
the hashing of module items by combining `Fingerprint::ZERO` with
the first item, it's difficult to actually build an example where
changing the order of module items leaves the final hash unchanged.
However, this appears to have been hit in practice in #92218
While we're not able to reproduce it, the fact that proc-macros
are involved (which can give an entire module the same span, preventing
any span-related invalidations) makes me confident that the root
cause of that issue is our method of hashing module items.
This PR removes all of the special handling for `Mod`, instead deriving
a `HashStable` implementation. This makes `Mod` consistent with other
'contains' like `Impl`, which hash their contents through the typical
derive of `HashStable`.
Currently the output of a command like `./x.py build --stage 0 library/std` is this:
```
Updating only changed submodules
Submodules updated in 0.02 seconds
extracting [...]
Compiling [...]
Finished dev [unoptimized] target(s) in 17.53s
Building stage0 std artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
Compling [...]
Finished release [optimized + debuginfo] target(s) in 21.99s
Copying stage0 std from stage0 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu / x86_64-unknown-linux-gnu)
Build completed successfully in 0:00:51
```
I find the part before the "Building stage0 std artifacts" a bit confusing.
After this commit, it looks like this:
```
Updating only changed submodules
Submodules updated in 0.02 seconds
extracting [...]
Building rustbuild
Compiling [...]
Finished dev [unoptimized] target(s) in 17.53s
Building stage0 std artifacts (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
Compling [...]
Finished release [optimized + debuginfo] target(s) in 21.99s
Copying stage0 std from stage0 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu / x86_64-unknown-linux-gnu)
Build completed successfully in 0:00:51
```
The "Building rustbuild" label makes it clear what the first cargo build
invocation is for. The indentation of the "Submodules updated" line
indicates it is a sub-step of a parent task.
- Do not `#[doc(hidden)]` the `#[derive]` macro attribute
- Add a link to the reference section to `derive`'s inherent docs
- Do the same for `#[test]` and `#[global_allocator]`
- Fix `GlobalAlloc` link (why is it on `core` and not `alloc`?)
- Try `no_inline`-ing the `std` reexports from `core`
- Revert "Try `no_inline`-ing the `std` reexports from `core`"
- Address PR review
- Also document the unstable macros
The documentation on `std::sync::mpsc::Iter` and `std::sync::mpsc::TryIter` provides links to the corresponding `Receiver` methods, unlike `std::sync::mpsc::IntoIter` does.
This was left out in c59b188aae
Related to #29377
Rollup of 6 pull requests
Successful merges:
- #90102 (Remove `NullOp::Box`)
- #92011 (Use field span in `rustc_macros` when emitting decode call)
- #92402 (Suggest while let x = y when encountering while x = y)
- #92409 (Couple of libtest cleanups)
- #92418 (Fix spacing in pretty printed PatKind::Struct with no fields)
- #92444 (Consolidate Result's and Option's methods into fewer impl blocks)
Failed merges:
- #92483 (Stabilize `result_cloned` and `result_copied`)
r? `@ghost`
`@rustbot` modify labels: rollup