Do not hash leading zero bytes of i64 numbers in Sip128 hasher
I was poking into the stable hasher, trying to improve its performance by compressing the number of hashed bytes. First I was experimenting with LEB128, but it was painful to implement because of the many assumptions that the SipHasher makes, so I tried something simpler - just ignoring leading zero bytes. For example, if an 8-byte integer can fit into a 4-byte integer, I will just hash the four bytes.
I wonder if this could produce any hashing ambiguity. Originally I thought so, but then I struggled to find any counter-example where this could cause different values to have the same hash. I'd be glad for any examples that could be broken by this (there are some ways of mitigating it if that would be the case). It could happen if you had e.g. 2x `u8` vs 1x `u16` hashed after one another in two different runs, but that can also happen now, without this "trick". And with collections, it should be fine, because the length is included in their hash.
I gathered some statistics for common values used in the `clap` benchmark. I observed that especially `i64` often had very low values, so I started with that type, let's see what perf does on CI.
There are some tradeoffs that we can try:
1) What types to use this optimization for? `u64`, `u32`, `u16`? Locally it was a slight loss for `u64`, I noticed that its values are often quite large.
2) What byte sizes to check? E.g. we can only distinguish between `u64`/`u32` or `u64`/`u8` instead of `u64`/`u32`/`u16`/`u8` to reduce branching (with `i64` it seemed to be better to go all the way down to `u8` locally though).
(The macro was introduced because I expect that I will be trying out this "trick" for different types).
Can you please schedule a perf. run? Thanks.
r? `@the8472`
Ignore flaky `panic-short-backtrace-windows-x86_64.rs` test for now
Mitigates (but does not fix) #92000.
It has been causing a lot of spurious test failures recently that slow
down the bors queue.
Update cargo
10 commits in fcef61230c3b6213b6b0d233a36ba4ebd1649ec3..358e79fe56fe374649275ca7aebaafd57ade0e8d
2021-12-17 02:30:38 +0000 to 2022-01-04 18:39:45 +0000
- Make rmeta_required no longer depend on whether timing is enabled (rust-lang/cargo#10254)
- The first version of pull request template (rust-lang/cargo#10218)
- Stabilize the `strip` profile option, now that rustc has stable `-C strip` (rust-lang/cargo#10088)
- Update docs for windows ssh-agent. (rust-lang/cargo#10248)
- Fix typo: substract -> subtract (rust-lang/cargo#10244)
- timings: Fix tick mark alignment (rust-lang/cargo#10239)
- Remove unused lifetimes (rust-lang/cargo#10238)
- Make levenshtein distance case insensitive. (rust-lang/cargo#10224)
- [docs] Adds basic CI yaml for GitHub Actions (rust-lang/cargo#10212)
- Add function for parsing already-read manifest (rust-lang/cargo#10209)
Rollup of 7 pull requests
Successful merges:
- #91587 (core::ops::unsize: improve docs for DispatchFromDyn)
- #91907 (Allow `_` as the length of array types and repeat expressions)
- #92515 (RustWrapper: adapt for an LLVM API change)
- #92516 (Do not use deprecated -Zsymbol-mangling-version in bootstrap)
- #92530 (Move `contains` method of Option and Result lower in docs)
- #92546 (Update books)
- #92551 (rename StackPopClean::None to Root)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
rename StackPopClean::None to Root
With https://github.com/rust-lang/rust/pull/90102, `StackPopClean::None` is now only used for the "root" frame of the stack, so adjust its name accordingly and add an assertion.
r? `@oli-obk`
Update books
## reference
3 commits in 06f9e61931bcf58b91dfe6c924057e42ce273ee1..f8ba2f12df60ee19b96de24ae5b73af3de8a446b
2021-12-17 07:31:40 -0800 to 2022-01-03 11:02:08 -0800
- Switch the default edition for examples to 2021 (rust-lang/reference#1125)
- Clarify behavior of x87 FP registers in inline assembly (rust-lang/reference#1126)
- Add inline assembly to the reference (rust-lang/reference#1105)
## book
36 commits in 8a0bb3c96e71927b80fa2286d7a5a5f2547c6aa4..d3740fb7aad0ea4a80ae20f64dee3a8cfc0c5c3c
2021-12-22 20:54:27 -0500 to 2022-01-03 21:46:04 -0500
- Add a concrete example of an optional value. Fixesrust-lang/book#2848.
- match isn't really an operator. Fixesrust-lang/book#2859
- Edits to edits of chapter 6
- Make fixes recommended by shellcheck
- Use shellcheck
- SIGH fix all the typos that were missed while spellcheck was broken
- SIGH add all the words to the dictionary that were missed while spellcheck was broken
- Remove test_harness from the dictionary
- sigh, the xkcd sandwich one
- Install aspell in CI
- set -eu in all bash scripts
- typo: assignement -> assignment
- Fix quotes
- Snapshot of ch12 for nostarch
- Use 'static lifetime earlier because that's more correct. Fixesrust-lang/book#2864.
- Add does_not_compile annotation to intermediate steps that don't compile
- Sidestep who provides output streams. Fixesrust-lang/book#2933.
- Remove note about primitive obsession. Fixesrust-lang/book#2863.
- Remove sentence encouraging writing tests on your own. Fixesrust-lang/book#2223.
- Bump mdBook version to 0.4.14 in workflow main.yml
- Past tense make better sense
- Past tense makes better sense
- Update the edition in all the listings' Cargo.toml
- Update the book to either say 2021 edition or not talk about editions
- Remove most of the 2018 edition text from the title page. Fixesrust-lang/book#2852.
- Fix word wrapping
- Emphasize return type is mandatory
- fix title capitalization
- Further edits to mention of --include-ignored, propagate to src
- feat: mention `cargo test -- --include-ignored`
- wording: get rid of "to from"
- interchanged position of `binary` and `library`
- Fix wrong word typo
- Further edits in rust-analyzer text
- appendix-04 IDE integration: Replaced rls by rust-analyzer
- Update link to Italian translation. Connects to rust-lang/book#2484.
## rustc-dev-guide
3 commits in 9bf0028b557798ddd07a6f652e4d0c635d3d6620..875464457c4104686faf667f47848aa7b0f0a744
2021-12-20 21:53:57 +0900 to 2021-12-28 22:17:49 -0600
- Update link to moved section (rust-lang/rustc-dev-guide#1282)
- Fix link in contributing.md (rust-lang/rustc-dev-guide#1280)
- Streamline "Getting Started" (rust-lang/rustc-dev-guide#1279)
Move `contains` method of Option and Result lower in docs
Follow-up to #92444 trying to get the `Option` and `Result` rustdocs in better shape.
This addresses the request in https://github.com/rust-lang/rust/issues/62358#issuecomment-645676285. The `contains` methods are previously too high up in the docs on both `Option` and `Result` — stuff like `ok` and `map` and `and_then` should all be featured higher than `contains`. All of those are more ubiquitously useful than `contains`.
Do not use deprecated -Zsymbol-mangling-version in bootstrap
`-Zsymbol-mangling-version` now produces warnings unconditionally. So if you want to use legacy mangling for the compiler (`new-symbol-mangling = false` in `config.toml`), the build is now littered with warnings.
However, with this change, stage 1 `std` doesn't compile:
```
error: `-C symbol-mangling-version=legacy` requires `-Z unstable-options`
```
Even after the bootstrap compiler is updated and it will support `-Csymbol-mangling-version`, the bootstrap code would either need to use `-Z` for the legacy mangling or use `-C` in combination with `-Z unstable-options` (because `-C` + legacy is not allowed without the unstable options). Should we just add `-Z unstable-options` to `std` compilation to resolve this?
Btw I use legacy mangling because the new mangling is not supported by [Hotspot](https://github.com/KDAB/hotspot).
RustWrapper: adapt for an LLVM API change
No functional changes intended.
The LLVM commit ec501f15a8 removed the signed version of `createExpression`.
This adapts the Rust LLVM wrappers accordingly.
Rollup of 7 pull requests
Successful merges:
- #91754 (Modifications to `std::io::Stdin` on Windows so that there is no longer a 4-byte buffer minimum in read().)
- #91884 (Constify `Box<T, A>` methods)
- #92107 (Actually set IMAGE_SCN_LNK_REMOVE for .rmeta)
- #92456 (Make the documentation of builtin macro attributes accessible)
- #92507 (Suggest single quotes when char expected, str provided)
- #92525 (intra-doc: Make `Receiver::into_iter` into a clickable link)
- #92532 (revert #92254 "Bump gsgdt to 0.1.3")
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
intra-doc: Make `Receiver::into_iter` into a clickable link
The documentation on `std::sync::mpsc::Iter` and `std::sync::mpsc::TryIter` provides links to the corresponding `Receiver` methods, unlike `std::sync::mpsc::IntoIter` does.
This was left out in c59b188aaeadea32625534250d1f5120420be000
Related to #29377
Suggest single quotes when char expected, str provided
If a type mismatch occurs where a char is expected and a string literal is provided, suggest changing the double quotes to single quotes.
We already provide this suggestion in the other direction ( ' -> " ).
Especially useful for new rust devs used to a language in which single/double quotes are interchangeable.
Fixes#92479.
Actually set IMAGE_SCN_LNK_REMOVE for .rmeta
The code intended to set the IMAGE_SCN_LNK_REMOVE flag for the
.rmeta section, however the value of this flag was set to zero.
Instead use the actual value provided by the object crate.
This dates back to the original introduction of this code in
PR #84449, so we were never setting this flag. As I'm not on
Windows, I'm not sure whether that means we were embedding .rmeta
into executables, or whether the section ended up getting stripped
for some other reason.
Modifications to `std::io::Stdin` on Windows so that there is no longer a 4-byte buffer minimum in read().
This is an attempted fix of issue #91722, where a too-small buffer was passed to the read function of stdio on Windows. This caused an error to be returned when `read_to_end` or `read_to_string` were called. Both delegate to `std::io::default_read_to_end`, which creates a buffer that is of length >0, and forwards it to `std::io::Stdin::read()`. The latter method returns an error if the length of the buffer is less than 4, as there might not be enough space to allocate a UTF-16 character. This creates a problem when the buffer length is in `0 < N < 4`, causing the bug.
The current modification creates an internal buffer, much like the one used for the write functions
I'd also like to acknowledge the help of ``@agausmann`` and ``@hkratz`` in detecting and isolating the bug, and for suggestions that made the fix possible.
Couple disclaimers:
- Firstly, I didn't know where to put code to replicate the bug found in the issue. It would probably be wise to add that case to the testing suite, but I'm afraid that I don't know _where_ that test should be added.
- Secondly, the code is fairly fundamental to IO operations, so my fears are that this may cause some undesired side effects ~or performance loss in benchmarks.~ The testing suite runs on my computer, and it does fix the issue noted in #91722.
- Thirdly, I left the "surrogate" field in the Stdin struct, but from a cursory glance, it seems to be serving the same purpose for other functions. Perhaps merging the two would be appropriate.
Finally, this is my first pull request to the rust language, and as such some things may be weird/unidiomatic/plain out bad. If there are any obvious improvements I could do to the code, or any other suggestions, I would appreciate them.
Edit: Closes#91722
Remove special-cased stable hashing for HIR module
All other 'containers' (e.g. `impl` blocks) hashed their contents
in the normal, order-dependent way. However, `Mod` was hashing
its contents in a (sort-of) order-independent way. However, the
exact order is exposed to consumers through `Mod.item_ids`,
and through query results like `hir_module_items`. Therefore,
stable hashing needs to take the order of items into account,
to avoid fingerprint ICEs.
Unforuntately, I was unable to directly build a reproducer
for the ICE, due to the behavior of `Fingerprint::combine_commutative`.
This operation swaps the upper and lower `u64` when constructing the
result, which makes the function non-associative. Since we start
the hashing of module items by combining `Fingerprint::ZERO` with
the first item, it's difficult to actually build an example where
changing the order of module items leaves the final hash unchanged.
However, this appears to have been hit in practice in #92218
While we're not able to reproduce it, the fact that proc-macros
are involved (which can give an entire module the same span, preventing
any span-related invalidations) makes me confident that the root
cause of that issue is our method of hashing module items.
This PR removes all of the special handling for `Mod`, instead deriving
a `HashStable` implementation. This makes `Mod` consistent with other
'contains' like `Impl`, which hash their contents through the typical
derive of `HashStable`.
- Do not `#[doc(hidden)]` the `#[derive]` macro attribute
- Add a link to the reference section to `derive`'s inherent docs
- Do the same for `#[test]` and `#[global_allocator]`
- Fix `GlobalAlloc` link (why is it on `core` and not `alloc`?)
- Try `no_inline`-ing the `std` reexports from `core`
- Revert "Try `no_inline`-ing the `std` reexports from `core`"
- Address PR review
- Also document the unstable macros
The documentation on `std::sync::mpsc::Iter` and `std::sync::mpsc::TryIter` provides links to the corresponding `Receiver` methods, unlike `std::sync::mpsc::IntoIter` does.
This was left out in c59b188aaeadea32625534250d1f5120420be000
Related to #29377
Rollup of 6 pull requests
Successful merges:
- #90102 (Remove `NullOp::Box`)
- #92011 (Use field span in `rustc_macros` when emitting decode call)
- #92402 (Suggest while let x = y when encountering while x = y)
- #92409 (Couple of libtest cleanups)
- #92418 (Fix spacing in pretty printed PatKind::Struct with no fields)
- #92444 (Consolidate Result's and Option's methods into fewer impl blocks)
Failed merges:
- #92483 (Stabilize `result_cloned` and `result_copied`)
r? `@ghost`
`@rustbot` modify labels: rollup
Consolidate Result's and Option's methods into fewer impl blocks
`Result`'s and `Option`'s methods have historically been separated up into `impl` blocks based on their trait bounds, with the bounds specified on type parameters of the impl block. I find this unhelpful because closely related methods, like `unwrap_or` and `unwrap_or_default`, end up disproportionately far apart in source code and rustdocs:
<pre>
impl<T> Option<T> {
pub fn unwrap_or(self, default: T) -> T {
...
}
<img alt="one eternity later" src="https://user-images.githubusercontent.com/1940490/147780325-ad4e01a4-c971-436e-bdf4-e755f2d35f15.jpg" width="750">
}
impl<T: Default> Option<T> {
pub fn unwrap_or_default(self) -> T {
...
}
}
</pre>
I'd prefer for method to be in as few impl blocks as possible, with the most logical grouping within each impl block. Any bounds needed can be written as `where` clauses on the method instead:
```rust
impl<T> Option<T> {
pub fn unwrap_or(self, default: T) -> T {
...
}
pub fn unwrap_or_default(self) -> T
where
T: Default,
{
...
}
}
```
*Warning: the end-to-end diff of this PR is computed confusingly by git / rendered confusingly by GitHub; it's practically impossible to review that way. I've broken the PR into commits that move small groups of methods for which git behaves better — these each should be easily individually reviewable.*
Use field span in `rustc_macros` when emitting decode call
This will cause backtraces to point to the location of
the field in the struct/enum, rather than the derive macro.
This makes it clear which field was being decoded when the
backtrace was captured (which is especially useful if
there are multiple fields with the same type).
Remove `NullOp::Box`
Follow up of #89030 and MCP rust-lang/compiler-team#460.
~1 month later nothing seems to be broken, apart from a small regression that #89332 (1aac85bb716c09304b313d69d30d74fe7e8e1a8e) shows could be regained by remvoing the diverging path, so it shall be safe to continue and remove `NullOp::Box` completely.
r? `@jonas-schievink`
`@rustbot` label T-compiler
Add `#[rustc_clean(loaded_from_disk)]` to assert loading of query result
Currently, you can use `#[rustc_clean]` to assert to that a particular
query (technically, a `DepNode`) is green or red. However, a green
`DepNode` does not mean that the query result was actually deserialized
from disk - we might have never re-run a query that needed the result.
Some incremental tests are written as regression tests for ICEs that
occured during query result decoding. Using
`#[rustc_clean(loaded_from_disk="typeck")]`, you can now assert
that the result of a particular query (e.g. `typeck`) was actually
loaded from disk, in addition to being green.
No functional changes intended.
The LLVM commit
ec501f15a8
removed the signed version of `createExpression`. This adapts the Rust
LLVM wrappers accordingly.
Move `PatKind::Lit` checking from ast_validation to ast lowering
Fixes#92074
This allows us to insert an `ExprKind::Err` when an invalid expression
is used in a literal pattern, preventing later stages of compilation
from seeing an unexpected literal pattern.
Rustdoc: use ThinVec for GenericArgs bindings
The bindings are almost always empty. This reduces the size of `PathSegment` and `GenericArgs` by about one fourth.
Stabilize -Z symbol-mangling-version=v0 as -C symbol-mangling-version=v0
This allows selecting `v0` symbol-mangling without an unstable option. Selecting `legacy` still requires -Z unstable-options.
This does not change the default symbol-mangling-version. See https://github.com/rust-lang/rust/pull/89917 for a pull request changing the default. Rationale, from #89917:
Rust's current mangling scheme depends on compiler internals; loses information about generic parameters (and other things) which makes for a worse experience when using external tools that need to interact with Rust symbol names; is inconsistent; and can contain . characters which aren't universally supported. Therefore, Rust has defined its own symbol mangling scheme which is defined in terms of the Rust language, not the compiler implementation; encodes information about generic parameters in a reversible way; has a consistent definition; and generates symbols that only use the characters A-Z, a-z, 0-9, and _.
Support for the new Rust symbol mangling scheme has been added to upstream tools that will need to interact with Rust symbols (e.g. debuggers).
This pull request allows enabling the new v0 symbol-mangling-version.
See #89917 for references to the implementation of v0, and for references to the tool changes to decode Rust symbols.
Support [x; n] expressions in concat_bytes!
Currently trying to use `concat_bytes!` with a repeating array value like `[42; 5]` results in an error:
```
error: expected a byte literal
--> src/main.rs:3:27
|
3 | let x = concat_bytes!([3; 4]);
| ^^^^^^
|
= note: only byte literals (like `b"foo"`, `b's'`, and `[3, 4, 5]`) can be passed to `concat_bytes!()`
```
This makes it so repeating array syntax can be used the same way normal arrays can be. The RFC doesn't explicitly mention repeat expressions, but it seems reasonable to allow them as well, since normal arrays are allowed.
It is possible to make the compiler get stuck compiling forever with `concat_bytes!([3; 999999999])`, but I don't think that's much of an issue since you can do that already with `const X: [u8; 999999999] = [3; 999999999];`.
Contributes to #87555.
Track caller of slice split and swap
Improves error location for `slice.split_at*()` and `slice.swap()`.
These are generic inline functions, so the `#[track_caller]` on them is free — only changes a value of an argument already passed to panicking code.
Remove effect of `#[no_link]` attribute on name resolution
Previously it hid all non-macro names from other crates.
This has no relation to linking and can change name resolution behavior in some cases (e.g. glob conflicts), in addition to just producing the "unresolved name" errors.
I can kind of understand the possible reasoning behind the current behavior - if you can use names from a `no_link` crates then you can use, for example, functions too, but whether it will actually work or produce link-time errors will depend on random factors like inliner behavior.
(^^^ This is not the actual reason why the current behavior exist, I've looked through git history and it's mostly accidental.)
I think this risk is ok for such an obscure attribute, and we don't need to specifically prevent use of non-macro items from such crates.
(I'm not actually sure why would anyone use `#[no_link]` on a crate, even if it's macro only, if you aware of any use cases, please share. IIRC, at some point it was used for crates implementing custom derives - the now removed legacy ones, not the current proc macros.)
Extracted from https://github.com/rust-lang/rust/pull/91795.