Update `rand` in the stdlib tests, and remove the `getrandom` feature from it.
The main goal is actually removing `getrandom`, so that eventually we can allow running the stdlib test suite on tier3 targets which don't have `getrandom` support. Currently those targets can only run the subset of stdlib tests that exist in uitests, and (generally speaking), we prefer not to test libstd functionality in uitests, which came up recently in https://github.com/rust-lang/rust/pull/104095 and https://github.com/rust-lang/rust/pull/104185. Additionally, the fact that we can't update `rand`/`getrandom` means we're stuck with the old set of tier3 targets, so can't test new ones.
~~Anyway, I haven't checked that this actually does allow use on tier3 targets (I think it does not, as some work is needed in stdlib submodules) but it moves us slightly closer to this, and seems to allow at least finally updating our `rand` dep, which definitely improves the status quo.~~ Checked and works now.
For the most part, our tests and benchmarks are fine using hard-coded seeds. A couple tests seem to fail with this (stuff manipulating the environment expecting no collisions, for example), or become pointless (all inputs to a function become equivalent). In these cases I've done a (gross) dance (ab)using `RandomState` and `Location::caller()` for some extra "entropy".
Trying to share that code seems *way* more painful than it's worth given that the duplication is a 7-line function, even if the lines are quite gross. (Keeping in mind that sharing it would require adding `rand` as a non-dev dep to std, and exposing a type from it publicly, all of which sounds truly awful, even if done behind a perma-unstable feature).
See also some previous attempts:
- https://github.com/rust-lang/rust/pull/86963 (in particular https://github.com/rust-lang/rust/pull/86963#issuecomment-885438936 which explains why this is non-trivial)
- https://github.com/rust-lang/rust/pull/89131
- https://github.com/rust-lang/rust/pull/96626#issuecomment-1114562857 (I tried in that PR at the same time, but settled for just removing the usage of `thread_rng()` from the benchmarks, since that was the main goal).
- https://github.com/rust-lang/rust/pull/104185
- Probably more. It's very tempting of a thing to "just update".
r? `@Mark-Simulacrum`
Fix documentation for `with_capacity` and `reserve` families of methods
Fixes#95614
Documentation for the following methods
- `with_capacity`
- `with_capacity_in`
- `with_capacity_and_hasher`
- `reserve`
- `reserve_exact`
- `try_reserve`
- `try_reserve_exact`
was inconsistent and often not entirely correct where they existed on the following types
- `Vec`
- `VecDeque`
- `String`
- `OsString`
- `PathBuf`
- `BinaryHeap`
- `HashSet`
- `HashMap`
- `BufWriter`
- `LineWriter`
since the allocator is allowed to allocate more than the requested capacity in all such cases, and will frequently "allocate" much more in the case of zero-sized types (I also checked `BufReader`, but there the docs appear to be accurate as it appears to actually allocate the exact capacity).
Some effort was made to make the documentation more consistent between types as well.
Documentation for the following methods
with_capacity
with_capacity_in
with_capacity_and_hasher
reserve
reserve_exact
try_reserve
try_reserve_exact
was inconsistent and often not entirely correct where they existed on the following types
Vec
VecDeque
String
OsString
PathBuf
BinaryHeap
HashSet
HashMap
BufWriter
LineWriter
since the allocator is allowed to allocate more than the requested capacity in all such cases, and will frequently "allocate" much more in the case of zero-sized types (I also checked BufReader, but there the docs appear to be accurate as it appears to actually allocate the exact capacity).
Some effort was made to make the documentation more consistent between types as well.
Fix with_capacity* methods for Vec
Fix *reserve* methods for Vec
Fix docs for *reserve* methods of VecDeque
Fix docs for String::with_capacity
Fix docs for *reserve* methods of String
Fix docs for OsString::with_capacity
Fix docs for *reserve* methods on OsString
Fix docs for with_capacity* methods on HashSet
Fix docs for *reserve methods of HashSet
Fix docs for with_capacity* methods of HashMap
Fix docs for *reserve methods on HashMap
Fix expect messages about OOM in doctests
Fix docs for BinaryHeap::with_capacity
Fix docs for *reserve* methods of BinaryHeap
Fix typos
Fix docs for with_capacity on BufWriter and LineWriter
Fix consistent use of `hasher` between `HashMap` and `HashSet`
Fix warning in doc test
Add test for capacity of vec with ZST
Fix doc test error
Updated the HashMap's documentation to include two references to
add_modify.
The first is when the `Entry` API is mentioned at the beginning. I was
hesitant to change the "attack" example (although I believe that it is
perfect example of where `add_modify` should be used) because both uses
work equally, but one is more idiomatic (`add_modify`).
The second is with the `entry` function that is used for the `Entry`
API. The code example was a perfect use for `add_modify`, which is why
it was changed to reflect that.
Tweak insert docs
For `{Hash, BTree}Map::insert`, I always have to take a few extra seconds to think about the slight weirdness about the fact that if we "did not" insert (which "sounds" false), we return true, and if we "did" insert, (which "sounds" true), we return false.
This tweaks the doc comments for the `insert` methods of those types (as well as what looks like a rustc internal data structure that I found just by searching the codebase for "If the set did") to first use the "Returns whether _something_" pattern used in e.g. `remove`, where we say that `remove` "returns whether the value was present".
Expose `get_many_mut` and `get_many_unchecked_mut` to HashMap
This pull-request expose the function [`get_many_mut`](https://docs.rs/hashbrown/0.12.0/hashbrown/struct.HashMap.html#method.get_many_mut) and [`get_many_unchecked_mut`](https://docs.rs/hashbrown/0.12.0/hashbrown/struct.HashMap.html#method.get_many_unchecked_mut) from `hashbrown` to the standard library `HashMap` type. They obviously keep the same API and are added under the (new) `map_many_mut` feature.
- `get_many_mut`: Attempts to get mutable references to `N` values in the map at once.
- `get_many_unchecked_mut`: Attempts to get mutable references to `N` values in the map at once, without validating that the values are unique.
As currently written, when a logic error occurs in a collection's trait
parameters, this allows *completely arbitrary* misbehavior, so long as
it does not cause undefined behavior in std. However, because the extent
of misbehavior is not specified, it is allowed for *any* code in std to
start misbehaving in arbitrary ways which are not formally UB; consider
the theoretical example of a global which gets set on an observed logic
error. Because the misbehavior is only bound by not resulting in UB from
safe APIs and the crate-level encapsulation boundary of all of std, this
makes writing user unsafe code that utilizes std theoretically
impossible, as it now relies on undocumented QOI that unrelated parts of
std cannot be caused to misbehave by a misuse of std::collections APIs.
In practice, this is a nonconcern, because std has reasonable QOI and an
implementation that takes advantage of this freedom is essentially a
malicious implementation and only compliant by the most langauage-lawyer
reading of the documentation.
To close this hole, we just add a small clause to the existing logic
error paragraph that ensures that any misbehavior is limited to the
collection which observed the logic error, making it more plausible to
prove the soundness of user unsafe code.
This is not meant to be formal; a formal refinement would likely need to
mention that values derived from the collection can also misbehave after a
logic error is observed, as well as define what it means to "observe" a
logic error in the first place. This fix errs on the side of informality
in order to close the hole without complicating a normal reading which
can assume a reasonable nonmalicious QOI.
See also [discussion on IRLO][1].
[1]: https://internals.rust-lang.org/t/using-std-collections-and-unsafe-anything-can-happen/16640
It is not obvious (at least for me) that complexity of iteration over hash tables depends on capacity and not length. Especially comparing with other containers like Vec or String. I think, this behaviour is worth mentioning.
I run benchmark which tests iteration time for maps with length 50 and different capacities and get this results:
```
capacity - time
64 - 203.87 ns
256 - 351.78 ns
1024 - 607.87 ns
4096 - 965.82 ns
16384 - 3.1188 us
```
If you want to dig why it behaves such way, you can look current implementation in [hashbrown code](f3a9f211d0/src/raw/mod.rs (L1933)).
Benchmarks code would be presented in PR related to this commit.
Remove `#[rustc_deprecated]`
This removes `#[rustc_deprecated]` and introduces diagnostics to help users to the right direction (that being `#[deprecated]`). All uses of `#[rustc_deprecated]` have been converted. CI is expected to fail initially; this requires #95958, which includes converting `stdarch`.
I plan on following up in a short while (maybe a bootstrap cycle?) removing the diagnostics, as they're only intended to be short-term.
This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths
This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
Improve doc wording for retain on some collections
I found the documentation wording on the various retain methods on many collections to be unusual.
I tried to invert the relation by switching `such that` with `for which` .
This updates the standard library's documentation to use the new syntax. The
documentation is worthwhile to update as it should be more idiomatic
(particularly for features like this, which are nice for users to get acquainted
with). The general codebase is likely more hassle than benefit to update: it'll
hurt git blame, and generally updates can be done by folks updating the code if
(and when) that makes things more readable with the new format.
A few places in the compiler and library code are updated (mostly just due to
already having been done when this commit was first authored).