This commit does the following.
- Changes the terminology for capacities used within HashMap's code.
"Internal capacity" is now consistently "raw capacity", and "usable
capacity" is now consistently just "capacity". This makes the code
easier to understand.
- Reworks capacity and raw capacity computations. Raw capacity
computations are now handled in a single place:
`DefaultResizePolicy::raw_capacity()`. This function correctly returns
zero when given zero, which means that the following cases now result
in a capacity of zero when they previously did not.
* `Hash{Map,Set}::with_capacity(0)`
* `Hash{Map,Set}::with_capacity_and_hasher(0)`
* `Hash{Map,Set}::shrink_to_fit()`, when used with a hash map/set whose
elements have all been removed
- Strengthens the language used in the comments describing the above
functions, to make it clearer when they will result in a map/set with
a capacity of zero. The new language is based on the language used for
the corresponding functions in `Vec`.
- Adds tests for the above zero-capacity cases.
- Removes `test_resize_policy` because it is no longer useful.
The following `HashMap` creation functions don't allocate heap storage for elements.
```
HashMap::new()
HashMap::default()
HashMap::with_hasher()
```
This is good, because it's surprisingly common to create a HashMap and never
use it. So that case should be cheap.
However, `HashSet` does not have the same behaviour. The corresponding creation
functions *do* allocate heap storage for the default number of non-zero
elements (which is 32 slots for 29 elements).
```
HashMap::new()
HashMap::default()
HashMap::with_hasher()
```
This commit gives `HashSet` the same behaviour as `HashMap`, by simply calling
the corresponding `HashMap` functions (something `HashSet` already does for
`with_capacity` and `with_capacity_and_hasher`). It also reformats one existing
`HashSet` construction to use a consistent single-line format.
This speeds up rustc itself by 1.01--1.04x on most of the non-tiny
rustc-benchmarks.
* We never want to make guarantees about protecting against attacks.
* "True randomness" is not the right terminology to be using in this
context.
* There is significantly more nuance to the performance of SipHash than
"somewhat slow".
Implement 1581 (FusedIterator)
* [ ] Implement on patterns. See https://github.com/rust-lang/rust/issues/27721#issuecomment-239638642.
* [ ] Handle OS Iterators. A bunch of iterators (`Args`, `Env`, etc.) in libstd wrap platform specific iterators. The current ones all appear to be well-behaved but can we assume that future ones will be?
* [ ] Does someone want to audit this? On first glance, all of the iterators on which I implemented `FusedIterator` appear to be well-behaved but there are a *lot* of them so a second pair of eyes would be nice.
* I haven't touched rustc internal iterators (or the internal rand) because rustc doesn't actually call `fuse()`.
* `FusedIterator` can't be implemented on `std::io::{Bytes, Chars}`.
Closes: #35602 (Tracking Issue)
Implements: rust-lang/rfcs#1581
Because of changes to how Rust acquires randomness HashMap is not
guaranteed to be DoS resistant. This commit reflects these changes in
the docs themselves and provides an alternative method to creating
a hash that is resistant if needed.
This is a rebase and extension of #31356 where we cache the keys in thread local
storage. This should give us a nice speed bost in creating hash maps along with
mostly retaining the property that all maps have a nondeterministic iteration
order.
Closes#27243
rand: don't block before random pool is initialized
If we attempt a read with getrandom() on Linux the syscall can block
before the random pool is initialized unless the GRND_NONBLOCK flag is
passed. This flag causes getrandom() to instead return EAGAIN while the
pool is uninitialized. To avoid downstream users of crate or std
functionality that have no ability to avoid this blocking behavior this
change causes Rust to read bytes from /dev/urandom while getrandom()
would block and once getrandom() is available to use that. Fixes#32953.
Signed-off-by: Doug Goldstein <cardoe@cardoe.com>
Make HashSet::Insert documentation more consistent
I have made the HashSet::Insert documentation more consistent in the use of the term 'value' vs 'key'. Also clarified that if _this_ value is present true is returned, instead of the ambiguous 'a value present'.
r? @steveklabnik
The algorithm implemented here is linear in the size of the two b-trees. It
firsts creates a `MergeIter` from the two b-trees and then builds a new b-tree
by pushing key-value pairs from the `MergeIter` into nodes at the right heights.
Three functions for stealing have been added to the implementation of `Handle` as
well as a getter for the height of a `NodeRef`.
The docs have been updated with performance information about `BTreeMap::append` and
the remark about B has been removed now that it is the same for all instances of `BTreeMap`.