This commit aims to prepare the `std::hash` module for alpha by formalizing its
current interface whileholding off on adding `#[stable]` to the new APIs. The
current usage with the `HashMap` and `HashSet` types is also reconciled by
separating out composable parts of the design. The primary goal of this slight
redesign is to separate the concepts of a hasher's state from a hashing
algorithm itself.
The primary change of this commit is to separate the `Hasher` trait into a
`Hasher` and a `HashState` trait. Conceptually the old `Hasher` trait was
actually just a factory for various states, but hashing had very little control
over how these states were used. Additionally the old `Hasher` trait was
actually fairly unrelated to hashing.
This commit redesigns the existing `Hasher` trait to match what the notion of a
`Hasher` normally implies with the following definition:
trait Hasher {
type Output;
fn reset(&mut self);
fn finish(&self) -> Output;
}
This `Hasher` trait emphasizes that hashing algorithms may produce outputs other
than a `u64`, so the output type is made generic. Other than that, however, very
little is assumed about a particular hasher. It is left up to implementors to
provide specific methods or trait implementations to feed data into a hasher.
The corresponding `Hash` trait becomes:
trait Hash<H: Hasher> {
fn hash(&self, &mut H);
}
The old default of `SipState` was removed from this trait as it's not something
that we're willing to stabilize until the end of time, but the type parameter is
always required to implement `Hasher`. Note that the type parameter `H` remains
on the trait to enable multidispatch for specialization of hashing for
particular hashers.
Note that `Writer` is not mentioned in either of `Hash` or `Hasher`, it is
simply used as part `derive` and the implementations for all primitive types.
With these definitions, the old `Hasher` trait is realized as a new `HashState`
trait in the `collections::hash_state` module as an unstable addition for
now. The current definition looks like:
trait HashState {
type Hasher: Hasher;
fn hasher(&self) -> Hasher;
}
The purpose of this trait is to emphasize that the one piece of functionality
for implementors is that new instances of `Hasher` can be created. This
conceptually represents the two keys from which more instances of a
`SipHasher` can be created, and a `HashState` is what's stored in a
`HashMap`, not a `Hasher`.
Implementors of custom hash algorithms should implement the `Hasher` trait, and
only hash algorithms intended for use in hash maps need to implement or worry
about the `HashState` trait.
The entire module and `HashState` infrastructure remains `#[unstable]` due to it
being recently redesigned, but some other stability decision made for the
`std::hash` module are:
* The `Writer` trait remains `#[experimental]` as it's intended to be replaced
with an `io::Writer` (more details soon).
* The top-level `hash` function is `#[unstable]` as it is intended to be generic
over the hashing algorithm instead of hardwired to `SipHasher`
* The inner `sip` module is now private as its one export, `SipHasher` is
reexported in the `hash` module.
And finally, a few changes were made to the default parameters on `HashMap`.
* The `RandomSipHasher` default type parameter was renamed to `RandomState`.
This renaming emphasizes that it is not a hasher, but rather just state to
generate hashers. It also moves away from the name "sip" as it may not always
be implemented as `SipHasher`. This type lives in the
`std::collections::hash_map` module as `#[unstable]`
* The associated `Hasher` type of `RandomState` is creatively called...
`Hasher`! This concrete structure lives next to `RandomState` as an
implemenation of the "default hashing algorithm" used for a `HashMap`. Under
the hood this is currently implemented as `SipHasher`, but it draws an
explicit interface for now and allows us to modify the implementation over
time if necessary.
There are many breaking changes outlined above, and as a result this commit is
a:
[breaking-change]
fmt::Show is for debugging, and can and should be implemented for
all public types. This trait is used with `{:?}` syntax. There still
exists #[derive(Show)].
fmt::String is for types that faithfully be represented as a String.
Because of this, there is no way to derive fmt::String, all
implementations must be purposeful. It is used by the default format
syntax, `{}`.
This will break most instances of `{}`, since that now requires the type
to impl fmt::String. In most cases, replacing `{}` with `{:?}` is the
correct fix. Types that were being printed specifically for users should
receive a fmt::String implementation to fix this.
Part of #20013
[breaking-change]
This commit is an implementation of [RFC 503][rfc] which is a stabilization
story for the prelude. Most of the RFC was directly applied, removing reexports.
Some reexports are kept around, however:
* `range` remains until range syntax has landed to reduce churn.
* `Path` and `GenericPath` remain until path reform lands. This is done to
prevent many imports of `GenericPath` which will soon be removed.
* All `io` traits remain until I/O reform lands so imports can be rewritten all
at once to `std::io::prelude::*`.
This is a breaking change because many prelude reexports have been removed, and
the RFC can be consulted for the exact list of removed reexports, as well as to
find the locations of where to import them.
[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/0503-prelude-stabilization.md
[breaking-change]
Closes#20068
post-unboxed-closure-conversion. This requires a fair amount of
annoying coercions because all the `map` etc types are defined
generically over the `F`, so the automatic coercions don't propagate;
this is compounded by the need to use `let` and not `as` due to
stage0. That said, this pattern is to a large extent temporary and
unusual.
lifetimes. This currently causes an ICE; it should (ideally) work, but
failing that at least give a structured error. For the purposes of
this PR, though, workaround is fine.
This commit performs a second pass stabilization of the `std::default` module.
The module was already marked `#[stable]`, and the inheritance of `#[stable]`
was removed since this attribute was applied. This commit adds the `#[stable]`
attribute to the trait definition and one method name, along with all
implementations found in the standard distribution.
Using a type alias for iterator implementations is fragile since this
exposes the implementation to users of the iterator, and any changes
could break existing code.
This commit changes the iterators of `BTreeSet` to use
proper new types, rather than type aliases. However, since it is
fair-game to treat a type-alias as the aliased type, this is a:
[breaking-change].
- The following operator traits now take their arguments by value: `Add`, `Sub`, `Mul`, `Div`, `Rem`, `BitAnd`, `BitOr`, `BitXor`, `Shl`, `Shr`. This breaks all existing implementations of these traits.
- The binary operation `a OP b` now "desugars" to `OpTrait::op_method(a, b)` and consumes both arguments.
- `String` and `Vec` addition have been changed to reuse the LHS owned value, and to avoid internal cloning. Only the following asymmetric operations are available: `String + &str` and `Vec<T> + &[T]`, which are now a short-hand for the "append" operation.
[breaking-change]
---
This passes `make check` locally. I haven't touch the unary operators in this PR, but converting them to by value should be very similar to this PR. I can work on them after this gets the thumbs up.
@nikomatsakis r? the compiler changes
@aturon r? the library changes. I think the only controversial bit is the semantic change of the `Vec`/`String` `Add` implementation.
cc #19148
I am trying to add an implementation of `bitor` for `BTreeSet`. I think I am most of the way there, but I am going to need some guidance to take it all the way.
When I run `make check`, I get:
```
error: cannot move out of dereference of `&`-pointer
self.union(_rhs).map(|&i| i).collect::<BTreeSet<T>>()
^~
```
I'd appreciate any nudges in the right direction. If I can figure this one out, I am sure I will be able to implement `bitand`, `bitxor`, and `sub` as well.
/cc @Gankro
---
**Update**
I have added implementations for `BitOr`, `BitAnd`, `BitXor`, and `Sub` for `BTreeSet`.
Add initial attempt at implementing BitOr for BTreeSet.
Update the implementation of the bitor operator for BTreeSets.
`make check` ran fine through this.
Add implementations for BitAnd, BitXor, and Sub as well.
Remove the FIXME comment and add unstable flags.
Add doctests for the bitop functions.
Now that we have an overloaded comparison (`==`) operator, and that `Vec`/`String` deref to `[T]`/`str` on method calls, many `as_slice()`/`as_mut_slice()`/`to_string()` calls have become redundant. This patch removes them. These were the most common patterns:
- `assert_eq(test_output.as_slice(), "ground truth")` -> `assert_eq(test_output, "ground truth")`
- `assert_eq(test_output, "ground truth".to_string())` -> `assert_eq(test_output, "ground truth")`
- `vec.as_mut_slice().sort()` -> `vec.sort()`
- `vec.as_slice().slice(from, to)` -> `vec.slice(from_to)`
---
Note that e.g. `a_string.push_str(b_string.as_slice())` has been left untouched in this PR, since we first need to settle down whether we want to favor the `&*b_string` or the `b_string[]` notation.
This is rebased on top of #19167
cc @alexcrichton @aturon
Add a rustdoc test for union to exhibit how it is used.
There is already a test for union in the test namespace, but this commit
adds a doctest that will appear in the rustdocs.
Add a doctest for the difference function.
Add a doctest for the symmetric_difference function.
Add a doctest for the intersection function.
Update the union et al. doctests based on @Gankro's comments.
Make the union et al. doctests a bit more readable.
* Renames/deprecates the simplest and most obvious methods
* Adds FIXME(conventions)s for outstanding work
* Marks "handled" methods as unstable
NOTE: the semantics of reserve and reserve_exact have changed!
Other methods have had their semantics changed as well, but in a
way that should obviously not typecheck if used incorrectly.
Lots of work and breakage to come, but this handles most of the core
APIs and most eggregious breakage. Future changes should *mostly* focus on
niche collections, APIs, or simply back-compat additions.
[breaking-change]
* Moves multi-collection files into their own directory, and splits them into seperate files
* Changes exports so that each collection has its own module
* Adds underscores to public modules and filenames to match standard naming conventions
(that is, treemap::{TreeMap, TreeSet} => tree_map::TreeMap, tree_set::TreeSet)
* Renames PriorityQueue to BinaryHeap
* Renames SmallIntMap to VecMap
* Miscellanious fallout fixes
[breaking-change]
As part of the collections reform RFC, this commit removes all collections
traits in favor of inherent methods on collections themselves. All methods
should continue to be available on all collections.
This is a breaking change with all of the collections traits being removed and
no longer being in the prelude. In order to update old code you should move the
trait implementations to inherent implementations directly on the type itself.
Note that some traits had default methods which will also need to be implemented
to maintain backwards compatibility.
[breaking-change]
cc #18424
Replaces BTree with BTreeMap and BTreeSet, which are completely new implementations.
BTreeMap's internal Node representation is particularly inefficient at the moment to
make this first implementation easy to reason about and fairly safe. Both collections
are also currently missing some of the tooling specific to sorted collections, which
is planned as future work pending reform of these APIs. General implementation issues
are discussed with TODOs internally
Perf results on x86_64 Linux:
test treemap::bench::find_rand_100 ... bench: 76 ns/iter (+/- 4)
test treemap::bench::find_rand_10_000 ... bench: 163 ns/iter (+/- 6)
test treemap::bench::find_seq_100 ... bench: 77 ns/iter (+/- 3)
test treemap::bench::find_seq_10_000 ... bench: 115 ns/iter (+/- 1)
test treemap::bench::insert_rand_100 ... bench: 111 ns/iter (+/- 1)
test treemap::bench::insert_rand_10_000 ... bench: 996 ns/iter (+/- 18)
test treemap::bench::insert_seq_100 ... bench: 486 ns/iter (+/- 20)
test treemap::bench::insert_seq_10_000 ... bench: 800 ns/iter (+/- 15)
test btree::map::bench::find_rand_100 ... bench: 74 ns/iter (+/- 4)
test btree::map::bench::find_rand_10_000 ... bench: 153 ns/iter (+/- 5)
test btree::map::bench::find_seq_100 ... bench: 82 ns/iter (+/- 1)
test btree::map::bench::find_seq_10_000 ... bench: 108 ns/iter (+/- 0)
test btree::map::bench::insert_rand_100 ... bench: 220 ns/iter (+/- 1)
test btree::map::bench::insert_rand_10_000 ... bench: 620 ns/iter (+/- 16)
test btree::map::bench::insert_seq_100 ... bench: 411 ns/iter (+/- 12)
test btree::map::bench::insert_seq_10_000 ... bench: 534 ns/iter (+/- 14)
BTreeMap still has a lot of room for optimization, but it's already beating out TreeMap on most access patterns.
[breaking-change]