Cache conscious hashmap table
Right now the internal HashMap representation is 3 unziped arrays hhhkkkvvv, I propose to change it to hhhkvkvkv (in further iterations kvkvkvhhh may allow inplace grow). A previous attempt is at #21973.
This layout is generally more cache conscious as it makes the value immediately accessible after a key matches. The separated hash arrays is a _no-brainer_ because of how the RH algorithm works and that's unchanged.
**Lookups**: Upon a successful match in the hash array the code can check the key and immediately have access to the value in the same or next cache line (effectively saving a L[1,2,3] miss compared to the current layout).
**Inserts/Deletes/Resize**: Moving values in the table (robin hooding it) is faster because it touches consecutive cache lines and uses less instructions.
Some backing benchmarks (besides the ones bellow) for the benefits of this layout can be seen here as well http://www.reedbeta.com/blog/2015/01/12/data-oriented-hash-table/
The obvious drawbacks is: padding can be wasted between the key and value. Because of that keys(), values() and contains() can consume more cache and be slower.
Total wasted padding between items (C being the capacity of the table).
* Old layout: C * (K-K padding) + C * (V-V padding)
* Proposed: C * (K-V padding) + C * (V-K padding)
In practice padding between K-K and V-V *can* be smaller than K-V and V-K. The overhead is capped(ish) at sizeof u64 - 1 so we can actually measure the worst case (u8 at the end of key type and value with aliment of 1, _hardly the average case in practice_).
Starting from the worst case the memory overhead is:
* `HashMap<u64, u8>` 46% memory overhead. (aka *worst case*)
* `HashMap<u64, u16>` 33% memory overhead.
* `HashMap<u64, u32>` 20% memory overhead.
* `HashMap<T, T>` 0% memory overhead
* Worst case based on sizeof K + sizeof V:
| x | 16 | 24 | 32 | 64 | 128 |
|----------------|--------|--------|--------|-------|-------|
| (8+x+7)/(8+x) | 1.29 | 1.22 | 1.18 | 1.1 | 1.05 |
I've a test repo here to run benchmarks https://github.com/arthurprs/hashmap2/tree/layout
```
➜ hashmap2 git:(layout) ✗ cargo benchcmp hhkkvv:: hhkvkv:: bench.txt
name hhkkvv:: ns/iter hhkvkv:: ns/iter diff ns/iter diff %
grow_10_000 922,064 783,933 -138,131 -14.98%
grow_big_value_10_000 1,901,909 1,171,862 -730,047 -38.38%
grow_fnv_10_000 443,544 418,674 -24,870 -5.61%
insert_100 2,469 2,342 -127 -5.14%
insert_1000 23,331 21,536 -1,795 -7.69%
insert_100_000 4,748,048 3,764,305 -983,743 -20.72%
insert_10_000 321,744 290,126 -31,618 -9.83%
insert_int_bigvalue_10_000 749,764 407,547 -342,217 -45.64%
insert_str_10_000 337,425 334,009 -3,416 -1.01%
insert_string_10_000 788,667 788,262 -405 -0.05%
iter_keys_100_000 394,484 374,161 -20,323 -5.15%
iter_keys_big_value_100_000 402,071 620,810 218,739 54.40%
iter_values_100_000 424,794 373,004 -51,790 -12.19%
iterate_100_000 424,297 389,950 -34,347 -8.10%
lookup_100_000 189,997 186,554 -3,443 -1.81%
lookup_100_000_bigvalue 192,509 189,695 -2,814 -1.46%
lookup_10_000 154,251 145,731 -8,520 -5.52%
lookup_10_000_bigvalue 162,315 146,527 -15,788 -9.73%
lookup_10_000_exist 132,769 128,922 -3,847 -2.90%
lookup_10_000_noexist 146,880 144,504 -2,376 -1.62%
lookup_1_000_000 137,167 132,260 -4,907 -3.58%
lookup_1_000_000_bigvalue 141,130 134,371 -6,759 -4.79%
lookup_1_000_000_bigvalue_unif 567,235 481,272 -85,963 -15.15%
lookup_1_000_000_unif 589,391 453,576 -135,815 -23.04%
merge_shuffle 1,253,357 1,207,387 -45,970 -3.67%
merge_simple 40,264,690 37,996,903 -2,267,787 -5.63%
new 6 5 -1 -16.67%
with_capacity_10e5 3,214 3,256 42 1.31%
```
```
➜ hashmap2 git:(layout) ✗ cargo benchcmp hhkkvv:: hhkvkv:: bench.txt
name hhkkvv:: ns/iter hhkvkv:: ns/iter diff ns/iter diff %
iter_keys_100_000 391,677 382,839 -8,838 -2.26%
iter_keys_1_000_000 10,797,360 10,209,898 -587,462 -5.44%
iter_keys_big_value_100_000 414,736 662,255 247,519 59.68%
iter_keys_big_value_1_000_000 10,147,837 12,067,938 1,920,101 18.92%
iter_values_100_000 440,445 377,080 -63,365 -14.39%
iter_values_1_000_000 10,931,844 9,979,173 -952,671 -8.71%
iterate_100_000 428,644 388,509 -40,135 -9.36%
iterate_1_000_000 11,065,419 10,042,427 -1,022,992 -9.24%
```
macros: clean up scopes of expanded `#[macro_use]` imports
This PR changes the scope of macro-expanded `#[macro_use]` imports to match that of unexpanded `#[macro_use]` imports. For example, this would be allowed:
```rust
example!();
macro_rules! m { () => { #[macro_use(example)] extern crate example_crate; } }
m!();
```
This PR also enforces the full shadowing restrictions from RFC 1560 on `#[macro_use]` imports (currently, we only enforce the weakened restrictions from #36767).
This is a [breaking-change], but I believe it is highly unlikely to cause breakage in practice.
r? @nrc
Error monitor should emit error to stderr instead of stdout
We are pretty consistent about emitting to stderr, except for when there is actually an error, in which case we emit to stdout. This seems a bit backwards. This PR just changes that exception to emit to stderr. This is useful for the RLS since the LS protocol uses stdout (grrr).
r? @alexcrichton
Avoid allocations in `Decoder::read_str`.
`opaque::Decoder::read_str` is very hot within `rustc` due to its use in
the reading of crate metadata, and it currently returns a `String`. This
commit changes it to instead return a `Cow<str>`, which avoids a heap
allocation.
This change reduces the number of calls to `malloc` by almost 10% in
some benchmarks.
This is a [breaking-change] to libserialize.
Add comparison operators to boolean const eval.
I think it might be worth adding tests here, but since I don't know how or where to do that, I have not done so yet. Willing to do so if asked and given an explanation as to how.
Fixes#37047.
Changed 0 into '0'
Right now `0` is an undefined production rule.
[Documentation following the grammar specification](https://doc.rust-lang.org/nightly/std/fmt/#sign0) strongly suggests `'0'` is meant as it is used as a character literal.
r? @steveklabnik
ICH: Enable some cases in trait definition hashing.
Enable some test cases originally written by @eulerdisk. The tests can be enabled now because @MathieuBordere has fixed the underlying problem in #36974.
r? @nikomatsakis
Merge `Printer::token` and `Printer::size`.
Logically, it's a vector of pairs, so might as well represent it that
way.
The commit also changes `scan_stack` so that it is initialized with the
default size, instead of the excessive `55 * linewidth` size, which it
usually doesn't get even close to reaching.
Book: Be very explicit of lifetimes being descriptive
... not prescriptive. Pointed out in https://users.rust-lang.org/t/what-if-i-get-lifetimes-wrong/7535/4, which was a revelation to me and made me think this should be more clear in the book. I'm not sure if I got this entirely right or if the wording is good, but I figured a PR is more helpful than a simple issue.
r? @steveklabnik
Small Note: There's also https://github.com/rust-lang/book, should I have sent the PR there? It doesn't coincide with the online book though, so I figured it's better of here.
Add method str::repeat(self, usize) -> String
It is relatively simple to repeat a string n times:
`(0..n).map(|_| s).collect::<String>()`. It becomes slightly more
complicated to do it “right” (sizing the allocation up front), which
warrants a method that does it for us.
This method is useful in writing testcases, or when generating text.
`format!()` can be used to repeat single characters, but not repeating
strings like this.
rustdoc: print non-self arguments of bare functions and struct methods on their own line
This change alters the formatting rustdoc uses when it creates function and struct method documentation. For bare functions, each argument is printed on its own line. For struct methods, non-self arguments are printed on their own line. In both cases, no line breaks are introduced if there are no arguments, and for struct methods, no line breaks are introduced if there is only a single self argument. This should aid readability of long function signatures and allow for greater comprehension of these functions.
I've run rustdoc with these changes on my crate egg-mode and its set of dependencies and put the result [on my server](https://shiva.icesoldier.me/doc-custom/egg_mode/). Of note, here are a few shortcut links that highlight the changes:
* [Bare function with a long signature](https://shiva.icesoldier.me/doc-custom/egg_mode/place/fn.reverse_geocode.html)
* [Struct methods, with single self argument and with self and non-self arguments](https://shiva.icesoldier.me/doc-custom/egg_mode/tweet/struct.Timeline.html#method.reset)
* [Bare functions with no arguments](https://shiva.icesoldier.me/doc-custom/rand/fn.thread_rng.html) and [struct methods with no arguments](https://shiva.icesoldier.me/doc-custom/hyper/client/struct.Client.html#method.new) are left unchanged.
This PR consists of two commits: one for bare functions and one for struct methods.
Turn compatibility lint `match_of_unit_variant_via_paren_dotdot` into a hard error
The lint was introduced 10 months ago and made deny-by-default 7 months ago.
In case someone is still using it, https://github.com/rust-lang/rust/pull/36868 contains a stable replacement.
r? @nikomatsakis
configure: Add options for separate musl roots
This allows using the `./configure` script to enable rustbuild to compile
multiple musl targets at once. We'll hopefully use this soon on our bots to
produce a bunch of targets.