Performance-oriented hashtable.

Previously, rust's hashtable was totally unoptimized. It used an Option
per key-value pair, and used very naive open allocation.

The old hashtable had very high variance in lookup time. For an example,
see the 'find_nonexisting' benchmark below. This is fixed by keys in
'lucky' spots with a low probe sequence length getting their good spots
stolen by keys with long probe sequence lengths. This reduces hashtable
probe length variance, while maintaining the same mean.

Also, other optimization liberties were taken. Everything is as cache
aware as possible, and this hashtable should perform extremely well for
both large and small keys and values.

Benchmarks:

comprehensive_old_hashmap         378 ns/iter (+/- 8)
comprehensive_new_hashmap         206 ns/iter (+/- 4)
1.8x faster

old_hashmap_as_queue              238 ns/iter (+/- 8)
new_hashmap_as_queue              119 ns/iter (+/- 2)
2x faster

old_hashmap_insert                172 ns/iter (+/- 8)
new_hashmap_insert                146 ns/iter (+/- 11)
1.17x faster

old_hashmap_find_existing         50 ns/iter (+/- 12)
new_hashmap_find_existing         35 ns/iter (+/- 6)
1.43x faster

old_hashmap_find_notexisting      49 ns/iter (+/- 49)
new_hashmap_find_notexisting      34 ns/iter (+/- 4)
1.44x faster

Memory usage of old hashtable (64-bit assumed):

aligned(8+sizeof(K)+sizeof(V))/0.75 + 6 words

Memory usage of new hashtable:

(aligned(sizeof(K))
+ aligned(sizeof(V))
+ 8)/0.9 + 6.5 words

BUT accesses are much more cache friendly. In fact, if the probe
sequence length is below 8, only two cache lines worth of hashes will be
pulled into cache. This is unlike the old version which would have to
stride over the stoerd keys and values, and would be more cache
unfriendly the bigger the stored values got.

And did you notice the higher load factor? We can now reasonably get a
load factor of 0.9 with very good performance.
This commit is contained in:
Clark Gaebel 2014-02-28 22:23:53 -05:00
parent 3316a0e6b2
commit 5bdbd21009

File diff suppressed because it is too large Load Diff