ff6c6ce917
Improve siphash performance for longer data Use `ptr::copy_nonoverlapping` (aka memcpy) to load an u64 from the byte stream. This is correct for any alignment, and the compiler will use the appropriate instruction to load the data. Also contains small tweaks that should benefit hashing short data too, both the commit that removes a variable and the autovectorization of the hash state initialization (in SipHash::reset). Benchmarks show that hashing longer data benefits for the improved word loading. Before (using benchmarks from the first commit in the PR): The before benchmark is a bit noisy. ``` test hash::sip::bench_bytes_4 ... bench: 41 ns/iter (+/- 0) = 97 MB/s test hash::sip::bench_bytes_7 ... bench: 49 ns/iter (+/- 2) = 142 MB/s test hash::sip::bench_bytes_8 ... bench: 42 ns/iter (+/- 4) = 190 MB/s test hash::sip::bench_bytes_a_16 ... bench: 57 ns/iter (+/- 14) = 280 MB/s test hash::sip::bench_bytes_b_32 ... bench: 85 ns/iter (+/- 74) = 376 MB/s test hash::sip::bench_bytes_c_128 ... bench: 278 ns/iter (+/- 33) = 460 MB/s test hash::sip::bench_long_str ... bench: 825 ns/iter (+/- 103) test hash::sip::bench_str_of_8_bytes ... bench: 151 ns/iter (+/- 66) test hash::sip::bench_str_over_8_bytes ... bench: 59 ns/iter (+/- 3) test hash::sip::bench_str_under_8_bytes ... bench: 47 ns/iter (+/- 56) test hash::sip::bench_u32 ... bench: 39 ns/iter (+/- 93) = 205 MB/s test hash::sip::bench_u32_keyed ... bench: 40 ns/iter (+/- 88) = 200 MB/s test hash::sip::bench_u64 ... bench: 54 ns/iter (+/- 96) = 148 MB/s ``` After: ``` test hash::sip::bench_bytes_4 ... bench: 41 ns/iter (+/- 3) = 97 MB/s test hash::sip::bench_bytes_7 ... bench: 48 ns/iter (+/- 0) = 145 MB/s test hash::sip::bench_bytes_8 ... bench: 35 ns/iter (+/- 1) = 228 MB/s test hash::sip::bench_bytes_a_16 ... bench: 45 ns/iter (+/- 1) = 355 MB/s test hash::sip::bench_bytes_b_32 ... bench: 60 ns/iter (+/- 0) = 533 MB/s test hash::sip::bench_bytes_c_128 ... bench: 161 ns/iter (+/- 5) = 795 MB/s test hash::sip::bench_long_str ... bench: 514 ns/iter (+/- 5) test hash::sip::bench_str_of_8_bytes ... bench: 44 ns/iter (+/- 0) test hash::sip::bench_str_over_8_bytes ... bench: 51 ns/iter (+/- 0) test hash::sip::bench_str_under_8_bytes ... bench: 52 ns/iter (+/- 6) test hash::sip::bench_u32 ... bench: 40 ns/iter (+/- 2) = 200 MB/s test hash::sip::bench_u32_keyed ... bench: 39 ns/iter (+/- 1) = 205 MB/s test hash::sip::bench_u64 ... bench: 36 ns/iter (+/- 1) = 222 MB/s ``` |
||
---|---|---|
.. | ||
fmt | ||
hash | ||
num | ||
any.rs | ||
atomic.rs | ||
cell.rs | ||
char.rs | ||
clone.rs | ||
cmp.rs | ||
intrinsics.rs | ||
iter.rs | ||
lib.rs | ||
mem.rs | ||
nonzero.rs | ||
ops.rs | ||
option.rs | ||
ptr.rs | ||
result.rs | ||
slice.rs | ||
str.rs | ||
tuple.rs |