rust/library/core/tests
bors ffdf18d144 Auto merge of #88788 - falk-hueffner:speedup-int-log10-branchless, r=joshtriplett
Speedup int log10 branchless

This is achieved with a branchless bit-twiddling implementation of the case x < 100_000, and using this as building block.

Benchmark on an Intel i7-8700K (Coffee Lake):

```
name                                   old ns/iter  new ns/iter  diff ns/iter   diff %  speedup
num::int_log::u8_log10_predictable     165          169                     4    2.42%   x 0.98
num::int_log::u8_log10_random          438          423                   -15   -3.42%   x 1.04
num::int_log::u8_log10_random_small    438          423                   -15   -3.42%   x 1.04
num::int_log::u16_log10_predictable    633          417                  -216  -34.12%   x 1.52
num::int_log::u16_log10_random         908          471                  -437  -48.13%   x 1.93
num::int_log::u16_log10_random_small   945          471                  -474  -50.16%   x 2.01
num::int_log::u32_log10_predictable    1,496        1,340                -156  -10.43%   x 1.12
num::int_log::u32_log10_random         1,076        873                  -203  -18.87%   x 1.23
num::int_log::u32_log10_random_small   1,145        874                  -271  -23.67%   x 1.31
num::int_log::u64_log10_predictable    4,005        3,171                -834  -20.82%   x 1.26
num::int_log::u64_log10_random         1,247        1,021                -226  -18.12%   x 1.22
num::int_log::u64_log10_random_small   1,265        921                  -344  -27.19%   x 1.37
num::int_log::u128_log10_predictable   39,667       39,579                -88   -0.22%   x 1.00
num::int_log::u128_log10_random        6,456        6,696                 240    3.72%   x 0.96
num::int_log::u128_log10_random_small  4,108        3,903                -205   -4.99%   x 1.05
```

Benchmark on an M1 Mac Mini:

```
name                                   old ns/iter  new ns/iter  diff ns/iter   diff %  speedup
num::int_log::u8_log10_predictable     143          130                   -13   -9.09%   x 1.10
num::int_log::u8_log10_random          375          325                   -50  -13.33%   x 1.15
num::int_log::u8_log10_random_small    376          325                   -51  -13.56%   x 1.16
num::int_log::u16_log10_predictable    500          322                  -178  -35.60%   x 1.55
num::int_log::u16_log10_random         794          405                  -389  -48.99%   x 1.96
num::int_log::u16_log10_random_small   1,035        405                  -630  -60.87%   x 2.56
num::int_log::u32_log10_predictable    1,144        894                  -250  -21.85%   x 1.28
num::int_log::u32_log10_random         832          786                   -46   -5.53%   x 1.06
num::int_log::u32_log10_random_small   832          787                   -45   -5.41%   x 1.06
num::int_log::u64_log10_predictable    2,681        2,057                -624  -23.27%   x 1.30
num::int_log::u64_log10_random         1,015        806                  -209  -20.59%   x 1.26
num::int_log::u64_log10_random_small   1,004        795                  -209  -20.82%   x 1.26
num::int_log::u128_log10_predictable   56,825       56,526               -299   -0.53%   x 1.01
num::int_log::u128_log10_random        9,056        8,861                -195   -2.15%   x 1.02
num::int_log::u128_log10_random_small  1,528        1,527                  -1   -0.07%   x 1.00
```

The 128 bit case remains ridiculously slow because llvm fails to optimize division by a constant 128-bit value to multiplications. This could be worked around but it seems preferable to fix this in llvm.

From u32 up, table lookup (like suggested [here](https://github.com/rust-lang/rust/issues/70887#issuecomment-881099813)) is still faster, but requires a hardware `leading_zeros` to be viable, and might clog up the cache.
2021-10-12 03:18:54 +00:00
..
fmt Use a test value that doesn't depend on the handling of even/odd rounding 2021-10-03 20:15:12 -07:00
hash move object safety test to library/core 2021-08-15 13:00:25 -04:00
iter implement advance_(back_)_by on more iterators 2021-09-30 21:23:28 +02:00
num Auto merge of #88788 - falk-hueffner:speedup-int-log10-branchless, r=joshtriplett 2021-10-12 03:18:54 +00:00
ops
alloc.rs
any.rs Add test for issue 84666. 2021-06-03 16:13:45 +02:00
array.rs Also cfg flag auxiliar function 2021-10-08 06:40:24 -03:00
ascii.rs
atomic.rs
bool.rs
cell.rs Add a few tests for UnsafeCell 2021-08-31 16:32:01 -07:00
char.rs Further simplification of to_digit 2021-06-10 20:16:35 +01:00
clone.rs
cmp.rs
const_ptr.rs Revert "Revert tests added by PR 81167." 2021-06-27 12:05:17 +02:00
intrinsics.rs
lazy.rs
lib.rs Rollup merge of #75644 - c410-f3r:array, r=yaahc 2021-10-09 17:08:38 +02:00
macros.rs Allow leading pipe in matches!() patterns. 2021-07-15 22:05:45 +03:00
manually_drop.rs Test ManuallyDrop::clone_from. 2021-07-05 11:55:45 +00:00
mem.rs Remove the deprecated core::raw and std::raw module. 2021-07-03 14:03:27 +08:00
nonzero.rs
ops.rs
option.rs const fn for option copied, take & replace + tests 2021-08-29 13:19:17 +02:00
pattern.rs
pin.rs
ptr.rs Bump cfgs 2021-04-04 14:57:05 -04:00
result.rs Update to new bootstrap compiler 2021-06-28 11:30:49 -04:00
slice.rs Move to the top of file 2021-08-31 08:28:51 -07:00
str_lossy.rs
str.rs
task.rs
time.rs Make Duration's Debug format pad to width 2021-09-16 03:09:31 +02:00
tuple.rs
unicode.rs