rust/benches at 7f319c74783b9e0684f1a2e29b977e0518257277 - rust

History

bors caca2121ff Auto merge of #74024 - Folyd:master, r=m-ou-se

Improve slice.binary_search_by()'s best-case performance to O(1)

This PR aimed to improve the [slice.binary_search_by()](https://doc.rust-lang.org/std/primitive.slice.html#method.binary_search_by)'s best-case performance to O(1).

# Noticed

I don't know why the docs of `binary_search_by` said `"If there are multiple matches, then any one of the matches could be returned."`, but the implementation isn't the same thing. Actually, it returns the **last one** if multiple matches found.

Then we got two options:

## If returns the last one is the correct or desired result

Then I can rectify the docs and revert my changes.

## If the docs are correct or desired result

Then my changes can be merged after fully reviewed.

However, if my PR gets merged, another issue raised: this could be a **breaking change** since if multiple matches found, the returning order no longer the last one instead of it could be any one.

For example:
```rust
let mut s = vec![0, 1, 1, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55];
let num = 1;
let idx = s.binary_search(&num);
s.insert(idx, 2);

// Old implementations
assert_eq!(s, [0, 1, 1, 1, 1, 2, 2, 3, 5, 8, 13, 21, 34, 42, 55]);

// New implementations
assert_eq!(s, [0, 1, 1, 1, 2, 1, 2, 3, 5, 8, 13, 21, 34, 42, 55]);
```

# Benchmarking

**Old implementations**
```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1           ... bench:          59 ns/iter (+/- 4)
test slice::binary_search_l1_with_dups ... bench:          59 ns/iter (+/- 3)
test slice::binary_search_l2           ... bench:          76 ns/iter (+/- 5)
test slice::binary_search_l2_with_dups ... bench:          77 ns/iter (+/- 17)
test slice::binary_search_l3           ... bench:         183 ns/iter (+/- 23)
test slice::binary_search_l3_with_dups ... bench:         185 ns/iter (+/- 19)
```

**New implementations (1)**

Implemented by this PR.
```rust
if cmp == Equal {
    return Ok(mid);
} else if cmp == Less {
    base = mid
}
```
```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1           ... bench:          58 ns/iter (+/- 2)
test slice::binary_search_l1_with_dups ... bench:          37 ns/iter (+/- 4)
test slice::binary_search_l2           ... bench:          76 ns/iter (+/- 3)
test slice::binary_search_l2_with_dups ... bench:          57 ns/iter (+/- 6)
test slice::binary_search_l3           ... bench:         200 ns/iter (+/- 30)
test slice::binary_search_l3_with_dups ... bench:         157 ns/iter (+/- 6)

$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1           ... bench:          59 ns/iter (+/- 8)
test slice::binary_search_l1_with_dups ... bench:          37 ns/iter (+/- 2)
test slice::binary_search_l2           ... bench:          77 ns/iter (+/- 2)
test slice::binary_search_l2_with_dups ... bench:          57 ns/iter (+/- 2)
test slice::binary_search_l3           ... bench:         198 ns/iter (+/- 21)
test slice::binary_search_l3_with_dups ... bench:         158 ns/iter (+/- 11)

```

**New implementations (2)**

Suggested by `@nbdd0121` in [comment](https://github.com/rust-lang/rust/pull/74024#issuecomment-665430239).
```rust
base = if cmp == Greater { base } else { mid };
if cmp == Equal { break }
```

```sh
$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1           ... bench:          59 ns/iter (+/- 7)
test slice::binary_search_l1_with_dups ... bench:          37 ns/iter (+/- 5)
test slice::binary_search_l2           ... bench:          75 ns/iter (+/- 3)
test slice::binary_search_l2_with_dups ... bench:          56 ns/iter (+/- 3)
test slice::binary_search_l3           ... bench:         195 ns/iter (+/- 15)
test slice::binary_search_l3_with_dups ... bench:         151 ns/iter (+/- 7)

$ ./x.py bench --stage 1 library/libcore
test slice::binary_search_l1           ... bench:          57 ns/iter (+/- 2)
test slice::binary_search_l1_with_dups ... bench:          38 ns/iter (+/- 2)
test slice::binary_search_l2           ... bench:          77 ns/iter (+/- 11)
test slice::binary_search_l2_with_dups ... bench:          57 ns/iter (+/- 4)
test slice::binary_search_l3           ... bench:         194 ns/iter (+/- 15)
test slice::binary_search_l3_with_dups ... bench:         151 ns/iter (+/- 18)

```

I run some benchmarking testings against on two implementations. The new implementation has a lot of improvement in duplicates cases, while in `binary_search_l3` case, it's a little bit slower than the old one.

2021-03-05 20:12:13 +00:00

ascii

…

char

Slight perf improvement on char::to_ascii_lowercase

2021-02-06 19:56:43 +00:00

hash

…

num

…

any.rs

…