rust/library
bors 9340e5c1b9 Auto merge of #103779 - the8472:simd-str-contains, r=thomcc
x86_64 SSE2 fast-path for str.contains(&str) and short needles

Based on Wojciech Muła's [SIMD-friendly algorithms for substring searching](http://0x80.pl/articles/simd-strfind.html#sse-avx2)

The two-way algorithm is Big-O efficient but it needs to preprocess the needle
to find a "critical factorization" of it. This additional work is significant
for short needles. Additionally it mostly advances needle.len() bytes at a time.

The SIMD-based approach used here on the other hand can advance based on its
vector width, which can exceed the needle length. Except for pathological cases,
but due to being limited to small needles the worst case blowup is also small.

benchmarks taken on a Zen2, compiled with `-Ccodegen-units=1`:

```
OLD:
test str::bench_contains_16b_in_long                     ... bench:         504 ns/iter (+/- 14) = 5061 MB/s
test str::bench_contains_2b_repeated_long                ... bench:         948 ns/iter (+/- 175) = 2690 MB/s
test str::bench_contains_32b_in_long                     ... bench:         445 ns/iter (+/- 6) = 5732 MB/s
test str::bench_contains_bad_naive                       ... bench:         130 ns/iter (+/- 1) = 569 MB/s
test str::bench_contains_bad_simd                        ... bench:          84 ns/iter (+/- 8) = 880 MB/s
test str::bench_contains_equal                           ... bench:         142 ns/iter (+/- 7) = 394 MB/s
test str::bench_contains_short_long                      ... bench:         677 ns/iter (+/- 25) = 3768 MB/s
test str::bench_contains_short_short                     ... bench:          27 ns/iter (+/- 2) = 2074 MB/s

NEW:
test str::bench_contains_16b_in_long                     ... bench:          82 ns/iter (+/- 0) = 31109 MB/s
test str::bench_contains_2b_repeated_long                ... bench:          73 ns/iter (+/- 0) = 34945 MB/s
test str::bench_contains_32b_in_long                     ... bench:          71 ns/iter (+/- 1) = 35929 MB/s
test str::bench_contains_bad_naive                       ... bench:           7 ns/iter (+/- 0) = 10571 MB/s
test str::bench_contains_bad_simd                        ... bench:          97 ns/iter (+/- 41) = 762 MB/s
test str::bench_contains_equal                           ... bench:           4 ns/iter (+/- 0) = 14000 MB/s
test str::bench_contains_short_long                      ... bench:          73 ns/iter (+/- 0) = 34945 MB/s
test str::bench_contains_short_short                     ... bench:          12 ns/iter (+/- 0) = 4666 MB/s
```
2022-11-17 04:47:11 +00:00
..
alloc generalize str.contains() tests to a range of haystack sizes 2022-11-15 18:30:07 +01:00
backtrace@07872f28cd Update backtrace 2022-09-02 16:09:58 -04:00
core Auto merge of #103779 - the8472:simd-str-contains, r=thomcc 2022-11-17 04:47:11 +00:00
panic_abort Remove std's transitive dependency on cfg-if 0.1 2022-11-02 18:01:20 -04:00
panic_unwind Remove std's transitive dependency on cfg-if 0.1 2022-11-02 18:01:20 -04:00
portable-simd Fix rustdoc lints 2022-11-06 17:21:22 -05:00
proc_macro Bump version placeholders to release 2022-11-06 17:11:02 -05:00
profiler_builtins
rtstartup Remove custom frame info registration on i686-pc-windows-gnu 2022-08-23 16:12:58 +08:00
rustc-std-workspace-alloc
rustc-std-workspace-core
rustc-std-workspace-std
std Rollup merge of #104401 - RalfJung:mpsc-leak, r=Amanieu 2022-11-16 08:36:12 +01:00
stdarch@790411f93c library: update stdarch submodule 2022-10-13 09:41:16 +08:00
test Rollup merge of #103681 - RalfJung:libtest-thread, r=thomcc 2022-11-04 18:52:26 +01:00
unwind Move most of unwind's build script to lib.rs 2022-11-14 14:24:12 +00:00