The innermost loop of TwoWaySearcher checks the boundary of the haystack
vs position + needle.len(), and it checks the last byte of the needle
against the byteset.
If these two steps are combined by using the indexing of the last
needle byte's position as bounds check, the algorithm improves its
throughput. We improve the innermost loop by reducing the number of
instructions used, and elminating the panic case for the checked
indexing that was previously used.
Selected benchmarks from the external/workspace testsuite. Benchmarks
improve across the board.
```
before:
test bb_in_aa::twoway_find ... bench: 4,229 ns/iter (+/- 1,305) = 23646 MB/s
test bb_in_aa::twoway_rfind ... bench: 3,873 ns/iter (+/- 101) = 25819 MB/s
test short_1let_long::twoway_find ... bench: 7,075 ns/iter (+/- 29) = 360 MB/s
test short_1let_long::twoway_rfind ... bench: 6,640 ns/iter (+/- 79) = 384 MB/s
test short_2let_long::twoway_find ... bench: 3,823 ns/iter (+/- 16) = 667 MB/s
test short_2let_long::twoway_rfind ... bench: 3,774 ns/iter (+/- 44) = 675 MB/s
test short_3let_long::twoway_find ... bench: 3,582 ns/iter (+/- 47) = 712 MB/s
test short_3let_long::twoway_rfind ... bench: 3,616 ns/iter (+/- 34) = 705 MB/s
with this commit:
test bb_in_aa::twoway_find ... bench: 2,952 ns/iter (+/- 20) = 33875 MB/s
test bb_in_aa::twoway_rfind ... bench: 2,939 ns/iter (+/- 99) = 34025 MB/s
test short_1let_long::twoway_find ... bench: 4,593 ns/iter (+/- 4) = 555 MB/s
test short_1let_long::twoway_rfind ... bench: 4,592 ns/iter (+/- 76) = 555 MB/s
test short_2let_long::twoway_find ... bench: 2,804 ns/iter (+/- 3) = 909 MB/s
test short_2let_long::twoway_rfind ... bench: 2,807 ns/iter (+/- 40) = 908 MB/s
test short_3let_long::twoway_find ... bench: 3,105 ns/iter (+/- 120) = 821 MB/s
test short_3let_long::twoway_rfind ... bench: 3,019 ns/iter (+/- 50) = 844 MB/s
```
- `bb_in_aa`: fast skip due to byteset filter loop improves.
- 1/2/3let: Searches for 1, 2, or 3 ascii bytes improves.
Fix quadratic behavior in StrSearcher in reverse search with periodic
needles.
This commit adds the missing pieces for the "short period" case in
reverse search. The short case will show up when the needle is literally
periodic, for example "abababab".
Two way uses a "critical factorization" of the needle: x = u v.
Searching matches v first, if mismatch at character k, skip k forward.
Matching u, if mismatch, skip period(x) forward.
To avoid O(mn) behavior after mismatch in u, memorize the already
matched prefix.
The short period case requires that |u| < period(x).
For the reverse search we need to compute a different critical
factorization x = u' v' where |v'| < period(x), because we are searching
for the reversed needle. A short v' also benefits the algorithm in
general.
The reverse critical factorization is computed quickly by using the same
maximal suffix algorithm, but terminating as soon as we have a location
with local period equal to period(x).
This adds extra fields crit_pos_back and memory_back for the reverse
case. The new overhead for TwoWaySearcher::new is low, and additionally
I think the "short period" case is uncommon in many applications of
string search.
The maximal_suffix methods were updated in documentation and the
algorithms updated to not use !0 and wrapping add, variable left is now
1 larger, offset 1 smaller.
Use periodicity when computing byteset: in the periodic case, just
iterate over one period instead of the whole needle.
Example before (rfind) after (twoway_rfind) benchmark shows the removal
of quadratic behavior.
needle: "ab" * 100, haystack: ("bb" + "ab" * 100) * 100
```
test periodic::rfind ... bench: 1,926,595 ns/iter (+/- 11,390) = 10 MB/s
test periodic::twoway_rfind ... bench: 51,740 ns/iter (+/- 66) = 386 MB/s
```
part of #24407
I'm not sure whether I should be trying to explain the general rule in the E0210 explanation or just point people to the RFC. However, if we go with the latter option I think that the RFC will need to be revised slightly, since it is not quite as gentle as I would like.
Also, the link to RFC 1023 is not the correct one (it should be https://github.com/rust-lang/rfcs/blob/master/text/1023-rebalancing-coherence.md), but the correct one is too long. I'm aware of @michaelsproul's PR https://github.com/rust-lang/rust/pull/26290 from awhile back, but it doesn't seem to be working. Has there not been a new snapshot yet?
#27360 removed a padding field full of uint8_t's, but didn't remove
the use. This didn't get picked up presumably because (a) bors
doesn't have any BSD builders, and/or (b) #[cfg]'d out blocks don't
get linted.
```
rustc: x86_64-unknown-freebsd/stage1/lib/rustlib/x86_64-unknown-freebsd/lib/liblibc
src/liblibc/lib.rs:1099:42: 1099:49 error: unused import, #[deny(unused_imports)] on by default
src/liblibc/lib.rs:1099 use types::common::c99::{uint8_t, uint32_t, int32_t};
^~~~~~~
error: aborting due to previous error
fatal runtime error: Could not unwind stack, error = 159555904
```
this makes the second code block consistent with the first code block -- other than being in reversed order, the first code block claims b is u16 and c is u32, whereas the second code block claims the opposite. seems to be an obvious typo.
This is *mostly* reducing *my* use of *italics* but there's some other misc changes interspersed as I went along.
This updates the italicizing alphabetically from `a` to `ra`.
r? @steveklabnik
#27360 removed a padding field full of uint8_t's, but didn't remove
the use. This didn't get picked up presumably because (a) bors
doesn't have any BSD builders, and/or (b) #[cfg]'d out blocks don't
get linted.
```
rustc: x86_64-unknown-freebsd/stage1/lib/rustlib/x86_64-unknown-freebsd/lib/liblibc
src/liblibc/lib.rs:1099:42: 1099:49 error: unused import, #[deny(unused_imports)] on by default
src/liblibc/lib.rs:1099 use types::common::c99::{uint8_t, uint32_t, int32_t};
^~~~~~~
error: aborting due to previous error
fatal runtime error: Could not unwind stack, error = 159555904
```
There are still problems in both the design and implementation of this, so we don't want it landing in 1.2.
cc @arielb1 @nikomatsakis
cc #27364
r? @alexcrichton
The following APIs were all marked with a `#[stable]` tag:
* process::Child::id
* error::Error::is
* error::Error::downcast
* error::Error::downcast_ref
* error::Error::downcast_mut
* io::Error::get_ref
* io::Error::get_mut
* io::Error::into_inner
* hash::Hash::hash_slice
* hash::Hasher::write_{i,u}{8,16,32,64,size}
This isn't actually necessary any more with the advent of `$crate` and changes
in the compiler to expand macros to `::core::$foo` in the context of a
`#![no_std]` crate.
The libcore inner module was also trimmed down a bit to the bare bones.
As there’s no C++ runtime any more there’s really no point in having anything but Rust tags being made.
I’ve also taken the liberty of excluding the compiler parts of this in the `librust%,,` pattern substitution. Whether or not this is “correct” will depend on whether you want tags for the compiler or for general use. For myself, I want it for general use.
I’m not sure how much people use the tags files anyway. I definitely do, but with Racer existing the tags files aren’t quite so necessary.
This is a minor [breaking-change], as it changes what
`boxed_str.to_owned()` does (previously it would deref to `&str` and
call `to_owned` on that to get a `String`). However `Box<str>` is such an
exceptionally rare type that this is not expected to be a serious
concern. Also a `Box<str>` can be freely converted to a `String` to
obtain the previous result anyway.