This commit stabilizes the prelude::v1 module of libcore after verifying that
it's a subset of the prelude of the standard library with the addition of a few
extension traits.
StrSearcher: Implement the complete reverse case for the two way algorithm
Fix quadratic behavior in StrSearcher in reverse search with periodic
needles.
This commit adds the missing pieces for the "short period" case in
reverse search. The short case will show up when the needle is literally
periodic, for example "abababab".
Two way uses a "critical factorization" of the needle: x = u v.
Searching matches v first, if mismatch at character k, skip k forward.
Matching u, if mismatch, skip period(x) forward.
To avoid O(mn) behavior after mismatch in u, memorize the already
matched prefix.
The short period case requires that |u| < period(x).
For the reverse search we need to compute a different critical
factorization x = u' v' where |v'| < period(x), because we are searching
for the reversed needle. A short v' also benefits the algorithm in
general.
The reverse critical factorization is computed quickly by using the same
maximal suffix algorithm, but terminating as soon as we have a location
with local period equal to period(x).
This adds extra fields crit_pos_back and memory_back for the reverse
case. The new overhead for TwoWaySearcher::new is low, and additionally
I think the "short period" case is uncommon in many applications of
string search.
The maximal_suffix methods were updated in documentation and the
algorithms updated to not use !0 and wrapping add, variable left is now
1 larger, offset 1 smaller.
Use periodicity when computing byteset: in the periodic case, just
iterate over one period instead of the whole needle.
Example before (rfind) after (twoway_rfind) benchmark shows the removal
of quadratic behavior.
needle: "ab" * 100, haystack: ("bb" + "ab" * 100) * 100
```
test periodic::rfind ... bench: 1,926,595 ns/iter (+/- 11,390) = 10 MB/s
test periodic::twoway_rfind ... bench: 51,740 ns/iter (+/- 66) = 386 MB/s
```
This implements https://github.com/rust-lang/rfcs/pull/1199 (except for doing all the platform intrinsics).
Things remaining for SIMD (not necessarily in this PR):
- [x] I (@huonw) am signed up to ensure the compiler matches the RFC, when it lands
- [x] the platform specific intrinsics aren't properly type checked at the moment (LLVM will throw a "random" assertion)
- [ ] there's a lot of useful intrinsics that are missing, including whole platforms (mips, powerpc)
- [ ] the target-feature `cfg` detection/adding is not so great at the moment
- [x] I think the platform specific intrinsics should go in their own `extern` ABI (i.e. not `"rust-intrinsic"`)
(I'm adjusting the RFC to reflect the latter.)
I think it would be very nice for this to land without requiring the RFC to land first, because of the first point, and because this is the only way for any further work to happen/be experimented with, without requiring people to build/install/multirust a compiler from a custom branch.
r? @alexcrichton
This is purposely separate to the "rust-intrinsic" ABI, because these
intrinsics are theoretically going to become stable, and should be fine
to be independent of the compiler/language internals since they're
intimately to the platform.
This is theoretically a breaking change, but GitHub search turns up no
uses of it, and most non-built-in cfg's are passed via cargo features,
which look like `feature = "..."`, and hence can't overlap.
All of the modules in the standard library were just straight reexports of those
in libcore, so remove all the "macro modules" from the standard library and just
reexport what's in core directly.
This commit removes the call to `panic!("Some tests failed")` at the end of all
tests run when running with libtest. The panic is replaced with
`std::process::exit` to have a nonzero error code, but this change both:
1. Makes the test runner no longer print out the extraneous panic message at the
end of a failing test run that some tests failed. (this is already summarized
in the output of the test run).
2. When running tests with `RUST_BACKTRACE` set it removes an extraneous
backtrace from the output (only failing tests will have their backtraces in
the output.