Commit Graph

94 Commits

Author SHA1 Message Date
Alex Crichton
48615a68fb std: Account for CRLF in {str, BufRead}::lines
This commit is an implementation of [RFC 1212][rfc] which tweaks the behavior of
the `str::lines` and `BufRead::lines` iterators. Both iterators now account for
`\r\n` sequences in addition to `\n`, allowing for less surprising behavior
across platforms (especially in the `BufRead` case). Splitting *only* on the
`\n` character can still be achieved with `split('\n')` in both cases.

The `str::lines_any` function is also now deprecated as `str::lines` is a
drop-in replacement for it.

[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/1212-line-endings.md

Closes #28032
2015-09-03 23:01:41 -07:00
bors
dfe9326941 Auto merge of #28148 - eefriedman:binary_heap, r=alexcrichton 2015-09-02 01:33:20 +00:00
Eli Friedman
b82c42c153 Add missing stability markings to BinaryHeap. 2015-09-01 01:22:57 -07:00
bors
b0f77ba26a Auto merge of #28101 - ijks:24214-str-bytes, r=alexcrichton
Specifically, `count`, `last`, and `nth` are implemented to use the
methods of the underlying slice iterator.

Partially closes #24214.
2015-08-31 09:15:55 +00:00
Daan Rijks
dacf2725ec Add overrides to iterator methods for str::Bytes
Specifically, `count`, `last`, and `nth` are implemented to use the
methods of the underlying slice iterator.

Partially closes #24214.
2015-08-30 17:32:50 +02:00
Andrew Paseltiner
f9b63d3973 implement RFC 1194 2015-08-28 12:41:54 -04:00
bors
de67d62c6b Auto merge of #27474 - bluss:twoway-reverse, r=brson
StrSearcher: Implement the complete reverse case for the two way algorithm

Fix quadratic behavior in StrSearcher in reverse search with periodic
needles.

This commit adds the missing pieces for the "short period" case in
reverse search. The short case will show up when the needle is literally
periodic, for example "abababab".

Two way uses a "critical factorization" of the needle: x = u v.

Searching matches v first, if mismatch at character k, skip k forward.
Matching u, if mismatch, skip period(x) forward.

To avoid O(mn) behavior after mismatch in u, memorize the already
matched prefix.

The short period case requires that |u| < period(x).

For the reverse search we need to compute a different critical
factorization x = u' v' where |v'| < period(x), because we are searching
for the reversed needle. A short v' also benefits the algorithm in
general.

The reverse critical factorization is computed quickly by using the same
maximal suffix algorithm, but terminating as soon as we have a location
with local period equal to period(x).

This adds extra fields crit_pos_back and memory_back for the reverse
case. The new overhead for TwoWaySearcher::new is low, and additionally
I think the "short period" case is uncommon in many applications of
string search.

The maximal_suffix methods were updated in documentation and the
algorithms updated to not use !0 and wrapping add, variable left is now
1 larger, offset 1 smaller.

Use periodicity when computing byteset: in the periodic case, just
iterate over one period instead of the whole needle.

Example before (rfind) after (twoway_rfind) benchmark shows the removal
of quadratic behavior.

needle: "ab" * 100, haystack: ("bb" + "ab" * 100) * 100

```
test periodic::rfind           ... bench:   1,926,595 ns/iter (+/- 11,390) = 10 MB/s
test periodic::twoway_rfind    ... bench:      51,740 ns/iter (+/- 66) = 386 MB/s
```
2015-08-18 02:02:57 +00:00
bors
e2bebf32fa Auto merge of #27696 - bluss:into-boxed-str, r=alexcrichton
Rename String::into_boxed_slice -> into_boxed_str

This is the name that was decided in rust-lang/rfcs#1152, and it's
better if we say “boxed str” for `Box<str>`.

The old name `String::into_boxed_slice` is deprecated.
2015-08-14 01:06:37 +00:00
Ulrik Sverdrup
bec64090a7 Rename String::into_boxed_slice -> into_boxed_str
This is the name that was decided in rust-lang/rfcs#1152, and it's
better if we say “boxed str” for `Box<str>`.

The old name `String::into_boxed_slice` is deprecated.
2015-08-13 14:02:00 +02:00
Alex Crichton
8d90d3f368 Remove all unstable deprecated functionality
This commit removes all unstable and deprecated functions in the standard
library. A release was recently cut (1.3) which makes this a good time for some
spring cleaning of the deprecated functions.
2015-08-12 14:55:17 -07:00
Ulrik Sverdrup
c5a1d8c3db StrSearcher: Add tests for rfind(&str)
Add tests for .rfind(&str), using the reverse searcher case for
substring search.
2015-08-02 20:08:35 +02:00
Alexis Beingessner
3e954a8cb2 implement Clone for Box<str>, closes #27323
This is a minor [breaking-change], as it changes what
`boxed_str.to_owned()` does (previously it would deref to `&str` and
call `to_owned` on that to get a `String`). However `Box<str>` is such an
exceptionally rare type that this is not expected to be a serious
concern. Also a `Box<str>` can be freely converted to a `String` to
obtain the previous behaviour anyway.
2015-07-29 18:43:01 -07:00
Jonathan Reem
e24423091f Implement Clone for Box<[T]> where T: Clone
Closes #25097
2015-07-28 01:43:17 -07:00
Alexis Beingessner
bfa0e1f58a Add RawVec to unify raw Vecish code 2015-07-17 08:29:15 -07:00
bors
dd46cf8b22 Auto merge of #26241 - SimonSapin:derefmut-for-string, r=alexcrichton
See https://github.com/rust-lang/rfcs/issues/1157
2015-07-13 23:47:06 +00:00
Simon Sapin
3226858e50 Fix tests for changes in #26241. 2015-07-13 23:28:58 +02:00
Simon Sapin
7469914e96 Add str::split_at_mut 2015-07-13 16:21:43 +02:00
bors
05d8767289 Auto merge of #26957 - wesleywiser:rename_connect_to_join, r=alexcrichton
Fixes #26900
2015-07-12 22:05:59 +00:00
bors
50d305e498 Auto merge of #26966 - nagisa:tail-init, r=alexcrichton
Fixes #26906
2015-07-12 13:16:24 +00:00
Jonathan Reem
69521affbb Add String::into_boxed_slice and Box<str>::into_string
Implements merged RFC 1152.

Closes #26697.
2015-07-11 21:31:56 -07:00
Simonas Kazlauskas
7a90865db5 Implement RFC 1058 2015-07-12 00:47:56 +03:00
Wesley Wiser
93ddee6cee Change some instances of .connect() to .join() 2015-07-10 19:40:46 -04:00
Ulrik Sverdrup
836f32e769 Use vec![elt; n] where possible
The common pattern `iter::repeat(elt).take(n).collect::<Vec<_>>()` is
exactly equivalent to `vec![elt; n]`, do this replacement in the whole
tree.

(Actually, vec![] is smart enough to only call clone n - 1 times, while
the former solution would call clone n times, and this fact is
virtually irrelevant in practice.)
2015-07-09 11:05:32 +02:00
bors
7fc0675f35 Auto merge of #26327 - bluss:two-way, r=aturon
Update substring search to use the Two Way algorithm

To improve our substring search performance, revive the two way searcher
and adapt it to the Pattern API.

Fixes #25483, a performance bug: that particular case now completes faster
in optimized rust than in ruby (but they share the same order of magnitude).

Many thanks to @gereeter who helped me understand the reverse case
better and wrote the comment explaining `next_back` in the code.

I had quickcheck to fuzz test forward and reverse searching thoroughly.

The two way searcher implements both forward and reverse search,
but not double ended search. The forward and reverse parts of the two
way searcher are completely independent.

The two way searcher algorithm has very small, constant space overhead,
requiring no dynamic allocation. Our implementation is relatively fast,
especially due to the `byteset` addition to the algorithm, which speeds
up many no-match cases.

A bad case for the two way algorithm is:

```
let haystack = (0..10_000).map(|_| "dac").collect::<String>();
let needle = (0..100).map(|_| "bac").collect::<String>());
```

For this particular case, two way is not much faster than the naive
implementation it replaces.
2015-06-30 18:09:51 +00:00
Johannes Oertel
239d9c2b09 Remove remaining use of bit_vec_append_splitoff feature gate. 2015-06-24 12:08:57 +02:00
Ulrik Sverdrup
b890b7bbc7 StrSearcher: Update substring search to use the Two Way algorithm
To improve our substring search performance, revive the two way searcher
and adapt it to the Pattern API.

Fixes #25483, a performance bug: that particular case now completes faster
in optimized rust than in ruby (but they share the same order of magnitude).

Much thanks to @gereeter who helped me understand the reverse case
better and wrote the comment explaining `next_back` in the code.

I had quickcheck to fuzz test forward and reverse searching thoroughly.

The two way searcher implements both forward and reverse search,
but not double ended search. The forward and reverse parts of the two
way searcher are completely independent.

The two way searcher algorithm has very small, constant space overhead,
requiring no dynamic allocation. Our implementation is relatively fast,
especially due to the `byteset` addition to the algorithm, which speeds
up many no-match cases.

A bad case for the two way algorithm is:

```
let haystack = (0..10_000).map(|_| "dac").collect::<String>();
let needle = (0..100).map(|_| "bac").collect::<String>());
```

For this particular case, two way is not much faster than the naive
implementation it replaces.
2015-06-21 19:58:50 +02:00
Alex Crichton
b4a2823cd6 More test fixes and fallout of stability changes 2015-06-17 09:07:17 -07:00
Alex Crichton
ce1a965cf5 Fallout in tests and docs from feature renamings 2015-06-17 09:07:16 -07:00
bors
b5b3a99f84 Auto merge of #26190 - Veedrac:no-iter, r=alexcrichton
Pull request for #26188.
2015-06-11 18:10:08 +00:00
bors
2fbbd54afe Auto merge of #26122 - bluss:borrow-box, r=alexcrichton
Implement Borrow<T> and BorrowMut<T> for Box<T: ?Sized>
2015-06-11 03:25:45 +00:00
bors
fbb13543fc Auto merge of #25839 - bluss:str-split-at-impl, r=alexcrichton
Implement RFC rust-lang/rfcs#1123

Add str method str::split_at(mid: usize) -> (&str, &str).

Also a minor cleanup in the collections::str module. Remove redundant slicing of self.
2015-06-11 00:22:27 +00:00
Joshua Landau
ca7418b846 Removed many pointless calls to *iter() and iter_mut() 2015-06-10 21:14:03 +01:00
Ulrik Sverdrup
d43bf53948 Add str::split_at
Implement RFC rust-lang/rfcs#1123

Add str method str::split_at(mid: usize) -> (&str, &str).
2015-06-10 09:15:07 +02:00
bors
f06e026578 Auto merge of #26039 - SimonSapin:case-mapping, r=alexcrichton
* Add “complex” mappings to `char::to_lowercase` and `char::to_uppercase`, making them yield sometimes more than on `char`: #25800. `str::to_lowercase` and `str::to_uppercase` are affected as well.
* Add `char::to_titlecase`, since it’s the same algorithm (just different data). However this does **not** add `str::to_titlecase`, as that would require UAX#29 Unicode Text Segmentation which we decided not to include in of `std`: https://github.com/rust-lang/rfcs/pull/1054 I made `char::to_titlecase` immediately `#[stable]`, since it’s so similar to `char::to_uppercase` that’s already stable. Let me know if it should be `#[unstable]` for a while.
* Add a special case for upper-case Sigma in word-final position in `str::to_lowercase`: #26035. This is the only language-independent conditional mapping currently in `SpecialCasing.txt`.
* Stabilize `str::to_lowercase` and `str::to_uppercase`. The `&self -> String` on `str` signature seems straightforward enough, and the only relevant issue I’ve found is #24536 about naming. But `char` already has stable methods with the same name, and deprecating them for a rename doesn’t seem worth it.

r? @alexcrichton
2015-06-09 20:00:32 +00:00
Ulrik Sverdrup
4fdb4cfa89 Implement Borrow<T> and BorrowMut<T> for Box<T: ?Sized> 2015-06-09 16:15:38 +02:00
Simon Sapin
6369dcbad8 Move collectionstest::char into coretest::char 2015-06-09 13:08:29 +02:00
bors
02c33b690b Auto merge of #26077 - SimonSapin:patch-6, r=alexcrichton
With the latter is provided by the `From` conversion trait, the former is now completely redundant. Their code is identical. Let’s deprecate now and plan to remove in the next cycle. (It’s `#[unstable]`.)

r? @alexcrichton 
CC @nagisa
2015-06-08 20:52:33 +00:00
Simon Sapin
c57a4124ff Address a review comment and fix a bootstrapping issue 2015-06-08 19:50:28 +02:00
Simon Sapin
c160192f5f Replace usage of String::from_str with String:from 2015-06-08 16:55:35 +02:00
Johannes Oertel
b36ed7d2ed Implement RFC 839
Closes #25976.
2015-06-08 12:05:33 +02:00
Simon Sapin
f901086b0d Correctly map upper-case Sigma to lower-case in word-final position. Fix #26035. 2015-06-06 12:37:11 +02:00
Simon Sapin
d316487ec1 Add char::to_titlecase
But not str::to_titlecase which would require UAX#29 Unicode Text Segmentation
which we decided not to include in of `std`:
https://github.com/rust-lang/rfcs/pull/1054
2015-06-06 12:37:11 +02:00
Simon Sapin
addaa5b1ff Add complex (but unconditional) Unicode case mapping. Fix #25800
As a result, the iterator returned by `char::to_uppercase` sometimes
yields two or three `char`s instead of just one.
2015-06-06 12:37:10 +02:00
Niko Matsakis
2c5e784d6f add const_fn features 2015-05-29 09:42:54 -04:00
Eduard Burtescu
377b0900ae Use const fn to abstract away the contents of UnsafeCell & friends. 2015-05-27 11:19:03 +03:00
Johannes Oertel
f95c812311 Implement append and split_off for BitSet (RFC 509) 2015-05-10 21:46:32 +02:00
bors
e8b4c84e39 Auto merge of #24890 - jooert:bitvec-append-split_off, r=alexcrichton
cc #19986 

r? @Gankro
2015-05-07 00:20:25 +00:00
Johannes Oertel
d55a7e8bc4 Implement append and split_off for BitVec (RFC 509) 2015-05-06 09:29:07 +02:00
Steven Allen
decf395221 Implement retain for vec_deque 2015-05-04 23:04:06 -04:00
bors
6517a0e90e Auto merge of #25047 - sinkuu:vec_intoiter_override, r=alexcrichton
Override methods `count`, `last`, and `nth` in vec::IntoIter.

#24214
2015-05-04 04:05:37 +00:00