This is an updated version of #19711. The merge and subsequent rebase on that branch were more trouble than they were worth, so I am just resubmitting the relevant change here.
If this PR is accepted, then #19711 can be closed.
/cc @Gankro
This pull request:
*Renames `BinaryHeap::top` to `BinaryHeap::peek`
*Stabilizes `front/back/front_mut/back_mut` in `DList` and `RingBuf`
*Stabilizes `swap` in `RingBuf`
in accordance with rust-lang/rfcs#509.
Note that this PR does not address `Bitv::{get,set}` or HashMap's iterators, nor does it move `std::vec` to `std::collections::vec`, all of which still need to be done.
Because of the method renaming, this is a [breaking-change].
This commit starts out by consolidating all `str` extension traits into one
`StrExt` trait to be included in the prelude. This means that
`UnicodeStrPrelude`, `StrPrelude`, and `StrAllocating` have all been merged into
one `StrExt` exported by the standard library. Some functionality is currently
duplicated with the `StrExt` present in libcore.
This commit also currently avoids any methods which require any form of pattern
to operate. These functions will be stabilized via a separate RFC.
Next, stability of methods and structures are as follows:
Stable
* from_utf8_unchecked
* CowString - after moving to std::string
* StrExt::as_bytes
* StrExt::as_ptr
* StrExt::bytes/Bytes - also made a struct instead of a typedef
* StrExt::char_indices/CharIndices - CharOffsets was renamed
* StrExt::chars/Chars
* StrExt::is_empty
* StrExt::len
* StrExt::lines/Lines
* StrExt::lines_any/LinesAny
* StrExt::slice_unchecked
* StrExt::trim
* StrExt::trim_left
* StrExt::trim_right
* StrExt::words/Words - also made a struct instead of a typedef
Unstable
* from_utf8 - the error type was changed to a `Result`, but the error type has
yet to prove itself
* from_c_str - this function will be handled by the c_str RFC
* FromStr - this trait will have an associated error type eventually
* StrExt::escape_default - needs iterators at least, unsure if it should make
the cut
* StrExt::escape_unicode - needs iterators at least, unsure if it should make
the cut
* StrExt::slice_chars - this function has yet to prove itself
* StrExt::slice_shift_char - awaiting conventions about slicing and shifting
* StrExt::graphemes/Graphemes - this functionality may only be in libunicode
* StrExt::grapheme_indices/GraphemeIndices - this functionality may only be in
libunicode
* StrExt::width - this functionality may only be in libunicode
* StrExt::utf16_units - this functionality may only be in libunicode
* StrExt::nfd_chars - this functionality may only be in libunicode
* StrExt::nfkd_chars - this functionality may only be in libunicode
* StrExt::nfc_chars - this functionality may only be in libunicode
* StrExt::nfkc_chars - this functionality may only be in libunicode
* StrExt::is_char_boundary - naming is uncertain with container conventions
* StrExt::char_range_at - naming is uncertain with container conventions
* StrExt::char_range_at_reverse - naming is uncertain with container conventions
* StrExt::char_at - naming is uncertain with container conventions
* StrExt::char_at_reverse - naming is uncertain with container conventions
* StrVector::concat - this functionality may be replaced with iterators, but
it's not certain at this time
* StrVector::connect - as with concat, may be deprecated in favor of iterators
Deprecated
* StrAllocating and UnicodeStrPrelude have been merged into StrExit
* eq_slice - compiler implementation detail
* from_str - use the inherent parse() method
* is_utf8 - call from_utf8 instead
* replace - call the method instead
* truncate_utf16_at_nul - this is an implementation detail of windows and does
not need to be exposed.
* utf8_char_width - moved to libunicode
* utf16_items - moved to libunicode
* is_utf16 - moved to libunicode
* Utf16Items - moved to libunicode
* Utf16Item - moved to libunicode
* Utf16Encoder - moved to libunicode
* AnyLines - renamed to LinesAny and made a struct
* SendStr - use CowString<'static> instead
* str::raw - all functionality is deprecated
* StrExt::into_string - call to_string() instead
* StrExt::repeat - use iterators instead
* StrExt::char_len - use .chars().count() instead
* StrExt::is_alphanumeric - use .chars().all(..)
* StrExt::is_whitespace - use .chars().all(..)
Pending deprecation -- while slicing syntax is being worked out, these methods
are all #[unstable]
* Str - while currently used for generic programming, this trait will be
replaced with one of [], deref coercions, or a generic conversion trait.
* StrExt::slice - use slicing syntax instead
* StrExt::slice_to - use slicing syntax instead
* StrExt::slice_from - use slicing syntax instead
* StrExt::lev_distance - deprecated with no replacement
Awaiting stabilization due to patterns and/or matching
* StrExt::contains
* StrExt::contains_char
* StrExt::split
* StrExt::splitn
* StrExt::split_terminator
* StrExt::rsplitn
* StrExt::match_indices
* StrExt::split_str
* StrExt::starts_with
* StrExt::ends_with
* StrExt::trim_chars
* StrExt::trim_left_chars
* StrExt::trim_right_chars
* StrExt::find
* StrExt::rfind
* StrExt::find_str
* StrExt::subslice_offset
Part of #18424
This commit changes the semantics of `reserve` and `capacity` for Bitv and BitvSet to match conventions. It also introduces the notion of `reserve_index` and `reserve_index_exact` for collections with maximum-index-based capacity semantics.
Deprecates free function constructors in favour of functions on Bitv itself.
Changes `Bitv::pop` to return an Option rather than panicking.
Deprecates and renames several methods in favour of conventions.
Marks several blessed methods as unstable.
This commit also substantially refactors Bitv and BitvSet's implementations. The new implementation is simpler, cleaner, better documented, and more robust against overflows. It also reduces coupling between Bitv and BitvSet. Tests have been seperated into seperate submodules.
Fixes#16958
[breaking-change]
post-unboxed-closure-conversion. This requires a fair amount of
annoying coercions because all the `map` etc types are defined
generically over the `F`, so the automatic coercions don't propagate;
this is compounded by the need to use `let` and not `as` due to
stage0. That said, this pattern is to a large extent temporary and
unusual.
This commit:
*Renames `BinaryHeap::top` to `BinaryHeap::peek`
*Stabilizes `front/back/front_mut/back_mut` in `DList` and `RingBuf`
*Stabilizes `swap` in `RingBuf`
Because of the method renaming, this is a [breaking-change].
This commit starts out by consolidating all `str` extension traits into one
`StrExt` trait to be included in the prelude. This means that
`UnicodeStrPrelude`, `StrPrelude`, and `StrAllocating` have all been merged into
one `StrExt` exported by the standard library. Some functionality is currently
duplicated with the `StrExt` present in libcore.
This commit also currently avoids any methods which require any form of pattern
to operate. These functions will be stabilized via a separate RFC.
Next, stability of methods and structures are as follows:
Stable
* from_utf8_unchecked
* CowString - after moving to std::string
* StrExt::as_bytes
* StrExt::as_ptr
* StrExt::bytes/Bytes - also made a struct instead of a typedef
* StrExt::char_indices/CharIndices - CharOffsets was renamed
* StrExt::chars/Chars
* StrExt::is_empty
* StrExt::len
* StrExt::lines/Lines
* StrExt::lines_any/LinesAny
* StrExt::slice_unchecked
* StrExt::trim
* StrExt::trim_left
* StrExt::trim_right
* StrExt::words/Words - also made a struct instead of a typedef
Unstable
* from_utf8 - the error type was changed to a `Result`, but the error type has
yet to prove itself
* from_c_str - this function will be handled by the c_str RFC
* FromStr - this trait will have an associated error type eventually
* StrExt::escape_default - needs iterators at least, unsure if it should make
the cut
* StrExt::escape_unicode - needs iterators at least, unsure if it should make
the cut
* StrExt::slice_chars - this function has yet to prove itself
* StrExt::slice_shift_char - awaiting conventions about slicing and shifting
* StrExt::graphemes/Graphemes - this functionality may only be in libunicode
* StrExt::grapheme_indices/GraphemeIndices - this functionality may only be in
libunicode
* StrExt::width - this functionality may only be in libunicode
* StrExt::utf16_units - this functionality may only be in libunicode
* StrExt::nfd_chars - this functionality may only be in libunicode
* StrExt::nfkd_chars - this functionality may only be in libunicode
* StrExt::nfc_chars - this functionality may only be in libunicode
* StrExt::nfkc_chars - this functionality may only be in libunicode
* StrExt::is_char_boundary - naming is uncertain with container conventions
* StrExt::char_range_at - naming is uncertain with container conventions
* StrExt::char_range_at_reverse - naming is uncertain with container conventions
* StrExt::char_at - naming is uncertain with container conventions
* StrExt::char_at_reverse - naming is uncertain with container conventions
* StrVector::concat - this functionality may be replaced with iterators, but
it's not certain at this time
* StrVector::connect - as with concat, may be deprecated in favor of iterators
Deprecated
* StrAllocating and UnicodeStrPrelude have been merged into StrExit
* eq_slice - compiler implementation detail
* from_str - use the inherent parse() method
* is_utf8 - call from_utf8 instead
* replace - call the method instead
* truncate_utf16_at_nul - this is an implementation detail of windows and does
not need to be exposed.
* utf8_char_width - moved to libunicode
* utf16_items - moved to libunicode
* is_utf16 - moved to libunicode
* Utf16Items - moved to libunicode
* Utf16Item - moved to libunicode
* Utf16Encoder - moved to libunicode
* AnyLines - renamed to LinesAny and made a struct
* SendStr - use CowString<'static> instead
* str::raw - all functionality is deprecated
* StrExt::into_string - call to_string() instead
* StrExt::repeat - use iterators instead
* StrExt::char_len - use .chars().count() instead
* StrExt::is_alphanumeric - use .chars().all(..)
* StrExt::is_whitespace - use .chars().all(..)
Pending deprecation -- while slicing syntax is being worked out, these methods
are all #[unstable]
* Str - while currently used for generic programming, this trait will be
replaced with one of [], deref coercions, or a generic conversion trait.
* StrExt::slice - use slicing syntax instead
* StrExt::slice_to - use slicing syntax instead
* StrExt::slice_from - use slicing syntax instead
* StrExt::lev_distance - deprecated with no replacement
Awaiting stabilization due to patterns and/or matching
* StrExt::contains
* StrExt::contains_char
* StrExt::split
* StrExt::splitn
* StrExt::split_terminator
* StrExt::rsplitn
* StrExt::match_indices
* StrExt::split_str
* StrExt::starts_with
* StrExt::ends_with
* StrExt::trim_chars
* StrExt::trim_left_chars
* StrExt::trim_right_chars
* StrExt::find
* StrExt::rfind
* StrExt::find_str
* StrExt::subslice_offset
`String::push(&mut self, ch: char)` currently has a single code path that calls `Char::encode_utf8`. This adds a fast path for ASCII `char`s, which are represented as a single byte in UTF-8.
Benchmarks of stage1 libcollections at the intermediate commit show that the fast path very significantly improves the performance of repeatedly pushing an ASCII `char`, but does not significantly affect the performance for a non-ASCII `char` (where the fast path is not taken).
```
bench_push_char_one_byte 59552 ns/iter (+/- 2132) = 167 MB/s
bench_push_char_one_byte_with_fast_path 6563 ns/iter (+/- 658) = 1523 MB/s
bench_push_char_two_bytes 71520 ns/iter (+/- 3541) = 279 MB/s
bench_push_char_two_bytes_with_slow_path 71452 ns/iter (+/- 4202) = 279 MB/s
bench_push_str_one_byte 38910 ns/iter (+/- 2477) = 257 MB/s
```
A benchmark of pushing a one-byte-long `&str` is added for comparison, but its performance [has varied a lot lately](https://github.com/rust-lang/rust/pull/19640#issuecomment-67741561). (When the input is fixed, `s.push_str("x")` could be used just as well as `s.push('x')`.)
This patch marks `clone` stable, as well as the `Clone` trait, but
leaves `clone_from` unstable. The latter will be decided by the beta.
The patch also marks most manual implementations of `Clone` as stable,
except where the APIs are otherwise deprecated or where there is
uncertainty about providing `Clone`.
r? @alexcrichton
TL;DR I wrongly implemented these two ops, namely `"prefix" + "suffix".to_string()` gives back `"suffixprefix"`. Let's remove them.
The correct implementation of these operations (`lhs.clone().push_str(rhs.as_slice())`) is really wasteful, because the lhs has to be cloned and the rhs gets moved/consumed just to be dropped (no buffer reuse). For this reason, I'd prefer to drop the implementation instead of fixing it. This leaves us with the fact that you'll be able to do `String + &str` but not `&str + String`, which may be unexpected.
r? @aturon
Closes#19952
It is useful to move all the elements out of a hashmap without deallocating
the underlying buffer. It came up in IRC, and this patch implements it as
`drain`.
r? @Gankro
cc: @frankmcsherry
`String::push(&mut self, ch: char)` currently has a single code path
that calls `Char::encode_utf8`.
Perhaps it could be faster for ASCII `char`s, which are represented as
a single byte in UTF-8.
This commit leaves the method unchanged,
adds a copy of it with the fast path,
and adds benchmarks to compare them.
Results show that the fast path very significantly improves the performance
of repeatedly pushing an ASCII `char`,
but does not significantly affect the performance for a non-ASCII `char`
(where the fast path is not taken).
Output of `make check-stage1-collections NO_REBUILD=1 PLEASE_BENCH=1 TESTNAME=string::tests::bench_push`
```
test string::tests::bench_push_char_one_byte ... bench: 59552 ns/iter (+/- 2132) = 167 MB/s
test string::tests::bench_push_char_one_byte_with_fast_path ... bench: 6563 ns/iter (+/- 658) = 1523 MB/s
test string::tests::bench_push_char_two_bytes ... bench: 71520 ns/iter (+/- 3541) = 279 MB/s
test string::tests::bench_push_char_two_bytes_with_slow_path ... bench: 71452 ns/iter (+/- 4202) = 279 MB/s
test string::tests::bench_push_str ... bench: 24 ns/iter (+/- 2)
test string::tests::bench_push_str_one_byte ... bench: 38910 ns/iter (+/- 2477) = 257 MB/s
```
A benchmark of pushing a one-byte-long `&str` is added for comparison,
but its performance [has varied a lot lately](
https://github.com/rust-lang/rust/pull/19640#issuecomment-67741561).
(When the input is fixed, `s.push_str("x")` could be used
instead of `s.push('x')`.)
The old logic would be ok with *either* 0 or all 1s in the last word,
because it didn't compute a proper mask for the case where nbits is an
exact multiple of u32::BITS.
Add mask_for_bits() to compute this properly, and use it in all(). Add
all/none assertions to most of the tests. Note in particular, the all-zero
bitv in test_32_elements() was incorrectly all()==true before this patch.
- Fix typos on Blocks and MutBlocks.
- Use slice_to_mut() for creating blocks_mut().
- Deref the block parameter in get().
- Access nbits separately from mutating set in pop().
Part of #18424
This commit changes the semantics of `reserve` and `capacity` for Bitv and BitvSet to match conventions. It also introduces the notion of `reserve_index` and `reserve_index_exact` for collections with maximum-index-based capacity semantics.
Deprecates free function constructors in favour of functions on Bitv itself.
Changes `Bitv::pop` to return an Option rather than panicking.
Deprecates and renames several methods in favour of conventions.
Marks several blessed methods as unstable.
This commit also substantially refactors Bitv and BitvSet's implementations. The new implementation is simpler, cleaner, better documented, and more robust against overflows. It also reduces coupling between Bitv and BitvSet. Tests have been seperated into seperate submodules.
Fixes#16958
[breaking-change]
This patch marks `clone` stable, as well as the `Clone` trait, but
leaves `clone_from` unstable. The latter will be decided by the beta.
The patch also marks most manual implementations of `Clone` as stable,
except where the APIs are otherwise deprecated or where there is
uncertainty about providing `Clone`.
The `is_power_of_two()` method of the `UnsignedInt` trait currently returns `true` for `self == 0`. Zero is not a power of two, assuming an integral exponent `k >= 0`. I've therefore moved this functionality to the new method `is_power_of_two_or_zero()` and reformed `is_power_of_two()` to return false for `self == 0`.
To illustrate the usefulness of the existence of both functions, consider `HashMap`. Its capacity must be zero or a power of two; conversely, it also requires a (non-zero) power of two for key and val alignment.
Also, added a small amount of documentation regarding #18604.
Rewrite how the HRTB algorithm matches impls against obligations. Instead of impls providing higher-ranked trait-references, impls now once again only have early-bound regions. The skolemization checks are thus moved out into trait matching itself. This allows to implement "perfect forwarding" impls like those described in #19730. This PR builds on a previous PR that was already reviewed by @pnkfelix.
r? @pnkfelix
Fixes#19730
lifetimes. This currently causes an ICE; it should (ideally) work, but
failing that at least give a structured error. For the purposes of
this PR, though, workaround is fine.
This commit is part of a series that introduces a `std::thread` API to
replace `std::task`.
In the new API, `spawn` returns a `JoinGuard`, which by default will
join the spawned thread when dropped. It can also be used to join
explicitly at any time, returning the thread's result. Alternatively,
the spawned thread can be explicitly detached (so no join takes place).
As part of this change, Rust processes now terminate when the main
thread exits, even if other detached threads are still running, moving
Rust closer to standard threading models. This new behavior may break code
that was relying on the previously implicit join-all.
In addition to the above, the new thread API also offers some built-in
support for building blocking abstractions in user space; see the module
doc for details.
Closes#18000
[breaking-change]
This commit merges the `rustrt` crate into `std`, undoing part of the
facade. This merger continues the paring down of the runtime system.
Code relying on the public API of `rustrt` will break; some of this API
is now available through `std::rt`, but is likely to change and/or be
removed very soon.
[breaking-change]
It is useful to move all the elements out of some collections without
deallocating the underlying buffer. It came up in IRC, and this patch
implements it as `drain`. This has been discussed as part of RFC 509.
r? @Gankro
cc: @frankmcsherry
followed by a semicolon.
This allows code like `vec![1i, 2, 3].len();` to work.
This breaks code that uses macros as statements without putting
semicolons after them, such as:
fn main() {
...
assert!(a == b)
assert!(c == d)
println(...);
}
It also breaks code that uses macros as items without semicolons:
local_data_key!(foo)
fn main() {
println("hello world")
}
Add semicolons to fix this code. Those two examples can be fixed as
follows:
fn main() {
...
assert!(a == b);
assert!(c == d);
println(...);
}
local_data_key!(foo);
fn main() {
println("hello world")
}
RFC #378.
Closes#18635.
[breaking-change]