fmt::Show is for debugging, and can and should be implemented for
all public types. This trait is used with `{:?}` syntax. There still
exists #[derive(Show)].
fmt::String is for types that faithfully be represented as a String.
Because of this, there is no way to derive fmt::String, all
implementations must be purposeful. It is used by the default format
syntax, `{}`.
This will break most instances of `{}`, since that now requires the type
to impl fmt::String. In most cases, replacing `{}` with `{:?}` is the
correct fix. Types that were being printed specifically for users should
receive a fmt::String implementation to fix this.
Part of #20013
[breaking-change]
macro_rules! is like an item that defines a macro. Other items don't have a
trailing semicolon, or use a paren-delimited body.
If there's an argument for matching the invocation syntax, e.g. parentheses for
an expr macro, then I think that applies more strongly to the *inner*
delimiters on the LHS, wrapping the individual argument patterns.
This commit is an implementation of [RFC 494][rfc] which removes the entire
`std::c_vec` module and redesigns the `std::c_str` module as `std::ffi`.
[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/0494-c_str-and-c_vec-stability.md
The interface of the new `CString` is outlined in the linked RFC, the primary
changes being:
* The `ToCStr` trait is gone, meaning the `with_c_str` and `to_c_str` methods
are now gone. These two methods are replaced with a `CString::from_slice`
method.
* The `CString` type is now just a wrapper around `Vec<u8>` with a static
guarantee that there is a trailing nul byte with no internal nul bytes. This
means that `CString` now implements `Deref<Target = [c_char]>`, which is where
it gains most of its methods from. A few helper methods are added to acquire a
slice of `u8` instead of `c_char`, as well as including a slice with the
trailing nul byte if necessary.
* All usage of non-owned `CString` values is now done via two functions inside
of `std::ffi`, called `c_str_to_bytes` and `c_str_to_bytes_with_nul`. These
functions are now the one method used to convert a `*const c_char` to a Rust
slice of `u8`.
Many more details, including newly deprecated methods, can be found linked in
the RFC. This is a:
[breaking-change]
Closes#20444
This removes a large array of deprecated functionality, regardless of how
recently it was deprecated. The purpose of this commit is to clean out the
standard libraries and compiler for the upcoming alpha release.
Some notable compiler changes were to enable warnings for all now-deprecated
command line arguments (previously the deprecated versions were silently
accepted) as well as removing deriving(Zero) entirely (the trait was removed).
The distribution no longer contains the libtime or libregex_macros crates. Both
of these have been deprecated for some time and are available externally.
Prior to 9bae6ec828 from_utf8_lossy had a minor optimization in place that avoided having to loop from the beginning of the input slice.
Recently 4908017d59 implemented Utf8Error::InvalidByte which makes this possible again.
This commit is an implementation of [RFC 503][rfc] which is a stabilization
story for the prelude. Most of the RFC was directly applied, removing reexports.
Some reexports are kept around, however:
* `range` remains until range syntax has landed to reduce churn.
* `Path` and `GenericPath` remain until path reform lands. This is done to
prevent many imports of `GenericPath` which will soon be removed.
* All `io` traits remain until I/O reform lands so imports can be rewritten all
at once to `std::io::prelude::*`.
This is a breaking change because many prelude reexports have been removed, and
the RFC can be consulted for the exact list of removed reexports, as well as to
find the locations of where to import them.
[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/0503-prelude-stabilization.md
[breaking-change]
Closes#20068
This commit is an implementation of [RFC 526][rfc] which is a change to alter
the definition of the old `fmt::FormatWriter`. The new trait, renamed to
`Writer`, now only exposes one method `write_str` in order to guarantee that all
implementations of the formatting traits can only produce valid Unicode.
[rfc]: https://github.com/rust-lang/rfcs/blob/master/text/0526-fmt-text-writer.md
One of the primary improvements of this patch is the performance of the
`.to_string()` method by avoiding an almost-always redundant UTF-8 check. This
is a breaking change due to the renaming of the trait as well as the loss of the
`write` method, but migration paths should be relatively easy:
* All usage of `write` should move to `write_str`. If truly binary data was
being written in an implementation of `Show`, then it will need to use a
different trait or an altogether different code path.
* All usage of `write!` should continue to work as-is with no modifications.
* All usage of `Show` where implementations just delegate to another should
continue to work as-is.
[breaking-change]
Closes#20352
This patch marks `PartialEq`, `Eq`, `PartialOrd`, and `Ord` as
`#[stable]`, as well as the majorify of manual implementaitons of these
traits. The traits match the [reform
RFC](https://github.com/rust-lang/rfcs/pull/439).
Along the way, two changes are made:
* The recently-added type parameters for `Ord` and `Eq` are
removed. These were mistakenly added while adding them to `PartialOrd`
and `PartialEq`, but they don't make sense given the laws that are
required for (and use cases for) `Ord` and `Eq`.
* More explicit laws are added for `PartialEq` and `PartialOrd`,
connecting them to their associated mathematical concepts.
In the future, many of the impls should be generalized; see
since generalizing later is not a breaking change.
[breaking-change]
This commit performs a second pass over the `std::string` module, performing the
following actions:
* The name `std::string` is now stable.
* The `String::from_utf8` function is now stable after having been altered to
return a new `FromUtf8Error` structure. The `FromUtf8Error` structure is now
stable as well as its `into_bytes` and `utf8_error` methods.
* The `String::from_utf8_lossy` function is now stable.
* The `String::from_chars` method is now deprecated in favor of `.collect()`
* The `String::from_raw_parts` method is now stable
* The `String::from_str` function remains experimental
* The `String::from_raw_buf` function remains experimental
* The `String::from_raw_buf_len` function remains experimental
* The `String::from_utf8_unchecked` function is now stable
* The `String::from_char` function is now deprecated in favor of
`repeat(c).take(n).collect()`
* The `String::grow` function is now deprecated in favor of
`.extend(repeat(c).take(n)`
* The `String::capacity` method is now stable
* The `String::reserve` method is now stable
* The `String::reserve_exact` method is now stable
* The `String::shrink_to_fit` method is now stable
* The `String::pop` method is now stable
* The `String::as_mut_vec` method is now stable
* The `String::is_empty` method is now stable
* The `IntoString` trait is now deprecated (there are no implementors)
* The `String::truncate` method is now stable
* The `String::insert` method is now stable
* The `String::remove` method is now stable
* The `String::push` method is now stable
* The `String::push_str` method is now stable
* The `String::from_utf16` function is now stable after its error type has now
become an opaque structure to carry more semantic information in the future.
A number of these changes are breaking changes, but the migrations should be
fairly straightforward on a case-by-case basis (outlined above where possible).
[breaking-change]
This commit starts out by consolidating all `str` extension traits into one
`StrExt` trait to be included in the prelude. This means that
`UnicodeStrPrelude`, `StrPrelude`, and `StrAllocating` have all been merged into
one `StrExt` exported by the standard library. Some functionality is currently
duplicated with the `StrExt` present in libcore.
This commit also currently avoids any methods which require any form of pattern
to operate. These functions will be stabilized via a separate RFC.
Next, stability of methods and structures are as follows:
Stable
* from_utf8_unchecked
* CowString - after moving to std::string
* StrExt::as_bytes
* StrExt::as_ptr
* StrExt::bytes/Bytes - also made a struct instead of a typedef
* StrExt::char_indices/CharIndices - CharOffsets was renamed
* StrExt::chars/Chars
* StrExt::is_empty
* StrExt::len
* StrExt::lines/Lines
* StrExt::lines_any/LinesAny
* StrExt::slice_unchecked
* StrExt::trim
* StrExt::trim_left
* StrExt::trim_right
* StrExt::words/Words - also made a struct instead of a typedef
Unstable
* from_utf8 - the error type was changed to a `Result`, but the error type has
yet to prove itself
* from_c_str - this function will be handled by the c_str RFC
* FromStr - this trait will have an associated error type eventually
* StrExt::escape_default - needs iterators at least, unsure if it should make
the cut
* StrExt::escape_unicode - needs iterators at least, unsure if it should make
the cut
* StrExt::slice_chars - this function has yet to prove itself
* StrExt::slice_shift_char - awaiting conventions about slicing and shifting
* StrExt::graphemes/Graphemes - this functionality may only be in libunicode
* StrExt::grapheme_indices/GraphemeIndices - this functionality may only be in
libunicode
* StrExt::width - this functionality may only be in libunicode
* StrExt::utf16_units - this functionality may only be in libunicode
* StrExt::nfd_chars - this functionality may only be in libunicode
* StrExt::nfkd_chars - this functionality may only be in libunicode
* StrExt::nfc_chars - this functionality may only be in libunicode
* StrExt::nfkc_chars - this functionality may only be in libunicode
* StrExt::is_char_boundary - naming is uncertain with container conventions
* StrExt::char_range_at - naming is uncertain with container conventions
* StrExt::char_range_at_reverse - naming is uncertain with container conventions
* StrExt::char_at - naming is uncertain with container conventions
* StrExt::char_at_reverse - naming is uncertain with container conventions
* StrVector::concat - this functionality may be replaced with iterators, but
it's not certain at this time
* StrVector::connect - as with concat, may be deprecated in favor of iterators
Deprecated
* StrAllocating and UnicodeStrPrelude have been merged into StrExit
* eq_slice - compiler implementation detail
* from_str - use the inherent parse() method
* is_utf8 - call from_utf8 instead
* replace - call the method instead
* truncate_utf16_at_nul - this is an implementation detail of windows and does
not need to be exposed.
* utf8_char_width - moved to libunicode
* utf16_items - moved to libunicode
* is_utf16 - moved to libunicode
* Utf16Items - moved to libunicode
* Utf16Item - moved to libunicode
* Utf16Encoder - moved to libunicode
* AnyLines - renamed to LinesAny and made a struct
* SendStr - use CowString<'static> instead
* str::raw - all functionality is deprecated
* StrExt::into_string - call to_string() instead
* StrExt::repeat - use iterators instead
* StrExt::char_len - use .chars().count() instead
* StrExt::is_alphanumeric - use .chars().all(..)
* StrExt::is_whitespace - use .chars().all(..)
Pending deprecation -- while slicing syntax is being worked out, these methods
are all #[unstable]
* Str - while currently used for generic programming, this trait will be
replaced with one of [], deref coercions, or a generic conversion trait.
* StrExt::slice - use slicing syntax instead
* StrExt::slice_to - use slicing syntax instead
* StrExt::slice_from - use slicing syntax instead
* StrExt::lev_distance - deprecated with no replacement
Awaiting stabilization due to patterns and/or matching
* StrExt::contains
* StrExt::contains_char
* StrExt::split
* StrExt::splitn
* StrExt::split_terminator
* StrExt::rsplitn
* StrExt::match_indices
* StrExt::split_str
* StrExt::starts_with
* StrExt::ends_with
* StrExt::trim_chars
* StrExt::trim_left_chars
* StrExt::trim_right_chars
* StrExt::find
* StrExt::rfind
* StrExt::find_str
* StrExt::subslice_offset
`String::push(&mut self, ch: char)` currently has a single code path that calls `Char::encode_utf8`. This adds a fast path for ASCII `char`s, which are represented as a single byte in UTF-8.
Benchmarks of stage1 libcollections at the intermediate commit show that the fast path very significantly improves the performance of repeatedly pushing an ASCII `char`, but does not significantly affect the performance for a non-ASCII `char` (where the fast path is not taken).
```
bench_push_char_one_byte 59552 ns/iter (+/- 2132) = 167 MB/s
bench_push_char_one_byte_with_fast_path 6563 ns/iter (+/- 658) = 1523 MB/s
bench_push_char_two_bytes 71520 ns/iter (+/- 3541) = 279 MB/s
bench_push_char_two_bytes_with_slow_path 71452 ns/iter (+/- 4202) = 279 MB/s
bench_push_str_one_byte 38910 ns/iter (+/- 2477) = 257 MB/s
```
A benchmark of pushing a one-byte-long `&str` is added for comparison, but its performance [has varied a lot lately](https://github.com/rust-lang/rust/pull/19640#issuecomment-67741561). (When the input is fixed, `s.push_str("x")` could be used just as well as `s.push('x')`.)
TL;DR I wrongly implemented these two ops, namely `"prefix" + "suffix".to_string()` gives back `"suffixprefix"`. Let's remove them.
The correct implementation of these operations (`lhs.clone().push_str(rhs.as_slice())`) is really wasteful, because the lhs has to be cloned and the rhs gets moved/consumed just to be dropped (no buffer reuse). For this reason, I'd prefer to drop the implementation instead of fixing it. This leaves us with the fact that you'll be able to do `String + &str` but not `&str + String`, which may be unexpected.
r? @aturon
Closes#19952
`String::push(&mut self, ch: char)` currently has a single code path
that calls `Char::encode_utf8`.
Perhaps it could be faster for ASCII `char`s, which are represented as
a single byte in UTF-8.
This commit leaves the method unchanged,
adds a copy of it with the fast path,
and adds benchmarks to compare them.
Results show that the fast path very significantly improves the performance
of repeatedly pushing an ASCII `char`,
but does not significantly affect the performance for a non-ASCII `char`
(where the fast path is not taken).
Output of `make check-stage1-collections NO_REBUILD=1 PLEASE_BENCH=1 TESTNAME=string::tests::bench_push`
```
test string::tests::bench_push_char_one_byte ... bench: 59552 ns/iter (+/- 2132) = 167 MB/s
test string::tests::bench_push_char_one_byte_with_fast_path ... bench: 6563 ns/iter (+/- 658) = 1523 MB/s
test string::tests::bench_push_char_two_bytes ... bench: 71520 ns/iter (+/- 3541) = 279 MB/s
test string::tests::bench_push_char_two_bytes_with_slow_path ... bench: 71452 ns/iter (+/- 4202) = 279 MB/s
test string::tests::bench_push_str ... bench: 24 ns/iter (+/- 2)
test string::tests::bench_push_str_one_byte ... bench: 38910 ns/iter (+/- 2477) = 257 MB/s
```
A benchmark of pushing a one-byte-long `&str` is added for comparison,
but its performance [has varied a lot lately](
https://github.com/rust-lang/rust/pull/19640#issuecomment-67741561).
(When the input is fixed, `s.push_str("x")` could be used
instead of `s.push('x')`.)
followed by a semicolon.
This allows code like `vec![1i, 2, 3].len();` to work.
This breaks code that uses macros as statements without putting
semicolons after them, such as:
fn main() {
...
assert!(a == b)
assert!(c == d)
println(...);
}
It also breaks code that uses macros as items without semicolons:
local_data_key!(foo)
fn main() {
println("hello world")
}
Add semicolons to fix this code. Those two examples can be fixed as
follows:
fn main() {
...
assert!(a == b);
assert!(c == d);
println(...);
}
local_data_key!(foo);
fn main() {
println("hello world")
}
RFC #378.
Closes#18635.
[breaking-change]
This commit performs a second pass stabilization of the `std::default` module.
The module was already marked `#[stable]`, and the inheritance of `#[stable]`
was removed since this attribute was applied. This commit adds the `#[stable]`
attribute to the trait definition and one method name, along with all
implementations found in the standard distribution.
This commit collapses the various prelude traits for slices into just one trait:
* SlicePrelude/SliceAllocPrelude => SliceExt
* CloneSlicePrelude/CloneSliceAllocPrelude => CloneSliceExt
* OrdSlicePrelude/OrdSliceAllocPrelude => OrdSliceExt
* PartialEqSlicePrelude => PartialEqSliceExt