fmt: Update docs and mention :#? pretty-printing
Expose `:#?` well in the docs for fmt and Debug itself. Also update some out of date information and fix formatting in `std::fmt` docs.
This removes a footgun, since it is a reasonable assumption to make that
pointers to `T` will be aligned to `align_of::<T>()`. This also matches
the behaviour of C/C++. `min_align_of` is now deprecated.
Closes#21611.
This removes a footgun, since it is a reasonable assumption to make that
pointers to `T` will be aligned to `align_of::<T>()`. This also matches
the behaviour of C/C++. `min_align_of` is now deprecated.
Closes#21611.
This commit stabilizes the `str::{matches, rmatches}` functions and iterators,
but renames the unstable feature for the `str::{matches,rmatches}_indices`
function to `str_match_indices` due to the comment present on the functions
about the iterator's return value.
This commit shards the broad `core` feature of the libcore library into finer
grained features. This split groups together similar APIs and enables tracking
each API separately, giving a better sense of where each feature is within the
stabilization process.
A few minor APIs were deprecated along the way:
* Iterator::reverse_in_place
* marker::NoCopy
Instead of a fast branch with a sized iterator falling back to a potentially poorly optimized iterate-and-push loop, a single efficient loop can serve all cases.
In my benchmark runs, I see some good gains, but also some regressions, possibly due to different inlining choices by the compiler. YMMV.
Various methods in both libcore/char.rs and librustc_unicode/char.rs were previously marked with #[inline], now every method is marked in char's impl blocks.
Partially fixes#26124.
EDIT: I'm not familiar with pull reqests (yet), apparently Github added my second commit to thit PR...
Fixes#26124
Implement RFC rust-lang/rfcs#1123
Add str method str::split_at(mid: usize) -> (&str, &str).
Also a minor cleanup in the collections::str module. Remove redundant slicing of self.
* Add “complex” mappings to `char::to_lowercase` and `char::to_uppercase`, making them yield sometimes more than on `char`: #25800. `str::to_lowercase` and `str::to_uppercase` are affected as well.
* Add `char::to_titlecase`, since it’s the same algorithm (just different data). However this does **not** add `str::to_titlecase`, as that would require UAX#29 Unicode Text Segmentation which we decided not to include in of `std`: https://github.com/rust-lang/rfcs/pull/1054 I made `char::to_titlecase` immediately `#[stable]`, since it’s so similar to `char::to_uppercase` that’s already stable. Let me know if it should be `#[unstable]` for a while.
* Add a special case for upper-case Sigma in word-final position in `str::to_lowercase`: #26035. This is the only language-independent conditional mapping currently in `SpecialCasing.txt`.
* Stabilize `str::to_lowercase` and `str::to_uppercase`. The `&self -> String` on `str` signature seems straightforward enough, and the only relevant issue I’ve found is #24536 about naming. But `char` already has stable methods with the same name, and deprecating them for a rename doesn’t seem worth it.
r? @alexcrichton
With the latter is provided by the `From` conversion trait, the former is now completely redundant. Their code is identical. Let’s deprecate now and plan to remove in the next cycle. (It’s `#[unstable]`.)
r? @alexcrichton
CC @nagisa
I had to use `impl<'a, V: Copy> Extend<(usize, &'a V)> for VecMap<V>` instead of `impl<'a, V: Copy> Extend<(&'a usize, &'a V)> for VecMap<V>` as that's what is needed for doing
```rust
let mut a = VecMap::new();
let b = VecMap::new();
b.insert(1, "foo");
a.extend(&b)
```
I can squash the commits after review.
r? @Gankro
The recent bug that was found in LinkedList reminded me of some general cleanup
that's been waiting for some time.
- Use a loop from the front in Drop, it works just as well and without an unsafe block
- Change Rawlink methods to use `unsafe` in an idiomatic way. This does mean that
we need an unsafe block for each dereference of a raw link. Even then, the extent
of unsafe-critical code is even larger of course, since safety depends on the whole
data structure's integrity. This is a general problem we are aware of.
- Some cleanup just to try to decrease the amount of Rawlink handling.
This wasn't marked inline, so wasn't being inlined cross-crate. It's
actually a no-op function, since it's a wrapper around `mem::transmute`.
Marking it inline means that programs calling it can see that it's a
no-op and act accordingly during optimisation.
This wasn't marked inline, so wasn't being inlined cross-crate. It's
actually a no-op function, since it's a wrapper around `mem::transmute`.
Marking it inline means that programs calling it can see that it's a
no-op and act accordingly during optimisation.
collections: Make BinaryHeap panic safe in sift_up / sift_down
Use a struct called Hole that keeps track of an invalid location
in the vector and fills the hole on drop.
I include a run-pass test that the current BinaryHeap fails, and the new
one passes.
NOTE: The BinaryHeap will still be inconsistent after a comparison fails. It will
not have the heap property. What we fix is just that elements will be valid
values.
This is actually a performance win -- the new code does not bother to write in `zeroed()`
values in the holes, it just leaves them as they were.
Net result is something like a 5% decrease in runtime for `BinaryHeap::from_vec`. This
can be further improved by using unchecked indexing (I confirmed it makes a difference,
not a surprise with the non-sequential access going on), but let's leave that for another PR.
Safety first 😉Fixes#25842
Use a struct called Hole that keeps track of an invalid location
in the vector and fills the hole on drop.
I include a run-pass test that the current BinaryHeap fails, and the new
one passes.
Fixes#25842
The `debug_builders` feature is up for 1.1 stabilization in #24028. This commit stabilizes the API as-is with no changes.
Some nits that @alexcrichton mentioned that may be worth discussing now if anyone cares:
* Should `debug_tuple_struct` and `DebugTupleStruct` be used instead of `debug_tuple` and `DebugTuple`? It's more typing but is a technically more correct name.
* `DebugStruct` and `DebugTuple` have `field` methods while `DebugSet`, `DebugMap` and `DebugList` have `entry` methods. Should we switch those to something else for consistency?
cc @alexcrichton @aturon
collections: Reorder slice methods to improve API docs
We have an evolutionary history whose traces are still visible in the
slice docs today.
Some heuristics:
* Group method and method_mut together
* Group method and method_by together
* Group by use case, here we have roughly:
Basic interrogators (len)
Mutation (swap)
Iterators (iter)
Segmentation (split)
Searching (contains)
Permutations (permutations)
Misc (clone_from_slice)
We have an evolutionary history whose traces are still visible in the
slice docs today.
Some heuristics:
* Group method and method_mut together
* Group method and method_by together
* Group by use case, here we have roughly:
Basic interrogators (len)
Mutation (swap)
Iterators (iter)
Segmentation (split)
Searching (contains)
Permutations (permutations)
Misc (clone_from_slice)
Use stable code in doc examples (libcollections)
Main task is to change from String::from_str to String::from in examples for String
(the latter constructor is stable). While I'm at it, also remove redundant feature flags,
fix some other instances of unstable code in examples (in examples for stable
methods), and remove some use of usize in examples too.
Prefer String::from over from_str; String::from_str is unstable while
String::from is stable. Promote the latter by using it in examples.
Simply migrating unstable function to the closest alternative.
Some modest running-time improvements to `std::collections::BitSet` on bit-sets of varying set-membership densities. This is work originally from [here](https://github.com/rayglover/alt_collections). (Benchmarks copied below)
```
std::collections::BitSet / alt_collections::BitSet
copy_dense ... 3.08x
copy_sparse ... 4.22x
count_dense ... 11.01x
count_sparse ... 8.11x
from_bytes ... 1.47x
intersect_dense ... 6.54x
intersect_sparse ... 4.37x
union_dense ... 5.53x
union_sparse ... 5.60x
```
The exception is `from_bytes`, which I've left unaltered since the optimization is rather obscure.
Compiling with the cpu feature `popcnt` gave a further ~10% improvement on my machine, but this wasn't factored in to the benchmarks above.
Similar improvements could be made to `BitVec`, although that would probably require more substantial changes.
criticism welcome!
Using regular pointer arithmetic to iterate collections of zero-sized types
doesn't work, because we'd get the same pointer all the time. Our
current solution is to convert the pointer to an integer, add an offset
and then convert back, but this inhibits certain optimizations.
What we should do instead is to convert the pointer to one that points
to an i8\*, and then use a LLVM GEP instructions without the inbounds
flag to perform the pointer arithmetic. This allows to generate pointers
that point outside allocated objects without causing UB (as long as you
don't dereference them), and it wraps around using two's complement,
i.e. it behaves exactly like the wrapping_* operations we're currently
using, with the added benefit of LLVM being able to better optimize the
resulting IR.
Using regular pointer arithmetic to iterate collections of zero-sized types
doesn't work, because we'd get the same pointer all the time. Our
current solution is to convert the pointer to an integer, add an offset
and then convert back, but this inhibits certain optimizations.
What we should do instead is to convert the pointer to one that points
to an i8*, and then use a LLVM GEP instructions without the inbounds
flag to perform the pointer arithmetic. This allows to generate pointers
that point outside allocated objects without causing UB (as long as you
don't dereference them), and it wraps around using two's complement,
i.e. it behaves exactly like the wrapping_* operations we're currently
using, with the added benefit of LLVM being able to better optimize the
resulting IR.
The functions BitSet::{iter,union,symmetric_difference} each had docs that claimed u32s were output when their actual output each end up being usizes.
r? @steveklabnik
We don't need to copy any elements if `at` is behind the last element
in the map. The last element is at index `self.v.len() - 1`, so we
should not copy if `at` is greater **or equals** `self.v.len()`.
r? @Gankro
Several Minor API / Reference Documentation Fixes
- Fix a few small errors in the reference.
- Fix paper cuts in the API docs.
Fixes#24882Fixes#25233Fixes#25250
We don't need to copy any elements if `at` is behind the last element
in the map. The last element is at index `self.v.len() - 1`, so we
should not copy if `at` is greater or equals `self.v.len()`.
I was profiling my code again and this time AsRef<str> for String
was eating up a considerable chunk of my runtime; adding the inline
annotation made the program run almost twice as fast!
While I was at it I also added the annotation to other implementations
of AsRef as well as AsMut.
Ideally this trait implementation would be unstable, requiring crates to opt-in
if they would like the functionality, but that's not currently how stability
works so the implementation needs to be removed entirely.
This may come back at a future date, but for now the conservative option is to
remove it.
[breaking-change]
Ideally this trait implementation would be unstable, requiring crates to opt-in
if they would like the functionality, but that's not currently how stability
works so the implementation needs to be removed entirely.
This may come back at a future date, but for now the conservative option is to
remove it.
[breaking-change]
collections: Convert SliceConcatExt to use associated types
Coherence now allows this, we have `SliceConcatExt<T> for [V] where T: Sized + Clone` and` SliceConcatExt<str> for [S]`, these don't conflict because
str is never Sized.
According to rust-lang/rfcs#235, `VecDeque` should have this method (`VecDeque` was called `RingBuf` at the time) but it was never implemented.
I marked this stable since "1.0.0" because it's stable in `Vec`.
Coherence now allows this, we have SliceConcatExt<T> for [V] where T: Sized
+ Clone and SliceConcatExt<str> for [S], these don't conflict because
str is never Sized.
This commit does two things: it adds an example for indexing vectors, and it changes the \"Examples\" section to use full sentences.
This change was spurred by someone in the #rust IRC channel asking if there was a `.set()` method for changing the `i`-th value of a vector (they had missed that `Vec` implements `IndexMut`, which is easy to do if you're not aware of that trait).
This makes the `bit::vec::bench::bench_bit_vec_big_union` benchmark go
from `774 ns/iter (+/- 190)` to `602 ns/iter (+/- 5)`.
(There's room for more work here too: if one can guarantee 128-bit
alignment for the vector, the compiler actually optimises `union`,
`intersection` etc. to SIMD instructions, which end up being ~5x faster
that the original version, and 4x faster than the optimised version in
this patch.)
This makes the `bit::vec::bench::bench_bit_vec_big_union` benchmark go
from `774 ns/iter (+/- 190)` to `602 ns/iter (+/- 5)`.
(There's room for more work here too: if one can guarantee 128-bit
alignment for the vector, the compiler actually optimises `union`,
`intersection` etc. to SIMD instructions, which end up being ~5x faster
that the original version, and 4x faster than the optimised version in
this patch.)
collections: Implement String::drain(range) according to RFC 574
`.drain(range)` is unstable and under feature(collections_drain).
This adds a safe way to remove any range of a String as efficiently as
possible.
As noted in the code, this drain iterator has none of the memory safety
issues of the vector version.
RFC tracking issue is #23055
These implementations were intended to be unstable, but currently the stability
attributes cannot handle a stable trait with an unstable `impl` block. This
commit also audits the rest of the standard library for explicitly-`#[unstable]`
impl blocks. No others were removed but some annotations were changed to
`#[stable]` as they're defacto stable anyway.
One particularly interesting `impl` marked `#[stable]` as part of this commit
is the `Add<&[T]>` impl for `Vec<T>`, which uses `push_all` and implicitly
clones all elements of the vector provided.
Closes#24791
[breaking-change]
`.drain(range)` is unstable and under feature(collections_drain).
This adds a safe way to remove any range of a String as efficiently as
possible.
As noted in the code, this drain iterator has none of the memory safety
issues of the vector version.
RFC tracking issue is #23055
These implementations were intended to be unstable, but currently the stability
attributes cannot handle a stable trait with an unstable `impl` block. This
commit also audits the rest of the standard library for explicitly-`#[unstable]`
impl blocks. No others were removed but some annotations were changed to
`#[stable]` as they're defacto stable anyway.
One particularly interesting `impl` marked `#[stable]` as part of this commit
is the `Add<&[T]>` impl for `Vec<T>`, which uses `push_all` and implicitly
clones all elements of the vector provided.
Closes#24791
Implement Vec::drain(\<range type\>) from rust-lang/rfcs#574, tracking issue #23055.
This is a big step forward for vector usability. This is an introduction of an API for removing a range of *m* consecutive elements from a vector, as efficently as possible.
New features:
- Introduce trait `std::collections::range::RangeArgument` implemented by all four built-in range types.
- Change `Vec::drain()` to use `Vec::drain<R: RangeArgument>(R)`
Implementation notes:
- Use @Gankro's idea for memory safety: Use `set_len` on the source vector when creating the iterator, to make sure that the part of the vector that will be modified is unreachable. Fix up things in Drain's destructor — but even if it doesn't run, we don't expose any moved-out-from slots of the vector.
- This `.drain<R>(R)` very close to how it is specified in the RFC.
- Introduced as unstable
- Drain reuses the slice iterator — copying and pasting the same iterator pointer arithmetic again felt very bad
- The `usize` index as a range argument in the RFC is not included. The ranges trait would have to change to accomodate it.
Please help me with:
- Name and location of the new ranges trait.
- Design of the ranges trait
- Understanding Niko's comments about variance (Note: for a long time I was using a straight up &mut Vec in the iterator, but I changed this to permit reusing the slice iterator).
Previous PR and discussion: #23071
Improve example for as_string and add example for as_vec
Provide a better example of `as_string` / `DerefString`'s unique capabilities.
Use an example where (for an unspecified reason) you need a &String, and
show how `as_string` solves the problem without needing an allocation.
Changes the style guidelines regarding unit tests to recommend using a
sub-module named "tests" instead of "test" for unit tests as "test"
might clash with imports of libtest.
This is an implementation of [RFC 1030][rfc] which adds these traits to the
prelude and additionally removes all inherent `into_iter` methods on collections
in favor of the trait implementation (which is now accessible by default).
[rfc]: https://github.com/rust-lang/rfcs/pull/1030
This is technically a breaking change due to the prelude additions and removal
of inherent methods, but it is expected that essentially no code breaks in
practice.
[breaking-change]
Closes#24538
as accepted in [RFC 526](https://github.com/rust-lang/rfcs/blob/master/text/0526-fmt-text-writer.md).
Note that this brand new method is marked as **stable**. I judged this safe enough: it’s simple enough that it’s very unlikely to change. Still, I can mark it unstable instead if you prefer.
r? @alexcrichton
For now, words() is left in (but deprecated), and Words is a type alias for
struct SplitWhitespace.
Also cleaned up references to s.words() throughout codebase.
Closes#15628
This commit removes all the old casting/generic traits from `std::num` that are
no longer in use by the standard library. This additionally removes the old
`strconv` module which has not seen much use in quite a long time. All generic
functionality has been supplanted with traits in the `num` crate and the
`strconv` module is supplanted with the [rust-strconv crate][rust-strconv].
[rust-strconv]: https://github.com/lifthrasiir/rust-strconv
This is a breaking change due to the removal of these deprecated crates, and the
alternative crates are listed above.
[breaking-change]
This implementation is currently about 3-4 times faster than using the `.to_string()` based approach.
I would also suggest we deprecate `String::from_str` since it's redundant with the stable `String::from` method, but I'll leave that for a future PR.
It's just as convenient, but it's much faster. Using write! requires an
extra call to fmt::write and a extra dynamically dispatched call to
Arguments' Display format function.
This is an implementation of [RFC 1030][rfc] which adds these traits to the
prelude and additionally removes all inherent `into_iter` methods on collections
in favor of the trait implementation (which is now accessible by default).
[rfc]: https://github.com/rust-lang/rfcs/pull/1030
This is technically a breaking change due to the prelude additions and removal
of inherent methods, but it is expected that essentially no code breaks in
practice.
[breaking-change]
Closes#24538
It's just as convenient, but it's much faster. Using write! requires an
extra call to fmt::write and a extra dynamically dispatched call to
Arguments' Display format function.
This patch
1. renames libunicode to librustc_unicode,
2. deprecates several pieces of libunicode (see below), and
3. removes references to deprecated functions from
librustc_driver and libsyntax. This may change pretty-printed
output from these modules in cases involving wide or combining
characters used in filenames, identifiers, etc.
The following functions are marked deprecated:
1. char.width() and str.width():
--> use unicode-width crate
2. str.graphemes() and str.grapheme_indices():
--> use unicode-segmentation crate
3. str.nfd_chars(), str.nfkd_chars(), str.nfc_chars(), str.nfkc_chars(),
char.compose(), char.decompose_canonical(), char.decompose_compatible(),
char.canonical_combining_class():
--> use unicode-normalization crate
The meaning of each variant of this enum was somewhat ambiguous and it's uncler
that we wouldn't even want to add more enumeration values in the future. As a
result this error has been altered to instead become an opaque structure.
Learning about the "first invalid byte index" is still an unstable feature, but
the type itself is now stable.
Right now, if the user requests to increase the vector size via reserve() or push_back() and the request brings the attempted memory above usize::MAX, we panic.
With this change there is only a panic if the minimum requested memory that could meet the requirement is above usize::MAX- otherwise it simply requests its largest capacity possible, usize::MAX.
...to be less confusing. Since 0 is the smallest number possible for usize, it doesn't make sense to mention it if it's already included, and it should be more clear that the length of the vector is a valid index with the new wording.
r? @steveklabnik
The meaning of each variant of this enum was somewhat ambiguous and it's uncler
that we wouldn't even want to add more enumeration values in the future. As a
result this error has been altered to instead become an opaque structure.
Learning about the "first invalid byte index" is still an unstable feature, but
the type itself is now stable.
In addition to being nicer, this also allows you to use `sum` and `product` for
iterators yielding custom types aside from the standard integers.
Due to removing the `AdditiveIterator` and `MultiplicativeIterator` trait, this
is a breaking change.
[breaking-change]
This adds the missing methods and turns `str::pattern` in a user facing module, as per RFC.
This also contains some big internal refactorings:
- string iterator pairs are implemented with a central macro to reduce redundancy
- Moved all tests from `coretest::str` into `collectionstest::str` and left a note to prevent the two sets of tests drifting apart further.
See https://github.com/rust-lang/rust/issues/22477
- Added missing reverse versions of methods
- Added [r]matches()
- Generated the string pattern iterators with a macro
- Added where bounds to the methods returning reverse iterators
for better error messages.