Improve .chars().count()
Use a simpler loop to count the `char` of a string: count the
number of non-continuation bytes. Use `count += <conditional>` which the
compiler understands well and can apply loop optimizations to.
benchmark descriptions and results for two configurations:
- ascii: ascii text
- cy: cyrillic text
- jp: japanese text
- words ascii: counting each split_whitespace item from the ascii text
- words jp: counting each split_whitespace item from the jp text
```
x86-64 rustc -Copt-level=3
name orig_ ns/iter cmov_ ns/iter diff ns/iter diff %
count_ascii 1,453 (1755 MB/s) 1,398 (1824 MB/s) -55 -3.79%
count_cy 5,990 (856 MB/s) 2,545 (2016 MB/s) -3,445 -57.51%
count_jp 3,075 (1169 MB/s) 1,772 (2029 MB/s) -1,303 -42.37%
count_words_ascii 4,157 (521 MB/s) 1,797 (1205 MB/s) -2,360 -56.77%
count_words_jp 3,337 (1071 MB/s) 1,772 (2018 MB/s) -1,565 -46.90%
x86-64 rustc -Ctarget-feature=+avx -Copt-level=3
name orig_ ns/iter cmov_ ns/iter diff ns/iter diff %
count_ascii 1,444 (1766 MB/s) 763 (3343 MB/s) -681 -47.16%
count_cy 5,871 (874 MB/s) 1,527 (3360 MB/s) -4,344 -73.99%
count_jp 2,874 (1251 MB/s) 1,073 (3351 MB/s) -1,801 -62.67%
count_words_ascii 4,131 (524 MB/s) 1,871 (1157 MB/s) -2,260 -54.71%
count_words_jp 3,253 (1099 MB/s) 1,331 (2686 MB/s) -1,922 -59.08%
```
I briefly explored a more involved blocked algorithm (looking at 8 or more bytes at a time),
but the code in this PR was always winning `count_words_ascii` in particular (counting
many small strings); this solution is an improvement without tradeoffs.
Use a simpler loop to count the `char` of a string: count the
number of non-continuation bytes. Use `count += <conditional>` which the
compiler understands well and can apply loop optimizations to.
The problem occured due to lines like
```
3400;<CJK Ideograph Extension A, First>;Lo;0;L;;;;;N;;;;;
4DB5;<CJK Ideograph Extension A, Last>;Lo;0;L;;;;;N;;;;;
```
in `UnicodeData.txt`, which the script previously interpreted as two
characters, although it represents the whole range.
Fixes#34318.
Document convention for using both fmt::Write and io::Write
Using a trait's methods (like `Write::write_fmt` as used in `writeln!` and other macros) requires importing that trait directly (not just the module containing it). Both `fmt::Write` and `io::Write` provide compatible `Write::write_fmt` methods, and code can use `writeln!` and other macros on both an object implementing `fmt::Write` (such as a `String`) and an object implementing `io::Write` (such as `Stderr`). However, importing both `Write` traits produces an error due to the name conflict.
The convention I've seen renames both of them on import, to `FmtWrite` and `IoWrite` respectively. Document that convention in the Rust documentation for `write!` and `writeln!`, with examples.
Add .wrapping_offset() methods
.wrapping_offset() exposes the arith_offset intrinsic in the core
module (as methods on raw pointers, next to offset). This is the
first step in making it possible to stabilize the interface later.
`arith_offset` is a useful tool for developing iterators for two
reasons:
1. `arith_offset` is used by the slice's iterator, the most important
iterator in libcore, and it is natural that Rust users need the same
power available to implement similar iterators.
2. It is a good way to implement raw pointer iterations with step
greater than one.
The name seems to fit the style of methods like "wrapping_add".
Add .wrapping_offset() methods
.wrapping_offset() exposes the arith_offset intrinsic in the core
module (as methods on raw pointers, next to offset). This is the
first step in making it possible to stabilize the interface later.
`arith_offset` is a useful tool for developing iterators for two
reasons:
1. `arith_offset` is used by the slice's iterator, the most important
iterator in libcore, and it is natural that Rust users need the same
power available to implement similar iterators.
2. It is a good way to implement raw pointer iterations with step
greater than one.
The name seems to fit the style of methods like "wrapping_add".
Add impls for `&Wrapping`. Also `Sum`, `Product` impls for both `Wrapping` and `&Wrapping`.
There are two changes here (split into two commits):
- Ops for references to `&Wrapping` (`Add`, `Sub`, `Mul` etc.) similar to the way they are implemented for primitives.
- Impls for `iter::{Sum,Product}` for `Wrapping`.
As far as I know `impl` stability attributes don't really matter so I didn't bother breaking up the macro for two different kinds of stability. Happy to change if it does matter.
Add Iterator trait TrustedLen to enable better FromIterator / Extend
This trait attempts to improve FromIterator / Extend code by enabling it to trust the iterator to produce an exact number of elements, which means that reallocation needs to happen only once and is moved out of the loop.
`TrustedLen` differs from `ExactSizeIterator` in that it attempts to include _more_ iterators by allowing for the case that the iterator's len does not fit in `usize`. Consumers must check for this case (for example they could panic, since they can't allocate a collection of that size).
For example, chain can be TrustedLen and all numerical ranges can be TrustedLen. All they need to do is to report an exact size if it fits in `usize`, and `None` as the upper bound otherwise.
The trait describes its contract like this:
```
An iterator that reports an accurate length using size_hint.
The iterator reports a size hint where it is either exact
(lower bound is equal to upper bound), or the upper bound is `None`.
The upper bound must only be `None` if the actual iterator length is
larger than `usize::MAX`.
The iterator must produce exactly the number of elements it reported.
This trait must only be implemented when the contract is upheld.
Consumers of this trait must inspect `.size_hint()`’s upper bound.
```
Fixes#37232
Copyediting on documentation for write! and writeln!
Fix various sentence fragments, missing articles, and other grammatical issues in the documentation for write! and writeln!.
Also fix the links (and link names) for common return types.
(Noticed when preparing https://github.com/rust-lang/rust/pull/37472 ; posted separately to avoid mixing the new documentation with copyedits to existing documentation.)
Prevent exhaustive matching of Ordering to allow for future extension
The C++11 atomic memory model defines a `memory_order_consume` ordering which is generally equivalent to `memory_order_acquire` but can allow better code generation by avoiding memory barrier instructions. Most compilers (including LLVM) currently do not implement this ordering directly and instead treat it identically to `memory_order_acquire`, including adding a memory barrier instruction.
There is currently [work](http://open-std.org/Jtc1/sc22/wg21/docs/papers/2016/p0098r1.pdf) to support consume ordering in compilers, and it would be a shame if Rust did not support this. This PR therefore reserves a `__Nonexhaustive` variant in `Ordering` so that adding a new ordering is not a breaking change in the future.
This is a [breaking-change] since it disallows exhaustive matching on `Ordering`, however a search of all Rust code on Github shows that there is no code that does this. This makes sense since `Ordering` is typically only used as a parameter to an atomic operation.
Most of the Rust community agrees that the vec! macro is clearer when
called using square brackets [] instead of regular brackets (). Most of
these ocurrences are from before macros allowed using different types of
brackets.
There is one left unchanged in a pretty-print test, as the pretty
printer still wants it to have regular brackets.
improve docs for Index and IndexMut
This mainly changes the boring example of Foo/Bar of `IndexMut` into a better one.
Also added explanations about syntactic sugar for `v[index]`.
Closes#36329