stabilize slice_align_to
This is very hard to implement correctly, and leads to [serious bugs](https://github.com/llogiq/bytecount/pull/42) when done incorrectly. Moreover, this is needed to be able to run code that opportunistically exploits alignment on miri. So code using `align_to`/`align_to_mut` gets the benefit of a well-tested implementation *and* of being able to run in miri to test for (some kinds of) UB.
This PR also clarifies the guarantee wrt. the middle part being as long as possible. Should the docs say under which circumstances the middle part could be shorter? Currently, that can only happen when running in miri.
debug_assert to ensure that from_raw_parts is only used properly aligned
This does not help nearly as much as I would hope because everybody uses the distributed libstd which is compiled without debug assertions. For this reason, I am not sure if this is even worth it. OTOH, this would have caught the misalignment fixed by https://github.com/rust-lang/rust/issues/42789 *if* there had been any tests actually using ZSTs with alignment >1 (we have a CI runner which has debug assertions in libstd enabled), and it seems to currently [fail in the rg testsuite](https://ci.appveyor.com/project/rust-lang/rust/build/1.0.8403/job/v7dfdcgn8ay5j6sb). So maybe it is worth it, after all.
I have seen the attribute `#[rustc_inherit_overflow_checks]` in some places, does that make it so that the *caller's* debug status is relevant? Is there a similar attribute for `debug_assert!`? That could even subsume `rustc_inherit_overflow_checks`: Something like `rustc_inherit_debug_flag` could affect *all* places that change the generated code depending on whether we are in debug or release mode. In fact, given that we have to keep around the MIR for generic functions anyway, is there ever a reason *not* to handle the debug flag that way? I guess currently we apply debug flags like `cfg` so this is dropped early during the MIR pipeline?
EDIT: I learned from @eddyb that because of how `debug_assert!` works, this is not realistic. Well, we could still have it for the rustc CI runs and then maybe, eventually, when libstd gets compiled client-side and there is both a debug and a release build... then this will also benefit users.^^
unsized ManuallyDrop
I think this matches what @eddyb had in https://github.com/rust-lang/rust/pull/52711 originally.
~~However, I have never added a `CoerceUnsized` before so I am not sure if I did this right. I copied the `unstable` attribute on the `impl` from elsewhere, but AFAIK it is useless because `impl`'s are insta-stable... so shouldn't this rather say "stable since 1.30"?~~
This is insta-stable and hence requires FCP, at least.
Fixes https://github.com/rust-lang/rust/issues/47034
slices: fix ZST slice iterators making up pointers; debug_assert alignment in from_raw_parts
This fixes the problem that we are fabricating pointers out of thin air. I also managed to share more code between the mutable and shared iterators, while reducing the amount of macros.
I am not sure how useful it really is to add a `debug_assert!` in libcore. Everybody gets a release version of that anyway, right? Is there at least a CI job that runs the test suite with a debug version?
Fixes#42789
Add ExactChunks::remainder and ExactChunks::into_remainder
These allow to get the leftover items of the slice that are not being
iterated as part of the iterator due to not filling a complete chunk.
The mutable version consumes the slice because otherwise we would either
a) have to borrow the iterator instead of taking the lifetime of
the underlying slice, which is not what *any* of the other iterator
functions is doing, or
b) would allow returning multiple mutable references to the same data
The current behaviour of consuming the iterator is consistent with
IterMut::into_slice for the normal iterator.
----
This is related to https://github.com/rust-lang/rust/issues/47115#issuecomment-392685177 and the following comments.
While there the discussion was first about a way to get the "tail" of the iterator (everything from the slice that is still not iterated yet), this gives kind of unintuitive behaviour and is inconsistent with how the other slice iterators work.
Unintuitive because the `next_back` would have no effect on the tail (or otherwise the tail could not include the remainder items), inconsistent because a) generally the idea of the slice iterators seems to be to only ever return items that were not iterated yet (and don't provide a way to access the same item twice) and b) we would return a "flat" `&[T]` slice but the iterator's shape is `&[[T]]` instead, c) the mutable variant would have to borrow from the iterator instead of the underlying slice (all other iterator functions borrow from the underlying slice!)
As such, I've only implemented functions to get the remainder. This also allows the implementation to be completely safe still (and around slices instead of raw pointers), while getting the tail would either be inefficient or would have to be implemented around raw pointers.
CC @kerollmops
Implement always-fallible TryFrom for usize/isize conversions that are infallible on some platforms
This reverts commit 837d6c70233715a0ae8e15c703d40e3046a2f36a "Remove TryFrom impls that might become conditionally-infallible with a portability lint".
This fixes#49415 by adding (restoring) missing `TryFrom` impls for integer conversions to or from `usize` or `isize`, by making them always fallible at the type system level (that is, with `Error=TryFromIntError`) even though they happen to be infallible on some platforms (for some values of `size_of::<usize>()`).
They had been removed to allow the possibility to conditionally having some of them be infallible `From` impls instead, depending on the platforms, and have the [portability lint](https://github.com/rust-lang/rfcs/pull/1868) warn when they are used in code that is not already opting into non-portability. For example `#[allow(some_lint)] usize::from(x: u64)` would be valid on code that only targets 64-bit platforms.
This PR gives up on this possiblity for two reasons:
* Based on discussion with @aturon, it seems that the portability lint is not happening any time soon. It’s better to have the conversions be available *at all* than keep blocking them for so long. Portability-lint-gated platform-specific APIs can always be added separately later.
* For code that is fine with fallibility, the alternative would force it to opt into "non-portability" even though there would be no real portability issue.
Stabilize Iterator::flatten in 1.29, fixes#48213.
This PR stabilizes [`Iterator::flatten`](https://doc.rust-lang.org/nightly/std/iter/trait.Iterator.html#method.flatten) in *version 1.29* (1.28 goes to beta in 10 days, I don't think there's enough time to land it in that time, but let's see...).
Tracking issue is: #48213.
cc @bluss re. itertools.
r? @SimonSapin
ping @pietroalbini -- let's do a crater run when this passes CI :)
Document round-off error in `.mod_euc()`-method, see issue #50179
Due to a round-off error the method `.mod_euc()` of both `f32` and `f64` can produce mathematical invalid outputs. If `self` in magnitude is much small than the modulus `rhs` and negative, `self + rhs` in the first branch cannot be represented in the given precision and results into `rhs`. In the mathematical strict sense, this breaks the definition. But given the limitation of floating point arithmetic it can be thought of the closest representable value to the true result, although it is not strictly in the domain `[0.0, rhs)` of the function. It is rather the left side asymptotical limit. It would be desirable that it produces the mathematical more sound approximation of `0.0`, the right side asymptotical limit. But this breaks the property, that `self == self.div_euc(rhs) * rhs + a.mod_euc(rhs)`.
The discussion in issue #50179 did not find an satisfying conclusion to which property is deemed more important. But at least we can document the behaviour. Which this pull request does.
Optimize sum of Durations by using custom function
The current `impl Sum for Duration` uses `fold` to perform several `add`s (or really `checked_add`s) of durations. In doing so, it has to guarantee the number of nanoseconds is valid after every addition. If you squeese the current implementation into a single function it looks kind of like this:
````rust
fn sum<I: Iterator<Item = Duration>>(iter: I) -> Duration {
let mut sum = Duration::new(0, 0);
for rhs in iter {
if let Some(mut secs) = sum.secs.checked_add(rhs.secs) {
let mut nanos = sum.nanos + rhs.nanos;
if nanos >= NANOS_PER_SEC {
nanos -= NANOS_PER_SEC;
if let Some(new_secs) = secs.checked_add(1) {
secs = new_secs;
} else {
panic!("overflow when adding durations");
}
}
sum = Duration { secs, nanos }
} else {
panic!("overflow when adding durations");
}
}
sum
}
````
We only need to check if `nanos` is in the correct range when giving our final answer so we can have a more optimized version like so:
````rust
fn sum<I: Iterator<Item = Duration>>(iter: I) -> Duration {
let mut total_secs: u64 = 0;
let mut total_nanos: u64 = 0;
for entry in iter {
total_secs = total_secs
.checked_add(entry.secs)
.expect("overflow in iter::sum over durations");
total_nanos = match total_nanos.checked_add(entry.nanos as u64) {
Some(n) => n,
None => {
total_secs = total_secs
.checked_add(total_nanos / NANOS_PER_SEC as u64)
.expect("overflow in iter::sum over durations");
(total_nanos % NANOS_PER_SEC as u64) + entry.nanos as u64
}
};
}
total_secs = total_secs
.checked_add(total_nanos / NANOS_PER_SEC as u64)
.expect("overflow in iter::sum over durations");
total_nanos = total_nanos % NANOS_PER_SEC as u64;
Duration {
secs: total_secs,
nanos: total_nanos as u32,
}
}
````
We now only convert `total_nanos` to `total_secs` (1) if `total_nanos` overflows and (2) at the end of the function when we have to output a valid `Duration`. This gave a 5-22% performance improvement when I benchmarked it, depending on how big the `nano` value of the `Duration`s in `iter` were.
Specialize StepBy<Range(Inclusive)>
Part of #51557, related to #43064, #31155
As discussed in the above issues, `step_by` optimizes very badly on ranges which is related to
1. the special casing of the first `StepBy::next()` call
2. the need to do 2 additions of `n - 1` and `1` inside the range's `next()`
This PR eliminates both by overriding `next()` to always produce the current element and also step ahead by `n` elements in one go. The generated code is much better, even identical in the case of a `Range` with constant `start` and `end` where `start+step` can't overflow. Without constant bounds it's a bit longer than the manual loop. `RangeInclusive` doesn't optimize as nicely but is still much better than the original asm.
Unsigned integers optimize better than signed ones for some reason.
See the following two links for a comparison.
[godbolt: specialization for ..](https://godbolt.org/g/haHLJr)
[godbolt: specialization for ..=](https://godbolt.org/g/ewyMu6)
`RangeFrom`, the only other range with an `Iterator` implementation can't be specialized like this without changing behaviour due to overflow. There is no way to save "finished-ness".
The approach can not be used in general, because it would produce side effects of the underlying iterator too early.
May obsolete #51435, haven't checked.
the originally generated code was highly suboptimal
this brings it close to the same code or even exactly the same as a
manual while-loop by eliminating a branch and the
double stepping of n-1 + 1 steps
The intermediate trait lets us circumvent the specialization
type inference bugs
Add Ref/RefMut map_split method
As proposed [here](https://internals.rust-lang.org/t/make-refcell-support-slice-splitting/7707).
TLDR: Add a `map_split` method that allows multiple `RefMut`s to exist simultaneously so long as they refer to non-overlapping regions of the original `RefCell`. This is useful for things like the slice `split_at_mut` method.
These allow to get the leftover items of the slice that are not being
iterated as part of the iterator due to not filling a complete chunk.
The mutable version consumes the slice because otherwise we would either
a) have to borrow the iterator instead of taking the lifetime of
the underlying slice, which is not what *any* of the other iterator
functions is doing, or
b) would allow returning multiple mutable references to the same data
The current behaviour of consuming the iterator is consistent with
IterMut::into_slice for the normal iterator.