mikros/rust - rust - Gitea.pterpstra.com

Author	SHA1	Message	Date
bors	2d91939bb7	Auto merge of #107634 - scottmcm:array-drain, r=thomcc Improve the `array::map` codegen The `map` method on arrays [is documented as sometimes performing poorly](https://doc.rust-lang.org/std/primitive.array.html#note-on-performance-and-stack-usage), and after [a question on URLO](https://users.rust-lang.org/t/try-trait-residual-o-trait-and-try-collect-into-array/88510?u=scottmcm) prompted me to take another look at the core [`try_collect_into_array`](`7c46fb2111/library/core/src/array/mod.rs (L865-L912)`) function, I had some ideas that ended up working better than I'd expected. There's three main ideas in here, split over three commits: 1. Don't use `array::IntoIter` when we can avoid it, since that seems to not get SRoA'd, meaning that every step writes things like loop counters into the stack unnecessarily 2. Don't return arrays in `Result`s unnecessarily, as that doesn't seem to optimize away even with `unwrap_unchecked` (perhaps because it needs to get moved into a new LLVM type to account for the discriminant) 3. Don't distract LLVM with all the `Option` dances when we know for sure we have enough items (like in `map` and `zip`). This one's a larger commit as to do it I ended up adding a new `pub(crate)` trait, but hopefully those changes are still straight-forward. (No libs-api changes; everything should be completely implementation-detail-internal.) It's still not completely fixed -- I think it needs pcwalton's `memcpy` optimizations still (#103830) to get further -- but this seems to go much better than before. And the remaining `memcpy`s are just `transmute`-equivalent (`[T; N] -> ManuallyDrop<[T; N]>` and `[MaybeUninit<T>; N] -> [T; N]`), so hopefully those will be easier to remove with LLVM16 than the previous subobject copies 🤞 r? `@thomcc` As a simple example, this test ```rust pub fn long_integer_map(x: [u32; 64]) -> [u32; 64] { x.map(\|x\| 13 * x + 7) } ``` On nightly <https://rust.godbolt.org/z/xK7548TGj> takes `sub rsp, 808` ```llvm start: %array.i.i.i.i = alloca [64 x i32], align 4 %_3.sroa.5.i.i.i = alloca [65 x i32], align 4 %_5.i = alloca %"core::iter::adapters::map::Map<core::array::iter::IntoIter<u32, 64>, [closure@/app/example.rs:2:11: 2:14]>", align 8 ``` (and yes, that's a 65-element array `alloca` despite 64-element input and output) But with this PR it's only `sub rsp, 520` ```llvm start: %array.i.i.i.i.i.i = alloca [64 x i32], align 4 %array1.i.i.i = alloca %"core::mem::manually_drop::ManuallyDrop<[u32; 64]>", align 4 ``` Similarly, the loop it emits on nightly is scalar-only and horrifying ```nasm .LBB0_1: mov esi, 64 mov edi, 0 cmp rdx, 64 je .LBB0_3 lea rsi, [rdx + 1] mov qword ptr [rsp + 784], rsi mov r8d, dword ptr [rsp + 4rdx + 528] mov edi, 1 lea edx, [r8 + 2r8] lea r8d, [r8 + 4rdx] add r8d, 7 .LBB0_3: test edi, edi je .LBB0_11 mov dword ptr [rsp + 4rcx + 272], r8d cmp rsi, 64 jne .LBB0_6 xor r8d, r8d mov edx, 64 test r8d, r8d jne .LBB0_8 jmp .LBB0_11 .LBB0_6: lea rdx, [rsi + 1] mov qword ptr [rsp + 784], rdx mov edi, dword ptr [rsp + 4rsi + 528] mov r8d, 1 lea esi, [rdi + 2rdi] lea edi, [rdi + 4rsi] add edi, 7 test r8d, r8d je .LBB0_11 .LBB0_8: mov dword ptr [rsp + 4rcx + 276], edi add rcx, 2 cmp rcx, 64 jne .LBB0_1 ``` whereas with this PR it's unrolled and vectorized ```nasm vpmulld ymm1, ymm0, ymmword ptr [rsp + 64] vpaddd ymm1, ymm1, ymm2 vmovdqu ymmword ptr [rsp + 328], ymm1 vpmulld ymm1, ymm0, ymmword ptr [rsp + 96] vpaddd ymm1, ymm1, ymm2 vmovdqu ymmword ptr [rsp + 360], ymm1 ``` (though sadly still stack-to-stack)	2023-02-13 10:18:48 +00:00
Scott McMurray	5bc328fdef	Allow canonicalizing the `array::map` loop in trusted cases	2023-02-04 16:44:51 -08:00
Lukas Markeffsky	76e216f29b	Use associated items of `char` instead of freestanding items in `core::char`	2023-01-14 11:58:41 +01:00
Scott McMurray	9d68a1a74c	Tune RepeatWith::try_fold and Take::for_each and Vec::extend_trusted	2022-11-24 19:14:19 -08:00
Scott McMurray	d62b903892	`VecDeque::resize` should re-use the buffer in the passed-in element Today it always copies it for every appended element, but one of those clones is avoidable.	2022-11-15 00:53:26 -08:00
The 8472	43c353fff7	simplification: do not process the ArrayChunks remainder in fold()	2022-11-07 21:44:25 +01:00
Matthias Krüger	6deca5f067	Rollup merge of #100220 - scottmcm:fix-by-ref-sized, r=joshtriplett Properly forward `ByRefSized::fold` to the inner iterator cc ``@timvermeulen,`` who noticed this mistake in https://github.com/rust-lang/rust/pull/100214#issuecomment-1207317625	2022-08-24 18:20:08 +02:00
bors	6c943bad02	Auto merge of #99541 - timvermeulen:flatten_cleanup, r=the8472 Refactor iteration logic in the `Flatten` and `FlatMap` iterators The `Flatten` and `FlatMap` iterators both delegate to `FlattenCompat`: ```rust struct FlattenCompat<I, U> { iter: Fuse<I>, frontiter: Option<U>, backiter: Option<U>, } ``` Every individual iterator method that `FlattenCompat` implements needs to carefully manage this state, checking whether the `frontiter` and `backiter` are present, and storing the current iterator appropriately if iteration is aborted. This has led to methods such as `next`, `advance_by`, and `try_fold` all having similar code for managing the iterator's state. I have extracted this common logic of iterating the inner iterators with the option to exit early into a `iter_try_fold` method: ```rust impl<I, U> FlattenCompat<I, U> where I: Iterator<Item: IntoIterator<IntoIter = U>>, { fn iter_try_fold<Acc, Fold, R>(&mut self, acc: Acc, fold: Fold) -> R where Fold: FnMut(Acc, &mut U) -> R, R: Try<Output = Acc>, { ... } } ``` It passes each of the inner iterators to the given function as long as it keep succeeding. It takes care of managing `FlattenCompat`'s state, so that the actual `Iterator` methods don't need to. The resulting code that makes use of this abstraction is much more straightforward: ```rust fn next(&mut self) -> Option<U::Item> { #[inline] fn next<U: Iterator>((): (), iter: &mut U) -> ControlFlow<U::Item> { match iter.next() { None => ControlFlow::CONTINUE, Some(x) => ControlFlow::Break(x), } } self.iter_try_fold((), next).break_value() } ``` Note that despite being implemented in terms of `iter_try_fold`, `next` is still able to benefit from `U`'s `next` method. It therefore does not take the performance hit that implementing `next` directly in terms of `Self::try_fold` causes (in some benchmarks). This PR also adds `iter_try_rfold` which captures the shared logic of `try_rfold` and `advance_back_by`, as well as `iter_fold` and `iter_rfold` for folding without early exits (used by `fold`, `rfold`, `count`, and `last`). Benchmark results: ``` before after bench_flat_map_sum 423,255 ns/iter 414,338 ns/iter bench_flat_map_ref_sum 1,942,139 ns/iter 2,216,643 ns/iter bench_flat_map_chain_sum 1,616,840 ns/iter 1,246,445 ns/iter bench_flat_map_chain_ref_sum 4,348,110 ns/iter 3,574,775 ns/iter bench_flat_map_chain_option_sum 780,037 ns/iter 780,679 ns/iter bench_flat_map_chain_option_ref_sum 2,056,458 ns/iter 834,932 ns/iter ``` I added the last two benchmarks specifically to demonstrate an extreme case where `FlatMap::next` can benefit from custom internal iteration of the outer iterator, so take it with a grain of salt. We should probably do a perf run to see if the changes to `next` are worth it in practice.	2022-08-19 02:34:30 +00:00
Scott McMurray	7680c8b690	Properly forward `ByRefSized::fold` to the inner iterator	2022-08-14 22:55:30 -07:00
austinabell	00bc9e8ac4	fix(iter::skip): Optimize `next` and `nth` implementations of `Skip`	2022-08-14 13:25:13 -04:00
Tim Vermeulen	3f7004920c	Move `fold` logic to `iter_fold` method and reuse it in `count` and `last`	2022-08-05 03:43:39 +02:00
Maybe Waffle	4db628a801	Remove incorrect impl `TrustedLen` for `ArrayChunks` As explained in the review of the previous attempt to add `ArrayChunks`, adapters that shrink the length can't implement `TrustedLen`.	2022-08-01 19:16:24 +04:00
Ross MacArthur	f5485181ca	Use `array::IntoIter` for the `ArrayChunks` remainder	2022-08-01 16:39:30 +04:00
Ross MacArthur	ca3d1010bb	Add `Iterator::array_chunks()`	2022-08-01 16:39:27 +04:00
Tim Vermeulen	e52837c362	Add note to test about `Unfuse`	2022-07-18 21:53:35 +02:00
Tim Vermeulen	50c612faef	Fix `Skip::next` for non-fused inner iterators	2022-07-18 21:10:47 +02:00
Ross MacArthur	bbdff1fff4	Add `Iterator::next_chunk`	2022-06-21 08:57:02 +02:00
est31	cdb8e64bc7	Use Box::new() instead of box syntax in core tests	2022-05-29 01:44:11 +02:00
Matthias Krüger	c183d4a510	Rollup merge of #94115 - scottmcm:iter-process-by-ref, r=yaahc Let `try_collect` take advantage of `try_fold` overrides No public API changes. With this change, `try_collect` (#94047) is no longer going through the `impl Iterator for &mut impl Iterator`, and thus will be able to use `try_fold` overrides instead of being forced through `next` for every element. Here's the test added, to see that it fails before this PR (once a new enough nightly is out): https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=462f2896f2fed2c238ee63ca1a7e7c56 This might as well go to the same person as my last `try_process` PR (#93572), so r? ``@yaahc``	2022-03-18 21:50:44 +01:00
bors	21b0325c68	Auto merge of #94738 - Urgau:rustbuild-check-cfg-values, r=Mark-Simulacrum Enable conditional checking of values in the Rust codebase This pull-request enable conditional checking of (well known) values in the Rust codebase. Well known values were added in https://github.com/rust-lang/rust/pull/94362. All the `target_*` values are taken from all the built-in targets which is why some extra values were needed do be added as they are not (yet ?) defined in any built-in targets. r? `@Mark-Simulacrum`	2022-03-13 18:34:00 +00:00
Scott McMurray	7ef74bc8b9	Let `try_collect` take advantage of `try_fold` overrides Without this change it's going through `&mut impl Iterator`, which handles `?Sized` and thus currently can't forward generics. Here's the test added, to see that it fails before this PR (once a new enough nightly is out): https://play.rust-lang.org/?version=nightly&mode=debug&edition=2021&gist=462f2896f2fed2c238ee63ca1a7e7c56	2022-03-10 00:16:06 -08:00
Loïc BRANSTETT	e3ea59ada5	Remove unexpected #[cfg(target_pointer_width = "8")] in tests	2022-03-09 00:30:17 +01:00
fren_gor	04b3162764	Add collect_into	2022-02-20 01:57:32 +01:00
Arthur Lafrance	47d5196a00	Add a `try_collect()` helper method to `Iterator` Tweaked `try_collect()` to accept more `Try` types Updated feature attribute for tracking issue	2022-02-16 14:26:39 -08:00
tamaron	83242897fb	add tests	2022-02-02 23:07:02 +09:00
Lucas Kent	08829853d3	eplace usages of vec![].into_iter with [].into_iter	2022-01-09 14:09:25 +11:00
Mara Bos	1acb44f03c	Use IntoIterator for array impl everywhere.	2021-12-04 19:40:33 +01:00
kit	aef59e4fb8	Add a `try_reduce` method to the Iterator trait	2021-12-04 15:17:14 +11:00
The8472	3f9b26dc64	Fix Iterator::advance_by contract inconsistency The `advance_by(n)` docs state that in the error case `Err(k)` that k is always less than n. It also states that `advance_by(0)` may return `Err(0)` to indicate an exhausted iterator. These statements are inconsistent. Since only one implementation (Skip) actually made use of that I changed it to return Ok(()) in that case too. While adding some tests I also found a bug in `Take::advance_back_by`.	2021-11-19 13:00:23 +01:00
The8472	2c6e67105e	implement advance_(back_)_by on more iterators	2021-09-30 21:23:28 +02:00
Frank Steffahn	8d2bb9389a	Consistent spelling of "adapter" in the standard library Change all occurrences of "(A\|a)daptor" to "(A\|a)dapter".	2021-07-30 17:23:07 +02:00
The8472	8dd903cc77	implement ConstSizeIntoIterator for &[T;N] in addition to [T;N] Due to #20400 the corresponding TrustedLen impls need a helper trait instead of directly adding `Item = &[T;N]` bounds. Since TrustedLen is a public trait this in turn means the helper trait needs to be public. Since it's just a workaround for a compiler deficit it's marked hidden, unstable and unsafe.	2021-07-16 20:38:42 +02:00
The8472	bd1c39dc6c	implement TrustedLen for Flatten/FlatMap if the U: IntoIterator == [T; N] This only works if arrays are passed directly instead of array iterators because we need to be sure that they have not been advanced before Flatten does its size calculation.	2021-07-15 22:59:30 +02:00
The8472	b4734b7c38	disable test on platforms that don't support unwinding	2021-06-20 12:20:05 +02:00
The8472	8b518542d0	fix panic-safety in specialized Zip::next_back This was unsound since a panic in a.next_back() would result in the length not being updated which would then lead to the same element being revisited in the side-effect preserving code.	2021-06-19 02:20:51 +02:00
Muhammad Mominul Huque	507d97b26e	Update expressions where we can use array's IntoIterator implementation	2021-06-02 16:09:04 +06:00
Mara Bos	8dc0ae24bc	Remove Option::{unwrap_none, expect_none}.	2021-03-14 12:54:34 +01:00
Giacomo Stevanato	c1bfb9a78d	Add relevant test	2021-03-05 19:09:23 +01:00
Mara	ee796c6523	Rollup merge of #82289 - SkiFire13:fix-issue-82282, r=m-ou-se Fix underflow in specialized ZipImpl::size_hint Fixes #82282	2021-03-05 10:57:19 +01:00
Giacomo Stevanato	8b9ac4d415	Add test for underflow in specialized Zip's size_hint	2021-03-03 21:16:08 +01:00
Ryan Levick	ee65416f0d	Fix core tests	2021-03-03 11:22:49 +01:00
Giacomo Stevanato	f241c10223	Improve flatten-fuse tests	2021-01-23 21:33:38 +01:00
Daniel Conley	0c78500426	library/core/tests/iter documentation and cleanup	2021-01-22 17:57:08 -05:00
Daniel Conley	bc830a274b	library/core/tests/iter rearrange & add back missed doc comments	2021-01-22 17:57:07 -05:00
Daniel Conley	1e3a2def67	library/core/test/iter add newlines between tests	2021-01-22 16:58:21 -05:00
Daniel Conley	3ce97000e1	library/core/test/iter.rs split attempt 2	2021-01-21 19:36:32 -05:00

46 Commits