mikros/rust - rust - Gitea.pterpstra.com

Author	SHA1	Message	Date
zhangyunhao	e107ca0f0b	Optimize break patterns	2023-02-22 16:02:35 +00:00
Dylan DPC	e802713941	Rollup merge of #106933 - schuelermine:fix/doc/102451, r=Amanieu Update documentation of select_nth_unstable and select_nth_unstable_by to state O(n^2) complexity See #102451	2023-02-19 13:03:40 +05:30
Anselm Schüler	f1e649b378	Update documentation of select_nth_unstable and select_nth_unstable_by and select_nth_unstable_by_key to state O(n log n) worst case complexity Also remove erronious / in doc comment	2023-02-18 16:18:34 +01:00
bors	2d91939bb7	Auto merge of #107634 - scottmcm:array-drain, r=thomcc Improve the `array::map` codegen The `map` method on arrays [is documented as sometimes performing poorly](https://doc.rust-lang.org/std/primitive.array.html#note-on-performance-and-stack-usage), and after [a question on URLO](https://users.rust-lang.org/t/try-trait-residual-o-trait-and-try-collect-into-array/88510?u=scottmcm) prompted me to take another look at the core [`try_collect_into_array`](`7c46fb2111/library/core/src/array/mod.rs (L865-L912)`) function, I had some ideas that ended up working better than I'd expected. There's three main ideas in here, split over three commits: 1. Don't use `array::IntoIter` when we can avoid it, since that seems to not get SRoA'd, meaning that every step writes things like loop counters into the stack unnecessarily 2. Don't return arrays in `Result`s unnecessarily, as that doesn't seem to optimize away even with `unwrap_unchecked` (perhaps because it needs to get moved into a new LLVM type to account for the discriminant) 3. Don't distract LLVM with all the `Option` dances when we know for sure we have enough items (like in `map` and `zip`). This one's a larger commit as to do it I ended up adding a new `pub(crate)` trait, but hopefully those changes are still straight-forward. (No libs-api changes; everything should be completely implementation-detail-internal.) It's still not completely fixed -- I think it needs pcwalton's `memcpy` optimizations still (#103830) to get further -- but this seems to go much better than before. And the remaining `memcpy`s are just `transmute`-equivalent (`[T; N] -> ManuallyDrop<[T; N]>` and `[MaybeUninit<T>; N] -> [T; N]`), so hopefully those will be easier to remove with LLVM16 than the previous subobject copies 🤞 r? `@thomcc` As a simple example, this test ```rust pub fn long_integer_map(x: [u32; 64]) -> [u32; 64] { x.map(\|x\| 13 * x + 7) } ``` On nightly <https://rust.godbolt.org/z/xK7548TGj> takes `sub rsp, 808` ```llvm start: %array.i.i.i.i = alloca [64 x i32], align 4 %_3.sroa.5.i.i.i = alloca [65 x i32], align 4 %_5.i = alloca %"core::iter::adapters::map::Map<core::array::iter::IntoIter<u32, 64>, [closure@/app/example.rs:2:11: 2:14]>", align 8 ``` (and yes, that's a 65-element array `alloca` despite 64-element input and output) But with this PR it's only `sub rsp, 520` ```llvm start: %array.i.i.i.i.i.i = alloca [64 x i32], align 4 %array1.i.i.i = alloca %"core::mem::manually_drop::ManuallyDrop<[u32; 64]>", align 4 ``` Similarly, the loop it emits on nightly is scalar-only and horrifying ```nasm .LBB0_1: mov esi, 64 mov edi, 0 cmp rdx, 64 je .LBB0_3 lea rsi, [rdx + 1] mov qword ptr [rsp + 784], rsi mov r8d, dword ptr [rsp + 4rdx + 528] mov edi, 1 lea edx, [r8 + 2r8] lea r8d, [r8 + 4rdx] add r8d, 7 .LBB0_3: test edi, edi je .LBB0_11 mov dword ptr [rsp + 4rcx + 272], r8d cmp rsi, 64 jne .LBB0_6 xor r8d, r8d mov edx, 64 test r8d, r8d jne .LBB0_8 jmp .LBB0_11 .LBB0_6: lea rdx, [rsi + 1] mov qword ptr [rsp + 784], rdx mov edi, dword ptr [rsp + 4rsi + 528] mov r8d, 1 lea esi, [rdi + 2rdi] lea edi, [rdi + 4rsi] add edi, 7 test r8d, r8d je .LBB0_11 .LBB0_8: mov dword ptr [rsp + 4rcx + 276], edi add rcx, 2 cmp rcx, 64 jne .LBB0_1 ``` whereas with this PR it's unrolled and vectorized ```nasm vpmulld ymm1, ymm0, ymmword ptr [rsp + 64] vpaddd ymm1, ymm1, ymm2 vmovdqu ymmword ptr [rsp + 328], ymm1 vpmulld ymm1, ymm0, ymmword ptr [rsp + 96] vpaddd ymm1, ymm1, ymm2 vmovdqu ymmword ptr [rsp + 360], ymm1 ``` (though sadly still stack-to-stack)	2023-02-13 10:18:48 +00:00
bors	96834f0231	Auto merge of #107191 - Voultapher:reverse-timsort-scan-direction, r=thomcc Reverse Timsort scan direction Another PR in the series of stable sort improvements. Best reviewed by looking at the individual commits. The main perf gain here is for fully ascending (sorted) or reversed inputs for cheap to compare types such as `u64`, these see a ~1.5x speedup. ![timsort_evo2_hot_u64_10k](https://user-images.githubusercontent.com/6864584/213913351-cfdf452f-a37c-4bc6-a811-d10c60e66eca.png) ![timsort_evo2_hot_string_10k](https://user-images.githubusercontent.com/6864584/213913354-d9cc395a-2b48-4f54-b687-09174b9e35ce.png) Types such as string with indirect pre-fetching see only minor changes. Further speedups are planned in future PRs so, I wouldn't spend too much time for benchmarks here.	2023-02-13 04:06:04 +00:00
Lukas Bergdoll	ee0376c368	Split branches in heapsort child selection This allows even better code-gen, cmp + adc. While also more clearly communicating the intent.	2023-02-11 09:32:52 +01:00
Lukas Bergdoll	7e072199a6	Speedup heapsort by 1.5x by making it branchless `slice::sort_unstable` will fall back to heapsort if it repeatedly fails to find a good pivot. By making the core child update code branchless it is much faster. On Zen3 sorting 10k `u64` and forcing the sort to pick heapsort, results in: 455us -> 278us	2023-02-10 18:05:12 +01:00
Michael Goulet	aee4570adf	Rollup merge of #107429 - tgross35:from-bytes-until-null-stabilization, r=dtolnay Stabilize feature `cstr_from_bytes_until_nul` This PR seeks to stabilize `cstr_from_bytes_until_nul`. Partially addresses #95027 This function has only been on nightly for about 10 months, but I think it is simple enough that there isn't harm discussing stabilization. It has also had at least a handful of mentions on both the user forum and the discord, so it seems like it's already in use or at least known. This needs FCP still. Comment on potential discussion points: - eventual conversion of `CStr` to be a single thin pointer: this function will still be useful to provide a safe way to create a `CStr` after this change. - should this return a length too, to address concerns about the `CStr` change? I don't see it as being particularly useful, and it seems less ergonomic (i.e. returning `Result<(&CStr, usize), FromBytesUntilNulError>`). I think users that also need this length without the additional `strlen` call are likely better off using a combination of other methods, but this is up for discussion - `CString::from_vec_until_nul`: this is also useful, but it doesn't even have a nightly implementation merged yet. I propose feature gating that separately, as opposed to blocking this `CStr` implementation on that Possible alternatives: A user can use `from_bytes_with_nul` on a slice up to `my_slice[..my_slice.iter().find(\|c\| c == 0).unwrap()]`. However; that is significantly less ergonomic, and is a bit more work for the compiler to optimize compared the direct `memchr` call that this wraps. ## New stable API ```rs // both in core::ffi pub struct FromBytesUntilNulError(()); impl CStr { pub const fn from_bytes_until_nul( bytes: &[u8] ) -> Result<&CStr, FromBytesUntilNulError> } ``` cc ```@ericseppanen``` original author, ```@Mark-Simulacrum``` original reviewer, ```@m-ou-se``` brought up some issues on the thin pointer CStr ```@rustbot``` modify labels: +T-libs-api +needs-fcp	2023-02-08 20:01:24 -08:00
Scott McMurray	5bc328fdef	Allow canonicalizing the `array::map` loop in trusted cases	2023-02-04 16:44:51 -08:00
Trevor Gross	877e9f5d3a	Change 'from_bytes_until_nul' to const stable	2023-02-01 02:14:07 -05:00
Lukas Markeffsky	2fbe9274aa	improve panic message for slice windows and chunks	2023-01-31 23:49:42 +01:00
Lukas Bergdoll	5eff264533	Document missing unsafe blocks	2023-01-23 09:12:25 +01:00
Lukas Bergdoll	f297afa0c9	Flip scanning direction of stable sort Memory pre-fetching prefers forward scanning vs backwards scanning, and the code-gen is usually better. For the most sensitive types such as integers, these are planned to be merged bidirectionally at once. So there is no benefit in scanning backwards. The largest perf gains are seen for full ascending and descending inputs, which see 1.5x speedups. Random inputs benefit too, and some patterns can loose out, but these losses are minimal.	2023-01-22 12:01:06 +01:00
Lukas Bergdoll	a3065a1a34	Unify insertion sort implementations Avoid duplicate insertion sort implementations. Optimize implementations.	2023-01-22 11:55:35 +01:00
Lukas Bergdoll	703ff60d9f	Use NonNull in merge_sort This is more clear about the intent of the pointer and avoids problems if the allocation returns a null pointer.	2023-01-21 10:17:06 +01:00
Michael Goulet	68b390ae2a	Rollup merge of #104672 - Voultapher:unify-sort-modules, r=thomcc Unify stable and unstable sort implementations in same core module This moves the stable sort implementation to the core::slice::sort module. By virtue of being in core it can't access `Vec`. The two `Vec` used by merge sort, `buf` and `runs`, are modelled as custom types that implement the very limited required `Vec` interface with the help of provided allocation and free functions. This is done to allow future re-use of functions and logic between stable and unstable sort. Such as `insert_head`. This is in preparation of #100856 and #104116. It only moves code, it doesn't change any of the sort related logic. This unlocks the ability to share `insert_head`, `insert_tail`, `swap_if_less` `merge` and more. Tagging ````@Mark-Simulacrum```` I hope this allows progress on #100856, by moving `merge_sort` here I hope future changes will be easier to review.	2023-01-20 21:33:21 -05:00
Matthias Krüger	788671c1c6	Rollup merge of #106997 - Sp00ph:introselect, r=scottmcm Add heapsort fallback in `select_nth_unstable` Addresses #102451 and #106933. `slice::select_nth_unstable` uses a quick select implementation based on the same pattern defeating quicksort algorithm that `slice::sort_unstable` uses. `slice::sort_unstable` uses a recursion limit and falls back to heapsort if there were too many bad pivot choices, to ensure O(n log n) worst case running time (known as introsort). However, `slice::select_nth_unstable` does not have such a fallback strategy, which leads to it having a worst case running time of O(n²) instead. #102451 links to a playground which generates pathological inputs that show this quadratic behavior. On my machine, a randomly generated slice of length `1 << 19` takes ~200µs to calculate its median, whereas a pathological input of the same length takes over 2.5s. This PR adds an iteration limit to `select_nth_unstable`, falling back to heapsort, which ensures an O(n log n) worst case running time (introselect). With this change, there was no noticable slowdown for the random input, but the same pathological input now takes only ~1.2ms. In the future it might be worth implementing something like Median of Medians or Fast Deterministic Selection instead, which guarantee O(n) running time for all possible inputs. I've left this as a `FIXME` for now and only implemented the heapsort fallback to minimize the needed code changes. I still think we should clarify in the `select_nth_unstable` docs that the worst case running time isn't currently O(n) (the original reason that #102451 was opened), but I think it's a lot better to be able to guarantee O(n log n) instead of O(n²) for the worst case.	2023-01-18 06:59:22 +01:00
Matthias Krüger	0ed2549802	Rollup merge of #106889 - scottmcm:windows-mut, r=cuviper Mention the lack of `windows_mut` in `windows` This is a common request, going back to at least 2015 (#23783), so mention in the docs that it can't be done and offer a workaround using <https://doc.rust-lang.org/std/cell/struct.Cell.html#method.as_slice_of_cells>. (See also URLO threads like <https://internals.rust-lang.org/t/a-windows-mut-method-on-slice/16941/10?u=scottmcm>.)	2023-01-17 20:21:27 +01:00
Markus Everling	273c6c3913	Add heapsort fallback in `select_nth_unstable`	2023-01-17 19:38:37 +01:00
The 8472	9db0134018	replace manual ptr arithmetic with ptr_sub	2023-01-15 17:38:05 +01:00
Scott McMurray	38917ee9e9	Mention the lack of `windows_mut` in `windows`	2023-01-14 15:31:32 -08:00
André Vennberg	2fea03f5e6	Fix some missed double spaces.	2023-01-14 18:26:38 +01:00
André Vennberg	0b35f448f8	Remove various double spaces in source comments.	2023-01-14 17:22:04 +01:00
jonathanCogan	72067c77bd	Replace libstd, libcore, liballoc in docs.	2022-12-30 14:00:40 +01:00
Yuki Okushi	342d1b7f01	Rollup merge of #105584 - raffimolero:patch-1, r=JohnTitor add assert messages if chunks/windows are length 0	2022-12-22 08:32:09 +09:00
Scott McMurray	a37d42133c	Another `as_chunks` example I really liked this structure that dtolney brought up in #105316, so wanted to put it in the docs to help others use it.	2022-12-17 18:41:14 -08:00
Hannes Körber	9671dd239d	doc: Fix a few small issues * A few typos around generic types (`;` vs `,`) * Use inline code formatting for code fragments * One instance of wrong wording	2022-12-15 14:05:03 +01:00
raffimolero	46f6e39ac6	add assert messages if chunks/windows are length 0	2022-12-12 12:28:40 +08:00
bors	f058493307	Auto merge of #105262 - eduardosm:more-inline-always, r=thomcc Make some trivial functions `#[inline(always)]` This is some kind of follow-up of PRs like https://github.com/rust-lang/rust/pull/85218, https://github.com/rust-lang/rust/pull/84061, https://github.com/rust-lang/rust/pull/87150. Functions that do very basic operations are made `#[inline(always)]` to avoid pessimizing them in debug builds when compared to using built-in operations directly.	2022-12-09 15:42:18 +00:00
Eduardo Sánchez Muñoz	00e7b54d46	Make some trivial functions `#[inline(always)]`	2022-12-07 17:11:17 +01:00
Ralf Jung	ee21454e61	attempt to clarify align_to docs	2022-12-05 11:37:55 +01:00
bors	32e613bbaa	Auto merge of #104999 - saethlin:immediate-abort-inlining, r=thomcc Adjust inlining attributes around panic_immediate_abort The goal of `panic_immediate_abort` is to permit the panic runtime and formatting code paths to be optimized away. But while poking through some disassembly of a small program compiled with that option, I found that was not the case. Enabling LTO did address that specific issue, but enabling LTO is a steep price to pay for this feature doing its job. This PR fixes that, by tweaking two things: * All the slice indexing functions that we `const_eval_select` on get `#[inline]`. `objdump -dC` told me that originally some `_ct` functions could end up in an executable. I won't pretend to understand what's going on there. * Normalize attributes across all `panic!` wrappers: use `inline(never) + cold` normally, and `inline` when `panic_immediate_abort` is enabled. But also, with LTO and `panic_immediate_abort` enabled, this patch knocks ~709 kB out of the `.text` segment of `librustc_driver.so`. That is slightly surprising to me, my best theory is that this shifts some inlining earlier in compilation, enabling some subsequent optimizations. The size improvement of `librustc_driver.so` with `panic_immediate_abort` due to this patch is greater with LTO than without LTO, which I suppose backs up this theory. I do not know how to test this. I would quite like to, because I think what this is solving was an accidental regression. This only works with `-Zbuild-std` which is a cargo flag, and thus can't be used in a rustc codegen test. r? `@thomcc` --- I do not seriously think anyone is going to use a compiler built with `panic_immediate_abort`, but I wanted a big complicated Rust program to try this out on, and the compiler is such.	2022-12-02 20:07:23 +00:00
Ben Kimock	906c3601fa	Adjust inlining attributes around panic_immediate_abort	2022-11-29 09:24:01 -05:00
Manish Goregaokar	1dd515f273	Rollup merge of #83608 - Kimundi:index_many, r=Mark-Simulacrum Add slice methods for indexing via an array of indices. Disclaimer: It's been a while since I contributed to the main Rust repo, apologies in advance if this is large enough already that it should've been an RFC. --- # Update: - Based on feedback, removed the `&[T]` variant of this API, and removed the requirements for the indices to be sorted. # Description This adds the following slice methods to `core`: ```rust impl<T> [T] { pub unsafe fn get_many_unchecked_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N]; pub fn get_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> Option<[&mut T; N]>; } ``` This allows creating multiple mutable references to disjunct positions in a slice, which previously required writing some awkward code with `split_at_mut()` or `iter_mut()`. For the bound-checked variant, the indices are checked against each other and against the bounds of the slice, which requires `N * (N + 1) / 2` comparison operations. This has a proof-of-concept standalone implementation here: https://crates.io/crates/index_many Care has been taken that the implementation passes miri borrow checks, and generates straight-forward assembly (though this was only checked on x86_64). # Example ```rust let v = &mut [1, 2, 3, 4]; let [a, b] = v.get_many_mut([0, 2]).unwrap(); std::mem::swap(a, b); v += 100; assert_eq!(v, &[3, 2, 101, 4]); ``` # Codegen Examples <details> <summary>Click to expand!</summary> Disclaimer: Taken from local tests with the standalone implementation. ## Unchecked Indexing: ```rust pub unsafe fn example_unchecked(slice: &mut [usize], indices: [usize; 3]) -> [&mut usize; 3] { slice.get_many_unchecked_mut(indices) } ``` ```nasm example_unchecked: mov rcx, qword, ptr, [r9] mov r8, qword, ptr, [r9, +, 8] mov r9, qword, ptr, [r9, +, 16] lea rcx, [rdx, +, 8rcx] lea r8, [rdx, +, 8r8] lea rdx, [rdx, +, 8r9] mov qword, ptr, [rax], rcx mov qword, ptr, [rax, +, 8], r8 mov qword, ptr, [rax, +, 16], rdx ret ``` ## Checked Indexing (Option): ```rust pub unsafe fn example_option(slice: &mut [usize], indices: [usize; 3]) -> Option<[&mut usize; 3]> { slice.get_many_mut(indices) } ``` ```nasm mov r10, qword, ptr, [r9, +, 8] mov rcx, qword, ptr, [r9, +, 16] cmp rcx, r10 je .LBB0_7 mov r9, qword, ptr, [r9] cmp rcx, r9 je .LBB0_7 cmp rcx, r8 jae .LBB0_7 cmp r10, r9 je .LBB0_7 cmp r9, r8 jae .LBB0_7 cmp r10, r8 jae .LBB0_7 lea r8, [rdx, +, 8r9] lea r9, [rdx, +, 8r10] lea rcx, [rdx, +, 8rcx] mov qword, ptr, [rax], r8 mov qword, ptr, [rax, +, 8], r9 mov qword, ptr, [rax, +, 16], rcx ret .LBB0_7: mov qword, ptr, [rax], 0 ret ``` ## Checked Indexing (Panic): ```rust pub fn example_panic(slice: &mut [usize], indices: [usize; 3]) -> [&mut usize; 3] { let len = slice.len(); match slice.get_many_mut(indices) { Some(s) => s, None => { let tmp = indices; index_many::sorted_bound_check_failed(&tmp, len) } } } ``` ```nasm example_panic: sub rsp, 56 mov rax, qword, ptr, [r9] mov r10, qword, ptr, [r9, +, 8] mov r9, qword, ptr, [r9, +, 16] cmp r9, r10 je .LBB0_6 cmp r9, rax je .LBB0_6 cmp r9, r8 jae .LBB0_6 cmp r10, rax je .LBB0_6 cmp rax, r8 jae .LBB0_6 cmp r10, r8 jae .LBB0_6 lea rax, [rdx, +, 8rax] lea r8, [rdx, +, 8r10] lea rdx, [rdx, +, 8r9] mov qword, ptr, [rcx], rax mov qword, ptr, [rcx, +, 8], r8 mov qword, ptr, [rcx, +, 16], rdx mov rax, rcx add rsp, 56 ret .LBB0_6: mov qword, ptr, [rsp, +, 32], rax mov qword, ptr, [rsp, +, 40], r10 mov qword, ptr, [rsp, +, 48], r9 lea rcx, [rsp, +, 32] mov edx, 3 call index_many::bound_check_failed ud2 ``` </details> # Extensions There are multiple optional extensions to this. ## Indexing With Ranges This could easily be expanded to allow indexing with `[I; N]` where `I: SliceIndex<Self>`. I wanted to keep the initial implementation simple, so I didn't include it yet. ## Panicking Variant We could also add this method: ```rust impl<T> [T] { fn index_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N]; } ``` This would work similar to the regular index operator and panic with out-of-bound indices. The advantage would be that we could more easily ensure good codegen with a useful panic message, which is non-trivial with the `Option` variant. This is implemented in the standalone implementation, and used as basis for the codegen examples here and there.	2022-11-22 01:26:05 -05:00
Lukas Bergdoll	4b5844fbe9	Document all unsafe blocks There were several unsafe blocks in the existing implementation that were not documented with a SAFETY comment.	2022-11-21 14:30:56 +01:00
Lukas Bergdoll	1ec59cdcd1	Remove debug unused	2022-11-21 14:20:31 +01:00
Lukas Bergdoll	dbc0ed2a10	Unify stable and unstable sort implementations in same core module This moves the stable sort implementation to the core::slice::sort module. By virtue of being in core it can't access `Vec`. The two `Vec` used by merge sort, `buf` and `runs`, are modelled as custom types that implement the very limited required `Vec` interface with the help of provided allocation and free functions. This is done to allow future re-use of functions and logic between stable and unstable sort. Such as `insert_head`.	2022-11-20 20:35:40 +01:00
Felix S. Klock II	98993af828	add examples to chunks remainder methods. Also fixed some links to rchunk remainder methods.	2022-11-20 11:43:23 -05:00
Marvin Löbel	3fe37b8c6e	Add get_many_mut methods to slice	2022-11-20 11:19:11 -05:00
Manish Goregaokar	8aca6ccedd	Rollup merge of #102977 - lukas-code:is-sorted-hrtb, r=m-ou-se remove HRTB from `[T]::is_sorted_by{,_key}` Changes the signature of `[T]::is_sorted_by{,_key}` to match `[T]::binary_search_by{,_key}` and make code like https://github.com/rust-lang/rust/issues/53485#issuecomment-885393452 compile. Tracking issue: https://github.com/rust-lang/rust/issues/53485 ~~Do we need an ACP for something like this?~~ Edit: Filed ACP here: https://github.com/rust-lang/libs-team/issues/121	2022-11-18 17:48:16 -05:00
Dylan DPC	64e737c07c	Rollup merge of #104111 - yancyribbens:add-mutable-to-the-description-for-as-simd-mut, r=scottmcm rustdoc: Add mutable to the description Add mutable the description to differentiate [as_simd](https://github.com/rust-lang/rust/blob/master/library/core/src/slice/mod.rs#L3654) from [as_simd_mut](https://github.com/rust-lang/rust/blob/master/library/core/src/slice/mod.rs#L3654).	2022-11-09 19:21:24 +05:30
yancy	f67ee43fe3	rustdoc: Add mutable to the description	2022-11-07 17:02:48 +01:00
yancy	d62582f92a	rustdoc: Add mutable to the description	2022-11-07 16:51:23 +01:00
Ben Kimock	458aaa5a23	Print the precondition we violated, and visible through output capture Co-authored-by: Ralf Jung <post@ralfj.de>	2022-10-26 22:09:17 -04:00
Dylan DPC	8ed3a80b9a	Rollup merge of #103287 - saethlin:faster-len-check, r=thomcc Use a faster allocation size check in slice::from_raw_parts I've been perusing through the codegen changes that result from turning on the standard library debug assertions. The previous check in here uses saturating arithmetic, which in my experience sometimes makes LLVM just fail to optimize things around the saturating operation. Here is a demo of the codegen difference: https://godbolt.org/z/WMEqrjajW Before: ```asm example::len_check_old: mov rax, rdi mov ecx, 3 mul rcx setno cl test rax, rax setns al and al, cl ret example::len_check_old: mov rax, rdi mov ecx, 8 mul rcx setno cl test rax, rax setns al and al, cl ret ``` After: ```asm example::len_check_new: movabs rax, 3074457345618258603 cmp rdi, rax setb al ret example::len_check_new: shr rdi, 60 sete al ret ``` Running rustc-perf locally, this looks like up to a 4.5% improvement when `debug-assertions-std = true`. Thanks ```@LegionMammal978``` (I think that's you?) for turning my idea into a much cleaner implementation. r? ```@thomcc```	2022-10-26 11:29:53 +05:30
bors	56f132565e	Auto merge of #100848 - xfix:use-metadata-for-slice-len, r=thomcc Use ptr::metadata in <[T]>::len implementation This avoids duplication of ptr::metadata code. I believe this is acceptable as the previous approach essentially duplicated `ptr::metadata` because back then `rustc_allow_const_fn_unstable` annotation did not exist. I would like somebody to ping `@rust-lang/wg-const-eval` as the documentation says: > Always ping `@rust-lang/wg-const-eval` if you are adding more rustc_allow_const_fn_unstable attributes to any const fn.	2022-10-24 04:14:46 +00:00
Ben Kimock	cfcb0a2135	Use a faster allocation size check in slice::from_raw_parts	2022-10-20 00:30:00 -04:00
Matthias Krüger	18431b66ce	Rollup merge of #102507 - scottmcm:more-binary-search-docs, r=m-ou-se More slice::partition_point examples After seeing the discussion of `binary_search` vs `partition_point` in #101999, I thought some more example code could be helpful.	2022-10-18 21:18:46 +02:00
Scott McMurray	5b9a02a87d	More slice::partition_point examples	2022-10-15 14:03:56 -07:00
Lukas Markeffsky	a02ec4cf18	remove HRTB from `[T]::is_sorted_by{,_key}`	2022-10-12 18:39:22 +02:00

1 2 3 4 5 ...

474 Commits