Commit Graph

474 Commits

Author SHA1 Message Date
zhangyunhao
e107ca0f0b Optimize break patterns 2023-02-22 16:02:35 +00:00
Dylan DPC
e802713941
Rollup merge of #106933 - schuelermine:fix/doc/102451, r=Amanieu
Update documentation of select_nth_unstable and select_nth_unstable_by to state O(n^2) complexity

See #102451
2023-02-19 13:03:40 +05:30
Anselm Schüler
f1e649b378 Update documentation of select_nth_unstable and select_nth_unstable_by and select_nth_unstable_by_key to state O(n log n) worst case complexity
Also remove erronious / in doc comment
2023-02-18 16:18:34 +01:00
bors
2d91939bb7 Auto merge of #107634 - scottmcm:array-drain, r=thomcc
Improve the `array::map` codegen

The `map` method on arrays [is documented as sometimes performing poorly](https://doc.rust-lang.org/std/primitive.array.html#note-on-performance-and-stack-usage), and after [a question on URLO](https://users.rust-lang.org/t/try-trait-residual-o-trait-and-try-collect-into-array/88510?u=scottmcm) prompted me to take another look at the core [`try_collect_into_array`](7c46fb2111/library/core/src/array/mod.rs (L865-L912)) function, I had some ideas that ended up working better than I'd expected.

There's three main ideas in here, split over three commits:
1. Don't use `array::IntoIter` when we can avoid it, since that seems to not get SRoA'd, meaning that every step writes things like loop counters into the stack unnecessarily
2. Don't return arrays in `Result`s unnecessarily, as that doesn't seem to optimize away even with `unwrap_unchecked` (perhaps because it needs to get moved into a new LLVM type to account for the discriminant)
3. Don't distract LLVM with all the `Option` dances when we know for sure we have enough items (like in `map` and `zip`).  This one's a larger commit as to do it I ended up adding a new `pub(crate)` trait, but hopefully those changes are still straight-forward.

(No libs-api changes; everything should be completely implementation-detail-internal.)

It's still not completely fixed -- I think it needs pcwalton's `memcpy` optimizations still (#103830) to get further -- but this seems to go much better than before.  And the remaining `memcpy`s are just `transmute`-equivalent (`[T; N] -> ManuallyDrop<[T; N]>` and `[MaybeUninit<T>; N] -> [T; N]`), so hopefully those will be easier to remove with LLVM16 than the previous subobject copies 🤞

r? `@thomcc`

As a simple example, this test
```rust
pub fn long_integer_map(x: [u32; 64]) -> [u32; 64] {
    x.map(|x| 13 * x + 7)
}
```
On nightly <https://rust.godbolt.org/z/xK7548TGj> takes `sub rsp, 808`
```llvm
start:
  %array.i.i.i.i = alloca [64 x i32], align 4
  %_3.sroa.5.i.i.i = alloca [65 x i32], align 4
  %_5.i = alloca %"core::iter::adapters::map::Map<core::array::iter::IntoIter<u32, 64>, [closure@/app/example.rs:2:11: 2:14]>", align 8
```
(and yes, that's a 6**5**-element array `alloca` despite 6**4**-element input and output)

But with this PR it's only `sub rsp, 520`
```llvm
start:
  %array.i.i.i.i.i.i = alloca [64 x i32], align 4
  %array1.i.i.i = alloca %"core::mem::manually_drop::ManuallyDrop<[u32; 64]>", align 4
```

Similarly, the loop it emits on nightly is scalar-only and horrifying
```nasm
.LBB0_1:
        mov     esi, 64
        mov     edi, 0
        cmp     rdx, 64
        je      .LBB0_3
        lea     rsi, [rdx + 1]
        mov     qword ptr [rsp + 784], rsi
        mov     r8d, dword ptr [rsp + 4*rdx + 528]
        mov     edi, 1
        lea     edx, [r8 + 2*r8]
        lea     r8d, [r8 + 4*rdx]
        add     r8d, 7
.LBB0_3:
        test    edi, edi
        je      .LBB0_11
        mov     dword ptr [rsp + 4*rcx + 272], r8d
        cmp     rsi, 64
        jne     .LBB0_6
        xor     r8d, r8d
        mov     edx, 64
        test    r8d, r8d
        jne     .LBB0_8
        jmp     .LBB0_11
.LBB0_6:
        lea     rdx, [rsi + 1]
        mov     qword ptr [rsp + 784], rdx
        mov     edi, dword ptr [rsp + 4*rsi + 528]
        mov     r8d, 1
        lea     esi, [rdi + 2*rdi]
        lea     edi, [rdi + 4*rsi]
        add     edi, 7
        test    r8d, r8d
        je      .LBB0_11
.LBB0_8:
        mov     dword ptr [rsp + 4*rcx + 276], edi
        add     rcx, 2
        cmp     rcx, 64
        jne     .LBB0_1
```

whereas with this PR it's unrolled and vectorized
```nasm
	vpmulld	ymm1, ymm0, ymmword ptr [rsp + 64]
	vpaddd	ymm1, ymm1, ymm2
	vmovdqu	ymmword ptr [rsp + 328], ymm1
	vpmulld	ymm1, ymm0, ymmword ptr [rsp + 96]
	vpaddd	ymm1, ymm1, ymm2
	vmovdqu	ymmword ptr [rsp + 360], ymm1
```
(though sadly still stack-to-stack)
2023-02-13 10:18:48 +00:00
bors
96834f0231 Auto merge of #107191 - Voultapher:reverse-timsort-scan-direction, r=thomcc
Reverse Timsort scan direction

Another PR in the series of stable sort improvements. Best reviewed by looking at the individual commits.

The main perf gain here is for fully ascending (sorted) or reversed inputs for cheap to compare types such as `u64`, these see a ~1.5x speedup.

![timsort_evo2_hot_u64_10k](https://user-images.githubusercontent.com/6864584/213913351-cfdf452f-a37c-4bc6-a811-d10c60e66eca.png)

![timsort_evo2_hot_string_10k](https://user-images.githubusercontent.com/6864584/213913354-d9cc395a-2b48-4f54-b687-09174b9e35ce.png)

Types such as string with indirect pre-fetching see only minor changes. Further speedups are planned in future PRs so, I wouldn't spend too much time for benchmarks here.
2023-02-13 04:06:04 +00:00
Lukas Bergdoll
ee0376c368 Split branches in heapsort child selection
This allows even better code-gen, cmp + adc. While also more clearly
communicating the intent.
2023-02-11 09:32:52 +01:00
Lukas Bergdoll
7e072199a6 Speedup heapsort by 1.5x by making it branchless
`slice::sort_unstable` will fall back to heapsort if it repeatedly fails to find
a good pivot. By making the core child update code branchless it is much faster.
On Zen3 sorting 10k `u64` and forcing the sort to pick heapsort, results in:

455us -> 278us
2023-02-10 18:05:12 +01:00
Michael Goulet
aee4570adf
Rollup merge of #107429 - tgross35:from-bytes-until-null-stabilization, r=dtolnay
Stabilize feature `cstr_from_bytes_until_nul`

This PR seeks to stabilize `cstr_from_bytes_until_nul`.

Partially addresses #95027

This function has only been on nightly for about 10 months, but I think it is simple enough that there isn't harm discussing stabilization. It has also had at least a handful of mentions on both the user forum and the discord, so it seems like it's already in use or at least known.

This needs FCP still.

Comment on potential discussion points:
- eventual conversion of `CStr` to be a single thin pointer: this function will still be useful to provide a safe way to create a `CStr` after this change.
- should this return a length too, to address concerns about the `CStr` change? I don't see it as being particularly useful, and it seems less ergonomic (i.e. returning `Result<(&CStr, usize), FromBytesUntilNulError>`). I think users that also need this length without the additional `strlen` call are likely better off using a combination of other methods, but this is up for discussion
- `CString::from_vec_until_nul`: this is also useful, but it doesn't even have a nightly implementation merged yet. I propose feature gating that separately, as opposed to blocking this `CStr` implementation on that

Possible alternatives:

A user can use `from_bytes_with_nul` on a slice up to `my_slice[..my_slice.iter().find(|c| c == 0).unwrap()]`. However; that is significantly less ergonomic, and is a bit more work for the compiler to optimize compared the direct `memchr` call that this wraps.

## New stable API

```rs
// both in core::ffi

pub struct FromBytesUntilNulError(());

impl CStr {
    pub const fn from_bytes_until_nul(
        bytes: &[u8]
    ) -> Result<&CStr, FromBytesUntilNulError>
}
```

cc ```@ericseppanen``` original author, ```@Mark-Simulacrum``` original reviewer, ```@m-ou-se``` brought up some issues on the thin pointer CStr

```@rustbot``` modify labels: +T-libs-api +needs-fcp
2023-02-08 20:01:24 -08:00
Scott McMurray
5bc328fdef Allow canonicalizing the array::map loop in trusted cases 2023-02-04 16:44:51 -08:00
Trevor Gross
877e9f5d3a Change 'from_bytes_until_nul' to const stable 2023-02-01 02:14:07 -05:00
Lukas Markeffsky
2fbe9274aa improve panic message for slice windows and chunks 2023-01-31 23:49:42 +01:00
Lukas Bergdoll
5eff264533 Document missing unsafe blocks 2023-01-23 09:12:25 +01:00
Lukas Bergdoll
f297afa0c9 Flip scanning direction of stable sort
Memory pre-fetching prefers forward scanning vs backwards scanning, and the
code-gen is usually better. For the most sensitive types such as integers, these
are planned to be merged bidirectionally at once. So there is no benefit in
scanning backwards.

The largest perf gains are seen for full ascending and descending inputs, which
see 1.5x speedups. Random inputs benefit too, and some patterns can loose out,
but these losses are minimal.
2023-01-22 12:01:06 +01:00
Lukas Bergdoll
a3065a1a34 Unify insertion sort implementations
Avoid duplicate insertion sort implementations.
Optimize implementations.
2023-01-22 11:55:35 +01:00
Lukas Bergdoll
703ff60d9f Use NonNull in merge_sort
This is more clear about the intent of the pointer and avoids problems
if the allocation returns a null pointer.
2023-01-21 10:17:06 +01:00
Michael Goulet
68b390ae2a
Rollup merge of #104672 - Voultapher:unify-sort-modules, r=thomcc
Unify stable and unstable sort implementations in same core module

This moves the stable sort implementation to the core::slice::sort module. By virtue of being in core it can't access `Vec`. The two `Vec` used by merge sort, `buf` and `runs`, are modelled as custom types that implement the very limited required `Vec` interface with the help of provided allocation and free functions. This is done to allow future re-use of functions and logic between stable and unstable sort. Such as `insert_head`.

This is in preparation of #100856 and #104116. It only moves code, it *doesn't* change any of the sort related logic. This unlocks the ability to share `insert_head`, `insert_tail`, `swap_if_less` `merge` and more.

Tagging ````@Mark-Simulacrum```` I hope this allows progress on #100856, by moving `merge_sort` here I hope future changes will be easier to review.
2023-01-20 21:33:21 -05:00
Matthias Krüger
788671c1c6
Rollup merge of #106997 - Sp00ph:introselect, r=scottmcm
Add heapsort fallback in `select_nth_unstable`

Addresses #102451 and #106933.

`slice::select_nth_unstable` uses a quick select implementation based on the same pattern defeating quicksort algorithm that `slice::sort_unstable` uses. `slice::sort_unstable` uses a recursion limit and falls back to heapsort if there were too many bad pivot choices, to ensure O(n log n) worst case running time (known as introsort). However, `slice::select_nth_unstable` does not have such a fallback strategy, which leads to it having a worst case running time of O(n²) instead. #102451 links to a playground which generates pathological inputs that show this quadratic behavior. On my machine, a randomly generated slice of length `1 << 19` takes ~200µs to calculate its median, whereas a pathological input of the same length takes over 2.5s. This PR adds an iteration limit to `select_nth_unstable`, falling back to heapsort, which ensures an O(n log n) worst case running time (introselect). With this change, there was no noticable slowdown for the random input, but the same pathological input now takes only ~1.2ms. In the future it might be worth implementing something like Median of Medians or Fast Deterministic Selection instead, which guarantee O(n) running time for all possible inputs. I've left this as a `FIXME` for now and only implemented the heapsort fallback to minimize the needed code changes.

I still think we should clarify in the `select_nth_unstable` docs that the worst case running time isn't currently O(n) (the original reason that #102451 was opened), but I think it's a lot better to be able to guarantee O(n log n) instead of O(n²) for the worst case.
2023-01-18 06:59:22 +01:00
Matthias Krüger
0ed2549802
Rollup merge of #106889 - scottmcm:windows-mut, r=cuviper
Mention the lack of `windows_mut` in `windows`

This is a common request, going back to at least 2015 (#23783), so mention in the docs that it can't be done and offer a workaround using <https://doc.rust-lang.org/std/cell/struct.Cell.html#method.as_slice_of_cells>.

(See also URLO threads like <https://internals.rust-lang.org/t/a-windows-mut-method-on-slice/16941/10?u=scottmcm>.)
2023-01-17 20:21:27 +01:00
Markus Everling
273c6c3913 Add heapsort fallback in select_nth_unstable 2023-01-17 19:38:37 +01:00
The 8472
9db0134018 replace manual ptr arithmetic with ptr_sub 2023-01-15 17:38:05 +01:00
Scott McMurray
38917ee9e9 Mention the lack of windows_mut in windows 2023-01-14 15:31:32 -08:00
André Vennberg
2fea03f5e6 Fix some missed double spaces. 2023-01-14 18:26:38 +01:00
André Vennberg
0b35f448f8 Remove various double spaces in source comments. 2023-01-14 17:22:04 +01:00
jonathanCogan
72067c77bd Replace libstd, libcore, liballoc in docs. 2022-12-30 14:00:40 +01:00
Yuki Okushi
342d1b7f01
Rollup merge of #105584 - raffimolero:patch-1, r=JohnTitor
add assert messages if chunks/windows are length 0
2022-12-22 08:32:09 +09:00
Scott McMurray
a37d42133c Another as_chunks example
I really liked this structure that dtolney brought up in #105316, so wanted to put it in the docs to help others use it.
2022-12-17 18:41:14 -08:00
Hannes Körber
9671dd239d doc: Fix a few small issues
* A few typos around generic types (`;` vs `,`)
* Use inline code formatting for code fragments
* One instance of wrong wording
2022-12-15 14:05:03 +01:00
raffimolero
46f6e39ac6
add assert messages if chunks/windows are length 0 2022-12-12 12:28:40 +08:00
bors
f058493307 Auto merge of #105262 - eduardosm:more-inline-always, r=thomcc
Make some trivial functions `#[inline(always)]`

This is some kind of follow-up of PRs like https://github.com/rust-lang/rust/pull/85218, https://github.com/rust-lang/rust/pull/84061, https://github.com/rust-lang/rust/pull/87150. Functions that do very basic operations are made `#[inline(always)]` to avoid pessimizing them in debug builds when compared to using built-in operations directly.
2022-12-09 15:42:18 +00:00
Eduardo Sánchez Muñoz
00e7b54d46 Make some trivial functions #[inline(always)] 2022-12-07 17:11:17 +01:00
Ralf Jung
ee21454e61 attempt to clarify align_to docs 2022-12-05 11:37:55 +01:00
bors
32e613bbaa Auto merge of #104999 - saethlin:immediate-abort-inlining, r=thomcc
Adjust inlining attributes around panic_immediate_abort

The goal of `panic_immediate_abort` is to permit the panic runtime and formatting code paths to be optimized away. But while poking through some disassembly of a small program compiled with that option, I found that was not the case. Enabling LTO did address that specific issue, but enabling LTO is a steep price to pay for this feature doing its job.

This PR fixes that, by tweaking two things:
* All the slice indexing functions that we `const_eval_select` on get `#[inline]`. `objdump -dC` told me that originally some `_ct` functions could end up in an executable. I won't pretend to understand what's going on there.
* Normalize attributes across all `panic!` wrappers: use `inline(never) + cold` normally, and `inline` when `panic_immediate_abort` is enabled.

But also, with LTO and `panic_immediate_abort` enabled, this patch knocks ~709 kB out of the `.text` segment of `librustc_driver.so`. That is slightly surprising to me, my best theory is that this shifts some inlining earlier in compilation, enabling some subsequent optimizations. The size improvement of `librustc_driver.so` with `panic_immediate_abort` due to this patch is greater with LTO than without LTO, which I suppose backs up this theory.

I do not know how to test this. I would quite like to, because I think what this is solving was an accidental regression. This only works with `-Zbuild-std` which is a cargo flag, and thus can't be used in a rustc codegen test.

r? `@thomcc`

---

I do not seriously think anyone is going to use a compiler built with `panic_immediate_abort`, but I wanted a big complicated Rust program to try this out on, and the compiler is such.
2022-12-02 20:07:23 +00:00
Ben Kimock
906c3601fa Adjust inlining attributes around panic_immediate_abort 2022-11-29 09:24:01 -05:00
Manish Goregaokar
1dd515f273
Rollup merge of #83608 - Kimundi:index_many, r=Mark-Simulacrum
Add slice methods for indexing via an array of indices.

Disclaimer: It's been a while since I contributed to the main Rust repo, apologies in advance if this is large enough already that it should've been an RFC.

---

# Update:

- Based on feedback, removed the `&[T]` variant of this API, and removed the requirements for the indices to be sorted.

# Description

This adds the following slice methods to `core`:

```rust
impl<T> [T] {
    pub unsafe fn get_many_unchecked_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N];
    pub fn get_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> Option<[&mut T; N]>;
}
```

This allows creating multiple mutable references to disjunct positions in a slice, which previously required writing some awkward code with `split_at_mut()` or `iter_mut()`. For the bound-checked variant, the indices are checked against each other and against the bounds of the slice, which requires `N * (N + 1) / 2` comparison operations.

This has a proof-of-concept standalone implementation here: https://crates.io/crates/index_many

Care has been taken that the implementation passes miri borrow checks, and generates straight-forward assembly (though this was only checked on x86_64).

# Example

```rust
let v = &mut [1, 2, 3, 4];
let [a, b] = v.get_many_mut([0, 2]).unwrap();
std::mem::swap(a, b);
*v += 100;
assert_eq!(v, &[3, 2, 101, 4]);
```

# Codegen Examples

<details>
  <summary>Click to expand!</summary>

Disclaimer: Taken from local tests with the standalone implementation.

## Unchecked Indexing:

```rust
pub unsafe fn example_unchecked(slice: &mut [usize], indices: [usize; 3]) -> [&mut usize; 3] {
    slice.get_many_unchecked_mut(indices)
}
```

```nasm
example_unchecked:
 mov     rcx, qword, ptr, [r9]
 mov     r8, qword, ptr, [r9, +, 8]
 mov     r9, qword, ptr, [r9, +, 16]
 lea     rcx, [rdx, +, 8*rcx]
 lea     r8, [rdx, +, 8*r8]
 lea     rdx, [rdx, +, 8*r9]
 mov     qword, ptr, [rax], rcx
 mov     qword, ptr, [rax, +, 8], r8
 mov     qword, ptr, [rax, +, 16], rdx
 ret
```

## Checked Indexing (Option):

```rust
pub unsafe fn example_option(slice: &mut [usize], indices: [usize; 3]) -> Option<[&mut usize; 3]> {
    slice.get_many_mut(indices)
}
```

```nasm
 mov     r10, qword, ptr, [r9, +, 8]
 mov     rcx, qword, ptr, [r9, +, 16]
 cmp     rcx, r10
 je      .LBB0_7
 mov     r9, qword, ptr, [r9]
 cmp     rcx, r9
 je      .LBB0_7
 cmp     rcx, r8
 jae     .LBB0_7
 cmp     r10, r9
 je      .LBB0_7
 cmp     r9, r8
 jae     .LBB0_7
 cmp     r10, r8
 jae     .LBB0_7
 lea     r8, [rdx, +, 8*r9]
 lea     r9, [rdx, +, 8*r10]
 lea     rcx, [rdx, +, 8*rcx]
 mov     qword, ptr, [rax], r8
 mov     qword, ptr, [rax, +, 8], r9
 mov     qword, ptr, [rax, +, 16], rcx
 ret
.LBB0_7:
 mov     qword, ptr, [rax], 0
 ret
```

## Checked Indexing (Panic):

```rust
pub fn example_panic(slice: &mut [usize], indices: [usize; 3]) -> [&mut usize; 3] {
    let len = slice.len();
    match slice.get_many_mut(indices) {
        Some(s) => s,
        None => {
            let tmp = indices;
            index_many::sorted_bound_check_failed(&tmp, len)
        }
    }
}
```

```nasm
example_panic:
 sub     rsp, 56
 mov     rax, qword, ptr, [r9]
 mov     r10, qword, ptr, [r9, +, 8]
 mov     r9, qword, ptr, [r9, +, 16]
 cmp     r9, r10
 je      .LBB0_6
 cmp     r9, rax
 je      .LBB0_6
 cmp     r9, r8
 jae     .LBB0_6
 cmp     r10, rax
 je      .LBB0_6
 cmp     rax, r8
 jae     .LBB0_6
 cmp     r10, r8
 jae     .LBB0_6
 lea     rax, [rdx, +, 8*rax]
 lea     r8, [rdx, +, 8*r10]
 lea     rdx, [rdx, +, 8*r9]
 mov     qword, ptr, [rcx], rax
 mov     qword, ptr, [rcx, +, 8], r8
 mov     qword, ptr, [rcx, +, 16], rdx
 mov     rax, rcx
 add     rsp, 56
 ret
.LBB0_6:
 mov     qword, ptr, [rsp, +, 32], rax
 mov     qword, ptr, [rsp, +, 40], r10
 mov     qword, ptr, [rsp, +, 48], r9
 lea     rcx, [rsp, +, 32]
 mov     edx, 3
 call    index_many::bound_check_failed
 ud2
```
</details>

# Extensions

There are multiple optional extensions to this.

## Indexing With Ranges

This could easily be expanded to allow indexing with `[I; N]` where `I: SliceIndex<Self>`.  I wanted to keep the initial implementation simple, so I didn't include it yet.

## Panicking Variant

We could also add this method:

```rust
impl<T> [T] {
    fn index_many_mut<const N: usize>(&mut self, indices: [usize; N]) -> [&mut T; N];
}
```

This would work similar to the regular index operator and panic with out-of-bound indices. The advantage would be that we could more easily ensure good codegen with a useful panic message, which is non-trivial with the `Option` variant.

This is implemented in the standalone implementation, and used as basis for the codegen examples here and there.
2022-11-22 01:26:05 -05:00
Lukas Bergdoll
4b5844fbe9 Document all unsafe blocks
There were several unsafe blocks in the existing implementation that
were not documented with a SAFETY comment.
2022-11-21 14:30:56 +01:00
Lukas Bergdoll
1ec59cdcd1 Remove debug unused 2022-11-21 14:20:31 +01:00
Lukas Bergdoll
dbc0ed2a10 Unify stable and unstable sort implementations in same core module
This moves the stable sort implementation to the core::slice::sort module. By
virtue of being in core it can't access `Vec`. The two `Vec` used by merge sort,
`buf` and `runs`, are modelled as custom types that implement the very limited
required `Vec` interface with the help of provided allocation and free
functions. This is done to allow future re-use of functions and logic between
stable and unstable sort. Such as `insert_head`.
2022-11-20 20:35:40 +01:00
Felix S. Klock II
98993af828 add examples to chunks remainder methods. Also fixed some links to rchunk remainder methods. 2022-11-20 11:43:23 -05:00
Marvin Löbel
3fe37b8c6e Add get_many_mut methods to slice 2022-11-20 11:19:11 -05:00
Manish Goregaokar
8aca6ccedd
Rollup merge of #102977 - lukas-code:is-sorted-hrtb, r=m-ou-se
remove HRTB from `[T]::is_sorted_by{,_key}`

Changes the signature of `[T]::is_sorted_by{,_key}` to match `[T]::binary_search_by{,_key}` and make code like https://github.com/rust-lang/rust/issues/53485#issuecomment-885393452 compile.

Tracking issue: https://github.com/rust-lang/rust/issues/53485

~~Do we need an ACP for something like this?~~ Edit: Filed ACP here: https://github.com/rust-lang/libs-team/issues/121
2022-11-18 17:48:16 -05:00
Dylan DPC
64e737c07c
Rollup merge of #104111 - yancyribbens:add-mutable-to-the-description-for-as-simd-mut, r=scottmcm
rustdoc: Add mutable to the description

Add mutable the description to differentiate [as_simd](https://github.com/rust-lang/rust/blob/master/library/core/src/slice/mod.rs#L3654) from [as_simd_mut](https://github.com/rust-lang/rust/blob/master/library/core/src/slice/mod.rs#L3654).
2022-11-09 19:21:24 +05:30
yancy
f67ee43fe3 rustdoc: Add mutable to the description 2022-11-07 17:02:48 +01:00
yancy
d62582f92a rustdoc: Add mutable to the description 2022-11-07 16:51:23 +01:00
Ben Kimock
458aaa5a23 Print the precondition we violated, and visible through output capture
Co-authored-by: Ralf Jung <post@ralfj.de>
2022-10-26 22:09:17 -04:00
Dylan DPC
8ed3a80b9a
Rollup merge of #103287 - saethlin:faster-len-check, r=thomcc
Use a faster allocation size check in slice::from_raw_parts

I've been perusing through the codegen changes that result from turning on the standard library debug assertions. The previous check in here uses saturating arithmetic, which in my experience sometimes makes LLVM just fail to optimize things around the saturating operation.

Here is a demo of the codegen difference: https://godbolt.org/z/WMEqrjajW
Before:
```asm
example::len_check_old:
        mov     rax, rdi
        mov     ecx, 3
        mul     rcx
        setno   cl
        test    rax, rax
        setns   al
        and     al, cl
        ret

example::len_check_old:
        mov     rax, rdi
        mov     ecx, 8
        mul     rcx
        setno   cl
        test    rax, rax
        setns   al
        and     al, cl
        ret
```
After:
```asm
example::len_check_new:
        movabs  rax, 3074457345618258603
        cmp     rdi, rax
        setb    al
        ret

example::len_check_new:
        shr     rdi, 60
        sete    al
        ret
```

Running rustc-perf locally, this looks like up to a 4.5% improvement when `debug-assertions-std = true`.

Thanks ```@LegionMammal978``` (I think that's you?) for turning my idea into a much cleaner implementation.

r? ```@thomcc```
2022-10-26 11:29:53 +05:30
bors
56f132565e Auto merge of #100848 - xfix:use-metadata-for-slice-len, r=thomcc
Use ptr::metadata in <[T]>::len implementation

This avoids duplication of ptr::metadata code.

I believe this is acceptable as the previous approach essentially duplicated `ptr::metadata` because back then `rustc_allow_const_fn_unstable` annotation did not exist.

I would like somebody to ping `@rust-lang/wg-const-eval` as the documentation says:

> Always ping `@rust-lang/wg-const-eval` if you are adding more rustc_allow_const_fn_unstable attributes to any const fn.
2022-10-24 04:14:46 +00:00
Ben Kimock
cfcb0a2135 Use a faster allocation size check in slice::from_raw_parts 2022-10-20 00:30:00 -04:00
Matthias Krüger
18431b66ce
Rollup merge of #102507 - scottmcm:more-binary-search-docs, r=m-ou-se
More slice::partition_point examples

After seeing the discussion of `binary_search` vs `partition_point` in #101999, I thought some more example code could be helpful.
2022-10-18 21:18:46 +02:00
Scott McMurray
5b9a02a87d More slice::partition_point examples 2022-10-15 14:03:56 -07:00
Lukas Markeffsky
a02ec4cf18 remove HRTB from [T]::is_sorted_by{,_key} 2022-10-12 18:39:22 +02:00