Move to intra-doc links for library/core/src/alloc/{layout, global, mod}.rs
Helps with #75080. The files already contained intra-doc links, so there are only minor changes.
@rustbot modify labels: T-doc, A-intra-doc-links, T-rustdoc
Known issues:
* [`handle_alloc_error`]: Link from `core` to `alloc` could not be resolved.
* [`slice`]: slice is a primitive type, but could not be resolved; had to use [`crate::slice`] instead.
Move to intra-doc links for /library/core/src/intrinsics.rs
Helps with #75080.
@rustbot modify labels: T-doc, A-intra-doc-links, T-rustdoc
Known issues:
* The following f32 and f64 primitive methods cannot be resolved:
f32/f64::powi
f32/f64::sqrt
f32/f64::sin
f32/f64::cos
f32/f64::powf
f32/f64::exp
f32/f64::exp2
f32/f64::ln
f32/f64::log2
f32/f64::log10
f32/f64::mul_add
f32/f64::abs
f32/f64::copysign
f32/f64::floor
f32/f64::ceil
f32/f64::trunc
f32/f64::round
* Links from core to std:
[`std::pointer::*`]
[`std::process::abort`]
[`from_raw_parts`]
[`Vec::append`]
* Links with anchors?
I provided a separate commit that replaced links with anchors by intra-doc links.
Here the anchor location information gets lost, so its questionable whether to
actually replace those links.
enable align_to tests in Miri
With https://github.com/rust-lang/miri/issues/1074 resolved, we can enable these tests in Miri.
I also tweaked the test sized to get reasonable execution times with decent test coverage.
Use min_specialization in libcore
Getting `TrustedRandomAccess` to work is the main interesting thing here.
- `get_unchecked` is now an unstable, hidden method on `Iterator`
- The contract for `TrustedRandomAccess` is made clearer in documentation
- Fixed a bug where `Debug` would create aliasing references when using the specialized zip impl
- Added tests for the side effects of `next_back` and `nth`.
closes#68536
`stride == 1` case can be computed more efficiently through `-p (mod
a)`. That, then translates to a nice and short sequence of LLVM
instructions:
%address = ptrtoint i8* %p to i64
%negptr = sub i64 0, %address
%offset = and i64 %negptr, %a_minus_one
And produces pretty much ideal code-gen when this function is used in
isolation.
Typical use of this function will, however, involve use of
the result to offset a pointer, i.e.
%aligned = getelementptr inbounds i8, i8* %p, i64 %offset
This still looks very good, but LLVM does not really translate that to
what would be considered ideal machine code (on any target). For example
that's the codegen we obtain for an unknown alignment:
; x86_64
dec rsi
mov rax, rdi
neg rax
and rax, rsi
add rax, rdi
In particular negating a pointer is not something that’s going to be
optimised for in the design of CISC architectures like x86_64. They
are much better at offsetting pointers. And so we’d love to utilize this
ability and produce code that's more like this:
; x86_64
lea rax, [rsi + rdi - 1]
neg rsi
and rax, rsi
To achieve this we need to give LLVM an opportunity to apply its
various peep-hole optimisations that it does during DAG selection. In
particular, the `and` instruction appears to be a major inhibitor here.
We cannot, sadly, get rid of this load-bearing operation, but we can
reorder operations such that LLVM has more to work with around this
instruction.
One such ordering is proposed in #75579 and results in LLVM IR that
looks broadly like this:
; using add enables `lea` and similar CISCisms
%offset_ptr = add i64 %address, %a_minus_one
%mask = sub i64 0, %a
%masked = and i64 %offset_ptr, %mask
; can be folded with `gepi` that may follow
%offset = sub i64 %masked, %address
…and generates the intended x86_64 machine code. One might also wonder
how the increased amount of code would impact a RISC target. Turns out
not much:
; aarch64 previous ; aarch64 new
sub x8, x1, #1 add x8, x1, x0
neg x9, x0 sub x8, x8, #1
and x8, x9, x8 neg x9, x1
add x0, x0, x8 and x0, x8, x9
(and similarly for ppc, sparc, mips, riscv, etc)
The only target that seems to do worse is… wasm32.
Onto actual measurements – the best way to evaluate snippets like these
is to use llvm-mca. Much like Aarch64 assembly would allow to suspect,
there isn’t any performance difference to be found. Both snippets
execute in same number of cycles for the CPUs I tried. On x86_64,
we get throughput improvement of >50%, however!
Improve codegen for `align_offset`
In this PR the `align_offset` implementation is changed/improved to produce better code in certain scenarios such as when pointer type is has a stride of 1 or when building for low optimisation levels.
While these changes do not achieve the "ideal" codegen referenced in #75579, it gets significantly closer to it. I’m not actually sure if the codegen can actually be much better with this function returning the offset, rather than the aligned pointer.
See the descriptions for separate commits for further information.
docs(marker/copy): provide example for `&T` being `Copy`
### Edited 2020-08-16 (most recent)
In the current documentation about the `Copy` marker trait, there is a section
with examples of structs that can implement `Copy`. Currently there is no example for
showing that shared references (`&T`) are also `Copy`.
It is worth to have a dedicated example for `&T` being `Copy`, because shared
references are an integral part of the language and it being `Copy` is not as
intuitive as other types that share this behaviour like `i32` or `bool`.
The example picks up on the previous non-`Copy` struct and shows that
structs can be `Copy`, even when they hold a shared reference to a non-`Copy` type.
-----------------------------------------
### Edited 2020-08-02, 3:28 p.m.
I've just realized that it says "in addition to the **implementors listed below**", which makes this PR kind of "wrong", because `&T` is indeed in the "implementors listed below".
Maybe we can instead show an example with `&T` in the [When can my type be Copy](https://doc.rust-lang.org/std/marker/trait.Copy.html#when-can-my-type-be-copy) section.
What I really want to achieve is that it becomes more obvious that `&T` is also `Copy`, because, I think, it is very valuable to know and it wasn't obvious for me, until I read something about it in a forum post.
What do you think? I would create another PR for that.
**Please feel free to close this PR.**
-----------------------------------
### Original post
In the current documentation about the `Copy` marker trait, there is a section
about "additional implementors", which list additional implementors of the `Copy` trait.
The fact that shared references are also `Copy` is mixed with another point,
which makes it hard to recognize and make it seem not as important.
This clarifies the fact that shared references are also `Copy`, by mentioning it as a
separate item in the list of "additional implementors".
See also X-Link mem::{swap, take, replace}
Since it's easy to end up at one of these functions when you really wanted the other one, cross link them with descriptions of why you'd want to use them.