Implement [T]::align_to
Note that this PR deviates from what is accepted by RFC slightly by making `align_offset` to return an offset in elements, rather than bytes. This is necessary to sanely support `[T]::align_to` and also simply makes more sense™. The caveat is that trying to align a pointer of ZST is now an equivalent to `is_aligned` check, rather than anything else (as no number of ZST elements will align a misaligned ZST pointer).
It also implements the `align_to` slightly differently than proposed in the RFC to properly handle cases where size of T and U aren’t co-prime.
Furthermore, a promise is made that the slice containing `U`s will be as large as possible (contrary to the RFC) – otherwise the function is quite useless.
The implementation uses quite a few underhanded tricks and takes advantage of the fact that alignment is a power-of-two quite heavily to optimise the machine code down to something that results in as few known-expensive instructions as possible. Currently calling `ptr.align_offset` with an unknown-at-compile-time `align` results in code that has just a single "expensive" modulo operation; the rest is "cheap" arithmetic and bitwise ops.
cc https://github.com/rust-lang/rust/issues/44488 @oli-obk
As mentioned in the commit message for align_offset, many thanks go to Chris McDonald.
Keep only the language item. This removes some indirection and makes
codegen worse for debug builds, but simplifies code significantly, which
is a good tradeoff to make, in my opinion.
Besides, the codegen can be improved even further with some constant
evaluation improvements that we expect to happen in the future.
This is necessary if we want to implement `[T]::align_to` and is more
useful in general.
This implementation effort has begun during the All Hands and represents
a month of my futile efforts to do any sort of maths. Luckily, I
found the very very nice Chris McDonald (cjm) on IRC who figured out the
core formulas for me! All the thanks for existence of this PR go to
them!
Anyway… Those formulas were mangled by yours truly into the arcane forms
you see here to squeeze out the best assembly possible on most of the
modern architectures (x86 and ARM were evaluated in practice). I mean,
just look at it: *one actual* modulo operation and everything else is
just the cheap single cycle ops! Admitedly, the naive solution might be
faster in some common scenarios, but this code absolutely butchers the
naive solution on the worst case scenario.
Alas, the result of this arcane magic also means that the code pretty
heavily relies on the preconditions holding true and breaking those
preconditions will unleash the UB-est of all UBs! So don’t.
Rewrite docs for `std::ptr`
This PR attempts to resolve#29371.
This is a fairly major rewrite of the `std::ptr` docs, and deserves a fair bit of scrutiny. It adds links to the GNU libc docs for various instrinsics, adds internal links to types and functions referenced in the docs, adds new, more complex examples for many functions, and introduces a common template for discussing unsafety of functions in `std::ptr`.
All functions in `std::ptr` (with the exception of `ptr::eq`) are unsafe because they either read from or write to a raw pointer. The "Safety" section now informs that the function is unsafe because it dereferences a raw pointer and requires that any pointer to be read by the function points to "a valid vaue of type `T`".
Additionally, each function imposes some subset of the following conditions on its arguments.
* The pointer points to valid memory.
* The pointer points to initialized memory.
* The pointer is properly aligned.
These requirements are discussed in the "Undefined Behavior" section along with the consequences of using functions that perform bitwise copies without requiring `T: Copy`. I don't love my new descriptions of the consequences of making such copies. Perhaps the old ones were good enough?
Some issues which still need to be addressed before this is merged:
- [ ] The new docs assert that `drop_in_place` is equivalent to calling `read` and discarding the value. Is this correct?
- [ ] Do `write_bytes` and `swap_nonoverlapping` require properly aligned pointers?
- [ ] The new example for `drop_in_place` is a lackluster.
- [ ] Should these docs rigorously define what `valid` memory is? Or should is that the job of the reference? Should we link to the reference?
- [ ] Is it correct to require that pointers that will be read from refer to "valid values of type `T`"?
- [x] I can't imagine ever using `{read,write}_volatile` with non-`Copy` types. Should I just link to {read,write} and say that the same issues with non-`Copy` types apply?
- [x] `write_volatile` doesn't link back to `read_volatile`.
- [ ] Update docs for the unstable [`swap_nonoverlapping`](https://github.com/rust-lang/rust/issues/42818)
- [ ] Update docs for the unstable [unsafe pointer methods RFC](https://github.com/rust-lang/rfcs/pull/1966)
Looking forward to your feedback.
r? @steveklabnik
- Add links to the GNU libc docs for `memmove`, `memcpy`, and
`memset`, as well as internally linking to other functions in `std::ptr`
- List sources of UB for all functions.
- Add example to `ptr::drop_in_place` and compares it to `ptr::read`.
- Add examples which more closely mirror real world uses for the
functions in `std::ptr`. Also, move the reimplementation of `mem::swap`
to the examples of `ptr::read` and use a more interesting example for
`copy_nonoverlapping`.
- Change module level description
Adds intrinsics::exact_div to take advantage of the unsafe, which reduces the implementation from
```asm
sub rcx, rdx
mov rax, rcx
sar rax, 63
shr rax, 62
lea rax, [rax + rcx]
sar rax, 2
ret
```
down to
```asm
sub rcx, rdx
sar rcx, 2
mov rax, rcx
ret
```
(for `*const i32`)
The comment referred to a variable using an incorrect name. (it has probably been renamed since the comment was written, or the comment was copied elsewhere - I noted the example in libcore has the `tmp` name for the temporary variable.)
* Bump the release version to 1.25
* Bump the bootstrap compiler to the recent beta
* Allow using unstable rustdoc features on beta - this fix has been applied to
the beta branch but needed to go to the master branch as well.
This commit adds compiler support for two basic operations needed for binding
SIMD on x86 platforms:
* First, a `nontemporal_store` intrinsic was added for the `_mm_stream_ps`, seen
in rust-lang-nursery/stdsimd#114. This was relatively straightforward and is
quite similar to the volatile store intrinsic.
* Next, and much more intrusively, a new type to the backend was added. The
`x86_mmx` type is used in LLVM for a 64-bit vector register and is used in
various intrinsics like `_mm_abs_pi8` as seen in rust-lang-nursery/stdsimd#74.
This new type was added as a new layout option as well as having support added
to the trans backend. The type is enabled with the `#[repr(x86_mmx)]`
attribute which is intended to just be an implementation detail of SIMD in
Rust.
I'm not 100% certain about how the `x86_mmx` type was added, so any extra eyes
or thoughts on that would be greatly appreciated!
- updates documentation on volatile memory intrinsics, now the case of
zero-sized types is mentioned explicitly.
Volatile memory operations which doesn't affect memory at all are omitted
in LLVM backend, e.g. if number of elements is zero or type used in
generic specialisation is zero-sized, then LLVM intrinsic or related code
is not generated. This was not explicitly documented before in Rust
documentation and potentially could cause issues.
Mark it with the `unreachable` feature and put it into the `mem` module.
This is a pretty straight-forward API that can already be simulated in
stable Rust by using `transmute` to create an uninhabited enum that can
be matched.
LLVM currently doesn't remove the "bypass if argument is zero" assembly inside branches where the value is known to be non-zero, pessimizing code that uses uN::leading_zeros
We've got a freshly minted beta compiler, let's update to use that on nightly!
This has a few other changes associated with it as well
* A bump to the rustc version number (to 1.19.0)
* Movement of the `cargo` and `rls` submodules to their "proper" location in
`src/tools/{cargo,rls}`. Now that Cargo workspaces support the `exclude`
option this can work.
* Updates of the `cargo` and `rls` submodules to their master branches.
* Tweak to the `src/stage0.txt` format to be more amenable for Cargo version
numbers. On the beta channel Cargo will bootstrap from a different version
than rustc (e.g. the version numbers are different), so we need different
configuration for this.
* Addition of `dev` as a readable key in the `src/stage0.txt` format. If present
then stage0 compilers are downloaded from `dev-static.rust-lang.org` instead
of `static.rust-lang.org`. This is added to accomodate our updated release
process with Travis and AppVeyor.
Drop of arrays is now translated in trans::block in an ugly way that I
should clean up in a later PR, and does not handle panics in the middle
of an array drop, but this commit & PR are growing too big.