In the comments of (closed, defunct) pull request #54884, Mazdak
"Centril" Farrokhzad noted that must-use annotations didn't work on an
associated function (what other communities might call a "static
method"). Subsequent logging revealed that in this case we have a
`Def::Method`, whereas the lint pass was only matching on
`Def::Fn`. (One could argue that those def-names are thereby
misleading—must-use for self-ful methods have always worked—but
documenting or reworking that can be left to another day.)
A very minor issue, `lifetime` was missing from the error list.
I left `literal` in the list, even though it is unstable. It looks like it may stabilize soon anyways.
The upcoming SIMD support in the wasm target is unique from the other
platforms where it's either unconditionally available or not available,
there's no halfway where a subsection of the program can use it but no
other parts of the program can use it. In this world it's valid for wasm
SIMD args to always be passed by value and there's no need to pass them
by reference.
This commit adds a new custom target specification option
`simd_types_indirect` which defaults to `true`, but the wasm backend
disables this and sets it to `false`.
[NLL] Check user types are well-formed
Also contains a change of span for AscribeUserType.
I'm not quite sure if this was what @nikomatsakis was thinking.
Closes#54620
r? @nikomatsakis
Rollup of 16 pull requests
Successful merges:
- #54755 (Documents reference equality by address (#54197))
- #54811 (During rustc bootstrap, make default for `optimize` independent of `debug`)
- #54825 (NLL says "borrowed content" instead of more precise "dereference of raw pointer")
- #54860 (Add doc comments about safest way to initialize a vector of zeros)
- #54869 (Fix mobile docs)
- #54891 (Fix tracking issue for Once::is_completed)
- #54913 (doc fix: it's auto traits that make for automatic implementations)
- #54920 (Fix handling of #[must_use] on unit and uninhabited types)
- #54932 (A handful of random string-related improvements)
- #54936 (impl Eq+Hash for TyLayout)
- #54950 (std: Synchronize global allocator on wasm32)
- #54956 ("(using ..." doesn't have the matching ")")
- #54958 (add a macro for static (compile-time) assertions)
- #54967 (Remove incorrect span for second label inner macro invocation)
- #54983 (Fix slice's benchmarks)
- #54989 (Fix spelling in the documentation to htmldocck.py)
Failed merges:
r? @ghost
Fix spelling in the documentation to htmldocck.py
I was reading through htmldocck.py, and decided to attempt to clean it up a little bit. Let me know if you disagree with my edits.
std: Synchronize global allocator on wasm32
We originally didn't have threads, and now we're starting to add them!
Make sure we properly synchronize access to dlmalloc when the `atomics`
feature is enabled for `wasm32-unknown-unknown`.
doc fix: it's auto traits that make for automatic implementations
Being a marker trait is not good enough (that just means "no items in the trait").
r? @alexcrichton who [originally wrote these docs](0a13f1abaf).
Add doc comments about safest way to initialize a vector of zeros
This adds more information about the vec! macro as discussed in #54628. I think this is a good starting point, but I think additional detail is needed so that we can explain why vec! is safer than the alternatives.
NLL says "borrowed content" instead of more precise "dereference of raw pointer"
Part of #52663.
Previously, move errors involving the dereference of a raw pointer would
say "borrowed content". This commit changes it to say "dereference of
raw pointer".
r? @nikomatsakis
cc @pnkfelix
Documents reference equality by address (#54197)
Clarification of the use of `ptr::eq` to test equality of references via address by pointer coercion, regarding issue #54197 .
The same example as in `ptr::eq` docs is shown here to clarify that
`PartialEq` compares values pointed-to instead of via address (which can be desired in some cases)
It looks like we tend to use angle-brackets around the placeholder in
the few other places we use `Applicability::HasPlaceholders`, but that
would be confusing here, so ...
Use MaybeUninit in liballoc
All code by @japaric. This is a re-submission of a part of https://github.com/rust-lang/rust/pull/53508 that hopefully does not regress performance.
Support for disabling PLT for better function call performance
This PR gives `rustc` the ability to skip the PLT when generating function calls into shared libraries. This can improve performance by reducing branch indirection.
AFAIK, the only advantage of using the PLT is to allow for ELF lazy binding. However, since Rust already [enables full relro for security](https://github.com/rust-lang/rust/pull/43170), lazy binding was disabled anyway.
This is a little known feature which is supported by [GCC](https://gcc.gnu.org/onlinedocs/gcc/Code-Gen-Options.html) and [Clang](https://clang.llvm.org/docs/ClangCommandLineReference.html#cmdoption-clang-fplt) as `-fno-plt` (some Linux distros [enable it by default](https://git.archlinux.org/svntogit/packages.git/tree/trunk/makepkg.conf?h=packages/pacman#n40) for all builds).
Implementation inspired by [this patch](https://reviews.llvm.org/D39079#change-YvkpNDlMs_LT) which adds `-fno-plt` support to Clang.
## Performance
I didn't run a lot of benchmarks, but these are the results on my machine for a `clap` [benchmark](https://github.com/clap-rs/clap/blob/master/benches/05_ripgrep.rs):
```
name control ns/iter no-plt ns/iter diff ns/iter diff % speedup
build_app_long 11,097 10,733 -364 -3.28% x 1.03
build_app_short 11,089 10,742 -347 -3.13% x 1.03
build_help_long 186,835 182,713 -4,122 -2.21% x 1.02
build_help_short 80,949 78,455 -2,494 -3.08% x 1.03
parse_clean 12,385 12,044 -341 -2.75% x 1.03
parse_complex 19,438 19,017 -421 -2.17% x 1.02
parse_lots 431,493 421,421 -10,072 -2.33% x 1.02
```
A small performance improvement across the board, with no downsides. It's likely binaries which make a lot of function calls into dynamic libraries could see even more improvements. [This comment](https://patchwork.ozlabs.org/patch/468993/#1028255) suggests that, in some cases, `-fno-plt` could improve PIC/PIE code performance by 10%.
## Security benefits
**Bonus**: some of the speculative execution attacks rely on the PLT, by disabling it we reduce a big attack surface and reduce the need for [`retpoline`](https://reviews.llvm.org/D41723).
## Remaining PLT calls
The compiled binaries still have plenty of PLT calls, coming from C/C++ libraries. Building dependencies with `CFLAGS=-fno-plt CXXFLAGS=-fno-plt` removes them.
Disable the PLT where possible to improve performance
for indirect calls into shared libraries.
This optimization is enabled by default where possible.
- Add the `NonLazyBind` attribute to `rustllvm`:
This attribute informs LLVM to skip PLT calls in codegen.
- Disable PLT unconditionally:
Apply the `NonLazyBind` attribute on every function.
- Only enable no-plt when full relro is enabled:
Ensures we only enable it when we have linker support.
- Add `-Z plt` as a compiler option
This commit extends the existing lang items functionality to assert
that the `#[lang_item]` attribute is only found on the appropriate item
for any given lang item. That is, language items representing traits
must only ever have their corresponding attribute placed on a trait, for
example.
This adds an implementation of thread local storage for the
`wasm32-unknown-unknown` target when the `atomics` feature is
implemented. This, however, comes with a notable caveat of that it
requires a new feature of the standard library, `wasm-bindgen-threads`,
to be enabled.
Thread local storage for wasm (when `atomics` are enabled and there's
actually more than one thread) is powered by the assumption that an
external entity can fill in some information for us. It's not currently
clear who will fill in this information nor whose responsibility it
should be long-term. In the meantime there's a strategy being gamed out
in the `wasm-bindgen` project specifically, and the hope is that we can
continue to test and iterate on the standard library without committing
to a particular strategy yet.
As to the details of `wasm-bindgen`'s strategy, LLVM doesn't currently
have the ability to emit custom `global` values (thread locals in a
`WebAssembly.Module`) so we leverage the `wasm-bindgen` CLI tool to do
it for us. To that end we have a few intrinsics, assuming two global values:
* `__wbindgen_current_id` - gets the current thread id as a 32-bit
integer. It's `wasm-bindgen`'s responsibility to initialize this
per-thread and then inform libstd of the id. Currently `wasm-bindgen`
performs this initialization as part of the `start` function.
* `__wbindgen_tcb_{get,set}` - in addition to a thread id it's assumed
that there's a global available for simply storing a pointer's worth
of information (a thread control block, which currently only contains
thread local storage). This would ideally be a native `global`
injected by LLVM, but we don't have a great way to support that right
now.
To reiterate, this is all intended to be unstable and purely intended
for testing out Rust on the web with threads. The story is very likely
to change in the future and we want to make sure that we're able to do
that!