BTreeMap: refactor Entry out of map.rs into its own file
btree/map.rs is approaching the 3000 line mark, splitting out the entry
code buys about 500 lines of headroom.
I've created this PR because the changes I've made in #77438 will push `map.rs` over the 3000 line limit and cause tidy to complain.
I picked `Entry` to factor out because it feels less tightly coupled to the rest of `BTreeMap` than the various iterator implementations.
Related: #60302
Use posix_spawn() on unix if program is a path
Previously `Command::spawn` would fall back to the non-posix_spawn based
implementation if the `PATH` environment variable was possibly changed.
On systems with a modern (g)libc `posix_spawn()` can be significantly
faster. If program is a path itself the `PATH` environment variable is
not used for the lookup and it should be safe to use the
`posix_spawnp()` method. [1]
We found this, because we have a cli application that effectively runs a
lot of subprocesses. It would sometimes noticeably hang while printing
output. Profiling showed that the process was spending the majority of
time in the kernel's `copy_page_range` function while spawning
subprocesses. During this time the process is completely blocked from
running, explaining why users were reporting the cli app hanging.
Through this we discovered that `std::process::Command` has a fast and
slow path for process execution. The fast path is backed by
`posix_spawnp()` and the slow path by fork/exec syscalls being called
explicitly. Using fork for process creation is supposed to be fast, but
it slows down as your process uses more memory. It's not because the
kernel copies the actual memory from the parent, but it does need to
copy the references to it (see `copy_page_range` above!). We ended up
using the slow path, because the command spawn implementation in falls
back to the slow path if it suspects the PATH environment variable was
changed.
Here is a smallish program demonstrating the slowdown before this code
change:
```
use std::process::Command;
use std::time::Instant;
fn main() {
let mut args = std::env::args().skip(1);
if let Some(size) = args.next() {
// Allocate some memory
let _xs: Vec<_> = std::iter::repeat(0)
.take(size.parse().expect("valid number"))
.collect();
let mut command = Command::new("/bin/sh");
command
.arg("-c")
.arg("echo hello");
if args.next().is_some() {
println!("Overriding PATH");
command.env("PATH", std::env::var("PATH").expect("PATH env var"));
}
let now = Instant::now();
let child = command
.spawn()
.expect("failed to execute process");
println!("Spawn took: {:?}", now.elapsed());
let output = child.wait_with_output().expect("failed to wait on process");
println!("Output: {:?}", output);
} else {
eprintln!("Usage: prog [size]");
std::process::exit(1);
}
()
}
```
Running it and passing different amounts of elements to use to allocate
memory shows that the time taken for `spawn()` can differ quite
significantly. In latter case the `posix_spawnp()` implementation is 30x
faster:
```
$ cargo run --release 10000000
...
Spawn took: 324.275µs
hello
$ cargo run --release 10000000 changepath
...
Overriding PATH
Spawn took: 2.346809ms
hello
$ cargo run --release 100000000
...
Spawn took: 387.842µs
hello
$ cargo run --release 100000000 changepath
...
Overriding PATH
Spawn took: 13.434677ms
hello
```
[1]: 5f72f9800b/posix/execvpe.c (L81)
BTreeMap: improve gdb introspection of BTreeMap with ZST keys or values
I accidentally pushed an earlier revision in #77788: it changes the index of tuples for BTreeSet from ""[{}]".format(i) to "key{}".format(i). Which doesn't seem to make the slightest difference on my linux box nor on CI. In fact, gdb doesn't make any distinction between "key{}" and "val{}" for a BTreeMap either, leading to confusing output if you test more. But easy to improve.
r? @Mark-Simulacrum
liballoc: VecDeque: Add binary search functions
I am submitting rust-lang/rfcs#2997 as a PR as suggested by @scottmcm
I haven't yet created a tracking issue - if there's a favorable feedback I'll create one and update the issue links in the unstable attribs.
Remove shrink_to_fit from default ToString::to_string implementation.
As suggested by `@scottmcm` on Zulip. shrink_to_fit() seems like the wrong thing to do here in most use cases of to_string(). Would be intereseting to see if it makes any difference in a timer run.
r? `@joshtriplett`
Deny broken intra-doc links in linkchecker
Since rustdoc isn't warning about these links, check for them manually.
This also fixes the broken links that popped up from the lint.
stabilize union with 'ManuallyDrop' fields and 'impl Drop for Union'
As [discussed by @SimonSapin and @withoutboats](https://github.com/rust-lang/rust/issues/55149#issuecomment-634692020), this PR proposes to stabilize parts of the `untagged_union` feature gate:
* It will be possible to have a union with field type `ManuallyDrop<T>` for any `T`.
* While at it I propose we also stabilize `impl Drop for Union`; to my knowledge, there are no open concerns around this feature.
In the RFC discussion, we also talked about allowing `&mut T` as another non-`Copy` non-dropping type, but that felt to me like an overly specific exception so I figured we'd wait if there is actually any use for such a special case.
Some things remain unstable and still require the `untagged_union` feature gate:
* Union with fields that do not drop, are not `Copy`, and are not `ManuallyDrop<_>`. The reason to not stabilize this is to avoid semver concerns around libraries adding `Drop` implementations later. (This is already not fully semver compatible as, to my knowledge, the borrow checker will exploit the non-dropping nature of any type, but it seems prudent to avoid further increasing the amount of trouble adding an `impl Drop` can cause.)
Due to this, quite a few tests still need the `untagged_union` feature, but I think the ones where I could remove the feature flag provide good test coverage for the stable part.
Cc @rust-lang/lang
Cleanup cloudabi mutexes and condvars
This gets rid of lots of unnecessary unsafety.
All the AtomicU32s were wrapped in UnsafeCell or UnsafeCell<MaybeUninit>, and raw pointers were used to get to the AtomicU32 inside. This change cleans that up by using AtomicU32 directly.
Also replaces a UnsafeCell<u32> by a safer Cell<u32>.
@rustbot modify labels: +C-cleanup
Static mutex is static
StaticMutex is only ever used with as a static (as the name already suggests). So it doesn't have to be generic over a lifetime, but can simply assume 'static.
This 'static lifetime guarantees the object is never moved, so this is no longer a manually checked requirement for unsafe calls to lock().
@rustbot modify labels: +T-libs +A-concurrency +C-cleanup
For backtrace, use StaticMutex instead of a raw sys Mutex.
The code used the very unsafe `sys::mutex::Mutex` directly, and built its own unlock-on-drop wrapper around it. The StaticMutex wrapper already provides that and is easier to use safely.
@rustbot modify labels: +T-libs +C-cleanup
Use futex-based thread-parker for Wasm32.
This uses the existing `sys_common/thread_parker/futex.rs` futex-based thread parker (that was already used for Linux) for wasm32 as well (if the wasm32 atomics target feature is enabled, which is not the case by default).
Wasm32 provides the basic futex operations as instructions: https://webassembly.github.io/threads/syntax/instructions.html
These are now exposed from `sys::futex::{futex_wait, futex_wake}`, just like on Linux. So, `thread_parker/futex.rs` stays completely unmodified.
Refactor io/buffered.rs into submodules
This pull request splits `BufWriter`, `BufReader`, `LineWriter`, and `LineWriterShim` (along with their associated tests) into separate submodules. It contains no functional changes. This change is being made in anticipation of adding another type of buffered writer which can be switched between line- and block-buffering mode.
Part of a series of pull requests resolving #60673.
Add `str::{Split,RSplit,SplitN,RSplitN,SplitTerminator,RSplitTerminator,SplitInclusive}::as_str` methods
tl;dr this allows viewing unyelded part of str-split-iterators, like so:
```rust
let mut split = "Mary had a little lamb".split(' ');
assert_eq!(split.as_str(), "Mary had a little lamb");
split.next();
assert_eq!(split.as_str(), "had a little lamb");
split.by_ref().for_each(drop);
assert_eq!(split.as_str(), "");
```
--------------
This PR adds semi-identical `as_str` methods to most str-split-iterators with signatures like `&'_ Split<'a, P: Pattern<'a>> -> &'a str` (Note: output `&str` lifetime is bound to the `'a`, not the `'_`). The methods are similar to [`Chars::as_str`]
`SplitInclusive::as_str` is under `"str_split_inclusive_as_str"` feature gate, all other methods are under `"str_split_as_str"` feature gate.
Before this PR you had to sum `len`s of all yielded parts or collect into `String` to emulate `as_str`.
[`Chars::as_str`]: https://doc.rust-lang.org/core/str/struct.Chars.html#method.as_str
StaticMutex is only ever used with as a static (as the name already
suggests). So it doesn't have to be generic over a lifetime, but can
simply assume 'static.
This 'static lifetime guarantees the object is never moved, so this is
no longer a manually checked requirement for unsafe calls to lock().
The comment said it's UB to call lock() while it is locked. That'd be
quite a useless Mutex. :) It was supposed to say 'locked by the same
thread', not just 'locked'.
warning: the operation is ineffective. Consider reducing it to
`self.segments()[0]`
--> library/std/src/net/ip.rs:1265:9
|
1265 | (self.segments()[0] & 0xffff) == 0xfe80
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(clippy::identity_op)]` on by default
= help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#identity_op
warning: the operation is ineffective. Consider reducing it to
`self.segments()[1]`
--> library/std/src/net/ip.rs:1266:16
|
1266 | && (self.segments()[1] & 0xffff) == 0
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#identity_op
warning: the operation is ineffective. Consider reducing it to
`self.segments()[2]`
--> library/std/src/net/ip.rs:1267:16
|
1267 | && (self.segments()[2] & 0xffff) == 0
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#identity_op
warning: the operation is ineffective. Consider reducing it to
`self.segments()[3]`
--> library/std/src/net/ip.rs:1268:16
|
1268 | && (self.segments()[3] & 0xffff) == 0
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
= help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#identity_op
Signed-off-by: wcampbell <wcampbell1995@gmail.com>
warning: struct update has no effect, all the fields in the struct have
already been specified
--> library/std/src/net/addr.rs:367:19
|
367 | ..unsafe { mem::zeroed() }
| ^^^^^^^^^^^^^^^^^^^^^^^^
|
= note: `#[warn(clippy::needless_update)]` on by default
= help: for further information visit
https://rust-lang.github.io/rust-clippy/master/index.html#needless_update
Replace absolute paths with relative ones
Modern compilers allow reaching external crates
like std or core via relative paths in modules
outside of lib.rs and main.rs.
Remove unsafety from sys/unsupported and add deny(unsafe_op_in_unsafe_fn).
Replacing `UnsafeCell`s by a `Cell`s simplifies things and makes the mutex and rwlock implementations safe. Other than that, only unsafety in strlen() contained unsafe code.
@rustbot modify labels: +F-unsafe-block-in-unsafe-fn +C-cleanup
Stabilize slice_partition_at_index
This stabilizes slice_partition_at_index, including renaming `partition_at_index*` -> `select_nth_unstable*`.
Closes#55300
r? `@Amanieu`
Implement `AsRawFd` for `StdinLock` etc. on WASI.
WASI implements `AsRawFd` for `Stdin`, `Stdout`, and `Stderr`, so
implement it for `StdinLock`, `StdoutLock`, and `StderrLock` as well.
r? @alexcrichton
The stabilisation issue, #73413, has an open item for documentation.
I looked at the docs and it is all there, but I felt it could do with
some minor wording improvement.
I looked at the `str::strip_prefix` docs for a template. (That
resulted in me slightly changing that doc too.)
I de-linkified `None` and `Some`, as I felt that rather noisy.. I
searched stdlib, and these don't seem to be usually linkified.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
"Some is returned with <some value>" is an awkward construction.
The use of the passive voice is a bit odd, and doesn't seem like the
house style.
So say instead "returns X, wrapped in `Some`", for which there is some
other precedent in stdlib.
Instead of repeating "with the prefix removed", say "after the
prefix". This is a bit clearer that the original is not modified.
Signed-off-by: Ian Jackson <ijackson@chiark.greenend.org.uk>
This stabilizes the functionality in slice_partition_at_index,
but under the names `select_nth_unstable*`. The functions
`partition_at_index*` are left as deprecated, to be removed in
a later release.
Closes#55300
Avoid SeqCst or static mut in mach_timebase_info and QueryPerformanceFrequency caches
This patch went through a couple iterations but the end result is replacing a pattern where an `AtomicUsize` (updated with many SeqCst ops) guards a `static mut` with a single `AtomicU64` that is known to use 0 as a value indicating that it is not initialized.
The code in both places exists to cache values used in the conversion of Instants to Durations on macOS, iOS, and Windows.
I have no numbers to prove that this improves performance (It seems a little futile to benchmark something like this), but it's much simpler, safer, and in practice we'd expect it to be faster everywhere where Relaxed operations on AtomicU64 are cheaper than SeqCst operations on AtomicUsize, which is a lot of places.
Anyway, it also removes a bunch of unsafe code and greatly simplifies the logic, so IMO that alone would be worth it unless it was a regression.
If you want to take a look at the assembly output though, see https://godbolt.org/z/rbr6vn for x86_64, https://godbolt.org/z/cqcbqv for aarch64 (Note that this just the output of the mac side, but i'd expect the windows part to be the same and don't feel like doing another godbolt for it). There are several versions of this function in the godbolt:
- `info_new`: version in the current patch
- `info_less_new`: version in initial PR
- `info_original`: version currently in the tree
- `info_orig_but_better_orderings`: a version that just tries to change the original code's orderings from SeqCst to the (probably) minimal orderings required for soundness/correctness.
The biggest concern I have here is if we can use AtomicU64, or if there are targets that dont have it that this code supports. AFAICT: no. (If that changes in the future, it's easy enough to do something different for them)
r? `@Amanieu` because he caught a couple issues last time I tried to do a patch reducing orderings 😅
---
<details>
<summary>I rewrote this whole message so the original is inside here</summary>
I happened to notice the code we use for caching the result of mach_timebase_info uses SeqCst exclusively.
However, thinking a little more, it's actually pretty easy to avoid the static mut by packing the timebase info into an AtomicU64.
This entirely avoids needing to do the compare_exchange. The AtomicU64 can be read/written using Relaxed ops, which on current macos/ios platforms (x86_64/aarch64) have no overhead compared to direct loads/stores. This simplifies the code and makes it a lot safer too.
I have no numbers to prove that this improves performance (It seems a little futile to benchmark something like this), although it should do that on both targets it applies to.
That said, it also removes a bunch of unsafe code and simplifies the logic (arguably at least — there are only two states now, initialized or not), so I think it's a net win even without concrete numbers.
If you want to take a look at the assembly output though, see below. It has the new version, the original, and a version of the original with lower Orderings (which is still worse than the version in this PR)
- godbolt.org/z/obfqf9 x86_64-apple-darwin
- godbolt.org/z/Wz5cWc aarch64-unknown-linux-gnu (godbolt can't do aarch64-apple-ios but that doesn't matter here)
A different (and more efficient) option than this would be to just use the AtomicU64 and use the knowledge that after initialization the denominator should be nonzero... That felt like it's relying on too many things I'm not confident in, so I didn't want to do that.
</details>
rust-lang/rust#77147 simplifies things by splitting this Mutex type
into two types matching the two use cases: StaticMutex and MovableMutex.
To support the behavior of StaticMutex, we move part of the mutex
implementation into libstd.
Allow generic parameters in intra-doc links
Fixes#62834.
---
The contents of the generics will be mostly ignored (except for warning
if fully-qualified syntax is used, which is currently unsupported in
intra-doc links - see issue #74563).
* Allow links like `Vec<T>`, `Result<T, E>`, and `Option<Box<T>>`
* Allow links like `Vec::<T>::new()`
* Warn on
* Unbalanced angle brackets (e.g. `Vec<T` or `Vec<T>>`)
* Missing type to apply generics to (`<T>` or `<Box<T>>`)
* Use of fully-qualified syntax (`<Vec as IntoIterator>::into_iter`)
* Invalid path separator (`Vec:<T>:new`)
* Too many angle brackets (`Vec<<T>>`)
* Empty angle brackets (`Vec<>`)
Note that this implementation *does* allow some constructs that aren't
valid in the actual Rust syntax, for example `Box::<T>new()`. That may
not be supported in rustdoc in the future; it is an implementation
detail.
doc: disambiguate stat in MetadataExt::as_raw_stat
A few architectures in `os::linux::raw` import `libc::stat`, rather than
defining that type directly. However, that also imports the _function_
called `stat`, which makes this doc link ambiguous:
error: `crate::os::linux::raw::stat` is both a struct and a function
--> library/std/src/os/linux/fs.rs:21:19
|
21 | /// [`stat`]: crate::os::linux::raw::stat
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ambiguous link
|
= note: `-D broken-intra-doc-links` implied by `-D warnings`
help: to link to the struct, prefix with the item type
|
21 | /// [`stat`]: struct@crate::os::linux::raw::stat
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to link to the function, add parentheses
|
21 | /// [`stat`]: crate::os::linux::raw::stat()
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We want the `struct`, so it's now prefixed accordingly.
fix __rust_alloc_error_handler comment
`__rust_alloc_error_handler` was added in the same `extern` block as the allocator functions, but the comment there was not actually correct for `__rust_alloc_error_handler`. So move it down to the rest of the default allocator handling with a fixed comment. At least the comment reflects my understanding of what happens, please check carefully. :)
r? @Amanieu Cc @haraldh
Link to documentation-specific guidelines.
Changed contribution information URL because it's not obvious how to get from the current URL to the documentation-specific content.
The current URL points to this "Getting Started" page, which contains nothing specific about documentation[*] and instead launches into how to *build* `rustc` which is not a strict prerequisite for contributing documentation fixes:
* https://rustc-dev-guide.rust-lang.org/getting-started.html
[*] The most specific content is a "Writing documentation" bullet point which is not itself a link to anything (I guess a patch for that might be helpful too).
### Why?
Making this change will make it easier for people who wish to make small "drive by" documentation fixes (and read contribution guidelines ;) ) which I find are often how I start contributing to a project. (Exhibit A: https://github.com/rust-lang/rust/pull/77050 :) )
### Background
My impression is the change of content linked is an unintentional change due to a couple of other changes:
* Originally, the link pointed to `contributing.md` which started with a "table of contents" linking to each section. But the content in `contributing.md` was removed and replaced with a link to the "Getting Started" section here:
* 3f6928f1f6 (diff-6a3371457528722a734f3c51d9238c13L1)
But the changed link doesn't actually point to the equivalent content, which is now located here:
* https://rustc-dev-guide.rust-lang.org/contributing.html
(If the "Guide to Rustc Development" is now considered the canonical location of "How to Contribute" content it might be a good idea to merge some of the "Contributing" Introduction section into the "Getting Started" section.)
* This was then compounded by changing the link from `contributing.md` to `contributing.html` here:
* https://github.com/rust-lang/rust/pull/74037/files#diff-242481015141f373dcb178e93cffa850L88
In order to even find the new location of the previous `contributing.md` content I ended up needing to do a GitHub search of the `rust-lang` org for the phrase "Documentation improvements are very welcome". :D
Fix error checking in posix_spawn implementation of Command
* Check for errors returned from posix_spawn*_init functions
* Check for non-zero return value from posix_spawn functions
A few architectures in `os::linux::raw` import `libc::stat`, rather than
defining that type directly. However, that also imports the _function_
called `stat`, which makes this doc link ambiguous:
error: `crate::os::linux::raw::stat` is both a struct and a function
--> library/std/src/os/linux/fs.rs:21:19
|
21 | /// [`stat`]: crate::os::linux::raw::stat
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^ ambiguous link
|
= note: `-D broken-intra-doc-links` implied by `-D warnings`
help: to link to the struct, prefix with the item type
|
21 | /// [`stat`]: struct@crate::os::linux::raw::stat
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
help: to link to the function, add parentheses
|
21 | /// [`stat`]: crate::os::linux::raw::stat()
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
We want the `struct`, so it's now prefixed accordingly.
`DirEntry` contains a `ReadDir` handle, which used to just be a wrapper
on `Arc<InnerReadDir>`. Commit af75314ecd added `end_of_stream: bool`
which is not needed by `DirEntry`, but adds 8 bytes after padding. We
can let `DirEntry` have an `Arc<InnerReadDir>` directly to avoid that.
The posix_spawnattr_init & posix_spawn_file_actions_init might fail,
but their return code is not checked.
Check for non-zero return code and destroy only succesfully initialized
objects.
The cvt function compares the argument with -1 and when equal returns a new
io::Error constructed from errno. It is used together posix_spawn_* functions.
This is incorrect. Those functions do not set errno. Instead they return
non-zero error code directly.
Check for non-zero return code and use it to construct a new io::Error.
(docs): make mutex error comment consistent with codebase
Although exceptionally minor, I found this stands out from other error reporting language used in doc comments. With the existence of the `failure` crate, I suppose this could be slightly ambiguous. In any case, this change brings the particular comment into a consistent state with other mentions of returning errors.
BTreeMap: comment why drain_filter's size_hint is somewhat pessimistic
The `size_hint` of the `DrainFilter` iterator doesn't adjust as you iterate. This hardly seems important to me, but there has been a comparable PR #64383 in the past. I guess a scenario is that you first iterate half the map manually and keep most of the key/value pairs in the map, and then tell the predicate to drain most of the key/value pairs and `.collect` the iterator over the remaining half of the map.
I am totally ambivalent whether this is better or not.
r? @Mark-Simulacrum
Give `impl Trait` in a `const fn` its own feature gate
...previously it was gated under `#![feature(const_fn)]`.
I think we actually want to do this in all const-contexts? If so, this should be `#![feature(const_impl_trait)]` instead. I don't think there's any way to make use of `impl Trait` within a `const` initializer.
cc #77463
r? `@oli-obk`
Eliminate bounds checking in slice::Windows
This is how `<core::slice::Windows as Iterator>::next` looks right now:
```rust
fn next(&mut self) -> Option<&'a [T]> {
if self.size > self.v.len() {
None
} else {
let ret = Some(&self.v[..self.size]);
self.v = &self.v[1..];
ret
}
}
```
The line with `self.v = &self.v[1..];` relies on assumption that `self.v` is definitely not empty at this point. Else branch is taken when `self.size <= self.v.len()`, so `self.v` can be empty if `self.size` is zero. In practice, since `Windows` is never created directly but rather trough `[T]::windows` which panics when `size` is zero, `self.size` is never zero. However, the compiler doesn't know about this check, so it keeps the code which checks bounds and panics.
Using `NonZeroUsize` lets the compiler know about this invariant and reliably eliminate bounds checking without `unsafe` on `-O2`. Here is assembly of `Windows<'a, u32>::next` before and after this change ([goldbolt](https://godbolt.org/z/xrefzx)):
<details>
<summary>Before</summary>
```
example::next:
push rax
mov rcx, qword ptr [rdi + 8]
mov rdx, qword ptr [rdi + 16]
cmp rdx, rcx
jbe .LBB0_2
xor eax, eax
pop rcx
ret
.LBB0_2:
test rcx, rcx
je .LBB0_5
mov rax, qword ptr [rdi]
mov rsi, rax
add rsi, 4
add rcx, -1
mov qword ptr [rdi], rsi
mov qword ptr [rdi + 8], rcx
pop rcx
ret
.LBB0_5:
lea rdx, [rip + .L__unnamed_1]
mov edi, 1
xor esi, esi
call qword ptr [rip + core::slice::slice_index_order_fail@GOTPCREL]
ud2
.L__unnamed_2:
.ascii "./example.rs"
.L__unnamed_1:
.quad .L__unnamed_2
.asciz "\f\000\000\000\000\000\000\000\016\000\000\000\027\000\000"
```
</details>
<details>
<summary>After</summary>
```
example::next:
mov rcx, qword ptr [rdi + 8]
mov rdx, qword ptr [rdi + 16]
cmp rdx, rcx
jbe .LBB0_2
xor eax, eax
ret
.LBB0_2:
mov rax, qword ptr [rdi]
lea rsi, [rax + 4]
add rcx, -1
mov qword ptr [rdi], rsi
mov qword ptr [rdi + 8], rcx
ret
```
</details>
Note the lack of call to `core::slice::slice_index_order_fail` in second snippet.
#### Possible reasons _not_ to merge this PR:
* this changes the error message on panic in `[T]::windows`. However, AFAIK this messages are not covered by backwards compatibility policy.
Add PartialEq impls for Vec <-> slice
This is a follow-up to #71660 and rust-lang/rfcs#2917 to add two more missing vec/slice PartialEq impls:
```
impl<A, B> PartialEq<[B]> for Vec<A> where A: PartialEq<B> { .. }
impl<A, B> PartialEq<Vec<B>> for [A] where A: PartialEq<B> { .. }
```
Since this is insta-stable, it should go through the `@rust-lang/libs` FCP process. Note that I used version 1.47.0 for the `stable` attribute because I assume this will not merge before the 1.46.0 branch is cut next week.
Remove `Box::leak_with_alloc`
Add leak-test for box with allocator
Rename `AllocErr` to `AllocError` in leak-test
Add `Box::alloc` and adjust examples to use the new API
Rollup of 11 pull requests
Successful merges:
- #76784 (Add some docs to rustdoc::clean::inline and def_id functions)
- #76911 (fix VecDeque::iter_mut aliasing issues)
- #77400 (Fix suggestions for x.py setup)
- #77515 (Update to chalk 0.31)
- #77568 (inliner: use caller param_env)
- #77571 (Use matches! for core::char methods)
- #77582 (Move `EarlyOtherwiseBranch` to mir-opt-level 2)
- #77590 (Update RLS and Rustfmt)
- #77605 (Fix rustc_def_path to show the full path and not the trimmed one)
- #77614 (Let backends access span information)
- #77624 (Add c as a shorthand check alternative for new options #77603)
Failed merges:
r? `@ghost`
Support static linking with glibc and target-feature=+crt-static
With this change, it's possible to build on a linux-gnu target and pass
RUSTFLAGS='-C target-feature=+crt-static' or the equivalent via a
`.cargo/config.toml` file, and get a statically linked executable.
Update to libc 0.2.78, which adds support for static linking with glibc.
Add `crt_static_respected` to the `linux_base` target spec.
Update `android_base` and `linux_musl_base` accordingly. Avoid enabling
crt_static_respected on Android platforms, since that hasn't been
tested.
Closes https://github.com/rust-lang/rust/issues/65447.
Implement advance_by, advance_back_by for iter::Chain
Part of #77404.
This PR does two things:
- implement `Chain::advance[_back]_by` in terms of `advance[_back]_by` on `self.a` and `advance[_back]_by` on `self.b`
- change `Chain::nth[_back]` to use `advance[_back]_by` on `self.a` and `nth[_back]` on `self.b`
This ensures that `Chain::nth` can take advantage of an efficient `nth` implementation on the second iterator, in case it doesn't implement `advance_by`.
cc `@scottmcm` in case you want to review this
Replace some once(x).chain(once(y)) with [x, y] IntoIter
Now that we have by-value array iterators that are [already used](25c8c53dd9/compiler/rustc_hir/src/def.rs (L305-L307))...
For example,
```diff
- once(self.type_ns).chain(once(self.value_ns)).chain(once(self.macro_ns)).filter_map(|it| it)
+ IntoIter::new([self.type_ns, self.value_ns, self.macro_ns]).filter_map(|it| it)
```
BTreeMap: admit the existence of leaf edges in comments
The btree code is ambiguous about leaf edges (i.e., edges within leaf nodes). Iteration relies on them heavily, but some of the comments suggest there are no leaf edges (extracted from #77025)
r? @Mark-Simulacrum
Use more intra-doc-links in `core::fmt`
This is a follow-up to #75819, which encountered some broken links due to #75176, so this PR contains the links that are blocked on #75176.
r? @jyn514
Allows getting the slices directly, rather than just through an iterator as in `array_chunks(_mut)`. The constructors for those iterators are then written in terms of these methods, so the iterator constructors no longer have any `unsafe` of their own.
Hint the maximum length permitted by invariant of slices
One of the safety invariants of references, and in particular of references to slices, is that they may not cover more than `isize::MAX` bytes. The unsafe `from_raw_parts` constructors of slices explicitly requires the caller to guarantee this fact. Violating it would also be UB with regards to the semantics of generated llvm code.
This effectively bounds the length of a (non-ZST) slice from above by a compile time constant. But when the length is loaded from a function argument it appears llvm is not aware of this requirement. The additional value range assertions allow some further elision of code branches, including overflow checks, especially in the presence of artithmetic on the indices.
This may have a performance impact, adding more code to a common method but allowing more optimization. I'm not quite sure, is the Rust side of const-prop strong enough to elide the irrelevant match branches?
Fixes: #67186
Uses assume to check the length against a constant upper bound. The
inlined result then informs the optimizer of the sound value range.
This was tried with unreachable_unchecked before which introduces a
branch. This has the advantage of not being executed in sound code but
complicates basic blocks. It resulted in ~2% increased compile time in
some worst cases.
Add a codegen test for the assumption, testing the issue from #67186
Rollup of 8 pull requests
Successful merges:
- #77072 (Minor `hash_map` doc adjustments + item attribute orderings)
- #77368 (Backport LLVM apfloat commit to rustc_apfloat)
- #77445 (BTreeMap: complete the compile-time test_variance test case)
- #77504 (Support vectors with fewer than 8 elements for simd_select_bitmask)
- #77513 (Change DocFragments from enum variant fields to structs with a nested enum)
- #77518 (Only use Fira Sans for the first `td` in item lists)
- #77521 (Move target feature whitelist from cg_llvm to cg_ssa)
- #77525 (Enable RenameReturnPlace MIR optimization on mir-opt-level >= 2)
Failed merges:
r? `@ghost`
BTreeMap: complete the compile-time test_variance test case
Some of the items added to the new `test_sync` belonged in the old `test_variance` as well. And fixed inconsistent paths to nearby modules.
r? @Mark-Simulacrum
Minor `hash_map` doc adjustments + item attribute orderings
This PR is really a couple visual changes glued together:
1. Some of the doc comments for items in `std::collections::hash_map` referenced the names of types without escaping their formatting (e.g. using "VacantEntry" instead of "`VacantEntry`") - the ones I could find were changed to the latter
2. The vast majority of pre-item attributes seem to place doc comments as the first attribute (instead of things like `#[feature(...)]`), so the few that had the other order were changed.
3. Also ordering related: the general trend seems to be that `#[feature]` attributes follow `#[inline]`, so I swapped the two lines in places where that ordering was reversed. This is primarily a change based on stylistic continuity and aesthetics - I'm not sure how important that actually is / should be.
I figured this would be pretty uncontroversial, but some of these might have been intentional for reasons I don't know about - if so, I'd be happy to remove the relevant changes. Of these, the final set of changes is probably the most unnecessary, so it also might be better to leave those out (in favor of reducing code churn).
Implement as_ne_bytes() for integers and floats
This is related to issue #64464.
I am pretty sure that these functions are actually const-ify-able, and technically as_bits() can also be implemented for floats, but I might need some comments on both.
Implement Make `handle_alloc_error` default to panic (for no_std + liballoc)
Related: https://github.com/rust-lang/rust/issues/66741
Guarded with `#![feature(default_alloc_error_handler)]` a default
`alloc_error_handler` is called, if a custom allocator is used and no
other custom `#[alloc_error_handler]` is defined.
Unbox mutexes and condvars on some platforms
Both mutexes and condition variables contained a Box containing the actual os-specific object. This was done because moving these objects may cause undefined behaviour on some platforms.
However, this is not needed on Windows[1], Wasm[2], cloudabi[2], and 'unsupported'[3], were the box was only needlessly making them less efficient.
This change gets rid of the box on those platforms.
On those platforms, `Condvar` can no longer verify it is only used with one `Mutex`, as mutexes no longer have a stable address. This was addressed and considered acceptable in #76932.
[1]\: https://docs.microsoft.com/en-us/windows/win32/api/synchapi/nf-synchapi-initializesrwlock
[2]\: These are just a single atomic integer together with futex wait/wake calls/instructions.
[3]\: The `unsupported` platform doesn't support multiple threads at all.
Use less divisions in display u128/i128
This PR is an absolute mess, and I need to test if it improves the speed of fmt::Display for u128/i128, but I think it's correct.
It hopefully is more efficient by cutting u128 into at most 2 u64s, and also chunks by 1e16 instead of just 1e4.
Also I specialized the implementations for uints to always be non-false because it bothered me that it was checked at all
Do not merge until I benchmark it and also clean up the god awful mess of spaghetti.
Based on prior work in #44583
cc: `@Dylan-DPC`
Due to work on `itoa` and suggestion in original issue:
r? `@dtolnay`
Allow Weak::as_ptr and friends for unsized T
Relaxes `impl<T> Weak<T>` to `impl<T: ?Sized> Weak<T>` for the methods `rc::Weak::as_ptr`, `into_raw`, and `from_raw`.
Follow-up to #73845, which did most of the impl work to make these functions work for `T: ?Sized`.
We still have to adjust the implementation of `Weak::from_raw` here, however, because I missed a use of `ptr.is_null()` previously. This check was necessary when `into`/`from_raw` were first implemented, as `into_raw` returned `ptr::null()` for dangling weak. However, we now just (wrapping) offset dangling weaks' pointers the same as nondangling weak, so the null check is no longer necessary (or even hit). (I can submit just 17a928f as a separate PR if desired.)
As a nice side effect, moves the `fn is_dangling` definition closer to `Weak::new`, which creates the dangling weak.
This technically stabilizes that "something like `align_of_val_raw`" is possible to do. However, I believe the part of the functionality required by these methods here -- specifically, getting the alignment of a pointee from a pointer where it may be dangling iff the pointee is `Sized` -- is uncontroversial enough to stabilize these methods without a way to implement them on stable Rust.
r? `@RalfJung,` who reviewed #73845.
ATTN: This changes (relaxes) the (input) generic bounds on stable fn!
Remove --cfg dox from rustdoc.rs
This was added in https://github.com/rust-lang/rust/pull/53076 because
several dependencies were using `cfg(dox)` instead of `cfg(rustdoc)` (now `cfg(doc)`).
I ran `rg 'cfg\(dox\)'` on the source tree with no matches, so I think
this is now safe to remove.
r? `@Mark-Simulacrum`
cc `@QuietMisdreavus` :)
Only use LOCAL_{STDOUT,STDERR} when set_{print/panic} is used.
The thread local `LOCAL_STDOUT` and `LOCAL_STDERR` are only used by the `test` crate to capture output from tests when running them in the same process in differen threads. However, every program will check these variables on every print, even outside of testing.
This involves allocating a thread local key, and registering a thread local destructor. This can be somewhat expensive.
This change keeps a global flag (`LOCAL_STREAMS`) which will be set to `true` when either of these local streams is used. (So, effectively only in test and benchmark runs.) When this flag is off, these thread locals are not even looked at and therefore will not be initialized on the first output on every thread, which also means no thread local destructors will be registered.
---
Together with https://github.com/rust-lang/rust/pull/77154, this should make output a little bit more efficient.
Fix Debug implementations of some of the HashMap and BTreeMap iterator types
HashMap's `ValuesMut`, BTreeMaps `ValuesMut`, IntoValues and `IntoKeys` structs were printing both keys and values on their Debug implementations. But they are iterators over either keys or values. Irrelevant values should not be visible. With this PR, they only show relevant fields.
This fixes#75297.
[Here's an example code.](https://play.rust-lang.org/?version=nightly&mode=debug&edition=2018&gist=0c79356ed860e347a0c1a205616f93b7) This prints this on nightly:
```
ValuesMut { inner: IterMut { range: [(1, "hello"), (2, "goodbye")], length: 2 } }
IntoKeys { inner: [(1, "hello"), (2, "goodbye")] }
IntoValues { inner: [(1, "hello"), (2, "goodbye")] }
[(2, "goodbye"), (1, "hello")]
```
After the patch this example prints these instead:
```
["hello", "goodbye"]
["hello", "goodbye"]
[1, 2]
["hello", "goodbye"]
```
I didn't add test cases for them, since I couldn't see any tests for Debug implementations anywhere. But please let me know if I should add it to a specific place.
r? @dtolnay
Use posix_spawn on musl targets
The posix_spawn had been available in a form suitable for use in a
Command implementation since musl 0.9.12. Use it in a preference to a
fork when possible, to benefit from CLONE_VM|CLONE_VFORK used there.
Previously `Command::spawn` would fall back to the non-posix_spawn based
implementation if the `PATH` environment variable was possibly changed.
On systems with a modern (g)libc `posix_spawn()` can be significantly
faster. If program is a path itself the `PATH` environment variable is
not used for the lookup and it should be safe to use the
`posix_spawnp()` method. [1]
We found this, because we have a cli application that effectively runs a
lot of subprocesses. It would sometimes noticeably hang while printing
output. Profiling showed that the process was spending the majority of
time in the kernel's `copy_page_range` function while spawning
subprocesses. During this time the process is completely blocked from
running, explaining why users were reporting the cli app hanging.
Through this we discovered that `std::process::Command` has a fast and
slow path for process execution. The fast path is backed by
`posix_spawnp()` and the slow path by fork/exec syscalls being called
explicitly. Using fork for process creation is supposed to be fast, but
it slows down as your process uses more memory. It's not because the
kernel copies the actual memory from the parent, but it does need to
copy the references to it (see `copy_page_range` above!). We ended up
using the slow path, because the command spawn implementation in falls
back to the slow path if it suspects the PATH environment variable was
changed.
Here is a smallish program demonstrating the slowdown before this code
change:
```
use std::process::Command;
use std::time::Instant;
fn main() {
let mut args = std::env::args().skip(1);
if let Some(size) = args.next() {
// Allocate some memory
let _xs: Vec<_> = std::iter::repeat(0)
.take(size.parse().expect("valid number"))
.collect();
let mut command = Command::new("/bin/sh");
command
.arg("-c")
.arg("echo hello");
if args.next().is_some() {
println!("Overriding PATH");
command.env("PATH", std::env::var("PATH").expect("PATH env var"));
}
let now = Instant::now();
let child = command
.spawn()
.expect("failed to execute process");
println!("Spawn took: {:?}", now.elapsed());
let output = child.wait_with_output().expect("failed to wait on process");
println!("Output: {:?}", output);
} else {
eprintln!("Usage: prog [size]");
std::process::exit(1);
}
()
}
```
Running it and passing different amounts of elements to use to allocate
memory shows that the time taken for `spawn()` can differ quite
significantly. In latter case the `posix_spawnp()` implementation is 30x
faster:
```
$ cargo run --release 10000000
...
Spawn took: 324.275µs
hello
$ cargo run --release 10000000 changepath
...
Overriding PATH
Spawn took: 2.346809ms
hello
$ cargo run --release 100000000
...
Spawn took: 387.842µs
hello
$ cargo run --release 100000000 changepath
...
Overriding PATH
Spawn took: 13.434677ms
hello
```
[1]: 5f72f9800b/posix/execvpe.c (L81)
Add accessors to Command.
This adds some accessor methods to `Command` to provide a way to access the values set when building the `Command`. An example where this can be useful is to display the command to be executed. This is roughly based on the [`ProcessBuilder`](13b73cdaf7/src/cargo/util/process_builder.rs (L105-L134)) in Cargo.
Possible concerns about the API:
- Values with NULs on Unix will be returned as `"<string-with-nul>"`. I don't think it is practical to avoid this, since otherwise a whole separate copy of all the values would need to be kept in `Command`.
- Does not handle `arg0` on Unix. This can be awkward to support in `get_args` and is rarely used. I figure if someone really wants it, it can be added to `CommandExt` as a separate method.
- Does not offer a way to detect `env_clear`. I'm uncertain if it would be useful for anyone.
- Does not offer a way to get an environment variable by name (`get_env`). I figure this can be added later if anyone really wants it. I think the motivation for this is weak, though. Also, the API could be a little awkward (return a `Option<Option<&OsStr>>`?).
- `get_envs` could skip "cleared" entries and just return `&OsStr` values instead of `Option<&OsStr>`. I'm on the fence here. My use case is to display a shell command, and I only intend it to be roughly equivalent to the actual execution, and I probably won't display `None` entries. I erred on the side of providing extra information, but I suspect many situations will just filter out the `None`s.
- Could implement more iterator stuff (like `DoubleEndedIterator`).
I have not implemented new std items before, so I'm uncertain if the existing issue should be reused, or if a new tracking issue is needed.
cc #44434