Call the OS function to set the main thread's name on program init
Normally, `Thread::spawn` takes care of setting the thread's name, if
one was provided, but since the main thread wasn't created by calling
`Thread::spawn`, we need to call that function in `std::rt::init`.
This is mainly useful for system tools like debuggers and profilers
which might show the thread name to a user. Prior to these changes, gdb
and WinDbg would show all thread names except the main thread's name to
a user. I've validated that this patch resolves the issue for both
debuggers.
Lazily allocate and initialize pthread locks.
Lazily allocate and initialize pthread locks.
This allows {Mutex, Condvar, RwLock}::new() to be const, while still using the platform's native locks for features like priority inheritance and debug tooling. E.g. on macOS, we cannot directly use the (private) APIs that pthread's locks are implemented with, making it impossible for us to use anything other than pthread while still preserving priority inheritance, etc.
This PR doesn't yet make the public APIs const. That's for a separate PR with an FCP.
Tracking issue: https://github.com/rust-lang/rust/issues/93740
Tweak insert docs
For `{Hash, BTree}Map::insert`, I always have to take a few extra seconds to think about the slight weirdness about the fact that if we "did not" insert (which "sounds" false), we return true, and if we "did" insert, (which "sounds" true), we return false.
This tweaks the doc comments for the `insert` methods of those types (as well as what looks like a rustc internal data structure that I found just by searching the codebase for "If the set did") to first use the "Returns whether _something_" pattern used in e.g. `remove`, where we say that `remove` "returns whether the value was present".
Expose `get_many_mut` and `get_many_unchecked_mut` to HashMap
This pull-request expose the function [`get_many_mut`](https://docs.rs/hashbrown/0.12.0/hashbrown/struct.HashMap.html#method.get_many_mut) and [`get_many_unchecked_mut`](https://docs.rs/hashbrown/0.12.0/hashbrown/struct.HashMap.html#method.get_many_unchecked_mut) from `hashbrown` to the standard library `HashMap` type. They obviously keep the same API and are added under the (new) `map_many_mut` feature.
- `get_many_mut`: Attempts to get mutable references to `N` values in the map at once.
- `get_many_unchecked_mut`: Attempts to get mutable references to `N` values in the map at once, without validating that the values are unique.
library/std: Bump compiler_builtins
Some neat changes include faster float conversions & fixes for AVR 🙂
(note that's it's my first time upgrading `compiler_builtins`, so I'm not 100% sure if bumping `library/std/Cargo.toml` is enough; certainly seems to be so, though.)
Put a bound on collection misbehavior
As currently written, when a logic error occurs in a collection's trait parameters, this allows *completely arbitrary* misbehavior, so long as it does not cause undefined behavior in std. However, because the extent of misbehavior is not specified, it is allowed for *any* code in std to start misbehaving in arbitrary ways which are not formally UB; consider the theoretical example of a global which gets set on an observed logic error. Because the misbehavior is only bound by not resulting in UB from safe APIs and the crate-level encapsulation boundary of all of std, this makes writing user unsafe code that utilizes std theoretically impossible, as it now relies on undocumented QOI (quality of implementation) that unrelated parts of std cannot be caused to misbehave by a misuse of std::collections APIs.
In practice, this is a nonconcern, because std has reasonable QOI and an implementation that takes advantage of this freedom is essentially a malicious implementation and only compliant by the most langauage-lawyer reading of the documentation.
To close this hole, we just add a small clause to the existing logic error paragraph that ensures that any misbehavior is limited to the collection which observed the logic error, making it more plausible to prove the soundness of user unsafe code.
This is not meant to be formal; a formal refinement would likely need to mention that values derived from the collection can also misbehave after a logic error is observed, as well as define what it means to "observe" a logic error in the first place. This fix errs on the side of informality in order to close the hole without complicating a normal reading which can assume a reasonable nonmalicious QOI.
See also [discussion on IRLO][1].
[1]: https://internals.rust-lang.org/t/using-std-collections-and-unsafe-anything-can-happen/16640
r? rust-lang/libs-api ```@rustbot``` label +T-libs-api -T-libs
This technically adds a new guarantee to the documentation, though I argue as written it's one already implicitly provided.
Rollup of 6 pull requests
Successful merges:
- #97089 (Improve settings theme display)
- #97229 (Document the current aliasing rules for `Box<T>`.)
- #97371 (Suggest adding a semicolon to a closure without block)
- #97455 (Stabilize `toowned_clone_into`)
- #97565 (Add doc alias `memset` to `write_bytes`)
- #97569 (Remove `memset` alias from `fill_with`.)
Failed merges:
r? `@ghost`
`@rustbot` modify labels: rollup
Implement [OsStr]::join
Implements join for `OsStr` and `OsString` slices:
```Rust
let strings = [OsStr::new("hello"), OsStr::new("dear"), OsStr::new("world")];
assert_eq!("hello dear world", strings.join(OsStr::new(" ")));
````
This saves one from converting to strings and back, or from implementing it manually.
This PR has been re-filed after #96744 was first accidentally merged and then reverted.
The change is instantly stable and thus:
r? rust-lang/libs-api `@rustbot` label +T-libs-api -T-libs
cc `@thomcc` `@m-ou-se` `@faptc`
Remove "sys isn't exported yet" phrase
The oldest occurence is from 9e224c2bf1,
which is from the pre-1.0 days. In the years since then, std::sys still
hasn't been exported, and the last attempt was met with strong criticism:
https://github.com/rust-lang/rust/pull/97151
Thus, removing the "yet" part makes a lot of sense.
Use Box::new() instead of box syntax in library tests
The tests inside `library/*` have no reason to use `box` syntax as they have 0 performance relevance. Therefore, we can safely remove them (instead of having to use alternatives like the one in #97293).
The oldest occurence is from 9e224c2bf1,
which is from the pre-1.0 days. In the years since then, std::sys still
hasn't been exported, and the last attempt was met with strong criticism:
https://github.com/rust-lang/rust/pull/97151
Thus, removing the "yet" part makes a lot of sense.
Finish bumping stage0
It looks like the last time had left some remaining cfg's -- which made me think
that the stage0 bump was actually successful. This brings us to a released 1.62
beta though.
This now brings us to cfg-clean, with the exception of check-cfg-features in bootstrap;
I'd prefer to leave that for a separate PR at this time since it's likely to be more tricky.
cc https://github.com/rust-lang/rust/pull/97147#issuecomment-1132845061
r? `@pietroalbini`
Normally, `Thread::spawn` takes care of setting the thread's name, if
one was provided, but since the main thread wasn't created by calling
`Thread::spawn`, we need to call that function in `std::rt::init`.
This is mainly useful for system tools like debuggers and profilers
which might show the thread name to a user. Prior to these changes, gdb
and WinDbg would show all thread names except the main thread's name to
a user. I've validated that this patch resolves the issue for both
debuggers.
It looks like the last time had left some remaining cfg's -- which made me think
that the stage0 bump was actually successful. This brings us to a released 1.62
beta though.
Add section on common message styles for Result::expect
Based on a question from https://github.com/rust-lang/project-error-handling/issues/50#issuecomment-1092339937
~~One thing I haven't decided on yet, should I duplicate this section on `Option::expect`, link to this section, or move it somewhere else and link to that location from both docs?~~: I ended up moving the section to `std::error` and referencing it from both `Result::expect` and `Option::expect`'s docs.
I think this section, when combined with the similar update I made on [`std::panic!`](https://doc.rust-lang.org/nightly/std/macro.panic.html#when-to-use-panic-vs-result) implies that we should possibly more aggressively encourage and support the "expect as precondition" style described in this section. The consensus among the libs team seems to be that panic should be used for bugs, not expected potential failure modes. The "expect as error message" style seems to align better with the panic for unrecoverable errors style where they're seen as normal errors where the only difference is a desire to kill the current execution unit (aka erlang style error handling). I'm wondering if we should be providing a panic hook similar to `human-panic` or more strongly recommending the "expect as precondition" style of expect message.
explain how to turn integers into fn ptrs
(with an intermediate raw ptr, not a direct transmute)
Direct int2ptr transmute, under the semantics I am imagining, will produce a ptr with "invalid" provenance that is invalid to deref or call. We cannot give it the same semantics as int2ptr casts since those do [something complicated](https://www.ralfj.de/blog/2022/04/11/provenance-exposed.html).
To my great surprise, that is already what the example in the `transmute` docs does. :) I still added a comment to say that that part is important, and I added a section explicitly talking about this to the `fn()` type docs.
With https://github.com/rust-lang/miri/pull/2151, Miri will start complaining about direct int-to-fnptr transmutes (in the sense that it is UB to call the resulting pointer).
As currently written, when a logic error occurs in a collection's trait
parameters, this allows *completely arbitrary* misbehavior, so long as
it does not cause undefined behavior in std. However, because the extent
of misbehavior is not specified, it is allowed for *any* code in std to
start misbehaving in arbitrary ways which are not formally UB; consider
the theoretical example of a global which gets set on an observed logic
error. Because the misbehavior is only bound by not resulting in UB from
safe APIs and the crate-level encapsulation boundary of all of std, this
makes writing user unsafe code that utilizes std theoretically
impossible, as it now relies on undocumented QOI that unrelated parts of
std cannot be caused to misbehave by a misuse of std::collections APIs.
In practice, this is a nonconcern, because std has reasonable QOI and an
implementation that takes advantage of this freedom is essentially a
malicious implementation and only compliant by the most langauage-lawyer
reading of the documentation.
To close this hole, we just add a small clause to the existing logic
error paragraph that ensures that any misbehavior is limited to the
collection which observed the logic error, making it more plausible to
prove the soundness of user unsafe code.
This is not meant to be formal; a formal refinement would likely need to
mention that values derived from the collection can also misbehave after a
logic error is observed, as well as define what it means to "observe" a
logic error in the first place. This fix errs on the side of informality
in order to close the hole without complicating a normal reading which
can assume a reasonable nonmalicious QOI.
See also [discussion on IRLO][1].
[1]: https://internals.rust-lang.org/t/using-std-collections-and-unsafe-anything-can-happen/16640
Document rounding for floating-point primitive operations and string parsing
The docs for floating point don't have much to say at present about either the precision of their results or rounding behaviour.
As I understand it[^1][^2], Rust doesn't support operating with non-default rounding directions, so we need only describe roundTiesToEven.
[^1]: https://github.com/rust-lang/rust/issues/41753#issuecomment-299322887
[^2]: https://github.com/llvm/llvm-project/issues/8472#issuecomment-980888781
This PR makes a start by documenting that for primitive operations and `from_str()`.
Use const initializer for LOCAL_PANIC_COUNT
This reduces the size of the __getit function for LOCAL_PANIC_COUNT and should speed up accesses of LOCAL_PANIC_COUNT a bit.
Make write/print macros eagerly drop temporaries
This PR fixes the 2 regressions in #96434 (`println` and `eprintln`) and changes all the other similar macros (`write`, `writeln`, `print`, `eprint`) to match the old pre-#94868 behavior of `println` and `eprintln`.
argument position | before #94868 | after #94868 | after this PR
--- |:---:|:---:|:---:
`write!($tmp, "…", …)` | 😡 | 😡 | 😺
`write!(…, "…", $tmp)` | 😡 | 😡 | 😺
`writeln!($tmp, "…", …)` | 😡 | 😡 | 😺
`writeln!(…, "…", $tmp)` | 😡 | 😡 | 😺
`print!("…", $tmp)` | 😡 | 😡 | 😺
`println!("…", $tmp)` | 😺 | 😡 | 😺
`eprint!("…", $tmp)` | 😡 | 😡 | 😺
`eprintln!("…", $tmp)` | 😺 | 😡 | 😺
`panic!("…", $tmp)` | 😺 | 😺 | 😺
Example of code that is affected by this change:
```rust
use std::sync::Mutex;
fn main() {
let mutex = Mutex::new(0);
print!("{}", mutex.lock().unwrap()) /* no semicolon */
}
```
You can see several real-world examples like this in the Crater links at the top of #96434. This code failed to compile prior to this PR as follows, but works after this PR.
```console
error[E0597]: `mutex` does not live long enough
--> src/main.rs:5:18
|
5 | print!("{}", mutex.lock().unwrap()) /* no semicolon */
| ^^^^^^^^^^^^---------
| |
| borrowed value does not live long enough
| a temporary with access to the borrow is created here ...
6 | }
| -
| |
| `mutex` dropped here while still borrowed
| ... and the borrow might be used here, when that temporary is dropped and runs the `Drop` code for type `MutexGuard`
```
Stabilize `Ipv6Addr::to_ipv4_mapped`
CC https://github.com/rust-lang/rust/issues/27709 (tracking issue for the `ip` feature which contains more
functions)
The function `Ipv6Addr::to_ipv4` is bad because it also returns an IPv4
address for the IPv6 loopback address `::1`. Stabilize
`Ipv6Addr::to_ipv4_mapped` so we can recommend that function instead.
Fix typo in futex RwLock::write_contended.
I wrote `state` where I should've used `s`.
This was spotted by `@Warrenren.`
This change removes the unnecessary `s` variable to prevent that mistake.
Fortunately, this typo didn't affect the correctness of the lock, as the
second half of the condition (!has_writers_waiting) is enough for
correctness, which explains why this mistake didn't show up during
testing.
Fixes https://github.com/rust-lang/rust/issues/97162
Fix rusty grammar in `std::error::Reporter` docs
### Commit
I initially saw "print's" instead of "prints" at the start of the doc comment for `std::error::Reporter`, while reading the docs for that type. Then I figured 'probably more where that came from', so, as well as correcting the foregoing to "prints", I've patched up these three minor solecisms (well, two [types](https://en.wikipedia.org/wiki/Type%E2%80%93token_distinction), three [tokens](https://en.wikipedia.org/wiki/Type%E2%80%93token_distinction)):
- One use of the indicative which should be subjunctive - indeed the sentence immediately following it, which mirrors its structure, _does_ use the subjunctive ([L871](https://github.com/rust-lang/rust/blob/master/library/std/src/error.rs?plain=1#L871)). Replaced with the subjunctive.
- Two separate clauses joined with commas ([L975](https://github.com/rust-lang/rust/blob/master/library/std/src/error.rs?plain=1#L975), [L1023](https://github.com/rust-lang/rust/blob/master/library/std/src/error.rs?plain=1#L1023)). Replaced the first with a semicolon and the second with a period. Admittedly those judgements are pretty much 100% subjective, based on my sense of how the sentences flowed into each other (though ofc the _replacement of the comma itself_ is not subjective or opinion-based).
I know this is silly and finicky, but I hope it helps tidy up the docs a bit for future readers!
### PR notes
**This is very much non-urgent (and, honestly, non-important).** I just figured it might be a nice quality-of-life improvement and bit of tidying up for the core contributors themselves not to have to do. 🙂
I'm tagging Steve, per the [contributing guidelines](https://rustc-dev-guide.rust-lang.org/contributing.html#r) ("Steve usually reviews documentation changes. So if you were to make a documentation change, add `r? `@steveklabnik`"):`
r? `@steveklabnik`
I wrote `state` where I should've used `s`.
This removes the unnecessary `s` variable to prevent that mistake.
Fortunately, this typo didn't affect the correctness of the lock, as the
second half of the condition (!has_writers_waiting) is enough for
correctness, which explains why this mistake didn't show up during
testing.
From reading the source code, it appears like the desired semantic of
std::unix::rand is to always provide some bytes and never block. For
that reason GRND_NONBLOCK is checked before calling getrandom(0), so
that getrandom(0) won't block. If it would block, then the function
falls back to using /dev/urandom, which for the time being doesn't
block. There are some drawbacks to using /dev/urandom, however, and so
getrandom(GRND_INSECURE) was created as a replacement for this exact
circumstance.
getrandom(GRND_INSECURE) is the same as /dev/urandom, except:
- It won't leave a warning in dmesg if used at early boot time, which is
a common occurance (and the reason why I found this issue);
- It won't introduce a tiny delay at early boot on newer kernels when
/dev/urandom tries to opportunistically create jitter entropy;
- It only requires 1 syscall, rather than 3.
Other than that, it returns the same "quality" of randomness as
/dev/urandom, and never blocks.
It's only available on kernels ≥5.6, so we try to use it, cache the
result of that attempt, and fall back to to the previous code if it
didn't work.
It is not obvious (at least for me) that complexity of iteration over hash tables depends on capacity and not length. Especially comparing with other containers like Vec or String. I think, this behaviour is worth mentioning.
I run benchmark which tests iteration time for maps with length 50 and different capacities and get this results:
```
capacity - time
64 - 203.87 ns
256 - 351.78 ns
1024 - 607.87 ns
4096 - 965.82 ns
16384 - 3.1188 us
```
If you want to dig why it behaves such way, you can look current implementation in [hashbrown code](f3a9f211d0/src/raw/mod.rs (L1933)).
Benchmarks code would be presented in PR related to this commit.
* For read and read_buf, only the front slice of a discontiguous
VecDeque is copied. The VecDeque is advanced after reading, making any
back slice available for reading with a second call to Read::read(_buf).
* For write, the VecDeque always appends the entire slice to the end,
growing its allocation when necessary.
Remove libstd's calls to `C-unwind` foreign functions
Remove all libstd and its dependencies' usage of `extern "C-unwind"`.
This is a prerequiste of a WIP PR which will forbid libraries calling `extern "C-unwind"` functions to be compiled in `-Cpanic=unwind` and linked against `panic_abort` (this restriction is necessary to address soundness bug #96926).
Cargo will ensure all crates are compiled with the same `-Cpanic` but the std is only compiled `-Cpanic=unwind` but needs the ability to be linked into `-Cpanic=abort`.
Currently there are two places where `C-unwind` is used in libstd:
* `__rust_start_panic` is used for interfacing to the panic runtime. This could be `extern "Rust"`
* `_{rdl,rg}_oom`: a shim `__rust_alloc_error_handler` will be generated by codegen to call into one of these; they can also be `extern "Rust"` (in fact, the generated shim is used as `extern "Rust"`, so I am not even sure why these are not, probably because they used to `extern "C"` and was changed to `extern "C-unwind"` when we allow alloc error hooks to unwind, but they really should just be using Rust ABI).
For dependencies, there is only one `extern "C-unwind"` function call, in `unwind` crate. This can be expressed as a re-export.
More dicussions can be seen in the Zulip thread: https://rust-lang.zulipchat.com/#narrow/stream/210922-project-ffi-unwind/topic/soundness.20in.20mixed.20panic.20mode
`@rustbot` label: T-libs F-c_unwind
Make HashMap fall back to RtlGenRandom if BCryptGenRandom fails
With PR #84096, Rust `std::collections::hash_map::RandomState` changed from using `RtlGenRandom()` ([msdn](https://docs.microsoft.com/en-us/windows/win32/api/ntsecapi/nf-ntsecapi-rtlgenrandom)) to `BCryptGenRandom()` ([msdn](https://docs.microsoft.com/en-us/windows/win32/api/bcrypt/nf-bcrypt-bcryptgenrandom)) as its source of secure randomness after much discussion ([here](https://github.com/rust-random/getrandom/issues/65#issuecomment-753634074), among other places).
Unfortunately, after that PR landed, Mozilla Firefox started experiencing fairly-rare crashes during startup while attempting to initialize the `env_logger` crate. ([docs for env_logger](https://docs.rs/env_logger/latest/env_logger/)) The root issue is that on some machines, `BCryptGenRandom()` will fail with an `Access is denied. (os error 5)` error message. ([Bugzilla issue 1754490](https://bugzilla.mozilla.org/show_bug.cgi?id=1754490)) (Discussion in issue #94098)
Note that this is happening upon startup of Firefox's unsandboxed Main Process, so this behavior is different and separate from previous issues ([like this](https://bugzilla.mozilla.org/show_bug.cgi?id=1746254)) where BCrypt DLLs were blocked by process sandboxing. In the case of sandboxing, we knew we were doing something abnormal and expected that we'd have to resort to abnormal measures to make it work.
However, in this case we are in a regular unsandboxed process just trying to initialize `env_logger` and getting a panic. We suspect that this may be caused by a virus scanner or some other security software blocking the loading of the BCrypt DLLs, but we're not completely sure as we haven't been able to replicate locally.
It is also possible that Firefox is not the only software affected by this; we just may be one of the pieces of Rust software that has the telemetry and crash reporting necessary to catch it.
I have read some of the historical discussion around using `BCryptGenRandom()` in Rust code, and I respect the decision that was made and agree that it was a good course of action, so I'm not trying to open a discussion about a return to `RtlGenRandom()`. Instead, I'd like to suggest that perhaps we use `RtlGenRandom()` as a "fallback RNG" in the case that BCrypt doesn't work.
This pull request implements this fallback behavior. I believe this would improve the robustness of this essential data structure within the standard library, and I see only 2 potential drawbacks:
1. Slight added overhead: It should be quite minimal though. The first call to `sys::rand::hashmap_random_keys()` will incur a bit of initialization overhead, and every call after will incur roughly 2 non-atomic global reads and 2 easily predictable branches. Both should be negligible compared to the actual cost of generating secure random numbers
2. `RtlGenRandom()` is deprecated by Microsoft: Technically true, but as mentioned in [this comment on GoLang](https://github.com/golang/go/issues/33542#issuecomment-626124873), this API is ubiquitous in Windows software and actually removing it would break lots of things. Also, Firefox uses it already in [our C++ code](https://searchfox.org/mozilla-central/rev/5f88c1d6977e03e22d3420d0cdf8ad0113c2eb31/mfbt/RandomNum.cpp#25), and [Chromium uses it in their code as well](https://source.chromium.org/chromium/chromium/src/+/main:base/rand_util_win.cc) (which transitively means that Microsoft uses it in their own web browser, Edge). If there did come a time when Microsoft truly removes this API, it should be easy enough for Rust to simply remove the fallback in the code I've added here
Fix use of SetHandleInformation on UWP
The use of `SetHandleInformation` (introduced in #96441 to make `HANDLE` inheritable) breaks UWP builds because it is not available for UWP targets.
Proposed workaround: duplicate the `HANDLE` with `inherit = true` and immediately close the old one. Traditional Windows Desktop programs are not affected.
cc `@ChrisDenton`
Add rustc_nonnull_optimization_guaranteed to Owned/Borrowed Fd/Socket
PR #94586 added support for using
`rustc_nonnull_optimization_guaranteed` on values where the "null" value
is the all-ones bitpattern.
Now that #94586 has made it to the stage0 compiler, add
`rustc_nonnull_optimization_guaranteed` to `OwnedFd`, `BorrowedFd`,
`OwnedSocket`, and `BorrowedSocket`, since these types all exclude
all-ones bitpatterns.
This allows `Option<OwnedFd>`, `Option<BorrowedFd>`, `Option<OwnedSocket>`,
and `Option<BorrowedSocket>` to be used in FFI declarations, as described
in the [I/O safety RFC].
[I/O safety RFC]: https://github.com/rust-lang/rfcs/blob/master/text/3128-io-safety.md#ownedfd-and-borrowedfdfd-1
ExitCode::exit_process() method
cc `@yaahc` / #93840
(eeek, hit ctrl-enter before I meant to and right after realizing the branch name was wrong. oh, well)
I feel like it makes sense to have the `exit(ExitCode)` function as a method or at least associated function on ExitCode, but maybe that would hurt discoverability? Probably not as much if it's at the top of the `process::exit()` documentation or something, but idk. Also very unsure about the name, I'd like something that communicates that you are exiting with *this* ExitCode, but with a method name being postfix it doesn't seem to flow. `code.exit_process_with()` ? `.exit_process_with_self()` ? Blech. Maybe it doesn't matter, since ideally just `code.exit()` or something would be clear simply by the name and single parameter but 🤷
Also I'd like to touch up the `ExitCode` docs (which I did a bit here), but that would probably be good in a separate PR, right? Since I think the beta deadline is coming up.
Clarify what values `BorrowedHandle`, `OwnedHandle` etc. can hold.
Reword the documentation to clarify that when `BorrowedHandle`, `OwnedHandle`, or `HandleOrNull` hold the value `-1`, it always means the current process handle, and not `INVALID_HANDLE_VALUE`.
`-1` should only mean `INVALID_HANDLE_VALUE` after a call to a function documented to return that to report errors, which should lead I/O functions to produce errors rather than succeeding and producing `OwnedHandle` or `BorrowedHandle` values. So if a consumer of an `OwnedHandle` or `BorrowedHandle` ever sees them holding a `-1`, it should always mean the current process handle.
PR #94586 added support for using
`rustc_nonnull_optimization_guaranteed` on values where the "null" value
is the all-ones bitpattern.
Now that #94586 has made it to the stage0 compiler, add
`rustc_nonnull_optimization_guaranteed` to `OwnedFd`, `BorrowedFd`,
`OwnedSocket`, and `BorrowedSocket`, since these types all exclude
all-ones bitpatterns.
This allows `Option<OwnedFd>`, `Option<BorrowedFd>`, `Option<OwnedSocket>`,
and `Option<BorrowedSocket>` to be used in FFI declarations, as described
in the [I/O safety RFC].
[I/O safety RFC]: https://github.com/rust-lang/rfcs/blob/master/text/3128-io-safety.md#ownedfd-and-borrowedfdfd-1
In the previous implementation, if the standard streams were open,
but the RLIMIT_NOFILE value was below three, the poll would fail
with EINVAL:
> ERRORS: EINVAL The nfds value exceeds the RLIMIT_NOFILE value.
Switch to the existing fcntl based implementation to avoid the issue.
Clarify that when `BorrowedHandle`, `OwnedHandle`, or `HandleOrNull`
hold the value `-1`, it always means the current process handle, and not
`INVALID_HANDLE_VALUE`.
Make `BorrowedFd::borrow_raw` a const fn.
Making `BorrowedFd::borrow_raw` a const fn allows it to be used to
create a constant `BorrowedFd<'static>` holding constants such as
`AT_FDCWD`. This will allow [`rustix::fs::cwd`] to become a const fn.
For consistency, make similar changes to `BorrowedHandle::borrow_raw`
and `BorrowedSocket::borrow_raw`.
[`rustix::fs::cwd`]: https://docs.rs/rustix/latest/rustix/fs/fn.cwd.html
r? `@joshtriplett`
CC #27709 (tracking issue for the `ip` feature which contains more
functions)
The function `Ipv6Addr::to_ipv4` is bad because it also returns an IPv4
address for the IPv6 loopback address `::1`. Stabilize
`Ipv6Addr::to_ipv4_mapped` so we can recommend that function instead.
Issue #84096 changed the hashmap RNG to use BCryptGenRandom instead of
RtlGenRandom on Windows.
Mozilla Firefox started experiencing random failures in
env_logger::Builder::new() (Issue #94098) during initialization of their
unsandboxed main process with an "Access Denied" error message from
BCryptGenRandom(), which is used by the HashMap contained in
env_logger::Builder
The root cause appears to be a virus scanner or other software interfering
with BCrypt DLLs loading.
This change adds a fallback option if BCryptGenRandom is unusable for
whatever reason. It will fallback to RtlGenRandom in this case.
Fixes#94098
Revert "Implement [OsStr]::join", which was merged without FCP.
This reverts commit 4fcbc53820, see https://github.com/rust-lang/rust/pull/96744. (I'm terribly sorry, and truly don't remember r+ing it, or even having seen it before yesterday, which is... genuinely very worrisome for me).
r? `@m-ou-se`
Improve floating point documentation
This is my attempt to improve/solve https://github.com/rust-lang/rust/issues/95468 and https://github.com/rust-lang/rust/issues/73328 .
Added/refined explanations:
- Refine the "NaN as a special value" top level explanation of f32
- Refine `const NAN` docstring: add an explanation about there being multitude of NaN bitpatterns and disclaimer about the portability/stability guarantees.
- Refine `fn is_sign_positive` and `fn is_sign_negative` docstrings: add disclaimer about the sign bit of NaNs.
- Refine `fn min` and `fn max` docstrings: explain the semantics and their relationship to the standard and libm better.
- Refine `fn trunc` docstrings: explain the semantics slightly more.
- Refine `fn powi` docstrings: add disclaimer that the rounding behaviour might be different from `powf`.
- Refine `fn copysign` docstrings: add disclaimer about payloads of NaNs.
- Refine `minimum` and `maximum`: add disclaimer that "propagating NaN" doesn't mean that propagating the NaN bit patterns is guaranteed.
- Refine `max` and `min` docstrings: add "ignoring NaN" to bring the one-row explanation to parity with `minimum` and `maximum`.
Cosmetic changes:
- Reword `NaN` and `NAN` as plain "NaN", unless they refer to the specific `const NAN`.
- Reword "a number" to `self` in function docstrings to clarify.
- Remove "Returns NAN if the number is NAN" from `abs`, as this is told to be the default behavior in the top explanation.
Remove `#[rustc_deprecated]`
This removes `#[rustc_deprecated]` and introduces diagnostics to help users to the right direction (that being `#[deprecated]`). All uses of `#[rustc_deprecated]` have been converted. CI is expected to fail initially; this requires #95958, which includes converting `stdarch`.
I plan on following up in a short while (maybe a bootstrap cycle?) removing the diagnostics, as they're only intended to be short-term.
Add more diagnostic items
This just adds a handful diagnostic items I noticed were missing.
Would it be worth doing this for all of the remaining types? I'm willing to do it if it'd be helpful.
Create clippy lint against unexpectedly late drop for temporaries in match scrutinee expressions
A new clippy lint for issue 93883 (https://github.com/rust-lang/rust/issues/93883). Relies on a new trait in `marker` (called `SignificantDrop` to enable linting), which is why this PR is for the rust-lang repo and not the clippy repo.
changelog: new lint [`significant_drop_in_scrutinee`]
Remove hard links from `env::current_exe` security example
The security example shows that `env::current_exe` will return the path used when the program was started. This is not really surprising considering how hard links work: after `ln foo bar`, the two files are _equivalent_. It is _not_ the case that `bar` is a “link” to `foo`, nor is `foo` a link to `bar`. They are simply two names for the same underlying data.
The security vulnerability linked to seems to be different: there an attacker would start a SUID binary from a directory under the control of the attacker. The binary would respawn itself by executing the program found at `/proc/self/exe` (which the attacker can control). This is a real problem. In my opinion, the example given here doesn’t really show the same problem, it just shows a misunderstanding of what hard links are.
I looked through the history a bit and found that the example was introduced in https://github.com/rust-lang/rust/pull/33526. That PR actually has two commits, and the first (8478d48dad) explains the race condition at the root of the linked security vulnerability. The second commit proceeds to replace the explanation with the example we have today.
This commit reverts most of the second commit from https://github.com/rust-lang/rust/pull/33526.
Add aliases for std::fs::canonicalize
The aliases are `realpath` and `GetFinalPathNameByHandle` which are explicitly mentioned in `canonicalize`'s documentation.
Use 64-bit time on 32-bit linux-gnu
The standard library suffered the [Year 2038 problem][Y2038] in two main places on targets with 32-bit `time_t`:
- In `std::time::SystemTime`, we stored a `timespec` that has `time_t` seconds. This is now changed to directly store 64-bit seconds and nanoseconds, and on 32-bit linux-gnu we try to use `__clock_gettime64` (glibc 2.34+) to get the larger timestamp.
- In `std::fs::Metadata`, we store a `stat64`, which has 64-bit `off_t` but still 32-bit `time_t`, and unfortunately that is baked in the API by the (deprecated) `MetadataExt::as_raw_stat()`. However, we can use `statx` for 64-bit `statx_timestamp` to store in addition to the `stat64`, as we already do to support creation time, and the rest of the `MetadataExt` methods can return those full values. Note that some filesystems may still be limited in their actual timestamp support, but that's not something Rust can change.
There remain a few places that need `timespec` for system call timeouts -- I leave that to future work.
[Y2038]: https://en.wikipedia.org/wiki/Year_2038_problem
Add a dedicated length-prefixing method to `Hasher`
This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths
This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
Fixes#94026
r? rust-lang/libs
---
The core of this change is the following two new methods on `Hasher`:
```rust
pub trait Hasher {
/// Writes a length prefix into this hasher, as part of being prefix-free.
///
/// If you're implementing [`Hash`] for a custom collection, call this before
/// writing its contents to this `Hasher`. That way
/// `(collection![1, 2, 3], collection![4, 5])` and
/// `(collection![1, 2], collection![3, 4, 5])` will provide different
/// sequences of values to the `Hasher`
///
/// The `impl<T> Hash for [T]` includes a call to this method, so if you're
/// hashing a slice (or array or vector) via its `Hash::hash` method,
/// you should **not** call this yourself.
///
/// This method is only for providing domain separation. If you want to
/// hash a `usize` that represents part of the *data*, then it's important
/// that you pass it to [`Hasher::write_usize`] instead of to this method.
///
/// # Examples
///
/// ```
/// #![feature(hasher_prefixfree_extras)]
/// # // Stubs to make the `impl` below pass the compiler
/// # struct MyCollection<T>(Option<T>);
/// # impl<T> MyCollection<T> {
/// # fn len(&self) -> usize { todo!() }
/// # }
/// # impl<'a, T> IntoIterator for &'a MyCollection<T> {
/// # type Item = T;
/// # type IntoIter = std::iter::Empty<T>;
/// # fn into_iter(self) -> Self::IntoIter { todo!() }
/// # }
///
/// use std:#️⃣:{Hash, Hasher};
/// impl<T: Hash> Hash for MyCollection<T> {
/// fn hash<H: Hasher>(&self, state: &mut H) {
/// state.write_length_prefix(self.len());
/// for elt in self {
/// elt.hash(state);
/// }
/// }
/// }
/// ```
///
/// # Note to Implementers
///
/// If you've decided that your `Hasher` is willing to be susceptible to
/// Hash-DoS attacks, then you might consider skipping hashing some or all
/// of the `len` provided in the name of increased performance.
#[inline]
#[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")]
fn write_length_prefix(&mut self, len: usize) {
self.write_usize(len);
}
/// Writes a single `str` into this hasher.
///
/// If you're implementing [`Hash`], you generally do not need to call this,
/// as the `impl Hash for str` does, so you can just use that.
///
/// This includes the domain separator for prefix-freedom, so you should
/// **not** call `Self::write_length_prefix` before calling this.
///
/// # Note to Implementers
///
/// The default implementation of this method includes a call to
/// [`Self::write_length_prefix`], so if your implementation of `Hasher`
/// doesn't care about prefix-freedom and you've thus overridden
/// that method to do nothing, there's no need to override this one.
///
/// This method is available to be overridden separately from the others
/// as `str` being UTF-8 means that it never contains `0xFF` bytes, which
/// can be used to provide prefix-freedom cheaper than hashing a length.
///
/// For example, if your `Hasher` works byte-by-byte (perhaps by accumulating
/// them into a buffer), then you can hash the bytes of the `str` followed
/// by a single `0xFF` byte.
///
/// If your `Hasher` works in chunks, you can also do this by being careful
/// about how you pad partial chunks. If the chunks are padded with `0x00`
/// bytes then just hashing an extra `0xFF` byte doesn't necessarily
/// provide prefix-freedom, as `"ab"` and `"ab\u{0}"` would likely hash
/// the same sequence of chunks. But if you pad with `0xFF` bytes instead,
/// ensuring at least one padding byte, then it can often provide
/// prefix-freedom cheaper than hashing the length would.
#[inline]
#[unstable(feature = "hasher_prefixfree_extras", issue = "88888888")]
fn write_str(&mut self, s: &str) {
self.write_length_prefix(s.len());
self.write(s.as_bytes());
}
}
```
With updates to the `Hash` implementations for slices and containers to call `write_length_prefix` instead of `write_usize`.
`write_str` defaults to using `write_length_prefix` since, as was pointed out in the issue, the `write_u8(0xFF)` approach is insufficient for hashers that work in chunks, as those would hash `"a\u{0}"` and `"a"` to the same thing. But since `SipHash` works byte-wise (there's an internal buffer to accumulate bytes until a full chunk is available) it overrides `write_str` to continue to use the add-non-UTF-8-byte approach.
---
Compatibility:
Because the default implementation of `write_length_prefix` calls `write_usize`, the changed hash implementation for slices will do the same thing the old one did on existing `Hasher`s.
Use futex-based locks and thread parker on {Free, Open, DragonFly}BSD.
This switches *BSD to our futex-based locks and thread parker.
Tracking issue: https://github.com/rust-lang/rust/issues/93740
This is a draft, because this still needs a new version of the `libc` crate to be published that includes https://github.com/rust-lang/libc/pull/2770.
r? `@Amanieu`
This accomplishes two main goals:
- Make it clear who is responsible for prefix-freedom, including how they should do it
- Make it feasible for a `Hasher` that *doesn't* care about Hash-DoS resistance to get better performance by not hashing lengths
This does not change rustc-hash, since that's in an external crate, but that could potentially use it in future.
Implement [OsStr]::join
Implements join for `OsStr` and `OsString` slices:
```Rust
let strings = [OsStr::new("hello"), OsStr::new("dear"), OsStr::new("world")];
assert_eq!("hello dear world", strings.join(OsStr::new(" ")));
````
This saves one from converting to strings and back, or from implementing it manually.
Relax memory ordering used in SameMutexCheck
`SameMutexCheck` only requires atomicity for `self.addr`, but does not need ordering of other memory accesses in either the success or failure case. Using `Relaxed`, the code still correctly handles the case when two threads race to store an address.
Relax memory ordering used in `min_stack`
`min_stack` does not provide any synchronization guarantees to its callers, and only requires atomicity for `MIN` itself, so relaxed memory ordering is sufficient.
This allows using `ReadBuf` in a builder-like style and to setup a `ReadBuf` and
pass it to `read_buf` in a single expression, e.g.,
```
// With this PR:
reader.read_buf(ReadBuf::uninit(buf).assume_init(init_len))?;
// Previously:
let mut buf = ReadBuf::uninit(buf);
buf.assume_init(init_len);
reader.read_buf(&mut buf)?;
```
Signed-off-by: Nick Cameron <nrc@ncameron.org>
The security example shows that `env::current_exe` will return the
path used when the program was started. This is not really surprising
considering how hard links work: after `ln foo bar`, the two files are
_equivalent_. It is _not_ the case that `bar` is a “link” to `foo`,
nor is `foo` a link to `bar`. They are simply two names for the same
underlying data.
The security vulnerability linked to seems to be different: there an
attacker would start a SUID binary from a directory under the control
of the attacker. The binary would respawn itself by executing the
program found at `/proc/self/exe` (which the attacker can control).
This is a real problem. In my opinion, the example given here doesn’t
really show the same problem, it just shows a misunderstanding of what
hard links are.
I looked through the history a bit and found that the example was
introduced in #33526. That PR actually has two commits, and the
first (8478d48dad) explains the race
condition at the root of the linked security vulnerability. The second
commit proceeds to replace the explanation with the example we have
today.
This commit reverts most of the second commit from #33526.
`SameMutexCheck` only requires atomicity for `self.addr`, but does not need ordering of other memory accesses in either the success or failure case. Using `Relaxed`, the code still correctly handles the case when two threads race to store an address.
`min_stack` does not provide any synchronization guarantees to its callers, and only requires atomicity for `MIN` itself, so relaxed memory ordering is sufficient.
rustdoc: Resolve doc links referring to `macro_rules` items
cc https://github.com/rust-lang/rust/issues/81633
UPD: the fallback to considering *all* `macro_rules` in the crate for unresolved names is not removed in this PR, it will be removed separately and will be run through crater.
Make [e]println macros eagerly drop temporaries (for backport)
This PR extracts the subset of #96455 which is only the parts necessary for fixing the 1.61-beta regressions in #96434.
My larger PR #96455 contains a few other changes relative to the pre-#94868 behavior; those are not necessary to backport into 1.61.
argument position | before #94868 | after #94868 | after this PR
--- |:---:|:---:|:---:
`write!($tmp, "…", …)` | 😡 | 😡 | 😡
`write!(…, "…", $tmp)` | 😡 | 😡 | 😡
`writeln!($tmp, "…", …)` | 😡 | 😡 | 😡
`writeln!(…, "…", $tmp)` | 😡 | 😡 | 😡
`print!("…", $tmp)` | 😡 | 😡 | 😡
`println!("…", $tmp)` | 😺 | 😡 | 😺
`eprint!("…", $tmp)` | 😡 | 😡 | 😡
`eprintln!("…", $tmp)` | 😺 | 😡 | 😺
`panic!("…", $tmp)` | 😺 | 😺 | 😺
Revert "Re-export core::ffi types from std::ffi"
This reverts commit 9aed829fe6.
Fixes https://github.com/rust-lang/rust/issues/96435 , a regression
in crates doing `use std::ffi::*;` and `use std::os::raw::*;`.
We can re-add this re-export once the `core::ffi` types
are stable, and thus the `std::os::raw` types can become re-exports as
well, which will avoid the conflict. (Type aliases to the same type
still conflict, but re-exports of the same type don't.)
Windows: Make stdin pipes synchronous
Stdin pipes do not need to be used asynchronously within the standard library. This is a first step in making pipes mostly synchronous.
r? `@m-ou-se`
std: directly use pthread in UNIX parker implementation
`Mutex` and `Condvar` are being replaced by more efficient implementations, which need thread parking themselves (see #93740). Therefore we should use the `pthread` synchronization primitives directly. Also, we can avoid allocating the mutex and condition variable because the `Parker` struct is being placed in an `Arc` anyways.
This basically is just a copy of the current `Mutex` and `Condvar` code, which will however be removed (again, see #93740). An alternative implementation could be to use dedicated private `OsMutex` and `OsCondvar` types, but all the other platforms supported by std actually have their own thread parking primitives.
I used `Pin` to guarantee a stable address for the `Parker` struct, while the current implementation does not, rather using extra unsafe declaration. Since the thread struct is shared anyways, I assumed this would not add too much clutter while being clearer.
Make EncodeWide implement FusedIterator
[`EncodeUtf16`](https://doc.rust-lang.org/std/str/struct.EncodeUtf16.html) and [`EncodeWide`](https://doc.rust-lang.org/std/os/windows/ffi/struct.EncodeWide.html) currently serve similar purposes: They convert from UTF-8 to UTF-16 and WTF-8 to WTF-16, respectively. `EncodeUtf16` wraps a &str, whereas `EncodeWide` wraps an &OsStr.
When Iteration has concluded, these iterators wrap an empty slice, which will forever yield `None` values. Hence, `EncodeUtf16` rightfully implements `FusedIterator`. However, `EncodeWide` in contrast does not, even though it serves an almost identical purpose.
This PR attempts to fix that issue. I consider this change minor and non-controversial, hence why I have not added a RFC/FCP. Please let me know if the stability attribute is wrong or contains a wrong version number. Thanks in advance.
Fixes https://github.com/rust-lang/rust/issues/96368
This reverts commit 9aed829fe6.
Fixes https://github.com/rust-lang/rust/issues/96435 , a regression
in crates doing `use std::ffi::*;` and `use std::os::raw::*;`.
We can re-add this re-export once the `core::ffi` types
are stable, and thus the `std::os::raw` types can become re-exports as
well, which will avoid the conflict. (Type aliases to the same type
still conflict, but re-exports of the same type don't.)
Define a dedicated error type for `HandleOrNull` and `HandleOrInvalid`.
Define `NullHandleError` and `InvalidHandleError` types, that implement std::error::Error, and use them as the error types in `HandleOrNull` and `HandleOrInvalid`,
This addresses [this concern](https://github.com/rust-lang/rust/issues/87074#issuecomment-1080031167).
This is the same as #95387.
r? `@joshtriplett`
Mutex and Condvar are being replaced by more efficient implementations, which need thread parking themselves (see #93740). Therefore use the pthread synchronization primitives directly. Also, avoid allocating because the Parker struct is being placed in an Arc anyways.
Windows Command: Don't run batch files using verbatim paths
Fixes#95178
Note that the first commit does some minor refactoring (moving command line argument building to args.rs). The actual changes are in the second.
Reduce allocations for path conversions on Windows
Previously, UTF-8 to UTF-16 Path conversions on Windows unnecessarily allocate twice, as described in #96297. This commit fixes that issue.
Improve Windows path prefix parsing
This PR fixes improves parsing of Windows path prefixes. `parse_prefix` now supports both types of separators on Windows (`/` and `\`).
[fuchsia] Add implementation for `current_exe`
This implementation returns a best attempt at the current exe path. On
fuchsia, fdio will always use `argv[0]` as the process name and if it is
not set then an error will be returned. Because this is not guaranteed
to be the case, this implementation returns an error if `argv` does not
contain any elements.
remove_dir_all_recursive: treat ELOOP the same as ENOTDIR
On older Linux kernels (I tested on 4.4, corresponding to Ubuntu 16.04), opening a symlink using `O_DIRECTORY | O_NOFOLLOW` returns `ELOOP` instead of `ENOTDIR`. We should handle it the same, since a symlink is still not a directory and needs to be `unlink`ed.
Use sys::unix::locks::futex* on wasm+atomics.
This removes the wasm-specific lock implementations and instead re-uses the implementations from sys::unix.
Tracking issue: https://github.com/rust-lang/rust/issues/93740
cc ``@alexcrichton``
Improve AddrParseError description
The existing description was incorrect for socket addresses, and misleading: users would see “invalid IP address syntax” and suppose they were supposed to provide an IP address rather than a socket address.
I contemplated making it two variants (IP, socket), but realised we can do still better for the IPv4 and IPv6 types, so here it is as six.
I contemplated more precise error descriptions (e.g. “invalid IPv6 socket address syntax: expected a decimal scope ID after %”), but that’s a more invasive change, and probably not worthwhile anyway.
Making `BorrowedFd::borrow_raw` a const fn allows it to be used to
create a constant `BorrowedFd<'static>` holding constants such as
`AT_FDCWD`. This will allow [`rustix::fs::cwd`] to become a const fn.
For consistency, make similar changes to `BorrowedHandle::borrow_raw`
and `BorrowedSocket::borrow_raw`.
[`rustix::fs::cwd`]: https://docs.rs/rustix/latest/rustix/fs/fn.cwd.html
This implementation returns a best attempt at the current exe path. On
fuchsia, fdio will always use `argv[0]` as the process name and if it is
not set then an error will be returned. Because this is not guaranteed
to be the case, this implementation returns an error if `argv` does not
contain any elements.
The existing description was incorrect for socket addresses, and
misleading: users would see “invalid IP address syntax” and suppose they
were supposed to provide an IP address rather than a socket address.
I contemplated making it two variants (IP, socket), but realised we can
do still better for the IPv4 and IPv6 types, so here it is as six.
I contemplated more precise error descriptions (e.g. “invalid IPv6
socket address syntax: expected a decimal scope ID after %”), but that’s
a more invasive change, and probably not worthwhile anyway.
Use a single ReentrantMutex implementation on all platforms.
This replaces all platform specific ReentrantMutex implementations by the one I added in #95727 for Linux, since that one does not depend on any platform specific details.
r? `@Amanieu`
fix error handling for pthread_sigmask(3)
Errors from `pthread_sigmask(3)` were handled using `cvt()`, which expects a return value of `-1` on error and uses `errno`.
However, `pthread_sigmask(3)` returns `0` on success and an error number otherwise.
Fix it by replacing `cvt()` with `cvt_nz()`.
Use u32 instead of i32 for futexes.
This changes futexes from i32 to u32. The [Linux man page](https://man7.org/linux/man-pages/man2/futex.2.html) uses `uint32_t` for them, so I'm not sure why I used i32 for them. Maybe because I first used them for thread parkers, where I used -1, 0, and 1 as the states.
(Wasm's `memory.atomic.wait32` does use `i32`, because wasm doesn't support `u32`.)
It doesn't matter much, but using the unsigned type probably results in fewer surprises when shifting bits around or using comparison operators.
r? ```@Amanieu```
Create (unstable) 2024 edition
[On Zulip](https://rust-lang.zulipchat.com/#narrow/stream/213817-t-lang/topic/Deprecating.20macro.20scoping.20shenanigans/near/272860652), there was a small aside regarding creating the 2024 edition now as opposed to later. There was a reasonable amount of support and no stated opposition.
This change creates the 2024 edition in the compiler and creates a prelude for the 2024 edition. There is no current difference between the 2021 and 2024 editions. Cargo and other tools will need to be updated separately, as it's not in the same repository. This change permits the vast majority of work towards the next edition to proceed _now_ instead of waiting until 2024.
For sanity purposes, I've merged the "hello" UI tests into a single file with multiple revisions. Otherwise we'd end up with a file per edition, despite them being essentially identical.
````@rustbot```` label +T-lang +S-waiting-on-review
Not sure on the relevant team, to be honest.
Windows: Use a pipe relay for chaining pipes
Fixes#95759
This fixes the issue by chaining pipes synchronously and manually pumping messages between them. It's not ideal but it has the advantage of not costing anything if pipes are not chained ("don't pay for what you don't use") and it also avoids breaking existing code that rely on our end of the pipe being asynchronous (which includes rustc's own testing framework).
Libraries can avoid needing this by using their own pipes to chain commands.
Document that DirEntry holds the directory open
I had a bug where holding onto DirEntry structs caused file descriptor exhaustion, and thought it would be good to document this.