Add drain_filter method to HashMap and HashSet
Add `HashMap::drain_filter` and `HashSet::drain_filter`, implementing part of rust-lang/rfcs#2140. These new methods are unstable. The tracking issue is #59618.
The added iterators behave the same as `BTreeMap::drain_filter` and `BTreeSet::drain_filter`, except their iteration order is arbitrary. The unit tests are adapted from `alloc::collections::btree`.
This branch rewrites `HashSet` to be a wrapper around `hashbrown::HashSet` rather than `std::collections::HashMap`.
(Both are themselves wrappers around `hashbrown::HashMap`, so the in-memory representation is the same either way.) This lets `std` re-use more iterator code from `hashbrown`. Without this change, we would need to duplicate much more code to implement `HashSet::drain_filter`.
This branch also updates the `hashbrown` crate to version 0.9.0. Aside from changes related to the `DrainFilter` iterators, this version only changes features that are not used in libstd or rustc. And it updates `indexmap` to version 1.6.0, whose only change is compatibility with `hashbrown` 0.9.0.
The calling convention of pthread_getattr_np() is to initialize the
pthread_attr_t, so _destroy() is only necessary on success (and _init()
isn't necessary beforehand). On the other hand, FreeBSD wants the
attr_t to be initialized before pthread_attr_get_np(), and therefore it
should always be destroyed afterwards.
Implement Seek::stream_position() for BufReader
Optimization over `BufReader::seek()` for getting the current position without flushing the internal buffer.
Related to #31100. Based on the code in #70577.
time.rs: Make spelling of "Darwin" consistent
On line 89 of this file, the OS name is written as "Darwin", but on line 162 it is written in all-caps. Darwin is usually spelt as a standard proper noun, i.e. "Darwin", rather than in all-caps.
This change makes that form consistent in both places.
Make `Ipv4Addr` and `Ipv6Addr` const tests unit tests under `library`
These tests are about the standard library, not the compiler itself, thus should live in `library`, see #76268.
Use Arc::clone and Rc::clone in documentation
This PR replaces uses of `x.clone()` by `Rc::clone(&x)` (or `Arc::clone(&x)`) to better match the documentation for those types.
@rustbot modify labels: T-doc
Functions such as `is_enclave_range` and `is_user_range` in
`sgx::os::fortanix_sgx::mem` are often used to make sure memory ranges
passed to an enclave from untrusted code or passed to other trusted code
functions are safe to use for their intended purpose. Currently, these
functions do not perform any checks to make sure the range provided
doesn't overflow when adding the range length to the base address. While
debug builds will panic if overflow occurs, release builds will simply
wrap the result, leading to false positive results for either function.
The burden is placed on application authors to know to perform overflow
checks on their own before calling these functions, which can easily
lead to security vulnerabilities if omitted. Additionally, since such
checks are performed in the Intel SGX SDK versions of these functions,
developers migrating from Intel SGX SDK code may expect these functions
to operate the same.
This commit adds explicit overflow checking to `is_enclave_range` and
`is_user_range`, returning `false` if overflow occurs in order to
prevent misuse of invalid memory ranges. It also alters the checks to
account for ranges that lie exactly at the end of the address space,
where calculating `p + len` would overflow despite the range being
valid.
rustdoc: do not use plain summary for trait impls
Fixes#38386.
Fixes#48332.
Fixes#49430.
Fixes#62741.
Fixes#73474.
Unfortunately this is not quite ready to go because the newly-working links trigger a bunch of linkcheck failures. The failures are tough to fix because the links are resolved relative to the implementor, which could be anywhere in the module hierarchy.
(In the current docs, these links end up rendering as uninterpreted markdown syntax, so I don't think these failures are any worse than the status quo. It might be acceptable to just add them to the linkchecker whitelist.)
Ideally this could be fixed with intra-doc links ~~but it isn't working for me: I am currently investigating if it's possible to solve it this way.~~ Opened #73829.
EDIT: This is now ready!
Convert many files to intra-doc links
Helps with https://github.com/rust-lang/rust/issues/75080
r? @poliorcetics
I recommend reviewing one commit at a time, but the diff is small enough you can do it all at once if you like :)
Applied `#![deny(unsafe_op_in_unsafe_fn)]` in library/std/src/wasi
partial fix for #73904
There are still more that was not applied in [mod.rs]( 38fab2ea92/library/std/src/sys/wasi/mod.rs) and that is due to its using files from `../unsupported`
like:
```
#[path = "../unsupported/cmath.rs"]
pub mod cmath;
```
Make all methods of `std::net::Ipv4Addr` const
Make the following methods of `std::net::Ipv4Addr` unstable const under the `const_ipv4` feature:
- `octets`
- `is_loopback`
- `is_private`
- `is_link_local`
- `is_global` (unstable)
- `is_shared` (unstable)
- `is_ietf_protocol_assignment` (unstable)
- `is_benchmarking` (unstable)
- `is_reserved` (unstable)
- `is_multicast`
- `is_broadcast`
- `is_documentation`
- `to_ipv6_compatible`
- `to_ipv6_mapped`
This would make all methods of `Ipv6Addr` const.
Of these methods, `is_global`, `is_broadcast`, `to_ipv6_compatible`, and `to_ipv6_mapped` require a change in implementation.
Part of #76205
Add a note for Ipv4Addr::to_ipv6_compatible
Previous discussion: #75019
> I think adding a comment saying "This isn't typically the method you want; these addresses don't typically function on modern systems. Use `to_ipv6_mapped` instead." would be a good first step, whether this method gets marked as deprecated or not.
_Originally posted by @joshtriplett in https://github.com/rust-lang/rust/pull/75150#issuecomment-680267745_
- Use intra-doc links for `std::io` in `std::fs`
- Use intra-doc links for File::read in unix/ext/fs.rs
- Remove explicit intra-doc links for `true` in `net/addr.rs`
- Use intra-doc links in alloc/src/sync.rs
- Use intra-doc links in src/ascii.rs
- Switch to intra-doc links in alloc/rc.rs
- Use intra-doc links in core/pin.rs
- Use intra-doc links in std/prelude
- Use shorter links in `std/fs.rs`
`io` is already in scope.
Make all methods of `std::net::Ipv6Addr` const
Make the following methods of `std::net::Ipv6Addr` unstable const under the `const_ipv6` feature:
- `segments`
- `is_unspecified`
- `is_loopback`
- `is_global` (unstable)
- `is_unique_local`
- `is_unicast_link_local_strict`
- `is_documentation`
- `multicast_scope`
- `is_multicast`
- `to_ipv4_mapped`
- `to_ipv4`
This would make all methods of `Ipv6Addr` const.
Changed the implementation of `is_unspecified` and `is_loopback` to use a `match` instead of `==`, all other methods did not require a change.
All these methods are dependent on `segments`, the current implementation of which requires unstable `const_fn_transmute` ([PR#75085](https://github.com/rust-lang/rust/pull/75085)).
Part of #76205
Makes the following methods of `std::net::Ipv4Addr` unstable const under the `const_ipv4` feature:
- `is_global`
- `is_reserved`
- `is_broadcast`
- `to_ipv6_compatible`
- `to_ipv6_mapped`
This results in all methods of `Ipv4Addr` being const.
Also adds tests for these methods in a const context.
Make the following methods of `std::net::Ipv6Addr` unstable const under the `const_ipv6` feature:
- `segments`
- `is_unspecified`
- `is_loopback`
- `is_global` (unstable)
- `is_unique_local`
- `is_unicast_link_local_strict`
- `is_documentation`
- `multicast_scope`
- `is_multicast`
- `to_ipv4_mapped`
- `to_ipv4`
Changed the implementation of `is_unspecified` and `is_loopback` to use a `match` instead of `==`.
Part of #76205
rename get_{ref, mut} to assume_init_{ref,mut} in Maybeuninit
References #63568
Rework with comments addressed from #66174
Have replaced most of the occurrences I've found, hopefully didn't miss out anything
r? @RalfJung
(thanks @danielhenrymantilla for the initial work on this)
This reverts commit 7e2548fe69.
Now I know why it was redefined: it seems like it's potentially because
of the orphan rule. Here are the error messages:
error[E0119]: conflicting implementations of trait `std::fmt::Debug` for type `!`:
--> src/primitive_docs.rs:236:1
|
6 | impl Debug for ! {
| ^^^^^^^^^^^^^^^^
|
= note: conflicting implementation in crate `core`:
- impl std::fmt::Debug for !;
error[E0117]: only traits defined in the current crate can be implemented for arbitrary types
--> src/primitive_docs.rs:236:1
|
6 | impl Debug for ! {
| ^^^^^^^^^^^^^^^-
| | |
| | `!` is not defined in the current crate
| impl doesn't use only types from inside the current crate
|
= note: define and implement a trait or new type instead
Constify the following methods of `std::net::Ipv4Addr`:
- `octets`
- `is_loopback`
- `is_private`
- `is_link_local`
- `is_shared`
- `is_ietf_protocol_assignment`
- `is_benchmarking`
- `is_multicast`
- `is_documentation`
Also insta-stabilizes these methods as const.
Possible because of the stabilization of const integer arithmetic and control flow.
vars() rather than vars function
Co-authored-by: Joshua Nelson <joshua@yottadb.com>
Use [xxx()] rather than the [xxx] function
Co-authored-by: Joshua Nelson <joshua@yottadb.com>
Env text representation of function intra-doc link
Suggested by @jyn514
Link join_paths in env doc for parity
Change xxx to env::xxx for lib env doc
Add link requsted by @jyn514
Fix doc build with same link
Co-authored-by: Joshua Nelson <joshua@yottadb.com>
Fix missing intra-doc link
Fix added whitespace in doc
Co-authored-by: Joshua Nelson <joshua@yottadb.com>
Add brackets for `join_paths`
Co-authored-by: Joshua Nelson <joshua@yottadb.com>
Use unused link join_paths
Removed same link for join_paths
Co-authored-by: Joshua Nelson <joshua@yottadb.com>
Remove unsed link join_paths
Update compiler-builtins
Update the compiler-builtins dependency to include latest changes.
This allows for `aarch64-unknown-linux-musl` to pass all tests.
Fixes#57820 and fixes#46651
Substantial refactor to the design of LineWriter
# Preamble
This is the first in a series of pull requests designed to move forward with https://github.com/rust-lang/rust/issues/60673 (and the related [5 year old FIXME](ea7181b5f7/src/libstd/io/stdio.rs (L459-L461))), which calls for an update to `Stdout` such that it can be block-buffered rather than line-buffered under certain circumstances (such as a `tty`, or a user setting the mode with a function call). This pull request refactors the logic `LineWriter` into a `LineWriterShim`, which operates on a `BufWriter` by mutable reference, such that it is easy to invoke the line-writing logic on an existing `BufWriter` without having to construct a new `LineWriter`.
Additionally, fixes#72721
## A note on flushing
Because the word **flush** tends to be pretty overloaded in this discussion, I'm going to use the word **unbuffered** to refer to a `BufWriter` sending its data to the wrapped writer via `write`, without calling `flush` on it, and I'll be using **flushed** when referring to sending data via flush, which recursively writes the data all the way to the final sink.
For example, given a `T = BufWriter<BufWriter<File>>`, saying that `T` **unbuffers** its data means that it is sent to the inner `BufWriter`, but not necessarily to the `File`, whereas saying that `T` **flushes** its data means that causes it (via `Write::flush`) to be delivered all the way to `File`.
# Goals
Once it became clear (for reasons described below) that the best way to approach this would involve refactoring `LineWriter` to work more directly on `BufWriter`'s internals, I established the following design goals for the refactor:
- Do not duplicate logic with `BufWriter`. It's great at buffering and then unbuffering data, so use the existing logic as much as possible.
- Minimize superfluous copying of data into `BufWriter`'s buffer.
- Eliminate calls to `BufWriter::flush` and instead do the same thing as `BufWriter::write`, which is to only write to the wrapped writer (rather than flushing all the way down to the final data sink).
- Uphold the "at-most 1 write of new data" convention of `Write::write`
- Minimize or eliminate dropping errors (that is, eliminate the parts of the old design that threw away errors because `write` *must* report if any bytes were written)
- As much as possible, attempt to fully flush completed lines, and *not* flush partial lines. One of the advantages of this design is that, so long as we don't encounter lines larger than the `BufWriter`'s capacity, partial lines will never be unbuffered, while completed lines will *always* be unbuffered (with subsequent calls to `LineWriter::write` retrying failed writes before processing new data.
# Design
There are two major & related parts of the design.
First, a new internal stuct, `LineWriterShim`, is added. This struct implements all of the actual logic of line-writing in a `Write` implementation, but it only operates on an `&mut BufWriter`. This means that this shim can be constructed on-the-fly to apply line writing logic to an existing `BufWriter`. This is in fact how `LineWriter` has been updated to operate, and it is also how `Stdout` is being updated in my [development branch](https://github.com/Lucretiel/rust/tree/stdout-block-buffer) to switch which mode it wants to use at runtime.
[An example of how this looks in practice](f24f272df6/src/libstd/io/stdio.rs (L479-L484)
)
The second major part of the design that the line-buffering logic, implemented in `LineWriterShim`, has been updated to work slightly more directly on the internals of `BufWriter`. Mostly it makes us of the public interface—particularly `buffer()` and `get_mut()`—but it also controls the flushing of the buffer with `flush_buf` rather than `flush`, and it writes to the buffer infallibly with a new `write_to_buffer` method. This has several advantages:
- Data no longer has to round trip through the `BufWriter`'s buffer. If the user provides a complete line, that line is written directly to the inner writer (after ensuring the existing buffer is flushed).
- The conventional contract of `write`—that at-most 1 attempt to write new data is made—is much more cleanly upheld, because we don't have to perform fallible flushes and perform semi-complicated logic of trying to pretend errors at different stages didn't happen. Instead, after attempting to write lines directly to the buffer, we can infallibly add trailing data to the buffer without allowing any attempts to continue writing it to the `inner` writer.
- Perhaps most importantly, `LineWriter` *no longer performs a full flush on every line.* This makes its behavior much more consistent with `BufWriter`, which unbuffers data to its inner writer, without trying to flush it all the way to the final device. Previously, `LineWriter` had no choice but to use `flush` to ensure that the lines were unbuffered, but by writing directly to `inner` via `get_mut()` (when appropriate), we can use a more correct behavior.
## New(ish) line buffering logic
The logic for line writing has been cleaned up, as described above. It now follows this algorithm for `write`, with minor adjustments for `write_all` and `write_vectored`:
- Does our input data contain a newline?
- If no:
- simply use the regular `BufWriter::write` to write it; this will append it to the buffer and/or flush it as necessary based on how full the buffer is and how much input data there is.
- additionally, if the current buffer ends with `'\n'`, attempt to immediately flush it with `flush_buf` before calling `BufWriter::write` This reproduces the old `needs_flush` behavior and ensures completed lines are flushed as soon as possible. The reason we only check if the buffer *ends* with `'\n'` is discussed later.
- If yes:
- First, `flush_buf`
- Then use `bufwriter.get_mut().write()` to write the input data directly to the underlying writer, up to the last newline. Make at most one attempt at this.
- If it errors, return the error
- If it succeeds with a full write, add the remaining data (between the last newline and the end of the input) to the buffer. In order to uphold the "at-most 1 attempt to write new data" convention, no attempts are made to write this data to the inner writer (though obviously a subsequent write may immediately flush it, e.g., if it totally filled the buffer's capacity.
- If it only partially succeeds, buffer the data only up to the last newline. We do this to try to avoid writing partial lines to the inner writer where possible (that is, whenever the lines are shorter than the total buffer capacity).
While it was not my intention for this behavior to diverge from this existing `LineWriter` algorithm, this updated design emerged very naturally once `LineWriter` wasn't burdened with having to only operate via `BufWriter::flush`. There essentially two main changes to observable behavior:
- `flush` is no longer used to unbuffer lines. The are only written to the writer wrapped by `LineWriter`; this inner writer might do its own buffering. This change makes `LineWriter` consistent with the behavior of `BufWriter`. This is probably the most obvious user-visible change; it's the one I most expect to provoke issue reports, if any are provoked.
- Unless a line exceeds the capacity of the buffer, partial lines are not unbuffered (without the user manually calling flush). This is a less surprising behavior, and is enabled because `LineWriter` now has more precise control of what data is buffered and when it is unbuffered. I'd be surprised if anyone is relying on `LineWriter` unbuffering or flushing *partial* lines that are shorter than the capacity, so I'm not worried about this one.
None of these changes are inconsistent with any published documentation of `LineWriter`. Nonetheless, like all changes with user-facing behavior changes, this design will obviously have to be very carefully scrutinized.
# Alternative designs and design rationalle
The initial goal of this project was to provide a way for the `LineWriter` logic to be operable directly on a `BufWriter`, so that the updated `Stdout` doesn't need to do something convoluted like `enum { BufWriter, LineWriter }` (which ends up being ~~impossible~~ difficult to transition between states after being constructed). The design went through several iterations before arriving at the current draft.
The major first version simply involved adding methods like `write_line_buffered` to `BufWriter`; these would contain the actual logic of line-buffered writing, and would additionally have the advantages (described above) of operating directly on the internals of `BufWriter`. The idea was that `LineWriter` would simply call these methods, and the updated `Stdout` would use either `BufWriter::write` or `BufWriter::write_line_buffered`, depending on what mode it was in.
The major issue with this design is that it loses the ability to take advantage of the `io::Write` trait, which provides several useful default implementations of the various io methods, such as `write_fmt` and `write_all`, just using the core methods. For this reason, the `write_line_buffered` design was retained, but moved into a separate struct called `LineWriterShim` which operates on an `&mut LineWriter`. As part of this move, the logic was lightly retooled to not touch the innards of `BufWriter` directly, but instead to make use of the unexported helper methods like `flush_buf`.
The other design evolutions were mostly related to answering questions like "how much data should be buffered", "how should partial line writes be handled", etc. As much as possible I tried to answer these by emulating the current `LineWriter` logic (which, for example, retries partial line writes on subsequent calls to `write`) while still meeting the refactor design goals.
# Next steps
~Currently, this design fails a few `LineWriter` tests, mostly because they expect `LineWriter` to *fully* flush its content. There are also some changes to the way that `LineWriter` buffers data *after* writing completed lines, aimed at ensuring that partial lines are not unbuffered prematurely. I want to make sure I fully understand the intent behind these tests before I either update the test or update this design so that they pass.~
However, in the meantime I wanted to get this published so that feedback could start to accumulate on it. There's a lot of errata around how I arrived at this design that didn't really fit in this overlong document, so please ask questions about anything that confusing or unclear and hopefully I can explain more of the rationale that led to it.
# Test updates
This design required some tests to be updated; I've research the intent behind these tests (mostly via `git blame`) and updated them appropriately. Those changes are cataloged here.
- `test_line_buffer_fail_flush`: This test was added as a regression test for #32085, and is intended to assure that an errors from `flush` aren't propagated when preceded by a successful `write`. Because type of issue is no longer possible, because `write` calls `buffer.get_mut().write()` instead of `buffer.write(); buffer.flush();`, I'm simply removing this test entirely. Other, similar error invariants related to errors during write-retrying are handled in other test cases.
- `erroneous_flush_retried`: This test was added as a regression test for #37807, and was intended to ensure that flush-retrying (via `needs_flush`) and error-ignoring were being handled correctly (ironically, this issue was caused by the flush-error-ignoring, above). Half of that issue is not possible by design with this refactor, because we no longer make fallible i/o calls that might produce errors we have to ignore after unbuffering lines. The `should_flush` behavior is captured by checking for a trailing newline in the `LineWriter` buffer; this test now checks that behavior.
- `line_vectored`: changes here were pretty minor, mostly related to when partial lines are or aren't written. The old implementation of `write_vectored` used very complicated logic to precisely determine the location of the last newline and precisely write up to that point; this required doing several consecutive fallible writes, with all the complex error handling or ignoring issues that come with it. The updated design does at-most one write of a subset of total buffers (that is, it doesn't split in the middle of a buffer), even if that means writing partial lines. One of the major advantages of the new design is that the underlying vectored write operation on the device can be taken advantage of, even with small writes, so long as they include a newline; previously these were unconditionally buffered then written.
- `line_vectored_partial_and_errors`: Pretty similiar to `line_vectored`, above; this test is for basic error recovery in `write_vectored` for vectored writes. As previously discussed, the mocked behavior being tested for (errors ignored under certain circumstances) no occurs, so I've simplified the test while doing my best to retain its spirit.
Abort when foreign exceptions are caught by catch_unwind
Prior to this PR, foreign exceptions were not caught by catch_unwind, and instead passed through invisibly. This represented a painful soundness hole in some libraries ([take_mut](https://github.com/Sgeo/take_mut/blob/master/src/lib.rs#L37)), which relied on `catch_unwind` to handle all possible exit paths from a closure.
With this PR, foreign exceptions are now caught by `catch_unwind` and will trigger an abort since catching foreign exceptions is currently UB according to the latest proposals by the FFI unwind project group.
cc @rust-lang/wg-ffi-unwind
[AVR] Replace broken 'avr-unknown-unknown' target with 'avr-unknown-gnu-atmega328' target
The `avr-unknown-unknown` target has never worked correctly, always trying to invoke
the host linker and failing. It aimed to be a mirror of AVR-GCC's
default handling of the `avr-unknown-unknown' triple (assume bare
minimum chip features, silently skip linking runtime libraries, etc).
This behaviour is broken-by-default as it will cause a miscompiled executable
when flashed.
This patch improves the AVR builtin target specifications to instead
expose only a 'avr-unknown-gnu-atmega328' target. This target system is
`gnu`, as it uses the AVR-GCC frontend along with avr-binutils. The
target triple ABI is 'atmega328'.
In the future, it should be possible to replace the dependency on
AVR-GCC and binutils by using the in-progress AVR LLD and compiler-rt support.
Perhaps at that point it would make sense to add an
'avr-unknown-unknown-atmega328' target as a better default when
implemented.
There is no current intention to add in-tree AVR target specifications for other
AVR microcontrollers - this one can serve as a reference implementation
for other devices via `rustc --print target-spec-json
avr-unknown-gnu-atmega328p`.
There should be no users of the existing 'avr-unknown-unknown' Rust
target as a custom target specification JSON has always been
recommended, and the avr-unknown-unknown target could never pass the
linking step anyway.
Update docs for SystemTime Windows implementation
Windows now uses `GetSystemTimePreciseAsFileTime` (since #69858) on versions of Windows that support it.
Call into fastfail on abort in libpanic_abort on Windows x86(_64)
This partially resolves#73215 though this is only for x86 targets. This code is directly lifted from [libstd](13290e83a6/library/std/src/sys/windows/mod.rs (L315)). `__fastfail` is the preferred way to abort a process on Windows as it will hook into debugger toolchains.
Other platforms expose a `_rust_abort` symbol which wraps `std::sys::abort_internal`. This would also work on Windows, but is a slightly largely change as we'd need to make sure that the symbol is properly exposed to the linker. I'm inlining the call to the `__fastfail`, but the indirection through `rust_abort` might be a cleaner approach.
A different instruction must be used on ARM architectures. I'd like to verify this works first before tackling ARM.
Minor changes to Ipv4Addr
Minor changes to Ipv4Addr
* Impl IntoInner rather than AsInner for Ipv4Addr
* Add some comments
* Add test to show endiannes of Ipv4Addr display
Report an ambiguity if both modules and primitives are in scope for intra-doc links
Closes https://github.com/rust-lang/rust/issues/75381
- Add a new `prim@` disambiguator, since both modules and primitives are in the same namespace
- Refactor `report_ambiguity` into a closure
Additionally, I noticed that rustdoc would previously allow `[struct@char]` if `char` resolved to a primitive (not if it had a DefId). I fixed that and added a test case.
I also need to update libstd to use `prim@char` instead of `type@char`. If possible I would also like to refactor `ambiguity_error` to use `Disambiguator` instead of its own hand-rolled match - that ran into issues with `prim@` (I updated one and not the other) and it would be better for them to be in sync.
Switch to intra-doc links in `std::macros`
Part of #75080.
---
* Switch to intra-doc links in `std::macros`
* Fix typo in module docs
* Link to `std::io::stderr` instead of `std::io::Stderr` to match the
link text
* Link to `std::io::stdout`
---
@rustbot modify labels: A-intra-doc-links T-doc T-rustdoc
Document that slice refers to any pointer type to a sequence
I was recently confused about the way slices are represented in memory. The necessary information was not available in the std-docs directly, but was a mix of different material from the reference and book.
This PR should clear up the definition of slices a bit more in the documentation. Especially the fact that the term slice refers to the pointer/reference type, e.g. `&[T]`, and not `[T]`.
It also documents that slice pointers are twice the size of pointers to `Sized` types, as this concept may be unfamiliar to users coming from other languages that do not have the concept of "fat pointers" (especially C/C++).
I've documented why this was important to me and my findings in [this blog post](https://codecrash.me/understanding-rust-slices).
r? @lcnr
clarify documentation of remove_dir errors
remove_dir will error if the path doesn't exist or isn't a directory.
It's useful to clarify that this is "remove dir or fail" not "remove dir
if it exists".
I don't think this belongs in the title. "Removes an existing, empty
directory" is strangely worded-- there's no such thing as a non-existing
directory. Better to just say explicitly it will return an error.
Remove `#[cfg(miri)]` from OnceCell tests
They were carried over from once_cell crate, but they are not entirely
correct (as miri now supports more things), and we don't run miri
tests for std, so let's just remove them.
Maybe one day we'll run miri in std, but then we can just re-install
these attributes.
Move to intra doc links for std::io
Helps with #75080.
@rustbot modify labels: T-doc, A-intra-doc-links, T-rustdoc
r? @jyn514
I had no problems with those files so I added some small links here and there.
They were carried over from once_cell crate, but they are not entirely
correct (as miri now supports more things), and we don't run miri
tests for std, so let's just remove them.
Maybe one day we'll run miri in std, but then we can just re-install
these attributes.
Switch to intra-doc links in /src/sys/unix/ext/*.rs
Partial fix for #75080
@rustbot modify labels: T-doc, A-intra-doc-links, T-rustdoc
r? @jyn514
These two links are not resolving to either `crate::fs::File...` or `fs::File...`
```
# unix/ext/fs.rs
27: /// [`File::read`]: ../../../../std/fs/struct.File.html#method.read
130: /// [`File::write`]: ../../../../std/fs/struct.File.html#method.write
```
Move to intra doc links for ascii.rs and panic.rs
Helps with #75080.
@rustbot modify labels: T-doc, A-intra-doc-links, T-rustdoc
I also updated the doc to fix the wording in `AsciiExt` since it is now deprecated.
The two file are small changes so I bundled them together.
Some links could not be changed to make them work, I believe those are known issues with primitive types.
Move to intra doc links in std::net
Helps with #75080.
@rustbot modify labels: T-doc, A-intra-doc-links, T-rustdoc
The links for `true` and `false` had to stay else `rustdoc` complained, it is intended ?
Add sanitizer support on FreeBSD
Restarting #47337. Everything is better now, no more weird llvm problems, well not everything:
Unfortunately, the sanitizers don't have proper support for versioned symbols (https://github.com/google/sanitizers/issues/628), so `libc`'s usage of `stat@FBSD_1.0` and so on explodes, e.g. in calling `std::fs::metadata`.
Building std (now easy thanks to cargo `-Zbuild-std`) and libc with `freebsd12/13` config via the `LIBC_CI=1` env variable is a good workaround…
```
LIBC_CI=1 RUSTFLAGS="-Z sanitizer=address" cargo +san-test -Zbuild-std run --target x86_64-unknown-freebsd --verbose
```
…*except* std won't build because there's no `st_lspare` in the ino64 version of the struct, so an std patch is required:
```diff
--- i/src/libstd/os/freebsd/fs.rs
+++ w/src/libstd/os/freebsd/fs.rs
@@ -66,8 +66,6 @@ pub trait MetadataExt {
fn st_flags(&self) -> u32;
#[stable(feature = "metadata_ext2", since = "1.8.0")]
fn st_gen(&self) -> u32;
- #[stable(feature = "metadata_ext2", since = "1.8.0")]
- fn st_lspare(&self) -> u32;
}
#[stable(feature = "metadata_ext", since = "1.1.0")]
@@ -136,7 +134,4 @@ impl MetadataExt for Metadata {
fn st_flags(&self) -> u32 {
self.as_inner().as_inner().st_flags as u32
}
- fn st_lspare(&self) -> u32 {
- self.as_inner().as_inner().st_lspare as u32
- }
}
```
I guess std could like.. detect that `libc` isn't built for the old ABI, and replace the implementation of `st_lspare` with a panic?
std/sys/unix/time: make it easier for LLVM to optimize `Instant` subtraction.
This PR is the minimal change necessary to get LLVM to optimize `if self.t.tv_nsec >= other.t.tv_nsec` to branchless instructions (at least on x86_64), inspired by @m-ou-se's own attempts at optimizing `Instant` subtraction.
I stumbled over this by looking at the total number of instructions executed by `rustc -Z self-profile`, and found that after disabling ASLR, the largest source of non-determinism remaining was from this `if` taking one branch or the other, depending on the values involved.
The reason this code is even called so many times to make a difference, is that `measureme` (the `-Z self-profile` implementation) currently uses `Instant::elapsed` for its event timestamps (of which there can be millions).
I doubt it's critical to land this, although perhaps it could slightly improve some forms of benchmarking.
Change Debug impl of SocketAddr and IpAddr to match their Display output
This has already been done for `SocketAddrV4`, `SocketAddrV6`, `IpAddrV4` and `IpAddrV6`. I don't see a point to keep the rather bad to read derived impl, especially so when pretty printing:
V4(
127.0.0.1
)
From the `Display`, one can easily and unambiguously see if it's V4 or V6. Two examples:
```
127.0.0.1:443
[2001:db8:85a3::8a2e:370:7334]:443
```
Luckily the docs explicitly state that `Debug` output is not stable and that it may be changed at any time.
Using `Display` as `Debug` is very convenient for configuration structs (e.g. for webservers) that often just have a `derive(Debug)` and are printed that way to the one starting the server.
Improve documentation on process::Child.std* fields
As a relative beginner, it took a while for me to figure out I could just steal the references to avoid partially moving the child and thus retain ability to call functions on it (and store it in structs etc).
This solves several problems
- race conditions where a file is truncated while copying from it. if we blindly trusted
the file size this would lead to an infinite loop
- proc files appearing empty to copy_file_range but not to read/write
https://github.com/coreutils/coreutils/commit/4b04a0c
- copy_file_range returning 0 for some filesystems (overlay? bind mounts?)
inside docker, again leading to an infinite loop
As a relative beginner, it took a while for me to figure out I could just steal the references to avoid partially moving the child and thus retain ability to call functions on it (and store it in structs etc).
Expand function pointer docs
Be more explicit in the ABI section, and add a section on how to obtain a function pointer, which can be somewhat confusing.
Cc https://github.com/rust-lang/rust/issues/75239
Move to intra doc links whenever possible within std/src/lib.rs
Helps with #75080.
@rustbot modify labels: T-doc, A-intra-doc-links, T-rustdoc
There are some things like
```rust
`//! [`Option<T>`]: option::Option`
```
that will either be fixed in the future or have open issues about them.
Fix minor things in the `f32` primitive docs
All of these were review comments in #74621 that I first fixed in that PR, but later accidentally overwrote by a force push.
Thanks @the8472 for noticing.
r? @KodrAus
Fix wasi::fs::OpenOptions to imply write when append is on
This PR fixes a bug in `OpenOptions` of `wasi` platform that it currently doesn't imply write mode when only `append` is enabled.
As explained in the [doc of OpenOptions#append](https://doc.rust-lang.org/std/fs/struct.OpenOptions.html#method.append), calling `.append(true)` should imply `.write(true)` as well.
## Reproduce
Given below simple Rust program:
```rust
use std::fs::OpenOptions;
use std::io::Write;
fn main() {
let mut file = OpenOptions::new()
.write(true)
.create(true)
.open("foo.txt")
.unwrap();
writeln!(file, "abc").unwrap();
}
```
it can successfully compiled into wasm and execute by `wasmtime` runtime:
```sh
$ rustc --target wasm32-wasi write.rs
$ ~/wasmtime/target/debug/wasmtime run --dir=. write.wasm
$ cat foo.txt
abc
```
However when I change `.write(true)` to `.append(true)`, it fails to execute by the error "Capabilities insufficient":
```sh
$ ~/wasmtime/target/debug/wasmtime run --dir=. append.wasm
thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: Os { code: 76, kind: Other, message: "Capabilities insufficient" }', append.rs:10:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Error: failed to run main module `append.wasm`
...
```
This is because of lacking "rights" on the opened file:
```sh
$ RUST_LOG=trace ~/wasmtime/target/debug/wasmtime run --dir=. append.wasm 2>&1 | grep validate_rights
TRACE wasi_common::entry > | validate_rights failed: required rights = HandleRights { base: fd_write (0x40), inheriting: empty (0x0) }; actual rights = HandleRights { base: fd_seek|fd_fdstat_set_flags|fd_sync|fd_tell|fd_advise|fd_filestat_set_times|poll_fd_readwrite (0x88000bc), inheriting: empty (0x0) }
```
Add Ipv6Addr::to_ipv4_mapped
* add Ipv6Addr::to_ipv4_mapped
* ~~deprecate Ipv4Addr::to_ipv6_compatible & Ipv6Addr::to_ipv4~~ reference: #75150
According to [IETF RFC 4291](https://tools.ietf.org/html/rfc4291#page-10), the "IPv4-Compatible IPv6 address" is deprecated.
> 2.5.5.1. IPv4-Compatible IPv6 Address
>
> The "IPv4-Compatible IPv6 address" was defined to assist in the IPv6
> transition. The format of the "IPv4-Compatible IPv6 address" is as
> follows:
>
> | 80 bits | 16 | 32 bits |
> +--------------------------------------+--------------------------+
> |0000..............................0000|0000| IPv4 address |
> +--------------------------------------+----+---------------------+
>
> Note: The IPv4 address used in the "IPv4-Compatible IPv6 address"
> must be a globally-unique IPv4 unicast address.
>
> The "IPv4-Compatible IPv6 address" is now deprecated because the
> current IPv6 transition mechanisms no longer use these addresses.
> New or updated implementations are not required to support this
> address type.
And the current implementation of `Ipv4Addr::to_ipv6_compatible`is incorrect: it does not check whether the IPv4 address is a globally-unique IPv4 unicast address.
Please let me know if there are any issues with this pull request.
Improve `f32` and `f64` primitive documentation
I noticed that the docs for the primitive floats were fairly short. I first only wanted to add the IEEE specification information (compare [the reference](https://doc.rust-lang.org/reference/types/numeric.html)), but then also added some more beginner-friendly docs. Let me know what you think!
Random doc team assign:
r? @rylev
Std panicking unsafe block in unsafe fn
Partial fix of #73904.
This encloses `unsafe` operations in `unsafe fn` in `libstd/ffi/panicking.rs`.
I also made a two lines change to `libstd/thread/local.rs` to add the necessary `unsafe` block without breaking everything else.
@rustbot modify labels: F-unsafe-block-in-unsafe-fn
Unfortunately, sanitizers do not support versioned symbols[1],
so they break filesystem access via the legacy, pre-ino64 ABI.
To use sanitizers on FreeBSD >= 12, we need to build the libc
crate with LIBC_CI=1 to use the new ABI -- including the libc
used for std. But that removes the st_lspare field std was
expecting for the deprecated metadata extension.
Add a way to skip that field to allow the build to work.
[1]: https://github.com/google/sanitizers/issues/628
Move to intra-doc links in library/std/src/path.rs
Helps with #75080.
@rustbot modify labels: T-doc, A-intra-doc-links, T-rustdoc
Known issue: The following links are broken (they are inside trait impls, undocumented in this file, inheriting from the original doc):
- [`Hasher`]
- [`Self`] (referencing `../primitive.slice.html`)
- [`Ordering`]
remove_dir will error if the path doesn't exist or isn't a directory.
It's useful to clarify that this is "remove dir or fail" not "remove dir
if it exists".
I don't think this belongs in the title. "Removes an existing, empty
directory" is strangely worded-- there's no such thing as a non-existing
directory. Better to just say explicitly it will return an error.
Implement `into_keys` and `into_values` for associative maps
This PR implements `into_keys` and `into_values` for HashMap and BTreeMap types. They are implemented as unstable, under `map_into_keys_values` feature.
Fixes#55214.
r? @dtolnay
All #[cfg(unix)] platforms follow the POSIX standard and define _SC_IOV_MAX so
that we rely purely on POSIX semantics to determine the limits on I/O vector
count.
Keep the I/O vector count limit in a `SyncOnceCell` to avoid the overhead of
repeatedly calling `sysconf` as these limits are guaranteed to not change during
the lifetime of a process by POSIX.
Both Linux and MacOS enforce limits on the vector count when performing vectored
I/O via the readv and writev system calls and return EINVAL when these limits
are exceeded. This changes the standard library to handle those limits as short
reads and writes to avoid forcing its users to query these limits using
platform specific mechanisms.
Co-authored-by: Weiyi Wang <wwylele@gmail.com>
Co-authored-by: Adam Reichold <adam.reichold@t-online.de>
Co-authored-by: Josh Stone <cuviper@gmail.com>
Co-authored-by: Scott McMurray <scottmcm@users.noreply.github.com>
Co-authored-by: tmiasko <tomasz.miasko@gmail.com>
Previously `std::fs::File::metadata` on wasm32-wasi would call `fd_filestat_get`
to get metadata associated with fd, but that fd is opened without
RIGHTS_FD_FILESTAT_GET right, so it will failed on correctly implemented WASI
environment.
This change instead to add the missing rights when opening an fd.
Remove links to rejected errata 4406 for RFC 4291
Fixes#74198.
For now I simply removed the links, the docs seems clear enough to me but I'm no expert in the domain so don't hesitate to correct me if more is needed.
cc @ghanan94.
@rustbot modify labels: T-doc, T-libs
Fix RefUnwindSafe & UnwinsSafe impls for lazy::SyncLazy
I *think* we should implement those unconditionally with respect to `F`.
The user code can't observe the closure in any way, and we poison lazy if the closure itself panics.
But I've never fully wrapped my head around `UnwindSafe` traits, so 🤷♂️
This commit is a proof-of-concept for switching the standard library's
backtrace symbolication mechanism on most platforms from libbacktrace to
gimli. The standard library's support for `RUST_BACKTRACE=1` requires
in-process parsing of object files and DWARF debug information to
interpret it and print the filename/line number of stack frames as part
of a backtrace.
Historically this support in the standard library has come from a
library called "libbacktrace". The libbacktrace library seems to have
been extracted from gcc at some point and is written in C. We've had a
lot of issues with libbacktrace over time, unfortunately, though. The
library does not appear to be actively maintained since we've had
patches sit for months-to-years without comments. We have discovered a
good number of soundness issues with the library itself, both when
parsing valid DWARF as well as invalid DWARF. This is enough of an issue
that the libs team has previously decided that we cannot feed untrusted
inputs to libbacktrace. This also doesn't take into account the
portability of libbacktrace which has been difficult to manage and
maintain over time. While possible there are lots of exceptions and it's
the main C dependency of the standard library right now.
For years it's been the desire to switch over to a Rust-based solution
for symbolicating backtraces. It's been assumed that we'll be using the
Gimli family of crates for this purpose, which are targeted at safely
and efficiently parsing DWARF debug information. I've been working
recently to shore up the Gimli support in the `backtrace` crate. As of a
few weeks ago the `backtrace` crate, by default, uses Gimli when loaded
from crates.io. This transition has gone well enough that I figured it
was time to start talking seriously about this change to the standard
library.
This commit is a preview of what's probably the best way to integrate
the `backtrace` crate into the standard library with the Gimli feature
turned on. While today it's used as a crates.io dependency, this commit
switches the `backtrace` crate to a submodule of this repository which
will need to be updated manually. This is not done lightly, but is
thought to be the best solution. The primary reason for this is that the
`backtrace` crate needs to do some pretty nontrivial filesystem
interactions to locate debug information. Working without `std::fs` is
not an option, and while it might be possible to do some sort of
trait-based solution when prototyped it was found to be too unergonomic.
Using a submodule allows the `backtrace` crate to build as a submodule
of the `std` crate itself, enabling it to use `std::fs` and such.
Otherwise this adds new dependencies to the standard library. This step
requires extra attention because this means that these crates are now
going to be included with all Rust programs by default. It's important
to note, however, that we're already shipping libbacktrace with all Rust
programs by default and it has a bunch of C code implementing all of
this internally anyway, so we're basically already switching
already-shipping functionality to Rust from C.
* `object` - this crate is used to parse object file headers and
contents. Very low-level support is used from this crate and almost
all of it is disabled. Largely we're just using struct definitions as
well as convenience methods internally to read bytes and such.
* `addr2line` - this is the main meat of the implementation for
symbolication. This crate depends on `gimli` for DWARF parsing and
then provides interfaces needed by the `backtrace` crate to turn an
address into a filename / line number. This crate is actually pretty
small (fits in a single file almost!) and mirrors most of what
`dwarf.c` does for libbacktrace.
* `miniz_oxide` - the libbacktrace crate transparently handles
compressed debug information which is compressed with zlib. This crate
is used to decompress compressed debug sections.
* `gimli` - not actually used directly, but a dependency of `addr2line`.
* `adler32`- not used directly either, but a dependency of
`miniz_oxide`.
The goal of this change is to improve the safety of backtrace
symbolication in the standard library, especially in the face of
possibly malformed DWARF debug information. Even to this day we're still
seeing segfaults in libbacktrace which could possibly become security
vulnerabilities. This change should almost entirely eliminate this
possibility whilc also paving the way forward to adding more features
like split debug information.
Some references for those interested are:
* Original addition of libbacktrace - #12602
* OOM with libbacktrace - #24231
* Backtrace failure due to use of uninitialized value - #28447
* Possibility to feed untrusted data to libbacktrace - #21889
* Soundness fix for libbacktrace - #33729
* Crash in libbacktrace - #39468
* Support for macOS, never merged - ianlancetaylor/libbacktrace#2
* Performance issues with libbacktrace - #29293, #37477
* Update procedure is quite complicated due to how many patches we
need to carry - #50955
* Libbacktrace doesn't work on MinGW with dynamic libs - #71060
* Segfault in libbacktrace on macOS - #71397
Switching to Rust will not make us immune to all of these issues. The
crashes are expected to go away, but correctness and performance may
still have bugs arise. The gimli and `backtrace` crates, however, are
actively maintained unlike libbacktrace, so this should enable us to at
least efficiently apply fixes as situations come up.
This commit updates the src/stdarch submodule primarily to include
rust-lang/stdarch#874 which updated and revamped WebAssembly SIMD
intrinsics and renamed WebAssembly atomics intrinsics. This is all
unstable surface area of the standard library so the changes should be
ok here. The SIMD updates also enable SIMD intrinsics to be used by any
program any any time, yay!
cc #74372, a tracking issue I've opened for the stabilization of SIMD
intrinsics
This has already been done for `SocketAddrV4`, `SocketAddrV6`,
`IpAddrV4` and `IpAddrV6`. I don't see a point to keep the rather bad
to read derived impl, especially when pretty printing:
V4(
127.0.0.1
)
From the `Display`, one can easily and unambiguously see if it's V4 or
V6. Using `Display` as `Debug` is very convenient for configuration
structs (e.g. for webservers) that often just have a `derive(Debug)`
and are printed that way to the user.